Since the 3.2.2 version, MaSuRCA uses modified algorithms and settings for assembly of heterozygous diploid/polyploid genomes. Therefore there is a ploidy setting that is auto-computed and saved in PLOIDY.txt. Valid values for PLOIDY are 1 and 2. Editing this file will result in forcing the assembler to use ploidy as indicated.
Ploidy 1 means haploid and ploidy 2 means diploid. This is a gross over-simplification that is used I the assembler for the time being.
Ploidy for non-clonal genomes is always 2, but for the internal algorithms ploidy 1 means that the genome is relatively inbred and ploidy 2 means that it is relatively outbred. The reasoning is that in most genomes there is a proportion of the sequence that is conserved between the two haplotypes, and then there is proportion of sequence that is divergent. I treat ploidy as measurement of ratio of the total amount of unique sequence in the genome / haploid genome size. This is a number between 1 and 2. 1 means no divergence ( the homologous chromosomes are identical) and 2 means two haplotypes are 100% different. At this time I do not treat this as a floating parameter between 1 and 2, but instead I set a threshold in the middle based on heuristical computation. This is an over-simplification and I will introduce a refinement of this parameter in later versions.
No, at this time possible values are 1 and 2. When I talk about ploidy I only refer to ploidy between pairs of homologous chromosomes (maternad and paternal), not polyploidy.
ReplyDeletewhere to check the ploidy information? and where to set the ploidy?
ReplyDelete