Friday, September 15, 2017

New version MaSuRCA 3.2.3

I have just finished testing the new version of MaSuRCA, version 3.2.3.  The only new notable added feature in the new version is gap closing for assemblies that use PacBio/Oxford Nanopore data.  The other changes are all improvements related to stability, usability and speed:

1. Added scaffold gap closing for hybrid assemblies that use PacBio/Oxford Nanopore
2. Improved the speed and stability of filter for Illumina mate pairs
3. Ploidy and Estimated genome size for the genome are now saved and can be read from ESTIMATED_GENOME_SIZE.txt and PLOIDY.txt files.
4. run Nucmer multi-threaded when SoapDenovo2 used as contigger/scaffolder, for filtering out redundant small contigs after gap closing.
5. updated MUMmer to the latest version
6. many small performance improvements to avoid re-running steps if they have been run on assembler re-start

The new version is available from my ftp site:


  1. Hi ! sorry for asking questions here, I couldn't find an adequate forum for users ....
    We are doing a hybrid illumina+minion assembly, using masurca 3.2.3. and the assembly just dies in the overlapcorrection stage. some chunks say : ERROR: Bad alignment ends a_end = 0 b_end = 0 and then all hell break loose, we get segfaults, insults, and then it's dead. Would you have any opinion about this ? Thanks

    1. I have never seen this kind of problem. Tru re-running overlaps by deleting genome.ovlStore, 1-* and 3-*

  2. Dear Aleksey,

    I using MaSuRCA 3.2.3.At this moment I am running a illumina only assembly with SOAP_ASSEMBLY=1.
    I have previously done the assembly using PE data only, then I soft linked the previous data into a new directory and launched the assembly adding 2 MP libraries that I named s3 and j3 and j5.
    The assembly consistently fails and the SOAPdenovo.err says at the end:

    Import reads from file:
    Cannot open ../j3.cor.clean.fa. Now exit to system...

    However, there is no file(neither j5). The only file is sj.cor.fa that was created after renaming both library fastqs:

    devel 102G Oct 26 16:11 sj.cor.fa
    -rw-r--r-- 1 fcruz devel 1.2G Oct 26 16:11 sj.cor.log
    -rw-r--r-- 1 fcruz devel 228M Oct 26 14:11 pe.cor.log
    -rw-r--r-- 1 fcruz devel 62G Oct 26 11:59 quorum_mer_db.jf
    -rw-r--r-- 1 fcruz devel 106G Oct 26 11:34 j5.renamed.fastq
    -rw-r--r-- 1 fcruz devel 117G Oct 26 11:20 j3.renamed.fastq

    Is this a bug or a particular problem? Do this version creates a generic sj. file with all libraries in there?

    Thanks in advance,

    sj.cor.clean.fa work1

    1. Softlinks will break the assembly. You can simply re-run the assembly in the existing folder where PE-only assembly has been run, masurca will re-use the appropriate files.

  3. Hi,

    I tried running this new version and it crashed (after 11 days of running). This was on a single HPC node, with 500GB of RAM allocated for the MaSuRCA job.

    Here are some snippets of error logs. Would you happen to have any thoughts on how to avoid this?

    slurm-94620.out (tail)

    compute_psa 6601202 2632582819
    Refining alignments
    Generating assembly input files
    Coverage of the mega-reads less than 5 -- using the super reads as well
    Coverage threshold for splitting unitigs is 138 minimum ovl 63
    Running assembly
    /gscratch/srlab/programs/MaSuRCA-3.2.3/bin/ line 85: 24330 Aborted (core dumped) overlapStoreBuild -o $ASM_DIR/$ASM_PREFIX.ovlStore -M 65536 -g $ASM_DIR/$ASM_PREFIX.gkpStore $ASM_DIR/overlaps_dedup.ovb.gz > $ASM_DIR/overlapStore.rebuild.err 2>&1
    Assembly stopped or failed, see
    [Mon Oct 30 23:19:37 PDT 2017] Assembly stopped or failed, see

    --- (tail)

    number of threads = 28 (OpenMP default)

    ERROR: overlapStore '/gscratch/scrubbed/samwhite/20171019_masurca_oly_assembly/' is incomplete; previous overlapStoreBuild probably crashed.

    Failure message:

    failed to unitig


    Scanning overlap files to count the number of overlaps.
    Found 277.972 million overlaps.
    Memory limit 65536MB supplied. Ill put 3246167525 IIDs (3435.97 million overlaps) into each of 1 buckets.
    bucketizing DONE!
    overlaps skipped:
    0 OBT - low quality
    0 DUP - non-duplicate overlap
    0 DUP - different library
    0 DUP - dedup not requested
    terminate called after throwing an instance of std::bad_alloc
    what(): std::bad_alloc

    Failed with Aborted

    Backtrace (mangled):


    Backtrace (demangled):

    [0] overlapStoreBuild() [0x40523a]
    [1] /usr/lib64/ + 0xf100 [0x2af83b3c0100]
    [2] /usr/lib64/ + 0x37 [0x2af83c0395f7]
    [3] /usr/lib64/ + 0x148 [0x2af83c03ace8]
    [4] /usr/lib64/ + 0x165 [0x2af83b62d9d5]
    [5] /usr/lib64/ + 0x5e946 [0x2af83b62b946]
    [6] /usr/lib64/ + 0x5e973 [0x2af83b62b973]
    [7] /usr/lib64/ + 0x5eb93 [0x2af83b62bb93]
    [8] /usr/lib64/ new(unsigned long) + 0x7d [0x2af83b62c12d]
    [9] /usr/lib64/ new[](unsigned long) + 0x9 [0x2af83b62c1c9]
    [10] overlapStoreBuild() [0x402e10]
    [11] /usr/lib64/ + 0xf5 [0x2af83c025b15]
    [12] overlapStoreBuild() [0x403089]