Wednesday, May 2, 2018

MaSuRCA 3.2.6 official

I have released the official 3.2.6 version of MaSuRCA.  It is available on the ftp site here:, and also on github.

Upgrading is easy, simply remove the 3.2.4 (or older) version and install this one.  Please use this version going forward. Big thanks to all users who reported errors and bugs!  

Please see this post for the list of improvements in 3.2.6 version: 

Thursday, April 19, 2018

Reporting issues with MaSuRCA on github

MaSuRCA is now on github.  Github has an excellent system for reporting bugs/issues with the software.  I encourage all users of MaSuRCA to utilize this resource and report issues here

Also if you are having a problem, please check the github issues page to see if the problem has been addressed already.

Thursday, April 5, 2018

Tree tobacco plant assembled with MaSuRCA

I am glad to see assemblies of the novel genomes that used MaSuRCA published.  Here is a recent data note published in BMC:

This is an assembly of tree tobacco Nicotiana glauca from Illumina-only data (350bp fragment Paired End library and ~4,000bp fragment mate pair library), that yielded N50 contig size of about 31Kbp.  The assembly size (~3.2Gbp) was bigger than the estimated genome size (~2Gbp) which points to relatively high heterozygosity of the plant.

Monday, March 26, 2018

Pre-release version maSuRCA 3.2.6beta

Over the past several weeks I have been working on improving stability of MaSuRCA.  I thank all users who reported problems to me and I have been addressing these problems in the code. The improved pre-release version of MaSuRCA 3.2.6beta is posted here:

This is a maintenance release.  There are no new features from 3.2.4 version, but there are many stability and performance improvements based on the feedback from the users (AGAIN BIG THANKS EVERYONE!!!) and my own use of MaSuRCA with the assemblies that I run.

List of major improvements:

1. occasional failure on overlapcorrection workaround
2. Illumina-only assembly unitig consensus failure workaround
3. running mega-reads on SGE grid improvements in performance and stability
4. cleaned up the code and improved re-starting assemblies with Illumina-only data
5. Updated version of MUMmer4 included
6. Improved compilation and install script on platforms where @ is present in the PWD
7. fixed bugs and improved performance of the assembly polishing code
8. speed and stability improvements to the Oxford Nanopore correction code
9. fixed bug that resulted in gap filling running in endless loop

The complete list of bugfixes and improvements for masurca and its submodules can be found on github

I would like this release to be a stable point before I continue adding new features.  Please let me know in the comments if you have any issues with this release.  I will remove the beta status after 2 weeks of testing and post it as an official release.

Monday, January 22, 2018

MaSuRCA is now on github

MaSuRCA has new home on github at MaSuRCA combines jellyfish, QuORUM, and other modules into one repository. The individual modules are submodules in the repository. The master branch of the masurca repository tracks the latest working commits. To checkout and compile MaSuRCA do the following:

git clone

git submodule init

git submodule update


MaSuRCA will compile under build/inst/bin/

To create a distribution, run make install. This will create MaSuRCA-3.2.4.tar.gz distributable tarball.

EDIT: to compile MaSuRCA from development tree, you will need the following dependencies:
swig and yaggo ( and Both must be available on the path.

Please post all questions and bug reports under "issues" in github:

Friday, January 12, 2018

New MaSuRCA version 3.2.4

I have just finished testing a new release of MaSuRCA version 3.2.4. The major improvement in this version is ability to run the hybrid assembly (Illumina+PacBio/Oxford Nanopore data) on a grid.  At this point only SGE is supported, and I am working on SLURM support which will be implemented shortly. Other improvements include:

1. gzippped fasta/fastq input files of PacBio/Oxford Nanopore reads supported
2. general speed and accuracy improvements
3. minor bugfixes based on user feedback

The new version is designed in such a way to allow mammalian genome assembly on a grid of computers with 128Gb of RAM.

The new release is available here

I am now updating the MaSuRCA manual to reflect the new options for grid execution, and I will upload it later today.