

The bio-samtools package is a wrapper around libbam.so (for Linux) and libbam.1.dylib (for Mac OS X), the core shared object library from the SAMtools package. Here we describe the Ruby language binding to the SAMtools library, developed for our own work and distributed as a BioRuby plug-in. The SAMtools utilities are implemented in C and provide an API for programmatic access, for which there are multiple language bindings, notably in Perl, Python and Java. The large SAM files can be converted to the binary equivalent BAM files a compressed and indexed variant for random access, which vastly facilitates genetic analyses that rely on high-throughput alignment.
#SAMTOOLS INSTALL SOFTWARE#
The SAMtools utilities comprise a very useful and widely used suite of software for manipulating files and alignments in the SAM format. Most high-throughput alignment programs produce a standard output file in Sequence Alignment/Map format (SAM), a tab-delimited text-based format for describing alignments. Many programs have been created for alignment including BWA, Bowtie, SOAP, NOVOALIGN and BFAST, each implementing different algorithms optimised to address different issues with the alignment problem. One widespread use for these sequences is in detecting small differences in the genome sequence of the sample donor, which is achieved by using computational methods to align each short sequence read against a long, reference genome sequence then examining the derived alignments and determining positions at which there are differences. These sequence reads are typically very short compared to the parent genome (reads will usually be in the range of 36 - 200 nucleotides long while genomes are many millions of nucleotides long) and very redundant many reads may have the same sequence. High-throughput DNA sequencing in the biological sciences has made it possible for researchers to obtain many millions of sequence reads in single, low-cost experiments. Conclusionsīio-samtools is a flexible and easy to use interface that programmers of many levels of experience can use to access information in the popular and common SAM/BAM format. The utility of SAMtools is encapsulated in 3 main classes, Bio::DB::Sam, representing the alignment files and providing access to the data in them, Bio::DB::Alignment, representing the individual read alignments inside the files and Bio::DB::Pileup, representing the summarised nucleotides of reads over a single point in the nucleotide sequence to which the reads are aligned. The SAMtools utilities are implemented in C and provide an API for programmatic access, to help make this functionality available to programmers wishing to develop in the high level Ruby language we have developed bio-samtools, a Ruby binding to the SAMtools library. The SAMtools utilities comprise a very useful and widely used suite of software for manipulating files and alignments in the SAM and BAM format, used in a wide range of genetic analyses.
