MATLAB: How to create/find a bam index file for using the bamread function

Bioinformatics Toolbox

I use Galaxy to align DNA sequences with Bowtie. This produces a bam file which I can read with baminfo from the Bioinformatics toolbox, but not an index file. How do go about creating a proper index file to use with bamread? Thanks,

Best Answer

  • You can use a BioMap object to load the information from the BAM file without the BAI file. See instructions here: http://www.mathworks.com/help/bioinfo/ug/storing-and-managing-short-read-sequence-data-in-objects.html
    For example if you want to load the data from chromosome 1, you do this:
    info = baminfo(bamFilename,'ScanDictionary',true);
    bm = BioMap(bamFilename, 'SelectReference', 'ChrI');
    Instead of 'ChrI' you can put any of the references that were used for mapping. These are stored here:
    info.ScannedDictionary
    The BioMap line will also generate a BAI file which you can use in bamread but that usually takes a lot more memory and it is slower.