I have sequencing data (Illumina). Library prep was focused on short ncRNAs. I would like to identify human miRNAs. Do you think it is feasible to simply
BLASTagainst a current version of mirBase and filter the output for mature, human miRNAs without using
You don't need to use
TopHatbut it is better to use
BLAST. First of all you need to get rid of the adapter sequences (along with other processing steps prior to alignment).
Now there are two aspects here:
- miRNA quantification
- miRNA discovery
First one is relatively straightforward whereas the second one requires you to perform additional tests such as prediction of stem loops etc. There are published software that can handle both the aspects (
miRScanetc). See this review for details on different miRNA gene finders. I have used
miRdeep and it uses
BLASTwould work but you have to set parameters which are suitable for small sequences (increase E-value cutoff, reduce word size etc). Moreover,
BLASTdoes not have a cutoff option for the number of mismatches and the length of alignment (it only has a post alignment filter for percentage identity).
I also personally prefer
bowtiebecause it has something called as
n-alignment mode. In this mode the entire read is divided into
non-seedregions. You can specify the length of the seed region and the mismatch cutoff. Since for miRNAs the seed region (bases 2-8) is critical for its function, I generally set the
seedmismatch cutoff to zero while allowing one or two mismatches in the
non-seedregion (note that in
seedalways starts at 1).
In general, I would advise that you should go for
miRdeepdoesn't straight away perform an alignment against the mature sequences. It maps the location of mature sequences in the pre-miRNA sequences (stem loops) and then aligns the reads to the pre-miRNA sequences. If the location of reads has a significant overlap with the mature region (you can adjust the window) then the read is considered as a valid miRNA.