I ran an RNA-seq Illumina experiment in which I compare cells from wild type animals and cells from animals that have a deletion in a splicing factor. Now I have my data in fastq format and need to do analysis to figure out which transcripts are changed and how (see below). Problem is, I have no idea whatsoever what to do.
Can someone be so kind and write down a basic outline of analysis to follow?
My understanding from what I've been reading online is that once you have fastq files you
1. Use FASTQ Groomer to convert to Sanger format
2. Evaluate the quality with FASTQ Summary Stats (and get boxplot of data)
3. Trim reads if their quality doesn't look good
What is considered "ok" quality? A score above 20? Is that the mean score or the absolute score? How do I trim based on score only those reads that have a low q score? (can I?)
4. Map the reads
What mapping software do you reccomend? BWA or Bowtie? Or Tophat? What next?
Let's say that anything after the trimming is very fuzzy.
The questions I am interested in are
- What transcripts are upregulated / downregulated in mutant vs control ? (I have 3 replicates of each)
- Are there introns that are retained in mutant (but not or less in control)?
- Are there exons that are excluded in mutant (basically, I want to at patterns of alternative splicing..)
Sorry for the very long message, but I have no idea who else to ask.