[Galaxy-user] inconsistent sequences

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Galaxy-user] inconsistent sequences

Melissa Wilson
Can anyone help figure out the following problem?

I am interested in the exon positions and sequences of EIF1AX.  I used Galaxy to
find the coding exon positions through UCSC (RefSeq gene NM_001412) using the
hg18 (and subsequently hg17 and hg16) sequences.  Then I used Fetch Sequences
and Alignments to Extract genomic DNA corresponding to query coordinates.

The functions work well on the surface, and while there are 435nt extracted for
the EIF1AX cds the composition of those 435nt is not the same as the 435nt I
get from NCBI NM_001412 CDS nor when I go through UCSC directly to extract the
CDS for EIF1AX.  

This is the same for hg17 and then almost the same for hg16 except for a few
additional exons (not clear about those).

Perhaps I'm running my search incorrectly, but would someone please check this
out.  I think a fresh view might clarify things.

Best,
Melissa




Reply | Threaded
Open this post in threaded view
|

Re: inconsistent sequences

Yi Zhang
Dear Melissa,

Sorry for the late reply. After following what you had done on galaxy,
galaxy got the genomic DNA for RefSeq gene NM_001412(hg17):

>hg17_chrX_19906080_19906086_-
ATCTAA
>hg17_chrX_19908290_19908382_-
CTAAAATCAATGAAACTGATACATTTGGTCCTGGAGATGATGATGAAATT
CAGTTTGATGACATTGGAGATGATGATGAAGATATTGATGAC
>hg17_chrX_19909956_19910038_-
GATAACAAAGCTGATGTAATTTTAAAATACAATGCAGACGAAGCTAGAAG
TCTGAAGGCATACGGCGAGCTTCCAGAGCATG
>hg17_chrX_19911731_19911782_-
GTTTGGATAAATACCTCGGACATTATTTTGGTTGGTCTCCGAGACTACCA
G
>hg17_chrX_19913512_19913616_-
AGTATGCTCAGGTAATCAAAATGTTGGGAAATGGACGGCTAGAAGCAATG
TGTTTCGATGGTGTAAAGAGGTTATGTCACATCAGAGGAAAATTGAGAAA
AAAG
>hg17_chrX_19916313_19916397_-
GTAAAGGAGGTAAAAACAGACGCAGGGGTAAGAATGAGAATGAATCTGAA
AAAAGAGAACTGGTATTCAAAGAGGATGGTCAGG
>hg17_chrX_19919399_19919415_-
ATGCCCAAGAATAAAG

The result is same as UCSC. What I have done is:
1. use galaxy to find the coding exons through UCSC, the result is:
chrX 19906080 19906086 NM_001412_cds_0_0_chrX_19906081_r 0  -
chrX 19908290 19908382 NM_001412_cds_1_0_chrX_19908291_r 0  -
chrX 19909956 19910038 NM_001412_cds_2_0_chrX_19909957_r 0  -
chrX 19911731 19911782 NM_001412_cds_3_0_chrX_19911732_r 0  -
chrX 19913512 19913616 NM_001412_cds_4_0_chrX_19913513_r 0  -
chrX 19916313 19916397 NM_001412_cds_5_0_chrX_19916314_r 0  -
chrX 19919399 19919415 NM_001412_cds_6_0_chrX_19919400_r 0  -

2. Use tool "Extract genomic DNA" under the "Fetch Sequences and
Alignments" to get DNAs.

Thank you for using galaxy in your research work. If you have
other questions, please let us know.

Sincerely,
Yi Zhang

On Fri, Jun 30, 2006 at 05:35:03PM -0400, Melissa Wilson wrote:

> Can anyone help figure out the following problem?
>
> I am interested in the exon positions and sequences of EIF1AX.  I used Galaxy to
> find the coding exon positions through UCSC (RefSeq gene NM_001412) using the
> hg18 (and subsequently hg17 and hg16) sequences.  Then I used Fetch Sequences
> and Alignments to Extract genomic DNA corresponding to query coordinates.
>
> The functions work well on the surface, and while there are 435nt extracted for
> the EIF1AX cds the composition of those 435nt is not the same as the 435nt I
> get from NCBI NM_001412 CDS nor when I go through UCSC directly to extract the
> CDS for EIF1AX.  
>
> This is the same for hg17 and then almost the same for hg16 except for a few
> additional exons (not clear about those).
>
> Perhaps I'm running my search incorrectly, but would someone please check this
> out.  I think a fresh view might clarify things.
>
> Best,
> Melissa
>
>
>
> _______________________________________________
> Galaxy-user mailing list
> [hidden email]
> http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user