Help with Summary Statistics

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Help with Summary Statistics

D. A. Cowart

Hello,


I am attempting to use Galaxy to calculate the mean sequence read length and identify the range of read lengths for my 454 data. The data has already been organized and sorted by species. The format of the data is as follows:


>HD4AU5D01BHBCQCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC
>HD4AU5D01A093MCTCTGTCGCTCTGTCTCTCTTCTCTCTCTCTCTCTCT

etc...for each species

I have attempted to use the "Summary Statistics" button, however it appears to only be for numerical data and not sequence data. Is this tool/task available
via Galaxy?

Thank you,


Dominique Cowart
User name: dac330



___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Help with Summary Statistics

Peter Cock
On Thu, Aug 2, 2012 at 7:50 PM, D. A. Cowart <[hidden email]> wrote:
> Hello,
>
>
> I am attempting to use Galaxy to calculate the mean sequence read length and
> identify the range of read lengths for my 454 data. The data has already
> been organized and sorted by species. The format of the data is as follows:
>

That was probably FASTA format (but mangled in the email).

> I have attempted to use the "Summary Statistics" button, however it appears
> to only be for numerical data and not sequence data. Is this tool/task
> available via Galaxy?

Use the "Compute sequence length" tool to compute the read lengths,
and then you should be able to compute some statistics about the lengths.

Peter
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Summary Statistics

Dominique Cowart
In reply to this post by D. A. Cowart

Hello,


I am attempting to use Galaxy to calculate the mean sequence read length and identify the range of read lengths for my 454 data. The data has already been divided into columns:

>HD4AU5D01BHBCQC    TCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC
>HD4AU5D01A093MC    TCTGTCGCTCTGTCTCTCTTCTCTCTCTCTCTCTCT


I have attempted to use the "Summary Statistics" button, however it appears to only be for numerical data and not sequence data. Is this tool/task available
via Galaxy?

Thank you in advance,


Dominique Cowart





___________________________________________________________
The Galaxy User List is being replaced by the Galaxy Biostar
User Support Forum at https://biostar.usegalaxy.org/

Posts to this list will be disabled in May 2014.  In the
meantime, you are encouraged to post all new questions to
Galaxy Biostar.

For discussion of local Galaxy instances and the Galaxy
source code, please use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Summary Statistics

Björn Grüning-2
Hi,

"Summary Statistics" is ok, but before you need to use the tool 'Compute
sequence length'.

Ciao,
Bjoern

Am 23.05.2014 13:29, schrieb Dominique Cowart:

>
>   Hello,
>
>
>
>   I am attempting to use Galaxy to calculate the mean sequence read
>   length and identify the range of read lengths for my 454 data. The
>   data has already been divided into columns:
>
>
>    >HD4AU5D01BHBCQC TCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC
>    >HD4AU5D01A093MC TCTGTCGCTCTGTCTCTCTTCTCTCTCTCTCTCTCT
>
>
>   I have attempted to use the "Summary Statistics" button, however it
>   appears to only be for numerical data and not sequence data. Is this
>   tool/task available
>   via Galaxy?
>
>   Thank you in advance,
>
>
>   Dominique Cowart
>
>
>
>
>
>
>
> ___________________________________________________________
> The Galaxy User List is being replaced by the Galaxy Biostar
> User Support Forum at https://biostar.usegalaxy.org/
>
> Posts to this list will be disabled in May 2014.  In the
> meantime, you are encouraged to post all new questions to
> Galaxy Biostar.
>
> For discussion of local Galaxy instances and the Galaxy
> source code, please use the Galaxy Development list:
>
>    http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>    http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>
>    http://galaxyproject.org/search/mailinglists/
>
___________________________________________________________
The Galaxy User List is being replaced by the Galaxy Biostar
User Support Forum at https://biostar.usegalaxy.org/

Posts to this list will be disabled in May 2014.  In the
meantime, you are encouraged to post all new questions to
Galaxy Biostar.

For discussion of local Galaxy instances and the Galaxy
source code, please use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Summary Statistics

Jen Hillman-Jackson
In reply to this post by Dominique Cowart
Hi Dominique,

There are a few ways to do this.

If you have the data in fasta format in your history already, skip converting to fasta-> tabular and just use that with the tool "FASTA manipulation -> Compute sequence length". Then run the statistics tool on the output.

Or, if this is the format you have to start with (tabular), the tool "Text Manipulation -> Compute" can be used with the option "length(c2)" to generate the length of column 2. Adjust the "c2" portion as needed if this is not your complete file or if you had to do extra manipulations to isolate these columns (potentially skip those steps and use the earlier file).

Thanks!

Jen
Galaxy team

ps. This mailing list has moved to Galaxy Biostar and will be closing soon. Please join us there! Here is how to get set up: https://wiki.galaxyproject.org/Support/Biostar

On 5/23/14 4:29 AM, Dominique Cowart wrote:

Hello,


I am attempting to use Galaxy to calculate the mean sequence read length and identify the range of read lengths for my 454 data. The data has already been divided into columns:

>HD4AU5D01BHBCQC    TCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC
>HD4AU5D01A093MC    TCTGTCGCTCTGTCTCTCTTCTCTCTCTCTCTCTCT


I have attempted to use the "Summary Statistics" button, however it appears to only be for numerical data and not sequence data. Is this tool/task available
via Galaxy?

Thank you in advance,


Dominique Cowart






___________________________________________________________
The Galaxy User List is being replaced by the Galaxy Biostar
User Support Forum at https://biostar.usegalaxy.org/

Posts to this list will be disabled in May 2014.  In the
meantime, you are encouraged to post all new questions to
Galaxy Biostar.

For discussion of local Galaxy instances and the Galaxy
source code, please use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

-- 
Jennifer Hillman-Jackson
http://galaxyproject.org

___________________________________________________________
The Galaxy User List is being replaced by the Galaxy Biostar
User Support Forum at https://biostar.usegalaxy.org/

Posts to this list will be disabled in May 2014.  In the
meantime, you are encouraged to post all new questions to
Galaxy Biostar.

For discussion of local Galaxy instances and the Galaxy
source code, please use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/
Jennifer Hillman-Jackson
http://galaxyproject.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Summary Statistics

graham etherington (TSL)
In reply to this post by Dominique Cowart
Hi Dominique,
I’d use the original fasta file and input it into the 'Fasta Manipulation > Compute Sequence Length' tool
Then, using the output, run the 'Statistics > Summary Statistics for any numerical column' tool on c2.
That will give you all the info you’re after.
Cheers,
Graham

Dr. Graham Etherington
Bioinformatics Support Officer,
The Sainsbury Laboratory,
Norwich Research Park, 
Norwich NR4 7UH.
UK
Tel: +44 (0)1603 450601
Twitter: @bioinformatiks

From: Dominique Cowart <[hidden email]>
Date: Friday, 23 May 2014 12:29
To: "[hidden email]" <[hidden email]>
Subject: [galaxy-user] Summary Statistics


Hello,


I am attempting to use Galaxy to calculate the mean sequence read length and identify the range of read lengths for my 454 data. The data has already been divided into columns:

>HD4AU5D01BHBCQC    TCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC
>HD4AU5D01A093MC    TCTGTCGCTCTGTCTCTCTTCTCTCTCTCTCTCTCT


I have attempted to use the "Summary Statistics" button, however it appears to only be for numerical data and not sequence data. Is this tool/task available
via Galaxy?

Thank you in advance,


Dominique Cowart





___________________________________________________________
The Galaxy User List is being replaced by the Galaxy Biostar
User Support Forum at https://biostar.usegalaxy.org/

Posts to this list will be disabled in May 2014.  In the
meantime, you are encouraged to post all new questions to
Galaxy Biostar.

For discussion of local Galaxy instances and the Galaxy
source code, please use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/
Loading...