-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peaks2genes: modifications #1030
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mblue9!
can skip the next step. Otherwise, it might be reasonable to include the promoter region into the comparison, e.g. because | ||
you want to include Transcriptions factors in ChIP-seq experiments. | ||
Our goal is to compare the 2 region files (the genes file and the peak file from the publication) | ||
to know which peaks are related to which genes. If you only want to know which peaks are located **inside** genes you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we but the word gene body
somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I've added that in now, see what you think
Our goal is to compare the 2 region files (the genes file and the peak file from the publication) | ||
to know which peaks are related to which genes. If you only want to know which peaks are located **inside** genes you | ||
can skip the next step. Otherwise, it might be reasonable to include the **promoter** region of the genes into the comparison, e.g. because | ||
you want to include transcriptions factors in ChIP-seq experiments. There is no strict definition for promoter region but 2kb upstream of the Transcription Start Site (start of region) is commonly used. We'll use the **Get Flanks** tool to get regions 2kb bases upstream of the start of the gene to 10kb bases downstream of the start (12kb in length). To do this we tell the Get Flanks tool we want regions upstream of the start, with an offset of 10kb, that are 12kb in length, as shown in the diagram below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TSS as a shortcut?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have changed it to TSS
Just wondering, what do you think about having "Introduction" in the name at the top of the tutorial e.g. "Galaxy Introduction: From peaks to genes" instead of just "From peaks to genes", to emphasise that this is material to help introduce people to Galaxy? |
But the entire section/topic is called |
Maybe. Just if you look at below, although the objectives show it's an intro, to me the name makes it look like its focus is on annotating peak regions (and it's not the most efficient way to do it currently in Galaxy e.g. in this case you could just use Chipseeker and not even bother getting data from UCSC). Having it in the name could help emphasise that this is an example to introduce people to Galaxy but I don't have a strong opinion on this. |
Maybe the topic name could be written above /before the tutorial name? |
That sounds like a good idea to me. But happy to have that in the future and not wait for it for this PR. |
@@ -349,35 +358,27 @@ you want to include Transcriptions factors in ChIP-seq experiments. | |||
> 3. Rename your dataset to reflect your findings | |||
{: .hands_on} | |||
|
|||
You might have noticed that the UCSC file is in `BED` format and has a database associated to it. That's what we want for our peak file as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this to demonstrate the conversion feature - in respect to the "change filetype". As a bonus, the presenter can also demonstrate the implicit-conversion if wished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I've added it back in. But imho if it's not necessary for users to do that conversion here then it's just adding confusion and I'd demonstrate the conversion somewhere else where it is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree here. The Get Flanks
tool is also not needed as a similar result can be archived by UCSC directly. It's important that people know the interface and this is the purpose of this tutorial, it's an introduction to Galaxy.
Also please note that they are different intersect tools in Galaxy and not all can cope with interval files and hence convert them implicitly. It's therefore good to know that there are implicit and explicit conversions, maybe this should be made more clear in the text - I stress this a lot during my session.
Thanks for adding it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Get Flanks tool is also not needed as a similar result can be archived by UCSC directly.
Well while we're at it UCSC is not needed here either 😄 But yes, the text could be made clearer I think, what do you think about what I've added here now
You might have noticed that the UCSC file is in `BED` format and has a database associated to it. That's what we want for our peak file as well. The **Intersect** tool we will use can automatically convert interval files to BED format but we'll convert our interval file explicitly here to show how this can be achieved with Galaxy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the text. Thanks.
If anyone could please merge this it would be great to have for the workshop tomorrow or if it needs more changes please let me know. |
Thanks a lot @bgruening !! |
In response to this feedback comment on the peaks to genes tutorial:
I've tried to clarify in this PR what the Get Flanks tool is doing in the text and I've added in a small diagram (made with Powerpoint 😳if there's a better way to make one please let me know).
I've made a few changes here: