You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/paper.md
+19-16
Original file line number
Diff line number
Diff line change
@@ -25,9 +25,9 @@ bibliography: paper.bib
25
25
26
26
The Gene Ontology (GO) [@Ashburner2000; @GO2023] is a structured vocabulary that describes gene products in the context of their associated functions. The ontology takes the form of a directed graph, where each node defines a term, and each edge represents a hierarchical relationship between the terms (the words of the vocabulary).
27
27
28
-
For example, in the GO data, `GO:0090630` defines *activation of GTPase activity* and is a child of `GO:0043547`, which is a *positive regulation of GTPase activity* which in turn is a child of `GO:0051345` representing a*positive regulation of hydrolase activity*.
28
+
For example, in the GO data, `GO:0090630` defines *activation of GTPase activity* and is a child of `GO:0043547`, defined as *positive regulation of GTPase activity* which in turn is a child of `GO:0051345` representing *positive regulation of hydrolase activity*.
29
29
30
-
Gene association files (GAF) are text files used to annotate an organism's gene products with Gene Ontology terms, associating a function to a gene product. For example, a GAF file connects a gene product label, such as `ZC3H11B`, with multiple GO terms, such as `GO:0046872` or `GO:0016973`. The complete human genome GAF representation contains 288,575 associations of 19,606 gene symbols over 18,680 GO terms.
30
+
Gene association files (GAF) are text files used to annotate an organism's gene products with Gene Ontology terms, associating a function to a gene product. For example, a GAF file connects a gene product label, such as `ZC3H11B`, with multiple GO terms, such as `GO:0046872` or `GO:0016973`. The complete human genome GAF representation contains 288,575 associations of 19,606 gene symbols with over 18,680 GO terms.
31
31
32
32
The [Gene Ontology Consortium][GO] maintains GAF files for various organisms. Typical genomic analysis protocols generate gene lists that must be placed in a functional context.
33
33
@@ -39,12 +39,14 @@ The most annotated gene in the human genome, `HTT1`, currently has 1098 annotati
39
39
40
40
Web-based tools designed to visualize and filter gene ontology data include `AmiGO`[@AmiGO] and `QuickGO`[@QuickGO]. Command line tools like `goatools`[@goatools] support GO term lineage visualization. R packages like `topGO`[@topGO] implement GO structure visualizations of enriched GO terms. We are unaware of locally installable software that specifically allows for interactive filtering and visualization of gene ontology derived on gene lists.
41
41
42
-
GeneScape is a Python package that allows users to visualize a list of gene products in terms of the functional context represented by the Gene Ontology. GeneScape is distributed both as a command-line tool and as GUI-enabled standalone software that does not require Python to be installed on the user's computer, thus making it accessible to a wide range of users.
42
+
GeneScape is a Python package that allows users to visualize a list of gene products in terms of the functional context represented by the Gene Ontology. GeneScape is distributed both as a command-line tool and as GUI-enabled standalone software via the [Shiny platform][shiny], thus making it accessible to a wide range of users.
43
+
44
+
[shiny]: https://shiny.posit.co/
43
45
44
46
GeneScape is distributed with prebuilt databases for human and mouse genomes. For other organisms, users need to download the GAF files from the Gene Ontology website and run the command:
The `build` command will create a database that can then be used for all subsequent analyses with the software. Users should consult the [GeneScape documentation][docs] for up-to-date details.
@@ -64,17 +66,18 @@ GRTP1
64
66
GeneScape first transforms the above gene input list into a GO term list, where additional information is added to each term:
65
67
66
68
```
67
-
gid,root,count,function,source,size,label
68
-
GO:0090630,BP,1,activation of GTPase activity,GRTP1,4,(1/4)
In the next step, GeneScape visualizes the GO terms as the graph structure that represents the functional context of the genes relative to the larger Gene Ontology.
80
+
In the next step, GeneScape draws the GO terms as the graph structure using the Networkx package [@networkx] helping users visualize the functional context of the genes relative to the larger Gene Ontology.
78
81
79
82
{height="216pt"}
80
83
@@ -83,9 +86,9 @@ Various colors are used to provide additional context to the nodes in the graph;
83
86
Since the resulting graphs may also be large, with thousands of nodes, the main interface provides input widgets that allow users to interactively
84
87
reduce the subgraph to nodes for which:
85
88
86
-
1. The function definitions match certain patterns
87
-
2. A minimum number of genes share a function,
88
-
3. Nodes belong to a specific GO subtree: Biological Process (BP), Molecular Function (MF), Cellular Component (CC)
89
+
1. The function definitions match certain patterns.
90
+
2. A minimum number of genes share a function.
91
+
3. Nodes belong to a specific GO subtree: Biological Process (BP), Molecular Function (MF), Cellular Component (CC).
89
92
90
93
As an example, take the input genelist of just four genes:
0 commit comments