Skip to content

Commit 2072431

Browse files
committed
updated the paper
1 parent f08c5f8 commit 2072431

File tree

5 files changed

+39
-22
lines changed

5 files changed

+39
-22
lines changed

Makefile

+5-6
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,6 @@ tag: test
4141
exe: test
4242
python src/genescape/exe.py --build
4343

44-
shiny: test
45-
pip install rsconnect
46-
rsconnect deploy shiny src/genescape --name biostar --title GeneScape
4744

4845
# Generate images for the documentation
4946
docimg:
@@ -63,8 +60,6 @@ fix:
6360
push:
6461
git commit -am 'saving work' && git push
6562

66-
67-
6863
build: clean
6964
rm -rf build dist
7065
hatch build
@@ -106,6 +101,10 @@ clean:
106101
rm -rf build dist ${IDX_FILE}
107102

108103
env:
109-
conda create -n genescape python=3.11 shiny rsconnect graphviz
104+
micromamba create -n shiny python=3.11 rsconnect-python graphviz make -y
105+
106+
shiny:
107+
cp -f src/genescape/web.py src/app/app.py
108+
rsconnect deploy shiny src/app --name biostar --title GeneScape
110109

111110
.PHONY: test lint fix push clean build publish obo docimg web

README.md

+4
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@
88

99
**GeneScape** is a Python-based [Shiny][pyshiny] application that be run both at the command line and also via a graphical user interface.
1010

11+
There is a public version of the software at:
12+
13+
* https://biostar.shinyapps.io/genescape/
14+
1115
[pyshiny]: https://shiny.posit.co/py/
1216

1317
## Quickstart

docs/paper.bib

+10
Original file line numberDiff line numberDiff line change
@@ -92,3 +92,13 @@ @ebi.ac.uk;
9292
eprint = {https://academic.oup.com/bioinformatics/article-pdf/25/22/3045/48997998/bioinformatics\_25\_22\_3045.pdf},
9393
}
9494

95+
@InProceedings{networkx,
96+
author = {Aric A. Hagberg and Daniel A. Schult and Pieter J. Swart},
97+
title = {Exploring Network Structure, Dynamics, and Function using NetworkX},
98+
booktitle = {Proceedings of the 7th Python in Science Conference},
99+
pages = {11 - 15},
100+
address = {Pasadena, CA USA},
101+
year = {2008},
102+
editor = {Ga\"el Varoquaux and Travis Vaught and Jarrod Millman},
103+
}
104+

docs/paper.md

+19-16
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@ bibliography: paper.bib
2525

2626
The Gene Ontology (GO) [@Ashburner2000; @GO2023] is a structured vocabulary that describes gene products in the context of their associated functions. The ontology takes the form of a directed graph, where each node defines a term, and each edge represents a hierarchical relationship between the terms (the words of the vocabulary).
2727

28-
For example, in the GO data, `GO:0090630` defines *activation of GTPase activity* and is a child of `GO:0043547`, which is a *positive regulation of GTPase activity* which in turn is a child of `GO:0051345` representing a *positive regulation of hydrolase activity*.
28+
For example, in the GO data, `GO:0090630` defines *activation of GTPase activity* and is a child of `GO:0043547`, defined as *positive regulation of GTPase activity* which in turn is a child of `GO:0051345` representing *positive regulation of hydrolase activity*.
2929

30-
Gene association files (GAF) are text files used to annotate an organism's gene products with Gene Ontology terms, associating a function to a gene product. For example, a GAF file connects a gene product label, such as `ZC3H11B`, with multiple GO terms, such as `GO:0046872` or `GO:0016973`. The complete human genome GAF representation contains 288,575 associations of 19,606 gene symbols over 18,680 GO terms.
30+
Gene association files (GAF) are text files used to annotate an organism's gene products with Gene Ontology terms, associating a function to a gene product. For example, a GAF file connects a gene product label, such as `ZC3H11B`, with multiple GO terms, such as `GO:0046872` or `GO:0016973`. The complete human genome GAF representation contains 288,575 associations of 19,606 gene symbols with over 18,680 GO terms.
3131

3232
The [Gene Ontology Consortium][GO] maintains GAF files for various organisms. Typical genomic analysis protocols generate gene lists that must be placed in a functional context.
3333

@@ -39,12 +39,14 @@ The most annotated gene in the human genome, `HTT1`, currently has 1098 annotati
3939

4040
Web-based tools designed to visualize and filter gene ontology data include `AmiGO` [@AmiGO] and `QuickGO` [@QuickGO]. Command line tools like `goatools` [@goatools] support GO term lineage visualization. R packages like `topGO` [@topGO] implement GO structure visualizations of enriched GO terms. We are unaware of locally installable software that specifically allows for interactive filtering and visualization of gene ontology derived on gene lists.
4141

42-
GeneScape is a Python package that allows users to visualize a list of gene products in terms of the functional context represented by the Gene Ontology. GeneScape is distributed both as a command-line tool and as GUI-enabled standalone software that does not require Python to be installed on the user's computer, thus making it accessible to a wide range of users.
42+
GeneScape is a Python package that allows users to visualize a list of gene products in terms of the functional context represented by the Gene Ontology. GeneScape is distributed both as a command-line tool and as GUI-enabled standalone software via the [Shiny platform][shiny], thus making it accessible to a wide range of users.
43+
44+
[shiny]: https://shiny.posit.co/
4345

4446
GeneScape is distributed with prebuilt databases for human and mouse genomes. For other organisms, users need to download the GAF files from the Gene Ontology website and run the command:
4547

4648
```
47-
genescape build --gaf mydata.gaf --index mydata.index.gz
49+
genescape build --gaf mydata.gaf.gz --index mydata.index.gz
4850
```
4951

5052
The `build` command will create a database that can then be used for all subsequent analyses with the software. Users should consult the [GeneScape documentation][docs] for up-to-date details.
@@ -64,17 +66,18 @@ GRTP1
6466
GeneScape first transforms the above gene input list into a GO term list, where additional information is added to each term:
6567

6668
```
67-
gid,root,count,function,source,size,label
68-
GO:0090630,BP,1,activation of GTPase activity,GRTP1,4,(1/4)
69-
GO:0046982,MF,1,protein heterodimerization activity,ABTB3,4,(1/4)
70-
GO:0031083,CC,1,BLOC-1 complex,BCAS4,4,(1/4)
71-
GO:0016020,CC,1,membrane,ABTB3,4,(1/4)
72-
GO:0005737,CC,1,cytoplasm,BCAS4,4,(1/4)
73-
GO:0005615,CC,1,extracellular space,C3P1,4,(1/4)
74-
...
69+
count,function,root,goid,source,size,label
70+
1,activation of GTPase activity,BP,GO:0090630,GRTP1,4,(1/4)
71+
1,protein heterodimerization activity,MF,GO:0046982,ABTB3,4,(1/4)
72+
1,BLOC-1 complex,CC,GO:0031083,BCAS4,4,(1/4)
73+
1,membrane,CC,GO:0016020,ABTB3,4,(1/4)
74+
1,cytoplasm,CC,GO:0005737,BCAS4,4,(1/4)
75+
1,extracellular space,CC,GO:0005615,C3P1,4,(1/4)
76+
1,GTPase activator activity,MF,GO:0005096,GRTP1,4,(1/4)
77+
1,endopeptidase inhibitor activity,MF,GO:0004866,C3P1,4,(1/4)
7578
```
7679

77-
In the next step, GeneScape visualizes the GO terms as the graph structure that represents the functional context of the genes relative to the larger Gene Ontology.
80+
In the next step, GeneScape draws the GO terms as the graph structure using the Networkx package [@networkx] helping users visualize the functional context of the genes relative to the larger Gene Ontology.
7881

7982
![Ontology subgraph for a gene list \label{fig:interface}](images/genescape-output1.png){height="216pt"}
8083

@@ -83,9 +86,9 @@ Various colors are used to provide additional context to the nodes in the graph;
8386
Since the resulting graphs may also be large, with thousands of nodes, the main interface provides input widgets that allow users to interactively
8487
reduce the subgraph to nodes for which:
8588

86-
1. The function definitions match certain patterns
87-
2. A minimum number of genes share a function,
88-
3. Nodes belong to a specific GO subtree: Biological Process (BP), Molecular Function (MF), Cellular Component (CC)
89+
1. The function definitions match certain patterns.
90+
2. A minimum number of genes share a function.
91+
3. Nodes belong to a specific GO subtree: Biological Process (BP), Molecular Function (MF), Cellular Component (CC).
8992

9093
As an example, take the input genelist of just four genes:
9194

src/app/requirements.txt

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
genescape

0 commit comments

Comments
 (0)