This repository contains a Python script, written in a Jupyter Notebook, that retrieves all Protein Data Bank (PDB) structures associated with a UniProt entry.
The tool uses APIs from PDBe, UniProt, and RCSB PDB to gather structural data and bibliographic references (including preprints and PubMed IDs) linked to each PDB structure, including newly released structures not yet indexed in UniProt.
- PDBe API: Retrieve PDB structures associated with a UniProt entry, including recently released ones.
- UniProt API: Find all PDBs listed in the UniProt entry.
- RCSB PDB API: Gather references and PubMed IDs (PMIDs) for each structure.
- UniProt Accession Code (AC): e.g.
P40967
.
- List of all associated PDB entries.
- References and PMIDs for each structure.
Below is an example output generated by the tool:
- Jupyter Notebook: You can install it via Anaconda or pip.
- Python 3.x
- Python libraries:
requests
,json
,pandas
, andnumpy
. Install them using:pip install requests pandas numpy
- Conny Yu – GitHub Profile
October 2024