1ļøā£ We present an interesting attempt, as illustrated in the figure below: Using the latest GPT-4o large model to generate schematic cancer gene identification. It is undeniable that the LLM possesses versatile and powerful capabilities, and LLM-aided approaches hold great promise for addressing complex scientific problems.
2ļøā£ This is the official implementation of the paper "Cancer Gene Identification through Integrating Causal Prompting Large Language Model with Omics DataāDriven Causal Inference". The well-organized code will come soon.
š Key Points
-
We propose a novel framework ICGI for cancer gene identification which leverages the emergent capabilities of the LLM and the advantages of causal inference.
-
Due to causal prompt and causal learning mechanisms, ICGI exhibits superior performance in identifying cancer genes from variations in the genome, transcriptome, and even other omics. Furthermore, it effectively distinguishes between cancer and normal samples and elucidates underlying biological mechanisms.
-
The study offers valuable insights into harnessing the power of LLMs for bioinformatics tasks through CoT prompting and the RAG technique. This LLM-aided strategy holds great promise for tackling other complex scientific challenges as well.
-
A web application has been developed and deployed to facilitate the ease of using the new causal cancer gene identification tool.
š Citation
If you use ICGI for your analysis and research work, please cite our paper:
Zeng, H., Yin, C., Chai, C., Wang, Y., Dai, Q., & Sun, H. (2025). Cancer gene identification through integrating causal prompting large language model with omics dataādriven causal inference. Briefings in Bioinformatics, 26(2), bbaf113. https://doi.org/10.1093/bib/bbaf113
If you have any questions and ideas for communication, please contact us via e-mail.