About
Foundation models and large language models (LLMs) represent a transformative advancement in life sciences, offering a new paradigm for understanding complex biological systems. These large-scale models are pretrained on extensive datasets encompassing various biological data types, enabling them to perform a wide range of tasks such as predicting protein structures, analyzing genomic sequences, and simulating cellular processes. Despite these advances, most existing foundation models and LLMs in life sciences focus on a single modality, such as protein sequences , genomic sequences, or single-cell transcriptomics. However, biological systems are inherently multi-modal, with complex interactions spanning different molecular and structural levels. For example, understanding protein function requires integrating sequence information with structural properties and interaction networks. Similarly, deciphering metabolic pathways involves linking small molecule representations with reaction kinetics and pathway annotations. Single-modal models often fail to capture these intricate relationships, limiting their applicability in real-world biological research.
To address these limitations, there is a growing need for multi-modal foundation models and LLMs that can effectively integrate and reason over diverse biological modalities. These models need to align heterogeneous data types, handle missing or noisy modalities, and support biologically meaningful interpretations. Developing such models poses unique challenges, including modality fusion strategies, cross-modal pretraining objectives, and computational scalability. Additionally, ensuring interpretability and robustness remains essential for their adoption in life sciences.
This workshop will bring together researchers working at the intersection of multi-modal learning, foundation models, LLMs, and life sciences to discuss recent advancements, explore methodological innovations, and identify key challenges in designing multi-modal foundation models and LLMs for biological data. By fostering interdisciplinary collaboration, the workshop aims to accelerate progress toward more comprehensive and biologically grounded AI models for life sciences.
Topics
The topics include but are not limited to:
- Multi-modal foundation models for learning representations of proteins, DNAs, RNAs, transcriptomic data, metabolomic data, and other biological modalities.
- Multi-modal LLMs for predicting the functions of proteins, DNAs, RNAs, and other biomolecules.
- LLM agents for multi-modal biomedical data
- Multi-modal foundation models for learning joint representations of multi-omics data.
- Multi-modal generative models for designing proteins, DNAs, RNAs, and other biomolecules.
- Applications of multi-modal foundation models and LLMs in drug discovery, precision medicine, personalized treatment, and beyond.
- Interpretability and robustness in biological multi-modal foundation models and LLMs.
Info
In-person workshop at ICML 2026
Date: TBD
Time: TBD
Location: TBD
Room: TBD
Paper Submission: OpenReview
Contact Email: icml2026fm4ls@gmail.com