AI for Science

BioAI Systems Lab studies how artificial intelligence can become an active partner in biological discovery. Our goal is not only to apply AI to biological data, but to build AI-native computational systems that help scientists analyze results, integrate evidence, generate hypotheses, prioritize experiments, and support interpretable discovery.

Modern biology is rapidly becoming an AI-driven science. Large-scale experiments now generate massive, heterogeneous, and high-dimensional datasets, including genome sequences, proteomics measurements, metabolomics profiles, imaging data, and scientific literature. These data contain rich signals about biological systems, but turning them into mechanistic insight, testable hypotheses, and actionable knowledge remains a major challenge.

We are particularly interested in research at the interface of machine learning, data mining, large-scale biological data analysis, and systems biology. By combining biological knowledge with modern AI methods, we aim to create computational tools that make scientific investigation faster, more systematic, and more insightful.

Research Focus Areas

AI for Hypothesis Generation

Biological discovery often begins with observations scattered across multiple datasets, experiments, and publications. Human experts are highly effective at synthesizing such evidence, but the scale of modern scientific data increasingly exceeds what can be examined manually.

We develop AI methods that learn from experimental results and prior biological knowledge to propose plausible mechanistic hypotheses, prioritize promising explanations, and identify questions worth testing experimentally. A long-term goal is to build systems that assist scientists in moving from data to discovery in a more structured and reproducible manner.

AI for Multi-Omics and Scientific Data Integration

Important biological phenomena are rarely captured by a single data type. Understanding genotype-phenotype relationships, cellular states, microbial communities, and disease mechanisms often requires integrating genomics, transcriptomics, proteomics, metabolomics, phenotypes, and literature-derived knowledge.

We study AI models for integrating heterogeneous biological data, extracting shared structure across modalities, and uncovering associations that are difficult to detect with conventional single-view analyses. We are especially interested in methods that remain robust, interpretable, and useful under noisy, sparse, or incomplete experimental conditions.

AI for Automated Scientific Reasoning

Scientific progress depends on more than predictive accuracy. Useful AI systems for science should reason over evidence, handle uncertainty, compare competing explanations, and produce outputs that scientists can inspect and validate.

Our research explores AI systems that support scientific reasoning by organizing evidence, linking observations to candidate mechanisms, evaluating alternative hypotheses, and suggesting next-step experiments. We are interested in frameworks that combine statistical learning, symbolic reasoning, domain knowledge, and human feedback to make AI more reliable for scientific discovery.

AI for Experiment Design and Discovery Workflows

Scientific discovery is an iterative cycle involving data collection, analysis, interpretation, hypothesis generation, and validation. We aim to build computational tools that support this full workflow rather than isolated prediction tasks.

Examples include AI systems that recommend follow-up experiments, prioritize candidate biomarkers or functional variants, identify missing evidence, and help researchers allocate effort toward the most informative next step. In the long run, we envision human-AI collaborative platforms that function as research assistants for computational biology.

Trustworthy and Interpretable AI for Biology

In scientific settings, black-box predictions are often insufficient. Researchers need to understand why a model made a suggestion, how uncertain it is, and whether the result is biologically plausible.

We therefore emphasize interpretable, knowledge-aware, and trustworthy AI methods. Our goal is to develop models that not only perform well, but also produce explanations, confidence estimates, and biologically meaningful representations that support downstream validation and discovery.

Representative Topics

AI for hypothesis generation from multi-omics data
Foundation and language models for biological discovery
Knowledge-guided machine learning for systems biology
AI-assisted proteomics, metaproteomics, and metabolomics analysis
Computational methods for genotype-phenotype discovery
AI systems for experiment recommendation and prioritization
Interpretable and trustworthy AI for scientific research

Vision

Through this research area, the BioAI Systems Lab seeks to help shape a new paradigm for computational biology: one in which AI is not just a tool for data analysis, but an active partner in scientific reasoning and discovery. We are interested in building methods and systems that assist scientists in understanding complex biological processes and transforming large-scale data into knowledge.