The research paper titled "Flexibility-Aware Geometric Latent Diffusion for Full-Atom Peptide Design," co-authored by the research teams from MindRank AI and Professor Zhen Li of Qingdao University, has been accepted by ICML 2026 and will be presented at the 43rd International Conference on Machine Learning. The study introduces PepFGLD, a full-atom peptide design framework specifically designed to solve the challenge of "intrinsic flexibility" in AI-aided peptide drug design, providing a new technical path for tackling complex diseases.
About ICML
The International Conference on Machine Learning (ICML) is one of the most influential top-tier academic conferences in the field of machine learning. It consistently focuses on the frontiers of machine learning theory, algorithms, systems, applications, and interdisciplinary directions, serving as a vital exchange platform for global AI researchers. Together with ICLR and NeurIPS, it is recognized as one of the "big three" conferences in machine learning.
Known for its rigorous review standards and high requirements for innovation, ICML attracts cutting-edge submissions from top universities, research institutions, and enterprises worldwide. This year, the conference received 23,918 valid submissions, with 6,352 accepted, representing an acceptance rate of approximately 26.6%. The conference will be held from July 6 to 11, 2026, in Seoul, South Korea.
Research Background: A New Model for Solving Peptide Design Challenges
Peptide molecules possess excellent target recognition capabilities and regulatory potential, making them highly valuable in modulating protein-protein interactions (PPI), constructing targeted binding molecules, and enabling innovative therapeutic modalities. Compared to traditional small molecules, peptides offer unique advantages in binding to flexible and shallow interfaces, high-specificity recognition, and the regulation of complex biological functions.
However, peptides naturally possess high conformational flexibility. Especially at the receptor binding interface, their sequence, three-dimensional structure, and binding conformation are often strongly coupled, placing higher demands on AI modeling. Generating peptide structures that satisfy both receptor-conditioned constraints and physical feasibility remains a significant challenge in AI-driven peptide design.
To overcome these obstacles, the algorithmic team at MindRank AI and the research team at Qingdao University proposed the PepFGLD framework, a receptor-conditioned, flexibility-aware framework for full-atom peptide design. While accounting for the high flexibility of peptides, this model can generate both the correct amino acid sequence and the precise 3D spatial structure down to every atom, ensuring that the designed drugs are physically feasible and bind tightly to the target receptor.
Collaborative Achievement: The PepFGLD Framework for Full-Atom Peptide Design
The PepFGLD framework utilizes a Latent Diffusion Model (LDM) deep learning architecture. Its core methodology is based on a flexibility-aware sequence-structure variational autoencoder (Flex-VAE) and time-dependent energy-guided diffusion (TDEG).
1. Structure Representation and Data Modeling:
- Full-Atom Geometric Graph: Each peptide structure is represented as a full-atom geometric graph to capture fine-grained, atom-level geometric constraints.
- Channel Graph: For more efficient modeling, the team defined a channel-augmented residue graph (Channel Graph), where each residue is represented by 14 predefined atomic channel coordinates (including backbone and side-chain atoms).
- Latent Point Clouds: Structural samples are mapped through the encoder into latent point clouds in a continuous latent space, allowing for diffusion and sampling in a lower-dimensional manifold.
2. Three Innovative Modules:
- FlexEGNN (Flexibility-Aware Equivariant Graph Neural Network): Improves the sensitivity of geometric representations to local flexibility, capturing conformational changes and interface deformations at the receptor binding site.
- SSBIM (Bidirectional Sequence–Structure Interaction Module): Simultaneously models the dynamic relationship between amino acid sequences and 3D structures, ensuring generated results align with realistic conformations in a receptor environment.
- TDEG (Time-Dependent Energy-Guided Diffusion): Injects physical constraints during the generation process, progressively guiding the model to generate more stable and reasonable full-atom structures.
3. Two-Stage Training Strategy: The team employed a two-stage training strategy. First, unsupervised pretraining is performed using tens of thousands of large-scale protein fragments, allowing the AI to learn universal structural recovery capabilities. Second, the model undergoes fine-tuning using real protein-peptide complex data, enabling the AI to learn conformational preferences and spatial constraints for specific receptor interfaces, effectively customizing the "key" for a specific "lock".
4. Sampling and Reconstruction Process: During the inference stage, the model performs reverse diffusion sampling in the latent space (with TDEG guidance) to obtain an optimized sequence-structure joint latent state. A decoder then reconstructs these latent variables into a protein-peptide complex structure containing full atomic coordinates. This approach ensures that PepFGLD not only generates high-affinity peptide sequences but also provides physically consistent and geometrically accurate 3D binding poses.
Experimental Success: More Precise and Stable Drug Candidates
Testing on authoritative benchmarks such as PepBench and PepBDB has demonstrated the significant advantages of PepFGLD:
- Significant Performance Boost: PepFGLD significantly outperforms existing leading models such as HSRN, dyMEAN, and PepGLAD in metrics including binding energy (), success rate, diversity, and consistency.
- Higher Physical Feasibility: Experiments prove that PepFGLD-generated peptide conformations are closer to the reference energy landscape and exhibit stronger adaptability and accuracy when handling high-flexibility regions (such as coil regions).
- Geometric Coherence: Case studies show that PepFGLD can generate continuous and stable molecular backbone trajectories, avoiding common issues such as structural fragmentation or atomic overlap found in traditional methods.
By introducing flexibility-aware mechanisms and physical energy guidance, the PepFGLD model shifts peptide drug design from "trial and error" toward AI-driven precision prediction. Its exceptional performance in handling dynamic interfaces and full-atom details makes it an ideal digital engine for the development of next-generation, high-specificity peptide therapies.
This technology not only improves the efficiency of designing high-quality, drug-like peptide molecules but also brings a new digital paradigm and hope for future treatments in high-risk areas such as cancer and immunological diseases.
