Duy H. M. Nguyen

Nguyen Ho Minh Duy

prof_pic.jpg

Universitätsstraße 32

70569 Stuttgart, Germany

Room: 2.321

I am currently a Ph.D. Candidate under the supervision of Prof. Mathias Niepert at Max Planck Research School for Intelligent Systems (IMPRS-IS) and University of Stuttgart. I have also been a Researcher at the German Research Center for Artificial Intelligence (DFKI) since 2021.

My topics of interest are Hybrid Discrete-Continuous Learning (differentiable relaxations for discrete intermediate representations), Scalable Algorithms for Multi-modal Learning with applications for Healthcare, Simulation Science, and Efficient Deep Learning (model compression, accelerated training/inference, etc.)

Please visit my Google Scholar for a full list of publications and GitHub for source codes.

news

Nov 08, 2025 🔔 Exciting News! We’re thrilled to share that our two recent works have been accepted to AAAI 2026 — one as an oral and the other as a poster presentation! 🎉 i. Multi-Mood — a multi-modal large language model that integrates video, audio, and text with psychological criteria through reinforcement learning to enable trustworthy and emotionally aligned responses. ii. LIBERO-Mem — a non-Markovian task suite for short- and long-horizon object tracking and manipulation, featuring temporally sequenced subgoals that challenge models to reason beyond the current observation. 📄 Papers and Codes will be released soon 🎉 — stay tuned!
Sep 26, 2025 🔔 Excited to share that our works on (i) ExGra-Med — a data-efficient multimodal large language model (LLM) for healthcare; (ii) Token Redundancy in 3D Point Cloud Transformers — uncovering how existing 3D transformers (e.g., Ptv-3, Sonata) are over-tokenized, and proposing an efficient token merging strategy that reduces computation by up to 90-95% while preserving accuracy; and (iii) Over-Optimization in RLHF for LLM Post-Training — exploring how reinforcement learning from human feedback can lead to alignment instability and proposing new insights into optimization LLM post-training have been accepted to NeurIPS 2025 🎉. Excited to present and discuss them at San Diego 🚀
Sep 09, 2025 🌟 Excited to give a talk about my current research on Scaling Multi-Modal Learning: Hybrid Representations and Efficient Adaptation at Machine Learning Lab, School of Information and Communications Technology (SOICT), Hanoi University of Science and Technology, Vietnam and (ii) School of Computing, National University of Singapore (NUS).
Sep 02, 2025 :bell: The MGPath has been accepted to the Transactions on Machine Learning Research. Congratulations to all co-authors on this milestone!
May 01, 2025 🎉 Our first (i) preliminary version, MGPath has been accepted to the Workshop on Foundation Models in the Wild, ICLR 2025 and (ii) another one about LLaMA-Adapter’s prompt learning is accepted at ICML 2025.
Apr 20, 2025 🎉 Our work in building a new Inductive Message Passing Network for Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data has been accepted at Scientific Report, Nature Portfolio.
Feb 20, 2025 :bell: Excited to share our latest work! 🎉: (i) On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation – We introduce a Mixture of Experts (MoE) perspective to explain the mechanism behind LLaMA-Adapter’s prompt learning. (ii) MGPath – A novel multi-granular prompt learning method for few-shot WSI pathology prediction, leveraging the power of foundation vision-language models.
Oct 08, 2024 🇨🇭 Start my visiting research at ETH AI Center, ETH Zurich. The topics are about Multi-Modal LLMs for Healthcare empowered by Retrieval-Augmented Generation.
Oct 07, 2024 :bell: Excited to introduce our latest work on medical multi-modal LLMs: LoGra-Med, a novel pre-training algorithm that incorporates multi-graph alignment to effectively address the data-hungry nature of autoregressive learning.
Oct 06, 2024 :rocket: The paper PiToMe has been accepted at NeurIPS 2024. Our code will be available soon!
Jun 10, 2024 :bell: Our new preprint PiToMe is online. We propose a new method to do token merging in the Transformer with spectrum-preserving.
May 01, 2024 :rocket: A paper submitted to ICML 2024 on the molecular conformer aggregation network topic is accepted.
Jan 15, 2024 :rocket: A paper submitted to ICLR 2024 on the topic of accelerating transformers is accepted as an oral talk.
Sep 22, 2023 :rocket: A paper submitted to NeurIPS 2023 on a large-scale medical image pre-trained models using second-order graph matching is accepted.

preprints

  1. Under Review
    svcot.png
    S-Chain: Structured Visual Chain-of-Thought for Medicine
    Khai Le-Duc*,  Duy MH Nguyen*,  Phuong T.H. Trinh* ,  Tien-Phat Nguyen*,  et al.
    2025
    *Co-first contributions
  2. Under Review
    SELF.png
    The Reasoning Boundary Paradox: How Reinforcement Learning Constrains Language Models
    Phuc Minh Nguyen,  Chinh D La,  Duy MH Nguyen,  Nitesh V Chawla ,  Binh T Nguyen,  Khoa D Doan
    2025
  3. Under Review
    facet.jpg
    From Fragments to Geometry: A Unified Graph Transformer for Molecular Representation from Conformer Ensembles
    Duy MH Nguyen ,  Trung Quoc Nguyen,  Ha Thi Hong Le,  Mai TN Truong ,  TrungTin Nguyen,  Nhat Ho,  Khoa D Doan,  Duy Duong-Tran,  Li Shen,  Daniel Sonntag, and 4 more authors
    2025

selected publications

  1. AAAI (Oral)
    AAAI_2025_Multimood.jpg
    Reinforce Trustworthiness in Multimodal Emotional Support System
    Huy M. Le ,  Dat Tien Nguyen,  Ngan T. T. Vo ,  Tuan D. Q. Nguyen,  Nguyen Le Binh,  Duy MH Nguyen,  Daniel Sonntag,  Lizi Liao ,  Binh T. Nguyen
    Proceedings of the AAAI Conference on Artificial Intelligence, 2026
  2. AAAI
    AAAI_2026_Libero_Non_Markov.png
    Rethinking Progression of Memory State in Robotic Manipulation: An Object-Centric Perspective
    Nhat Chung,  Taisei Hanyu ,  Toan Nguyen,  Huy Le,  Frederick Bumgarner,  Duy MH Nguyen,  Khoa Vo,  Kashu Yamazaki,  Chase Rainwater,  Tung Kieu, and 2 more authors
    Proceedings of the AAAI Conference on Artificial Intelligence, 2026
  3. NeurIPS
    exgramed.jpg
    ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models
    Duy MH Nguyen,  Nghiem T. Diep ,  Trung Q. Nguyen,  Hoang-Bao Le ,  Tai Nguyen ,  Tien Nguyen ,  TrungTin Nguyen,  Nhat Ho,  Pengtao Xie,  Roger Wattenhofer, and 3 more authors
    Advances in Neural Information Processing Systems (NeurIPS), 2025
    Short version was accepted at Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences, ICML 2025
  4. NeurIPS
    3d_point.jpg
    How Many Tokens Do 3D Point Cloud Transformer Architectures Really Need?
    Tuan Anh Tran,  Duy MH Nguyen,  Hoai-Chau Tran,  Michael Barz,  Khoa D Doan,  Roger Wattenhofer,  Vien Anh Ngo,  Mathias Niepert,  Daniel Sonntag,  Paul Swoboda
    Advances in Neural Information Processing Systems (NeurIPS), 2025
    Short version was accepted at 3rd Workshop on Efficient Systems for Foundation Models, ICML 2025
  5. NeurIPS
    sampling.jpg
    Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
    Phuc Minh Nguyen ,  Ngoc-Hieu Nguyen,  Duy MH Nguyen,  Anji Liu,  An Mai ,  Binh T. Nguyen,  Daniel Sonntag,  Khoa D. Doan
    Advances in Neural Information Processing Systems (NeurIPS), 2025
  6. TMLR
    TMLR_2025.jpg
    MGPATH: Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot WSI Classification
    Anh-Tien Nguyen,  Duy MH Nguyen,  Nghiem Tuong Diep ,  Trung Quoc Nguyen,  Nhat Ho,  Jacqueline Michelle Metsch,  Miriam Cindy Maurer,  Daniel Sonntag,  Hanibal Bohnenberger,  Anne-Christin Hauschild
    Transactions on Machine Learning Research (TMLR), 2025
    Short version was accepted at Workshop on Foundation Models in the Wild, ICLR 2025
  7. ICML
    ICML_2025.jpg
    On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation
    Nghiem T. Diep* ,  Huy Nguyen* ,  Chau Nguyen*,  Minh Le,  Duy MH Nguyen,  Daniel Sonntag,  Mathias Niepert,  Nhat Ho
    International Conference on Machine Learning (ICML), 2025
  8. NeurIPS
    NeurIPS_2024.jpg
    Accelerating Transformers with Spectrum-Preserving Token Merging
    Hoai-Chau Tran*,  Duy MH Nguyen* ,  Duy M Nguyen ,  Trung-Tin Nguyen,  Ngan Le,  Pengtao Xie,  Daniel Sonntag,  James Y Zou ,  Binh T Nguyen,  Mathias Niepert
    Advances in Neural Information Processing Systems (NeurIPS), 2024
  9. ICML
    ICML_2024.jpg
    Structure-aware E(3)-invariant molecular conformer aggregation networks
    Duy MH Nguyen,  Nina Lukashina ,  Tai Nguyen,  An T Le ,  TrungTin Nguyen,  Nhat Ho,  Jan Peters,  Daniel Sonntag,  Viktor Zaverkin,  Mathias Niepert
    International Conference on Machine Learning (ICML), 2024
  10. ICLR (Oral)
    ICLR_2024.jpg
    Energy minimizing-based token merging for accelerating Transformers
    Hoai-Chau Tran*,  Duy MH Nguyen* ,  Manh-Duy Nguyen,  Ngan Hoang Le ,  Binh T Nguyen
    5th Workshop on practical ML for limited/low resource settings, International Conference on Learning Representations (ICLR), 2024
  11. NeurIPS
    NeurIPS_2023.jpg
    LVM-Med: Learning large-scale self-supervised vision models for medical imaging via second-order graph matching
    Duy MH Nguyen ,  Hoang Nguyen,  Nghiem Diep,  Tan Ngoc Pham,  Tri Cao ,  Binh Nguyen,  Paul Swoboda,  Nhat Ho,  Shadi Albarqouni,  Pengtao Xie, and 1 more author
    Advances in Neural Information Processing Systems (NeurIPS), 2023
  12. AAAI
    AAAI_2023.jpg
    Joint self-supervised image-volume representation learning with intra-inter contrastive clustering
    Duy MH Nguyen ,  Hoang Nguyen,  Truong TN Mai,  Tri Cao ,  Binh T Nguyen,  Nhat Ho,  Paul Swoboda,  Shadi Albarqouni,  Pengtao Xie,  Daniel Sonntag
    Proceedings of the AAAI Conference on Artificial Intelligence, 2023
  13. CVPR
    CVPR_2022.jpg
    LMGP: Lifted multicut meets geometry projections for multi-camera multi-object tracking
    Duy MH Nguyen,  Roberto Henschel,  Bodo Rosenhahn,  Daniel Sonntag,  Paul Swoboda
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022