I’m a final-year Ph.D. candidate in Computer Science at the University of Maryland, College Park, and a pre-doctoral research fellow at the National Library of Medicine (NIH).
My research combines natural language processing, biomedical informatics, and AI safety to build clinical language models that are accurate, safe, and robust in real-world healthcare applications.
Research Focus
-
Domain-Augmented Language Models – Enhance biomedical reasoning in LLMs using retrieval, structured knowledge (e.g., gene databases, ontologies), and tool integration (e.g., clinical calculators, genomic indexing).
-
Clinical AI for Real-World Use – Develop interactive systems for patient-trial matching, diagnostic imaging, and genomic analysis that are both explainable and efficient for clinical workflows.
-
Safe and Trustworthy Medical LLMs – Investigate hallucination, bias, adversarial attacks, and privacy risks in clinical LLMs; design decoding-time safety mechanisms and benchmark evaluation frameworks.
Awards
- NIH Director’s Challenge Award
- NIH Predoctoral Visiting Program Award
Recent Publications
- Protecting Patient Privacy Through Controlled Text Generation (AMIA, 2025)
- MedGuard: A Safety Benchmark for Medical Large Language Models. (Under review, 2025)
- Knowledge-Guided Contextual Gene Set Analysis with Large Language Models. (Under review, 2025)
- Matching patients to clinical trials with large language models. (Nature Comm. 2024)
A complete list is available on my Google Scholar.
Experience
- Pre-Doctoral Fellow, National Library of Medicine (NIH) — 2022 – present
- Ph.D. Candidate, University of Maryland — 2020 – present
- Research Intern, Genentech (gRED) — Summer 2022
Contact
yang7832@umd.edu · yifan.yang3@nih.gov · CV · Google Scholar