Privacy-preserving Large language models for
Acronym Clinical Inference and Disambiguation
The Problem
Electronic health records contain tens of thousands of acronyms. "PT" could mean patient, physical therapy, prothrombin time, or posterior tibial — depending on context. Misinterpretation contributes to clinical errors and hinders downstream NLP tasks.
Existing solutions require sending sensitive patient data to external APIs — a non-starter in healthcare. PLACID addresses both problems at once.
Acronym Ambiguity Example
Distribution depends entirely on clinical context
The Approach
Runs entirely on-premises. Patient data never leaves the institution. No external API calls, no data sharing.
Uses surrounding clinical context — the full sentence, section headers, patient history — to infer the most likely acronym expansion.
Designed to operate at scale across millions of clinical notes, improving downstream NLP tasks including coding, summarisation, and QA.
Performance
Disambiguation Accuracy by Acronym Frequency
* Figures are representative — see arXiv paper for full experimental details.
0
Acronyms in scope
0
% Local inference
0ms
External data transfer
0
Team members
Team
Manjushree B. Aithal
Lead Author · PhD
Privacy-preserving NLP and clinical language models
Alexander Kotz
Co-author · CPBS PhD Student
Computational Bioscience · CU Anschutz
James Mitchell
Senior Author · PI
Department of Biomedical Informatics · CU Anschutz
arXiv:2603.23678 · March 2026
Available now as a preprint on arXiv. Peer review in progress.
View on arXiv →