Joining the conversation: introducing a dedicated medical education corpus

Document Type

Journal Article

Publication Date

1-1-2026

Journal

Academic medicine : journal of the Association of American Medical Colleges

Volume

101

Issue

1

DOI

10.1093/acamed/wvaf008

Keywords

bibliometrics; information retrieval; machine learning; medical education; scholarship

Abstract

PROBLEM: Medical education scholars struggle to join ongoing conversations in their field due to the lack of a dedicated medical education corpus. Without such a corpus, scholars must search too widely across thousands of irrelevant journals or too narrowly by relying on PubMed's Medical Subject Headings (MeSH). In tests conducted for this study, MeSH missed 34% of medical education articles. APPROACH: From January to December 2024, the authors developed the Medical Education Corpus (MEC), the first dedicated collection of medical education articles, through a 3-step process. First, using the core-periphery model, they created the Medical Education Journals (MEJ), a collection of 2 groups of journals based on participation and influence in medical education discourse: the MEJ-Core (formerly the MEJ-24, 24 journals) and the MEJ-Adjacent (127 journals). Second, they developed and evaluated a machine learning model, the MEC Classifier, trained on 4,032 manually labeled articles to identify medical education content. Third, they applied the MEC Classifier to extract medical education articles from the MEJ-Core and MEJ-Adjacent journals. OUTCOMES: As of December 2024, the MEC contained 119,137 medical education articles from the MEJ-Core (54,927 articles) and MEJ-Adjacent journals (64,210 articles). In an evaluation using 1,358 test articles, the MEC Classifier demonstrated significantly improved sensitivity compared with MeSH (90% vs 66%, P = .001), while maintaining a similar positive predictive value (82% vs 81%). NEXT STEPS: The MEC provides a focused corpus that enables medical education scholars to more easily join conversations in the field. Scholars can rely on the MEC when reviewing literature to frame their work, and the MEC also creates opportunities for field-wide analyses and meta-research. The core methodology also underlies the MedEdMentor Paper Database (mededmentor.org), a separately maintained online tool that complements the versioned MEC snapshot with a web-based search interface.

Department

Health, Human Function, and Rehabilitation Sciences

Share

COinS