Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND: Despite numerous past endeavors for the semantic harmonization of Alzheimer's disease (AD) cohort studies, an automatic tool has yet to be developed. OBJECTIVE: As cohort studies form the basis of data-driven analysis, harmonizing them is crucial for cross-cohort analysis. We aimed to accelerate this task by constructing an automatic harmonization tool. METHODS: We created a common data model (CDM) through cross-mapping data from 20 cohorts, three CDMs, and ontology terms, which was then used to fine-tune a BioBERT model. Finally, we evaluated the model using three previously unseen cohorts and compared its performance to a string-matching baseline model. RESULTS: Here, we present our AD-Mapper interface for automatic harmonization of AD cohort studies, which outperformed a string-matching baseline on previously unseen cohort studies. We showcase our CDM comprising 1218 unique variables. CONCLUSION: AD-Mapper leverages semantic similarities in naming conventions across cohorts to improve mapping performance.

Original publication




Journal article


J Alzheimers Dis

Publication Date





1409 - 1423


Alzheimer’s disease, automatic data harmonization, cohort study, common data model, data interoperability, semantic mapping, Alzheimer Disease, Humans, Semantics, Cohort Studies