Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Pre-trained Large Language Models (LLMs) have revolutionised Natural Language Processing (NLP) tasks, but often struggle when applied to specialised domains such as healthcare. The traditional approach of pre-training on large datasets followed by task-specific fine-tuning is resource-intensive and poorly aligned with the constraints of many healthcare settings. This presents a significant challenge for deploying LLM-based NLP solutions in medical contexts, where data privacy, computational resources, and domain-specific language pose unique obstacles. This study aims to develop and evaluate efficient methods for adapting smaller LLMs to healthcare-specific datasets and tasks. We seek to identify pre-training approaches that can effectively instil healthcare competency in compact LLMs under tight computational budgets, a crucial capability for responsible and sustainable deployment in local healthcare settings. We explore three specialised pre-training methods to adapt smaller LLMs to different healthcare datasets: traditional Masked Language modelling (MLM), Deep Contrastive Learning for Unsupervised Textual Representations (DeCLUTR), and a novel approach utilising metadata categories from healthcare settings. These methods are assessed across multiple healthcare datasets, with a focus on downstream document classification tasks. We evaluate the performance of the resulting LLMs through classification accuracy and analysis of the derived embedding spaces. Contrastively trained models consistently outperform other approaches on classification tasks, delivering strong performance with limited labelled data and fewer model parameter updates. While our novel metadata-based pre-training does not further improve classifications across datasets, it yields interesting embedding cluster separability. Importantly, all domain-adapted LLMs outperform their publicly available, general-purpose base models, validating the importance of domain specialisation. This research demonstrates the efficacy of specialised pre-training methods in adapting compact LLMs to healthcare tasks, even under resource constraints. We provide guidelines for pre-training specialised healthcare LLMs and motivate continued inquiry into contrastive objectives. Our findings underscore the potential of these approaches for aligning small LLMs with privacy-sensitive medical tasks, offering a path toward more efficient and responsible NLP deployment in healthcare settings. This work contributes to the broader goal of making advanced NLP techniques accessible and effective in specialised domains, particularly where resource limitations and data sensitivity are significant concerns.

Original publication

DOI

10.1016/j.artmed.2024.103009

Type

Journal article

Journal

Artif Intell Med

Publication Date

31/10/2024

Volume

158

Keywords

Classification, Contrastive loss, Embeddings, Healthcare, LLMs