Prediction of treatment dosage and duration from free-text prescriptions: an application to ADHD medications in the Swedish prescribed drug register
Zhang L., Lagerberg T., Chen Q., Ghirardi L., D'Onofrio BM., Larsson H., Viktorin A., Chang Z.
BackgroundAccurate estimation of daily dosage and duration of medication use is essential to pharmacoepidemiological studies using electronic healthcare databases. However, such information is not directly available in many prescription databases, including the Swedish Prescribed Drug Register.ObjectiveTo develop and validate an algorithm for predicting prescribed daily dosage and treatment duration from free-text prescriptions, and apply the algorithm to ADHD medication prescriptions.MethodsWe developed an algorithm to predict daily dosage from free-text prescriptions using 8000 ADHD medication prescriptions as the training sample, and estimated treatment periods while taking into account several features including titration, stockpiling and non-perfect adherence. The algorithm was implemented to all ADHD medication prescriptions from the Swedish Prescribed Drug Register in 2013. A validation sample of 1000 ADHD medication prescriptions, independent of the training sample, was used to assess the accuracy for predicted daily dosage.FindingsIn the validation sample, the overall accuracy for predicting daily dosage was 96.8%. Specifically, the natural language processing model (NLP1 and NLP2) have an accuracy of 99.2% and 96.3%, respectively. In an application to ADHD medication prescriptions in 2013, young adult ADHD medication users had the highest probability of discontinuing treatments as compared with other age groups. The daily dose of methylphenidate use increased with age substantially.ConclusionsThe algorithm provides a flexible approach to estimate prescribed daily dosage and treatment duration from free-text prescriptions using register data. The algorithm showed a good performance for predicting daily dosage in external validation.Clinical implicationsThe structured output of the algorithm could serve as basis for future pharmacoepidemiological studies evaluating utilization, effectiveness, and safety of medication use, which would facilitate evidence-based treatment decision-making.