Collecting language, speech acoustics, and facial expression to predict psychosis and other clinical outcomes: strategies from the AMP® SCZ initiative.
Bilgrami ZR., Castro E., Agurto C., Liebenthal E., Ennis M., Baker JT., Scott I., Colton B-L., Cho KIK., Li L., Tamayo Z., Henecks M., Rahimi Eichi H., Henry T., Addington J., Alameda LK., Arango C., Breitborde NJK., Broome MR., Cadenhead KS., Calkins ME., Chen EYH., Choi J., Conus P., Cornblatt BA., Ellman LM., Fusar-Poli P., Gaspar PA., Gerber C., Glenthøj LB., Horton LE., Hui C., Kambeitz J., Kambeitz-Ilankovic L., Keshavan MS., Kim S-W., Koutsouleris N., Kwon JS., Langbein K., Mamah D., Diaz-Caneja CM., Mathalon DH., Mittal VA., Nordentoft M., Pearlson GD., Perez J., Perkins DO., Powers AR., Rogers J., Sabb FW., Schiffman J., Shah JL., Silverstein SM., Smesny S., Stone WS., Yassin W., Strauss GP., Thompson JL., Upthegrove R., Verma S., Wang J., Wolf DH., McGorry PD., Kahn RS., Kane JM., Anticevic A., Bearden CE., Dwyer D., Billah T., Bouix S., Pasternak O., Shenton ME., Woods SW., Nelson B., Accelerating Medicines Partnership® Schizophrenia (AMP® SCZ) ., Cecchi GA., Corcoran CM., Wolff PM.
Speech-based detection of early psychosis is progressing at a rapid pace. Within this evolving field, the Accelerating Medicines Partnership® in Schizophrenia (AMP® SCZ) is uniquely positioned to deepen our understanding of how language and related behaviors reflect early psychosis. We begin with detailed standard operating procedures (SOPs) that govern every stage of collection. These SOPs specify how to elicit speech, capture facial expressions, and record acoustics in synchronized audio-video files-both on-site and through remote platforms. We then explain how we chose our sampling tasks, hardware, and software, and how we built streamlined pipelines for data acquisition, aggregation, and processing. Robust quality-assurance and quality-control (QA/QC) routines, along with standardized interviewer training and certification, ensure data integrity across sites. Using natural language processing parsers, large language models, and machine-learning classifiers, we analyzed Data Release 3.0 to uncover systematic grammatical markers of psychosis risk. Speakers at clinical high risk (CHR) produced more referential language but fewer adjectives, adverbs, and nouns than community controls (CC), a pattern that replicated across sampling tasks. Some effects were task-specific: CHR participants showed elevated use of complex syntactic embeddings in two elicitation conditions but not the third, underscoring the importance of the language sampling task. Together, these results demonstrate how computational linguistics can turn everyday speech into a scalable, objective biomarker, paving the way for earlier and more precise detection of psychosis.Video Link: https://vimeo.com/1112291965?fl=pl&fe=sh.
