Automated measurements of fluency, syntax and semantics in the language of persons with primary progressive aphasia

One of the clinical manifestations of dementia is a decline of the ability to use language. The term Primary Progressive Aphasia (PPA) is used to describe a neurodegenerative condition in which the primary, dominant symptom is a progressive language disorder. This type is central to the thesis. Persons with PPA are relatively young (< 65 years old) as compared to persons with dementia caused by Alzheimer's or Parkinson's disease. The young age of patients, paired with the atypical symptoms, often causes misdiagnoses.
In the thesis, the central question is to what extent software can be used to recognize in spoken language whether someone is likely to have PPA. We collected data by recording individuals with and individuals without PPA. We then studied the extent to which variables of acoustics (speech fluency) and word choices are different for one group versus the other, and if these variables can be used in a machine learning model that predicts the likelihood of having PPA.
The model was able to predict of language fragments with 90% accuracy whether they were generated by someone in the PPA group or by someone in the control group, and with 85% if the subvariants of the PPA group were also taken into account. The variables and techniques will be usable in software applications that can assist in clinical diagnoses or in monitoring the longitudinal development of the disease.