Catching words in a stream of speech. Computational simulations of segmenting transcribed child-directed speech

08 December 2011

PhD ceremony: Mr. C. Çöltekin, 14.30 uur, Aula Academiegebouw, Broerstraat 5, Groningen

Dissertation: Catching words in a stream of speech. Computational simulations of segmenting transcribed child-directed speech

Promotor(s): prof. J. Nerbonne

Faculty: Arts

Segmenting continuous speech into lexical units is one of the early tasks an infant needs to tackle during language acquisition. Çağrı Çöltekin’s thesis investigates this particular problem, segmentation, by means of computational modeling and simulations.

The segmentation problem is more difficult than it may be appreciated at first sight. Children need to find words in a continuous stream of speech, with no knowledge of words to start with. Fortunately, experimental studies reveal that children and adults use a number of cues in the input and simple strategies that exploit these cues in order to segment the speech. More interestingly, some of these cues are language independent, allowing a learner to segment the continuous input before knowing any words.

Two major aspects set the models presented in this thesis apart from other computational models in the literature. First, the models presented here use simple local strategies - as opposed to global optimization - that rely on cues known to be used by children, namely, predictability statistics, phonotactics and lexical stress. Second, these cues are combined using an explicit cue-combination model which can easily be extended to include more cues.

The models are tested using real-world transcribed child-directed speech. The simulation results show that the performance of individual strategies are comparable to the state-of-the-art computational models of segmentation. Furthermore, combinations of individual cues provide a consistent increase in performance. The combined model performs on a par with the reference state-of-the-art model, while while employing only mechanisms more similar to those available to humans performing the same task.

Last modified:

13 March 2020 01.13 a.m.

Share this Facebook Twitter LinkedIn

View this page in: Nederlands

More news

13 May 2024

Trapping molecules

In his laboratory, physicist Steven Hoekstra is building an experimental set-up made of two parts: one that produces barium fluoride molecules, and a second part that traps the molecules and brings them to an almost complete standstill so they can...
06 May 2024

Impact: Utilization of geospatial data within international development cooperation

One of students nominated for the Ben Feringa Impact Award 2024 is Jonas Göbel. Göbel is nominated because of his internship research around the utilization of geospatial data in the field of international development cooperation.
03 May 2024

NWO Impact Explorer for Suzanne Manizza-Roszak's impactful postcolonial literary research

Suzanne Manizza-Roszak, Assistent Professor English at the Faculty of Arts has received an Impact Explorer grant from the Dutch Research Council (NWO) for her postcolonial literary research and the project to translate the results into social...

Catching words in a stream of speech. Computational simulations of segmenting transcribed child-directed speech

More news

Trapping molecules

Impact: Utilization of geospatial data within international development cooperation

NWO Impact Explorer for Suzanne Manizza-Roszak's impactful postcolonial literary research