Antal van den Bosch - Example-based modeling of syntactic alternations
08 januari 2013
Based on corpus data such as childrens' and child-directed speech data from CHILDES, or any large digital corpora of written text, computational models can be trained to predict certain choices made during speech or writing. Memory-based models are a class of computational models that store examples of alternations, and use analogical or similarity-based reasoning over these stored examples to predict which choice is going to be made given a new, unseen input context. I will discuss experiments performed with memory-based models in two cases studies which are both work in progress.
First, in joint work with Joan Bresnan we train models on individual childrens' data as well as on other childrens' data and child-directed speech to predict new alternation choices in the English dative construction. Learning curve studies indicate that having child-directed speech in memory leads to better predictions than having only children's data, but if sufficient data is available for a single child, its own history of data points is also a good predictor of its next choices.
Second, in joint work with Stef Grondelaers and Dirk Speelman we model the complex distribution of Dutch existential 'er' (there) in Flemish and northern-Dutch locative inversion constructions. Our data show that using only lexical features produces prediction scores for the Northern-Dutch data which are on a par with previously tested regression models containing abstract linguistic features. The fact that the Flemish distribution of 'er' cannot be modelled exclusively on the basis of lexical input reveals deep-rooted differences between two language varieties which seem to be no further apart than British and American English.
Generalizing over these case studies I discuss issues in comparing lexical versus abstract linguistic features and issues in experimental regimens for testing computational models.
First, in joint work with Joan Bresnan we train models on individual childrens' data as well as on other childrens' data and child-directed speech to predict new alternation choices in the English dative construction. Learning curve studies indicate that having child-directed speech in memory leads to better predictions than having only children's data, but if sufficient data is available for a single child, its own history of data points is also a good predictor of its next choices.
Second, in joint work with Stef Grondelaers and Dirk Speelman we model the complex distribution of Dutch existential 'er' (there) in Flemish and northern-Dutch locative inversion constructions. Our data show that using only lexical features produces prediction scores for the Northern-Dutch data which are on a par with previously tested regression models containing abstract linguistic features. The fact that the Flemish distribution of 'er' cannot be modelled exclusively on the basis of lexical input reveals deep-rooted differences between two language varieties which seem to be no further apart than British and American English.
Generalizing over these case studies I discuss issues in comparing lexical versus abstract linguistic features and issues in experimental regimens for testing computational models.
Laatst gewijzigd: | 10 februari 2021 14:56 |
Meer nieuws
-
24 maart 2025
RUG 28e in World’s Most International Universities 2025 ranglijst
De Rijksuniversiteit Groningen is door Times Higher Education gerangschikt op de 28e plek in de World’s Most International Universities 2025 ranglijst. Daarmee laat de RUG instellingen als MIT en Harvard achter zich. De 28e plek betekent een stijging...
-
05 maart 2025
Vrouwen in de wetenschap
De RUG viert Internationale Vrouwendag met een bijzondere fotoserie: Vrouwen in de wetenschap.
-
16 december 2024
Jouke de Vries: ‘De universiteit zal wendbaar moeten zijn’
Aan het einde van 2024 blikt collegevoorzitter Jouke de Vries terug op het afgelopen jaar. In de podcast gaat hij in op zijn persoonlijke hoogte- en dieptepunten en kijkt hij vooruit naar de toekomst van de universiteit in financieel moeilijke tijden...