The natural languages of the world are characterized by largely rigid, often highly idiosyncratic rule systems. Within any given language, it is hard to find instances of true optionality (i.e., a many-to-one mapping from meaning to form). The ExSynOp project sets out to explore this central puzzle: what drives the evolution of complex, rigid rule systems in natural languages? We do this by investigating the sources of regularization, i.e., the reduction of variation in a language. Our focus on regularization will bring new insights to one of the most debated issues in linguistics: are languages shaped predominantly by the usage patterns of adults (evident in processing, register/style choices, MacDonald 2013, Bybee 2015), or the learning preferences/limitations in children (Clark 1987, Newport 2005, Yang 2017, Chomsky 1986)?
The ExSynOp project takes as its starting point the word order variation found in the closely related Mainland Scandinavian languages and varieties. The empirical basis will be four word order phenomena where at least one of the languages shows an apparent word order optionality and at least one of the languages is restricted to one word order only: subject shift, particle shift, object shift and long object shift. By studying the acquisition (L1 child and L2 adult), and processing, both in production and comprehension, in closely related languages that differ in the presence/absence of variation for a certain phenomenon, we can pin down where preferences for regular systems arise. Four questions addressing fundamental issues in the establishment of rigid grammars will be asked: (I) are there processing benefits (or costs) associated with categorical rules; (II) is the L1 language learner disposed to categorical rules, or do categorical tendencies develop later (III) is the variation within speakers conditioned by register/dialect, and (IV) how should this kind of non-meaning related variation in word order be modeled theoretically.
The project is built on four foundations: language technology, experimental linguistics, quantitative analysis and linguistic theory. Prior work on regularization is almost exclusively based on artificial languages (e.g., Hudson Kam & Newport 2009, Smith & Wonnacott 2010, Culbertson et al. 2012). This project is the first rigorous test of predictions generated from this line of research using micro-variation in natural languages through experiments on language acquisition and processing. In order to do this, we have designed a novel large-scale elicitation paradigm which can be used with adults and children. This paradigm will generate a new source of data on the evolutionary roots of rule formation. In order to quickly and efficiently extract data from studies using this paradigm, we will develop software for automatic speech-text alignment. Quantitative analysis will then be carried out by a group with worldleading expertise on analysis of linguistic variation, using information theoretic measures. The results of our project will feed directly into linguistic theory in form of precise models of linguistic variation. Such models have been developed for phonetic variation (Tanner et al. 2017, Tamminga 2018), but have not previously been applied to word order variation. The starting point of this project is that linguistic variation should be modelled the same way, independent of linguistic level (Saldana et al. 2017), therefore models from socio-phonetics will be applied to syntactic variation. The project will provide, as far as we are aware, the first exact models of intraspeaker word order variation. The methods and software we develop will benefit the field beyond this project. The speech-text aligner for the Scandinavian languages is something that many linguistic departments are desperate to get their hands on, as it will save hundreds of hours of manual segmentation. The speech elicitation paradigm provides a new source of data with a temporal resolution comparable to self-paced reading and eyetracking while reading, but without the need for written input or reading.