Old Japanese (592–794 AD) had a uniquely complex writing system: variant Chinese; classical Chinese; man’yōgana; senmyoo gaki. This study takes a mathematical linguistic approach, employing word length and dependency distance as metrics of the lexical and syntactic complexity of Old Japanese. We find that the distribution of Japanese dependency directions is balanced, indicating that Japanese is neither a strongly head-initial nor strongly head-final language. Neither an advcl relation nor a cc relation are detected, suggesting that syntactic structure in Old Japanese is simpler than Modern Japanese. Among all the dependency relations, 46.3 per cent were of an adjacent relationship, rendered by case, mark, and det (with DD = 1), while nsubj, advmod, obl, and acl were long-distanced and presented a diverse range, with nsubj, for example, ranging from 1 to 29. Mean dependency distance and frequency fit a power law function (y = axb) well. Among texts, Senmyōgaki bears a relatively short mean word length, while Kojiki presents the longest word length. The mean word length-frequency distributions of Bussokusekika and Fudoki fit the Cohen-binomial model and Senmyō fits the Palm-Poisson model. The distribution of mean word length and their frequencies supports Zipf’s (1949) principle of least effort: shorter words tend to be more frequently used.
Citation: Wenchao Li (2022) Morphosyntactic Complexity in Old Japanese, European Journal of Statistics and Probability, Vol.10, No.2, pp., 14-28
This work by European American Journals is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License