8  Computational approaches

8.1 Introduction

One of the more recent perspectives on language has viewed it as information. This treatment arose initially from the field of information theory (Shannon, 1948), which used a mathematical lens to view communication as a means of sending information from a sender to a receiver, subject to constraints on the communication system (e.g., its channel capacity or noise). More broadly, the application of computational approaches to linguistics has seen an exponential growth in interest in the last half-century, from early efforts in machine translation for military intelligence applications (Hutchins, 1999) to recent sophisticated chatbots (Open AI, 2024). Much has been written elsewhere on the history of computational linguistics and natural language processing (e.g., Johri et al., 2021; K. S. Jones, 1994; Schubert, 2020); here we focus on surveying some theoretical and philosophical issues regarding such approaches.

8.2 Why computational modelling?

Computational approaches afford a few particular advantages given the methods used to construct, fit, and employ computational models. The first is formalisation: since computational models require the operationalisation of constructs related to language, they require an explicit quantification of language how it is processed and/or acquired, rather than relying on verbal theory. Such formalisation is useful because it allows for the instantiation and evaluation of proposed mechanisms of language processing and acquisition, demonstrating how these mechanisms can (or cannot) explain the observed variation in actual human language. For example, computational models have been used to explain how humans handle communication in settings with noise or errors (Gibson et al., 2013; Levy, 2008), how children acquire regular and irregular past tense forms in English (Plunkett & Juola, 1999; Rumelhart & McClelland, 1987), and how unexpected words slow down reading speed (Oh & Schuler, 2023; Wilcox et al., 2023). By investiating input–output correspondences in these computational models, linguisticians can validate theories of language use, but also conduct experiments that may not be possible on humans (e.g., controlled rearing studies, Christiansen & Chater, 1999), or search through a larger parameter space for optimal experiment design (Huan et al., 2024).

Another benefit of computational approaches is the ability to handle large volumes of data. Continued advancements in corpus collection has vastly increased the amount of available language data (e.g., Common Crawl, 2025), which would be intractable to manually annotate. The use of computational models allows for the automatic processing and annotation of such data (e.g., Qi et al., 2020; Straka et al., 2016), permitting much larger-scale analyses and possibly the detection of lower-frequency constructions or phenomena with smaller effect sizes, which may not have otherwise appeared in smaller datasets (e.g., Roland et al., 2007).

A third contribution of computational methods is that they can represent the rich, high-dimensional nature of language. One significant advance is the shift towards sub-symbolic representations of language, especially distributional semantics, which suggests that word meanings can be elucidated from the contexts in which that word appears (Firth, 1957). Hence, word meanings can be represented as vectors or embeddings, which capture statistical patterns of the contexts in which the word occurs (e.g., Mikolov et al., 2013); this approach stands in stark contrast with formal symbolic theories of semantics, in which it is difficult to express a comprehensive description of meaning that can account for the entire lexicon. The distributed representations of meanings allows them to be arbitrarily composed mathematically, and can also serve as numerical representations for other kinds of operations (including those in modern neural network models). Furthermore, embeddings appear to have properties which align with humans’ linguistic representations (Grand et al., 2022), suggesting that they do in fact capture relevant dimensions of variance in semantics. We can also probe the internal representations of language models to determine how much semantic information is accessible from purely linguistic information—for example, it is possible to read out human colour perceptions (Marjieh et al., 2024) and cyclic representations of time (Engels et al., 2024) as emergent properties of language model representations.

Broadly, the quantitative nature of computational methods has enabled mechanistic, large-scale, robust, and sophisticated analyses of language that would be difficult to conduct otherwise. It is important to note that these characteristics may not apply to every computational approach—for example, modern language models are often difficult to interpret mechanistically (but see Rai et al., 2024). Nonetheless, these tools have provided us with new insights into the structure and usage of language.

8.3 The push towards language modelling

We can also approach the question of computational linguistics from the opposite angle: What makes language a good target for computational approaches? Some possible responses are clear, including the fact that language is essential for human communication, and that it is ubiquitous and thus has a large quantity of potentially available data. There are several other features that make language learning an interesting problem for computational approaches. First, it appears to be effectively universal across humans (barring developmental difficulties), and learnt early and without much explicit instruction—recall that these are the same arguments initially used to support Universal Grammar. That language is so pervasive is a good indicator that progress in machine use of language would be very useful for many applications. On the other hand, language appears to be difficult to learn and represent from a formal perspective. For example, early research into machine translation quickly revealed that it is not as straightforward as had been assumed, particularly due to non-linearities in the information (e.g., hierarchical grammatical structure, differing categorisations of semantic space, and information structure); thus, early symbolic approaches were relatively limited in what they could accomplish (e.g., Weizenbaum, 1966). Hence, natural language processing has emerged as an important challenge task for computational approaches.

Progress in language modelling has often been driven by difficult aspects of language representation and usage. For example, the streamed, linear format of language contrasts with the static, single-snapshot format of vision or other modalities of data; as such, handling complex time series information is necessary for language modelling, and drove early neural network approaches for handling dynamic data, including recursive neural networks (e.g., Costa et al., 2003). Language also exhibits long-distance dependencies (whether the narrowly-defined grammatical phenomenon, or more general informational dependencies), which was one of the impetuses for the development of attentional mechanisms, such that computations involving later words can “attend” more or less to earlier words depending on relevance (Vaswani et al., 2023). More recent approaches have also emphasised the importance of multimodal grounding in semantics and natural language understanding (Radford et al., 2021), as well as the distinction between truthfulness and usefulness in language use (Ouyang et al., 2022).

Furthermore, the modelling of “language” in fact encompasses a very large range of phenomena and capacities. These phenomena include traditional topics in linguistic analyses, including grammatical parsing (e.g., Bai et al., 2023; Vinyals et al., 2015), reference resolution (e.g., Moniz et al., 2024), natural language inference (e.g., Gubelmann et al., 2024), language acquisition (e.g., Elman, 1993; Wang et al., 2023), and the distinction between formal and functional competence (e.g., Mahowald et al., 2024). Computational approaches to language have also addressed issues related to different modalities of language data, including speech recognition (e.g., Dahl et al., 2012; Radford et al., 2022) and optical character recognition (e.g., Poznanski et al., 2025), or even further afield to decoding neural representations of language (e.g., Défossez et al., 2023; Hong et al., 2024). The diversity of potential target phenomena have driven a corresponding expansion in the methods and techniques employed under the broad umbrellas of computational linguistics and natural language processing, and continue to encourage innovation in contemporary computational approaches.

8.4 Philosophical issues in computational linguistics

The computational modelling of language has always been associated with corresponding philosophical issues related to these models. Turing famously introduced the idea of the Turing test, which suggests that a machine can be considered intelligent if a human interrogator is unable to distinguish between it and another human (Turing, 1950). This test is also related to Searle’s Chinese room thought experiment (Searle, 1980), which (contra Turing) suggests that it is possible for a person in a room to follow a set of instructions for constructing appropriate responses to inputs given in Chinese, even if they do not understand Chinese themself. Hence, the Turing test is too crude to determine understanding. These arguments have been naturally extended to modern large language models (LLMs), which do exhibit language performance sophisticated enough to ostensibly pass some Turing tests (C. R. Jones & Bergen, 2024).

Linguisticians have taken up a very broad range of perspectives on the modern version of this debate—that is, whether LLMs can tell us anything about linguistics. Some researchers believe that they cannot, largely because the context in which LLMs learn and use language is qualitatively different from humans, who use different mechanisms for learning, have much less input data, and are embodied in a multisensory, social environment that drives true meaning-making (e.g., Bender et al., 2021; Bender & Koller, 2020; Bolhuis et al., 2024; Gomes, 2024; Kodner et al., 2023). Under this view, the inherent differences between human and machine learning imply that language models cannot truly serve as effective models of language learning and use. However, a key under-addressed issue is the validity of the assumptions made—for example, do models in fact require human-like learning mechanisms in order to be effective models of language? Given that modern LLMs do show relatively sophisticated language behaviour, it seems plausible to posit that even “unnatural” learning mechanisms can extract meaningful structural features of language, such that these models remain interesting artifacts for investigation, especially since they permit analyses that would not be possible with humans.

A much more bullish perspective on LLMs is that they can themselves serve as theories of language, which may even surpass traditional linguistic theories, since they provide more accurate predictions about language behaviour in humans (e.g., Baroni, 2022; Piantadosi, 2024). While LLMs do indeed have increasingly strong predictive power, they lack explanatory power, since they only provide descriptions either at a very high, abstract level (e.g., regarding phenomena), or at a very low, implementational level (e.g., regarding statistical learning), neither of which are useful in providing interpretable, analytical explanations of linguistic phenomena (see Opitz et al., 2025).

In contrast with both of these more extreme perspectives, a growing group of researchers have laid out something of a via media: language models can serve as interesting ways to probe and evaluate linguistic theories, even if they do not serve as complete theories themselves (e.g., Binz et al., 2025; Frank & Goodman, 2025; Futrell & Mahowald, 2025; Mansfield & Wilcox, 2025; Millière, 2024; Pater, 2019; Portelance & Jasbi, 2024). Two ideas are key in this regard. The first is representations: probing the internal representations of LLMs allows us to understand what kinds of representations are able to support complex language behaviour (see Tosato et al., 2024). For example, language models appear to encode hierarchical syntactic information (Rogers et al., 2020) as well as syntactic relations (Diego-Simón et al., 2024), suggesting that such representations are important for appropriate language production, as opposed to merely operating over linear positional features. Another key idea is learnability: understanding what can be acquired by language models reflects the inductive biases that may or may not be necessary for language learning in humans. A recent line of work has demonstrated that actual human languages are easier for LLMs to learn than implausible languages (e.g., with inconsistent word order; Kallini et al., 2024; Xu et al., 2025; Yang et al., 2025), refuting the supposition that language models are able to learn any arbitrary language (Moro et al., 2023), and conversely suggesting that structural regularities in the input are crucial for a language to be learnable—even for learning algorithms like statistical learning. This moderate perspective draws connections among symbolic linguistic theory, information theory, and language modelling, allowing for more multifaceted approaches toward understanding language.

8.5 Conclusion

Computational approaches have received a lot of attention in recent years, and much debate continues on their application and implications for linguistics. Nonetheless, it is exciting that these approaches have permitted many analyses which were hitherto impossible, and it will be interesting to observe how this young field continues to develop and mature over time, through both technical and methodological improvements, as well as continued theoretical and philosophical discussion.