We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
Speakers often do not state requests directly but employ innuendos such as Would you like to see my etchings? Though such indirectness seems puzzlingly inefficient, it can be explained by a theory of the strategic speaker, who seeks plausible deniability when he or she is uncertain of whether the hearer is cooperative or antagonistic. A paradigm case is bribing a policeman who may be corrupt or honest: A veiled bribe may be accepted by the former and ignored by the latter. Everyday social interactions can have a similar payoff structure (with emotional rather than legal penalties) whenever a request is implicitly forbidden by the relational model holding between speaker and hearer (e.g., bribing an honest maitre d’, where the reciprocity of the bribe clashes with his authority). Even when a hearer’s willingness is known, indirect speech offers higher-order plausible deniability by preempting certainty, gossip, and common knowledge of the request. In supporting experiments, participants judged the intentions and reactions of characters in scenarios that involved fraught requests varying in politeness and directness.
Words, grammar, and phonology are linguistically distinct, yet their neural substrates are difficult to distinguish in macroscopic brain regions. We investigated whether they can be separated in time and space at the circuit level using intracranial electrophysiology (ICE), namely by recording local field potentials from populations of neurons using electrodes implanted in language-related brain regions while people read words verbatim or grammatically inflected them (present/past or singular/plural). Neighboring probes within Broca’s area revealed distinct neuronal activity for lexical (~200 milliseconds), grammatical (~320 milliseconds), and phonological (~450 milliseconds) processing, identically for nouns and verbs, in a region activated in the same patients and task in functional magnetic resonance imaging. This suggests that a linguistic processing sequence predicted on computational grounds is implemented in the brain in fine-grained spatiotemporally patterned activity.
Why do compounds containing regular plurals, such as rats-infested, sound so much worse than corresponding compounds containing irregular plurals, such as mice-infested? Berent and Pinker (2007) reported five experiments showing that this theoretically important effect hinges on the morphological structure of the plurals, not their phonological properties, as had been claimed by Haskell, MacDonald, and Seidenberg (2003). In this note we reply to a critique by these authors. We show that the connectionist model they invoke to explain the data has nothing to do with compounding but exploits fortuitous properties of adjectives, and that our experimental results disconfirm explicit predictions the authors had made. We also present new analyses which answer the authors’ methodological objections. We conclude that the interaction of compounding with regularity is a robust effect, unconfounded with phonology or semantics.
When people speak, they often insinuate their intent indirectly rather than stating it as a bald proposition. Examples include sexual come-ons, veiled threats, polite requests, and concealed bribes. We propose a three-part theory of indirect speech, based on the idea that human communication involves a mixture of cooperation and conflict. First, indirect requests allow for plausible deniability, in which a cooperative listener can accept the request, but an uncooperative one cannot react adversarially to it. This intuition is sup- ported by a game-theoretic model that predicts the costs and benefits to a speaker of direct and indirect requests. Second, language has two functions: to convey information and to negotiate the type of relationship holding between speaker and hearer (in particu- lar, dominance, communality, or reciprocity). The emotional costs of a mismatch in the assumed relationship type can create a need for plausible deniability and, thereby, select for indirectness even when there are no tangible costs. Third, people perceive language as a digital medium, which allows a sentence to generate common knowledge, to propagate a message with high fidelity, and to serve as a reference point in coordination games. This feature makes an indirect request qualitatively different from a direct one even when the speaker and listener can infer each other’s intentions with high confidence.
English speakers disfavor compounds containing regular plurals compared to irregular ones. Haskell, MacDonald and Seidenberg (2003) attribute this phenomenon to the rarity of compounds containing words with the phonological properties of regular plurals. Five experiments test this proposal. Experiment 1 demonstrated that novel regular plurals (e.g., loonks-eater) are disliked in compounds compared to irregular plurals with illicit (hence less frequent) phonological patterns (e.g., leevk-eater, plural of loovk). Experiments 2–3 found that people show no dispreference for compounds containing nouns that merely sound like regular plurals (e.g., hose-installer vs. pipe-installer). Experiments 4–5 showed a robust effect of morphological regularity when phonological familiar- ity was controlled: Compounds containing regular plural nonwords (e.g., gleeks- hunter, plural of gleek) were disfavored relative to irregular, phonologically-iden- tical, plurals (e.g., breex-container, plural of broox). The dispreference for regular plurals inside compounds thus hinges on the morphological distinction between irregular and regular forms and it is irreducible to phonological familiarity.
This paper proposes a new analysis of indirect speech in the framework of game theory, social psychology, and evolutionary psychology. It builds on the theory of Grice, which tries to ground indirect speech in pure rationality (the demands of e‰cient communication between two cooperating agents) and on the Politeness Theory of Brown and Levinson, who proposed that people cooperate not just in exchanging data but in saving face (both the speaker’s and the hearer’s). I suggest that these theories need to be supple- mented because they assume that people in conversation always cooperate. A reflection on how a pair of talkers may have goals that conflict as well as coincide requires an examination of the game-theoretic logic of plausible denial, both in legal contexts, where people’s words may be held against them, and in everyday life, where the sanctions are social rather than judi- cial. This in turn requires a theory of the distinct kinds of relationships that make up human social life, a consideration of a new role for common knowledge in the use of indirect speech, and ultimately the paradox of ra- tional ignorance, where we choose not to know something relevant to our interests.
The role of Broca’s area in grammatical computation is unclear, because syntactic processing is often confounded with working memory, articulation, or semantic selection. Morphological processing potentially circumvents these problems. Using event-related functional magnetic resonance imaging (fMRI), we had 18 subjects silently inflect words or read them verbatim. Subtracting the activity pattern for reading from that for inflection, which indexes processes involved in inflection (holding constant lexical processing and articulatory planning) highlighted left Brodmann area (BA) 44/45 (Broca’s area), BA 47, anterior insula, and medial supplementary motor area. Subtracting activity during zero inflection (the hawk; they walk) from that during overt inflection (the hawks; they walked), which highlights manipulation of phonological content, implicated subsets of the regions engaged by inflection as a whole. Subtracting activity during verbatim reading from activity during zero inflection (which highlights the manipulation of inflectional features) implicated distinct regions of BA 44, 47, and a premotor region (thereby tying these regions to grammatical features), but failed to implicate the insula or BA 45 (thereby tying these to articulation). These patterns were largely similar in nouns and verbs and in regular and irregular forms, suggesting these regions implement inflectional features cutting across word classes. Greater activity was observed for irregular than regular verbs in the anterior cingulate and supplementary motor area (SMA), possibly reflecting the blocking of regular or competing irregular candidates. The results confirm a role for Broca’s area in abstract grammatical processing, and are interpreted in terms of a network of regions in left prefrontal cortex (PFC) that are recruited for processing abstract morphosyntactic features and overt morphophonological content.
In my book How the Mind Works, I defended the theory that the human mind is a naturally selected system of organs of computation. Jerry Fodor claims that ‘the mind doesn’t work that way’ (in a book with that title) because (1) Turing Machines cannot duplicate humans’ ability to perform abduction (inference to the best explanation); (2) though a massively modular system could succeed at abduction, such a system is implausible on other grounds; and (3) evolution adds nothing to our under- standing of the mind. In this review I show that these arguments are flawed. First, my claim that the mind is a computational system is different from the claim Fodor attacks (that the mind has the architecture of a Turing Machine); therefore the practical limitations of Turing Machines are irrelevant. Second, Fodor identifies abduction with the cumulative accomplishments of the scientific community over millennia. This is very different from the accomplishments of human common sense, so the supposed gap between human cognition and computational models may be illusory. Third, my claim about biological specialization, as seen in organ systems, is distinct from Fodor’s own notion of encapsulated modules, so the limitations of the latter are irrelevant. Fourth, Fodor’s arguments dismissing of the relevance of evolution to psychology are unsound.
We examine the question of which aspects of language are uniquely human and uniquely linguistic in light of recent suggestions by Hauser, Chomsky, and Fitch that the only such aspect is syntactic recursion, the rest of language being either specific to humans but not to language (e.g. words and concepts) or not specific to humans (e.g. speech perception). We find the hypothesis problematic. It ignores the many aspects of grammar that are not recursive, such as phonology, morphology, case, agreement, and many properties of words. It is inconsistent with the anatomy and neural control of the human vocal tract. And it is weakened by experiments suggesting that speech perception cannot be reduced to primate audition, that word learning cannot be reduced to fact learning, and that at least one gene involved in speech and language was evolutionarily selected in the human lineage but is not specific to recursion. The recursion-only claim, we suggest, is motivated by Chomsky’s recent approach to syntax, the Minimalist Program, which de-emphasizes the same aspects of language. The approach, however, is sufficiently problematic that it cannot be used to support claims about evolution. We contest related arguments that language is not an adaptation, namely that it is “perfect,” non-redundant, unusable in any partial form, and badly designed for communication. The hypothesis that language is a complex adaptation for communication which evolved piecemeal avoids all these problems.
In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the “narrow language faculty”) consists only of recursion, and that this part cannot be considered an adaptation to communication. We argue that their characterization of the narrow language faculty is problematic for many reasons, including its dichotomization of cognitive capacities into those that are utterly unique and those that are identical to nonlinguistic or nonhuman capacities, omitting capacities that may have been substantially modified during human evolution. We also question their dichotomy of the current utility versus original function of a trait, which omits traits that are adaptations for current use, and their dichotomy of humans and animals, which conflates similarity due to common function and similarity due to inheritance from a recent common ancestor. We show that recursion, though absent from other animals’ communications systems, is found in visual cognition, hence cannot be the sole evolutionary development that granted language to humans. Finally, we note that despite Fitch et al.’s denial, their view of language evolution is tied to Chomsky’s conception of language itself, which identifies combinatorial productivity with a core of “narrow syntax.” An alternative conception, in which combinatoriality is spread across words and constructions, has both empirical advantages and greater evolutionary plausibility.
The distinction between singular and plural enters into linguistic phenomena such as morphology, lexical semantics, and agreement and also must interface with perceptual and conceptual systems that assess numerosity in the world. Three experiments examine the computation of semantic number for singulars and plurals from the morphological properties of visually presented words. In a Stroop-like task, Hebrew speakers were asked to determine the number of words presented on a computer screen (one or two) while ignoring their contents. People took longer to respond if the number of words was incongruent with their morphological number (e.g., they were slower to determine that one word was on the screen if it was plural, and in some conditions, that two words were on the screen if they were singular, compared to neutral letter strings), suggesting that the extraction of number from words is automatic and yields a representation comparable to the one computed by the perceptual system. In many conditions, the effect of number congruency occurred only with plural nouns, not singulars, consistent with the suggestion from linguistics that words lacking a plural affix are not actually singular in their semantics but unmarked for number.