Information

How do you predict a cognitive bias theoretically using dual process theory?

How do you predict a cognitive bias theoretically using dual process theory?

Disclaimer: I am a complete layman in psychology, with no education in the field whatsoever. This question was prompted by my reading Kahneman's "Thinking Fast and Slow", in which he discusses the model of System 1 and System 2 extensively.

I thought many of the cognitive biases and other psychological effects were fascinating to read about, and I really enjoyed it. But every time he brought the "two systems" into the discussion of a particular cognitive bias (such as the affect heuristic or hindsight bias), it felt like he was using it as a post hoc explanation of how the effect could be described by certain characteristics of the two systems. It was never clear to me how you can show that the dual process implies the existence of the discussed effect.

I think he did a far better job showing how certain psychological effects result from prospect theory. It's easy to show, for example, how prospect theory implies the endowment effect, as shown in the given link. Even though knowledge of the endowment effect existed before prospect theory, it's easy to see how the endowment effect could have been predicted by prospect theory. You can even prove that it would exist, given the utility function of prospect theory and the right axioms.

But I'm confused about how you can do the same with dual process theory, and the cognitive biases that it supposedly explains. How can you use dual process theory to predict the existence of a cognitive bias? Are there examples of cognitive biases that were, or can be explicitly predicted using DPT?

If this is just a poorly framed question--if predicting cognitive biases is not one of the purposes of the theory--then what is the purpose of dual process theory?


EDIT: I want to make a few clarifications here. My concern is neither with the predictive success of DPT - or other theories to which I compare it - nor is it with whether the things that DPT can predict have already been discovered before DPT. The example I used - prospect theory, and its "prediction" of the endowment effect - is both retroactive (the effect was known before the theory), and according to the answer by Fizz, has questionable predictive verification. However, the manner by which you can use the theory to make the prediction is quite clear. If we assume the value curve given by prospect theory, and the claim of prospect theory that a person's perceived value of a gain or loss is relative to their current position rather than absolutely dependent on the final state, then it follows logically that an individual's perceived gain in value for acquiring something is lower than their perceived loss in value for losing the same thing. This can be answered in theory, on pen and paper, given the framing of the scenario and the axioms of prospect theory.

With dual process theory I want to understand:

  1. what kind of predictions it can make about human behavior
  2. how to make those predictions using the theory

Since this is seemingly a very celebrated modern theory of human behavior, I just want to know how it's used. I don't mind if every known consequence of dual process theory was discovered before the theory, nor do I mind if the method is not as rigorous as that of a physics theory.


Well, whether something is truly predicted or a post-hoc explanation can be difficult to disentangle in this area. What you're asking for is that dual-process theory predicts the existence of a new type of bias previously unobserved. And that's a tall order because observational psychology has been around for a log time.

Probably the area in which I buy the dual-process explanations the most is belief bias; dual-process theory predicts that our intuitions (T-1 processes) bias us for plausible conclusions (and against implausible ones), even if the reasoning (T-2 process) is faulty (and respectively correct).

Predicting an entirely new type of bias is not the only way to test dual-process theory. One can test it quantitatively by looking at known biases and verifying experimentally if the timing affects the level of bias. And in the case of belief bias, that seems to be the case.

The purpose of any scientific theory is to explain some aspect of the real world. Just because we don't continuously discover (qualitatively) new facts based on that theory doesn't invalidate it as scientific theory.

In psychology in general, experimental data is fairly easy to come by, so evidence often leads theory. It's not like in modern physics where you predict the Higgs boson and it takes decades to discover it in actual experiment. Or you predict the string structure of the universe, but it's unlikely we'll ever be able to test that.


By request, here's the introductory / "theoretical" part of the paper:

[Dual-process] theories posit two distinct processes of reasoning that compete for control of the response that participants make in reasoning tasks. Heuristic or System1 processes are characterised as rapid, implicit, associative, and heavily contextualised, whereas analytic or System2 processes are described as slow and sequential but capable of abstraction and generalisation. Note that a key difference is the speed of processing. The reason for this is that analytic processing is a sequential process requiring use of central working memory and is constrained by its limited capacity. By contrast, heuristic processes operate through massively parallel implicit systems that exert an unconscious influence on responding. In support of this distinction is evidence of substantial correlations between general intelligence and working memory capacity with abstract deductive reasoning but not pragmatic, belief-based reasoning (Stanovich, 2004).

While a mass of evidence has been recorded supporting the idea that there are two different processes of reasoning, more research is needed to confirm the characteristics that theorists have attributed to the cognitive systems that may underlie these observations. A relevant methodological innovation reported by Roberts and Newton (2002) is the rapid-response reasoning task. The idea is that by constraining participants to respond within a short period of time, the slower analytic process of reasoning will be differentially inhibited. For example, on the Wason selection task, there is a well established non-logical tendency, known as"matching bias"(Evans, 1998), to select cards that match the explicit content of conditional statements, regardless of the presence of negations. Roberts and Newton (2002) showed that measures of matching bias were significantly increased when a rapid response version of the selection task was compared with a free-time version. This is as dual-process theory would predict, since any influence of analytic reasoning to inhibit the bias would be suppressed by the requirement to respond quickly.

A central phenomenon in dual-process accounts of reasoning is that of"belief bias"-the tendency to evaluate the validity of an argument on the basis of whether or not one agrees with the conclusion, rather than on whether or not it follows logically from the premises. The phenomenon is one of the earliest reported in the psychology of reasoning (Wilkins, 1928) but the modern study of the effect dates from the paper of Evans, Barston, and Pollard (1983), which established the phenomenon with all relevant experimental controls. Evans et al. showed that participants' evaluations of conclusions were substantially affected both by the logical validity of the arguments and by the believability of the conclusions. With the help of protocol analyses Evans et al. characterised the effect as involving a within-participant conflict between logic-based (analytic) processes and belief-based (heuristic) processes. Such a conflict is to be expected in view of contemporary dual-process theories. In support of this, it has been shown that participants high in general intelligence are more able to resist belief biases (Stanovich & West, 1997) and that logical and belief-based responding are neurologically differentiated (Goel & Dolan, 2003).

In this study, we extend the rapid-response methodology of Roberts and Newton (2002) to the study of belief bias in syllogistic reasoning. Using a conclusion-evaluation paradigm with problems similar to those of Evans et al. (1983), we compare performance under rapid-response and free-time tasks. We predict that the participants required to respond rapidly will show (a) an increased level of belief bias and (b) a reduced level of logical responding.


And regarding prospect theory; quoting from Wikipedia page on it:

Critics from the field of psychology argued that even if Prospect Theory arose as a descriptive model, it offers no psychological explanations for the processes stated in it.

So I don't see how it is any less of a reverse-fitting of (arguably more math-oriented) description to prior data.

Furthermore, the endowment effect is not even certain to exist, if wikipedia is correct (There are some failed replications, no meta-analysis that I can find.) There are a lot of other (possible) theoretical explanations given for it there as well. In fact there is more theory than there are experiments. Also, being much closer to economics, it's generally more likely to have mathematical models for it.


Strengths, Limitations, and Conclusions

Overall, this research represents a novel investigation of emotion-cognition linkages framed within a differential susceptibility model, and includes several methodological strengths. First, use of a behavioral paradigm to index cognitive processing eliminated distortion due to response biases such as social desirability, which may occur when informants select responses that will be viewed favorably by others (e.g., the endorsement of positive but not negative maternal attributes). Second, emotional reactivity was assessed in response to naturally occurring events, thereby minimizing confounds associated with estimating reactions to hypothetical stressors. Finally, the administration of semi-structured diagnostic interviews provided a comprehensive and refined assessment of maternal psychopathology.

In spite of these strengths, several limitations are worth noting. First, maternal psychopathology served as a proxy for the emotional quality of caregiving experiences it would be helpful in future research to assess specific parenting behaviors during mother-child interactions (e.g., maternal sensitivity) that may shape youths' cognitive processing. Second, the study included a relatively small sample of youth, in which only a subset of caregivers experienced diagnoses or subclinical symptoms of psychopathology. Thus, future research will need to replicate these findings in a large, ethnically diverse sample of youth as well as in samples of caregivers with diagnostic levels of psychopathology. Third, our emotional reactivity index reflected the experience of negative emotionality in response to stress. Although this index is consistent with the construct of difficult temperament, which is the focus of theory and research on differential susceptibility, it is unclear whether the cognitive benefits accrued to youth with high emotional reactivity resulted from non-depressed mothers' ability to react in an emotionally supportive manner when youth are stressed or whether the same youth also displayed heightened positive emotionality in response to support, thereby resulting in positive cognitive biases. Finally, this study specifically examined cognitive biases during the processing of mother-referent information, and it remains to be determined whether results generalize to youths' cognitive processing of other relationships (e.g., peers, siblings) or non-interpersonal domains (e.g., academics, health).

In sum, these findings illuminate one personal characteristic of youth that shapes emotion-cognition linkages during early adolescence, and reveal trade-offs of emotional reactivity for cognitive processing such that both enhancing and impairing effects emerge as a function of socialization environment. That is, in the context of maternal depression, youths' heightened emotional arousal and distress may impair cognition by generating a perseverative focus on negative features of the environment, including information about emotionally insensitive or unavailable caregivers. In contrast, in parenting contexts characterized by low maternal depression (and, perhaps, accompanying warmth and sensitivity), youths' emotional reactivity may enhance cognition by allowing youth to interpret caregiving interactions in a positive light. Given that negative cognitive biases represent a risk factor for depression, these findings implicate youths' emotional reactivity and maternal depression as joint targets of intervention and prevention endeavors. Overall, this research emphasizes the importance of considering integrative, developmentally sensitive perspectives of the complex interplay between emotion and cognition, which may involve mutually enhancing or impairing associations, particularly as emotion-cognition linkages pertain to the onset and maintenance of psychopathology across the lifespan.


VISUO-SPATIAL SKETCHPAD

Interest in visuo-spatial memory developed during the 1960s, when Posner & Konick (1966) showed that memory for a point on a line was well retained over a period ranging up to 30 seconds, but it was disrupted by an interpolated information-processing task, suggesting some form of active rehearsal. Dale (1973) obtained a similar result for remembering a point located in an open field. In contrast to these spatial memory tasks, Posner & Keele (1967) produced evidence suggesting a visual store lasting for only two seconds. However, their method was based on speed of processing letters, in which a visual letter code appeared to be superseded by a phonological code after two seconds. Although this could reflect the duration of the visual trace, it could equally well reflect a more slowly developing phonological code that then overrides the visual.

Visual STM

A colleague, Bill Phillips, and I decided to test this using material that would not be readily nameable. We chose 5 × 5 matrices in which approximately half the cells would be filled at random on any given trial. We tested retention over intervals ranging from 0.3 to 9 seconds, by presenting either an identical stimulus or one in which a single cell was changed, with participants making a same/different judgment. We found a steady decline over time, regardless of whether we measured performance in terms of accuracy or reaction time (Phillips & Baddeley 1971). A range of studies by Kroll et al. (1970), using articulatory suppression to disrupt the use of a name code in letter judgments, came to a similar conclusion, that the Posner and Keele result was based on switching from a visual to a phonological code, perhaps because of easier maintenance by subvocal rehearsal. Meanwhile, Phillips went on to investigate the visual memory store using matrix stimuli, demonstrating that accuracy declines systematically with number of cells to be remembered (Phillips 1974), suggesting limited visual STM capacity. It was this work that influenced our initial concept of the visuo-spatial sketchpad.

Spatial STM

The most frequently used clinical test of visuo-spatial memory is the Corsi block-tapping test (Milner 1971), which is spatially based and involves sequential presentation and recall. The participant views an array of nine blocks scattered across a test board. The tester taps a sequence of blocks, and the participant attempts to imitate this. The number of blocks tapped is increased until performance breaks down, with Corsi span typically being around five, about two less than digit span. Della Sala et al. (1999), using a modified version of the Phillips matrix task, showed that visual pattern span is dissociable from spatial Corsi span, with some patients being impaired on one while the other is preserved, and vice versa. Furthermore, pattern span can be disrupted by concurrent visual processing, whereas Corsi span is more susceptible to spatial disruption (Della Sala et al. 1999). I return to the visual-spatial distinction at a later point.

Visuo-Spatial WM

During the 1970s, research moved from visual STM to its role in visual imagery. Our own studies used a technique developed by Brooks (1968), in which participants are required to remember and repeat back a sequence of spoken sentences. In half of the cases the sentences can be encoded as a path through a visually presented matrix. The other half of the instructions were not readily encodable spatially. We found that recall of the visuo-spatially codable sentences was differentially disrupted by pursuit tracking (Baddeley et al. 1975a). We interpreted this result in terms of the sketchpad, leading to the question of whether the underlying store was visual or spatial. This we tested using a task in which blindfolded participants tracked a sound source (spatial but not visual) or detected the brightening of their visual field (visual but not spatial), again while performing the Brooks task. We found that the tracking still disrupted the spatial but did not interfere with the verbal task, whereas the brightness judgment showed a slight tendency in the opposite direction, leading us to conclude that the system was spatial rather than visual (Baddeley & Lieberman 1980).

Although these results convinced me that the system was essentially spatial, Robert Logie, who was working with me at the time, disagreed and set out to show that I was wrong. He succeeded, demonstrating that some imagery tasks were visual rather than spatial. He used a visual imagery mnemonic whereby two unrelated items are associated by forming an image of them interacting for example, cow and chair could be remembered as a cow sitting on a chair. Logie (1986) showed that this process can be disrupted by visual stimuli such as irrelevant line drawings or indeed by simple patches of color. There are now multiple demonstrations of the dissociation of visual and spatial WM. Klauer & Zhao (2004) critically review this literature before performing a very thorough series of investigations controlling for potential artifacts their results support the distinction between visual and spatial STM, a distinction that is also supported by neuroimaging evidence (Smith & Jonides 1997).

Yet further fractionation of the sketchpad seems likely. Research by Smyth and colleagues has suggested a kinesthetic or movement-based system used in gesture and dance (Smyth & Pendleton 1990). Another possible channel of information into the sketchpad comes from haptic coding as used in grasping and holding objects, which in turn is likely to involve a tactile component. Touch itself depends on a number of different receptor cells capable of detecting pressure, vibration, heat, cold, and pain. We currently know very little about these aspects of STM, and my assumption that information from all of these sources converges on the sketchpad is far from clearly established.

The nature of rehearsal in the sketchpad is also uncertain. Logie (1995, 2011) suggests a distinction between a “visual cache,” a temporary visual store, and a spatial manipulation and rehearsal system, the “inner scribe,” although the precise nature of visuo-spatial rehearsal remains unclear.


Language Development

Given the remarkable complexity of a language, one might expect that mastering a language would be an especially arduous task indeed, for those of us trying to learn a second language as adults, this might seem to be true. However, young children master language very quickly with relative ease. B. F. Skinner (1957) proposed that language is learned through reinforcement. Noam Chomsky (1965) criticized this behaviorist approach, asserting instead that the mechanisms underlying language acquisition are biologically determined. The use of language develops in the absence of formal instruction and appears to follow a very similar pattern in children from vastly different cultures and backgrounds. It would seem, therefore, that we are born with a biological predisposition to acquire a language (Chomsky, 1965 Fernández & Cairns, 2011). Moreover, it appears that there is a critical period for language acquisition, such that this proficiency at acquiring language is maximal early in life generally, as people age, the ease with which they acquire and master new languages diminishes (Johnson & Newport, 1989 Lenneberg, 1967 Singleton, 1995).

Children begin to learn about language from a very early age (Table 1). In fact, it appears that this is occurring even before we are born. Newborns show a preference for their mother’s voice and appear to be able to discriminate between the language spoken by their mother and other languages. Babies are also attuned to the languages being used around them and show preferences for videos of faces that are moving in synchrony with the audio of spoken language versus videos that do not synchronize with the audio (Blossom & Morgan, 2006 Pickens, 1994 Spelke & Cortelyou, 1981).

Table 2. Stages of Language and Communication Development
Stage Age Developmental Language and Communication
1 0–3 months Reflexive communication
2 3–8 months Reflexive communication interest in others
3 8–12 months Intentional communication sociability
4 12–18 months First words
5 18–24 months Simple sentences of two words
6 2–3 years Sentences of three or more words
7 3–5 years Complex sentences has conversations

Each language has its own set of phonemes that are used to generate morphemes, words, and so on. Babies can discriminate among the sounds that make up a language (for example, they can tell the difference between the “s” in vision and the “ss” in fission) early on, they can differentiate between the sounds of all human languages, even those that do not occur in the languages that are used in their environments. However, by the time that they are about 1 year old, they can only discriminate among those phonemes that are used in the language or languages in their environments (Jensen, 2011 Werker & Lalonde, 1988 Werker & Tees, 1984).

Watch It

This video explains some of the research surrounding language acquisition in babies, particularly those learning a second language.

Newborn Communication

Figure 2. Before they develop language, infants communicate using facial expressions.

Do newborns communicate? Certainly, they do. They do not, however, communicate with the use of language. Instead, they communicate their thoughts and needs with body posture (being relaxed or still), gestures, cries, and facial expressions. A person who spends adequate time with an infant can learn which cries indicate pain and which ones indicate hunger, discomfort, or frustration.

Intentional Vocalizations

Infants begin to vocalize and repeat vocalizations within the first couple of months of life. That gurgling, musical vocalization called cooing can serve as a source of entertainment to an infant who has been laid down for a nap or seated in a carrier on a car ride. Cooing serves as practice for vocalization. It also allows the infant to hear the sound of their own voice and try to repeat sounds that are entertaining. Infants also begin to learn the pace and pause of conversation as they alternate their vocalization with that of someone else and then take their turn again when the other person’s vocalization has stopped. Cooing initially involves making vowel sounds like “oooo.” Later, as the baby moves into babbling (see below), consonants are added to vocalizations such as “nananananana.”

Babbling and Gesturing

Between 6 and 9 months, infants begin making even more elaborate vocalizations that include the sounds required for any language. Guttural sounds, clicks, consonants, and vowel sounds stand ready to equip the child with the ability to repeat whatever sounds are characteristic of the language heard. These babies repeat certain syllables (ma-ma-ma, da-da-da, ba-ba-ba), a vocalization called babbling because of the way it sounds. Eventually, these sounds will no longer be used as the infant grows more accustomed to a particular language. Deaf babies also use gestures to communicate wants, reactions, and feelings. Because gesturing seems to be easier than vocalization for some toddlers, sign language is sometimes taught to enhance one’s ability to communicate by making use of the ease of gesturing. The rhythm and pattern of language are used when deaf babies sign just as when hearing babies babble.

At around ten months of age, infants can understand more than they can say. You may have experienced this phenomenon as well if you have ever tried to learn a second language. You may have been able to follow a conversation more easily than to contribute to it.

Try It

Holophrasic Speech

Children begin using their first words at about 12 or 13 months of age and may use partial words to convey thoughts at even younger ages. These one-word expressions are referred to as holophrasic speech (holophrase). For example, the child may say “ju” for the word “juice” and use this sound when referring to a bottle. The listener must interpret the meaning of the holophrase. When this is someone who has spent time with the child, interpretation is not too difficult. They know that “ju” means “juice” which means the baby wants some milk! But, someone who has not been around the child will have trouble knowing what is meant. Imagine the parent who exclaims to a friend, “Ezra’s talking all the time now!” The friend hears only “ju da ga” which, the parent explains, means “I want some milk when I go with Daddy.”

Underextension

A child who learns that a word stands for an object may initially think that the word can be used for only that particular object. Only the family’s Irish Setter is a “doggie.” This is referred to as underextension. More often, however, a child may think that a label applies to all objects that are similar to the original object. In overextension, all animals become “doggies,” for example.

First words and cultural influences

First words for English-speaking children tend to be nouns. The child labels objects such as a cup or a ball. In a verb-friendly language such as Chinese, however, children may learn more verbs. This may also be due to the different emphasis given to objects based on culture. Chinese children may be taught to notice action and relationship between objects while children from the United States may be taught to name an object and its qualities (color, texture, size, etc.). These differences can be seen when comparing interpretations of art by older students from China and the United States.

Vocabulary growth spurt

One-year-olds typically have a vocabulary of about 50 words. But by the time they become toddlers, they have a vocabulary of about 200 words and begin putting those words together in telegraphic speech (short phrases). This language growth spurt is called the naming explosion because many early words are nouns (persons, places, or things).

Two-word sentences and telegraphic speech

Words are soon combined and 18-month-old toddlers can express themselves further by using phrases such as “baby bye-bye” or “doggie pretty.” Words needed to convey messages are used, but the articles and other parts of speech necessary for grammatical correctness are not yet included. These expressions sound like a telegraph (or perhaps a better analogy today would be that they read like a text message) where unnecessary words are not used. “Give baby ball” is used rather than “Give the baby the ball.” Or a text message of “Send money now!” rather than “Dear Mother. I really need some money to take care of my expenses.” You get the idea.

Child-directed speech

Why is a horse a “horsie”? Have you ever wondered why adults tend to use “baby talk” or that sing-song type of intonation and exaggeration used when talking to children? This represents a universal tendency and is known as child-directed speech or motherese or parentese. It involves exaggerating the vowel and consonant sounds, using a high-pitched voice, and delivering the phrase with great facial expression. Why is this done? It may be in order to clearly articulate the sounds of a word so that the child can hear the sounds involved. Or it may be because when this type of speech is used, the infant pays more attention to the speaker and this sets up a pattern of interaction in which the speaker and listener are in tune with one another. When I demonstrate this in class, the students certainly pay attention and look my way. Amazing! It also works in the college classroom!

Watch It

This video examines new research on infant-directed speech.

Try It

Theories of Language Development

How is language learned? Each major theory of language development emphasizes different aspects of language learning: that infants’ brains are genetically attuned to language, that infants must be taught, and that infants’ social impulses foster language learning. The first two theories of language development represent two extremes in the level of interaction required for language to occur (Berk, 2007).

Chomsky and the language acquisition device

This theory posits that infants teach themselves and that language learning is genetically programmed. The view is known as nativism and was advocated by Noam Chomsky, who suggested that infants are equipped with a neurological construct referred to as the language acquisition device (LAD), which makes infants ready for language. The LAD allows children, as their brains develop, to derive the rules of grammar quickly and effectively from the speech they hear every day. Therefore, language develops as long as the infant is exposed to it. No teaching, training, or reinforcement is required for language to develop. Instead, language learning comes from a particular gene, brain maturation, and the overall human impulse to imitate.

Skinner and reinforcement

This theory is the opposite of Chomsky’s theory because it suggests that infants need to be taught language. This idea arises from behaviorism. Learning theorist, B. F. Skinner, suggested that language develops through the use of reinforcement. Sounds, words, gestures, and phrases are encouraged by following the behavior with attention, words of praise, treats, or anything that increases the likelihood that the behavior will be repeated. This repetition strengthens associations, so infants learn the language faster as parents speak to them often. For example, when a baby says “ma-ma,” the mother smiles and repeats the sound while showing the baby attention. So, “ma-ma” is repeated due to this reinforcement.

Social pragmatics

Another language theory emphasizes the child’s active engagement in learning the language out of a need to communicate. Social impulses foster infant language because humans are social beings and we must communicate because we are dependent on each other for survival. The child seeks information, memorizes terms, imitates the speech heard from others, and learns to conceptualize using words as language is acquired. Tomasello & Herrmann (2010) argue that all human infants, as opposed to chimpanzees, seek to master words and grammar in order to join the social world [5] Many would argue that all three of these theories (Chomsky’s argument for nativism, conditioning, and social pragmatics) are important for fostering the acquisition of language (Berger, 2004).

Try It


1 Introduction

How should we study cognition? How can we understand how the mind works? Such questions are inextricably linked to the notion of levels of explanation for cognitive phenomena. Some researchers argue that cognitive phenomena are best studied and understood at some particular level, for example at the functional level, at the level of neural networks in the brain, or at the level of biological or cultural evolution. Such positions are sometimes associated with the conviction that the ultimate aim of cognitive science is to establish a single, comprehensive account of cognitive phenomena based on their causal mechanisms, or on a single set of principles, at some privileged level of explanation.

Other researchers disagree and maintain that for any cognitive phenomenon there is no privileged level of explanation. From this view, a full understanding of cognitive phenomena requires explanations that integrate multiple levels of explanation. For these researchers, cognitive science should pursue integration and become a genuine interdisciplinary endeavour.

A third position is that the explanatory targets of cognitive science have a much more disunified and unstable character than the targets of disciplines such as physics. Hence, the understanding of cognitive phenomena necessarily resembles a patchwork of relatively autonomous levels and approaches. These are just three positions, and there are many other opinions in between.

Intuitively, it is plausible that any cognitive phenomenon can be studied and explained at different levels. For example, if you want to explain how you can learn something new by reading this text, you might seek an explanation at some “high” level, considering psychological, social, or cultural factors that are relevant to acquiring new knowledge by reading. But you might also pursue explanations at some “lower” levels, for instance, the functional level of syntactic parsing, lexical processing, and memory update. Or you might go at “lower” levels still, and study eye movement or brain activity during reading.

But does it really make sense to talk about different levels of explanations in cognitive science? Does the notion of a level of explanation play any important role in studying and understanding cognitive phenomena? A positive answer would beget several other questions. What exactly is a level of explanation? Is there any privileged level of explanation for a given cognitive phenomenon? How are different levels of explanation in cognitive science related, or how should they be related? How can unified multi-level accounts of cognitive phenomena be effectively pursued? A negative answer would instead suggest that the notion of level of explanation is confused and does not do much epistemic work in cognitive science. Level of explanation would be a generic notion related to a number of distinct research strategies and questions, which are more precisely defined in terms of concepts such as scale, composition, complexity, and hierarchy.

  • What is a level of explanation for cognitive phenomena? Is there a privileged level or kind of explanation in cognitive science? How could we tell?
  • How do different levels of explanation fit together, or relate to one another? How should explanations at one level inform or constrain explanations at some other level?
  • Can the different approaches to the mind, brain, and culture be unified? Or is a plurality of approaches and levels of explanation a genuine feature of cognitive science? What would it take to unify or integrate different levels of explanation?
  • What is reductionism in the sciences of mind, brain, and culture? How does reductionism promote or hinder our understanding of the mind?
  • Which kind of explanations should be more represented in the future of cognitive science?

Such questions have always been the subject of controversy. So it is unsurprising that this is not the first topiCS on the theme. For example, there is another issue devoted to David Marr and “Levels of analysis in Cognitive Science” (Peebles & Cooper, 2015 ). However, the authors of the present volume express opinions that go far beyond the debates about Marr’s trichotomy between what is computed, how it is computed, and in which hardware it is computed.

In the remainder of this introduction, we first provide an overview of the different contributions in this special issue. Then we are going to do an experiment: We present a dialogue between a student of cognitive science and two professors representing different views on the theme of this volume. The aim of this dialogue is to lay out a complex pattern of arguments and counterarguments concerning the topic of levels of explanation in the cognitive science to a wider readership. We close with some general remarks and a list of recommended readings.


Discussion

Relationship between self-control/impulsivity and interference control

Participants completed Whiteside and Lynam’s (2001) subscales for three facets of impulsivity (premeditation, urgency, and perseverance) and Tangney et al.’s (2011) BSCS. Earlier reviews and analyses by Allom et al. (2016) and Duckworth and Kern (2011) reported very small correlations between self-report trait measures of self-control and objective measures of EF obtained with a variety of laboratory tasks but did not specifically examine the nonverbal interference tasks that are the focus of the present study.

As described in more detail in the results, and as shown in Table 6, the correlations between the trait measures of impulsivity/self-control and the interference effects that presumably reflect some type of conflict resolution processing are nonsignificant. The strong and significant correlation reported by Enticott et al. (2006) between trait impulsivity and spatial Stroop interference was not significant in our data for premeditation, urgency, or perseverance (see Table 6). With the exception of Enticott et al., the cumulative evidence shows that interference effects do not predict self-reported impulsivity in everyday life. As Wolff et al. (2016) note, a persisting gap between EFs and self-control implies that adequate EF could be a necessary condition, but it is clearly not a sufficient condition for successful self-control.

Another potential cause of the disconnect may be that the laboratory tasks are very sensitive to the participant’s calibration of speed and accuracy, a skill that has little relevance to delaying gratification (urgency), planning before acting (premeditation), or having the grit to persist in the face of adversity (perseverance). Either implicitly or explicitly, the computerized EF tasks almost always encourage the participant to go as fast as possible without making more than an occasional error. The mechanisms needed to filter out competing information in the nick of time and when there is little intrinsic value associated with a “correct” response, may be different from those needed to resist actions that are affect laden and/or creatures of habit and have genuine costs and benefits. Moreover, competing information in the real world does not typically appear at random and is exquisitely tied to the onset of new task relevant information, and the conflict need not be resolved within the first couple of hundred ms of the onset of the event. In fact, any rapid suppression of responses counter to long-term goals often needs to be sustained in order to be ultimately successful.

Relationship between special experiences and interference control

Bilingualism

As shown in Table 4, the correlation between the ratio of L2/L1 proficiency and the composite measure of interference control was near zero. For this dataset, Paap et al. (2019) also reported no significant relationships between interference control and any of the following dimensions of bilingual experience: L2 proficiency, similarity of L2 to L1, age-of-acquisition of L2, percentage of time speaking L2, frequency of language switching per day, frequency of code switching, the mean number of languages used per context (e.g., at home, at work, at school, with friends, etc.), and the number of languages spoken. The results from this study are consistent with the meta-analyses described earlier (Donnelly et al., 2019 Lehtonen et al., 2018 Paap, 2019). The most straightforward conclusion is that bilingualism does not enhance inhibitory control. Paap, Johnson, and Sawi (2015, 2016) present an extended discussion of why a steady drip of significant findings occurs in the published literature, and Paap et al. (2019) conclude that bilingual language control may be encapsulated within the language-processing system and, consequently, have no beneficial effect on domain-general control.

Video game playing

In the present study, the composite interference score significantly correlated with the frequency of video game play (r = − .214), but when Raven’s scores, sex, and other factors were entered into the model, the regression coefficient for video game playing was no longer significant. Likewise, the frequency of video game play was not a predictor in the regression analyses of the individual tasks. The regression results are consistent with the results of Dye et al. (2009), showing no difference between players and nonplayers on flanker effects and the results of Unsworth et al. (2015) analyses showing no correlation between a continuous measure of video gaming and either Simon effects or flanker effects. From the studies reviewed in the introduction, only the training study by Hutchinson et al. is consistent with the hypothesis that video game play improves interference control and that study was restricted to Simon effects. However, as shown in Fig. 3, frequency of video game play was not a significant predictor for Simon effects either. In summary, little exists in the present study to stem what appears to be the tide that video game play has little or no impact on interference control as expressed in nonverbal interference tasks.

Music training

Years of music training was not a significant predictor of the composite interference scores. Neither was it a significant predictor in any of the separate stepwise analyses of interference scores. However, it was a significant predictor of Simon incongruent-trial residuals. This was the first time that the relationship between music training and Simon effects was assessed, and accordingly, no prior literature exists to support or guide an interpretation that music performance may hone interference control in the Simon task but not produce benefits on other nonverbal interference tasks. Consistent with the expectations laid out in the introduction, the current results provide no compelling evidence that music training or performance enhances inhibitory control to the extent that this hypothesis can be confirmed across a set of nonverbal interference tasks.

Mindfulness /meditation

meditation/meditation in our data are very inconsistent. The bivariate correlation between frequency of meditation and the composite interference scores was near zero (r = + 0.05), as was the beta coefficient for the regression analysis on the composite interference scores (β = + 0.07). However, significant positive beta coefficients were found for the meditation/mindfulness predictor in both the stepwise analysis of spatial Stroop interference scores (β = + 0.14) and the stepwise analysis of spatial Stroop residuals (β = + 0.07). These positive regression coefficients are, of course, opposite of what one would predict if mindfulness/meditation led to smaller interference scores and faster incongruent trials. The reliability of these positive regression coefficients in the analysis of the spatial Stroop is further questioned by the finding that the bootstrapped 95% CIs for both regression coefficients included zero. In contrast, in the analysis of the incongruent RT residuals for the Simon task, the beta for the mindfulness/meditation predictor was significant and in the expected negative direction (β = − 0.06). However, it was not a significant predictor of either the stepwise or LASSO regressions on Simon interference scores, which reduces the impact of the positive outcome in the regression on the Simon incongruent-RT residuals.

Recall that many training studies did not show significant facilitation and that most of the cross-sectional comparisons of meditators to non-meditators showed no group differences. We offer the following conjecture regarding why this pattern occurs in studies of mindfulness /meditation. Potential effects of bilingualism, music performance, or playing video games on nonverbal interference tasks are clear cases of far transfer in the sense that, for example, musicians are not practicing music when they are doing a flanker task, but meditators may be in a meditative state. This seems more probable when the last session of training culminates with the post-test of the interference task. Whether intentional or not, if a meditative state continues into the post-test, all types of cognitive control may be enhanced. Posner (2018) has recently reported that connectivity in the anterior cingulate cortex is improved following 2 to 4 weeks of meditation training and that the increase in frontal theta following meditation training might be the cause of improved connectivity. A critical question is whether improved connectivity is relatively durative and facilitates any processing employing those networks or if meditation induces temporary states that must be reinstated to produce benefit.

Team-sports ability

Team-sports ability was self-rated using this item originally developed by Paap and Greenberg (2013): Team sports often involve dividing your attention between a ball, a goal, your opponents, and your teammates. Do you excel at these sports? Team-sports ability enjoys the third highest zero-order correlation (r = − 0.19), with the composite interference scores and the beta coefficient being significant in the analysis of Simon interference effects (β = − 0.19). However, it did not enter the final stepwise model for any of the other tasks or for any of the tasks in the regression analyses of incongruent RT residuals.

In regression analyses similar to those used in the present study, Paap and Greenberg reported significant beta coefficients in their Study 3 for separate analyses of flanker effects and switching costs but not for Simon effects. A further complication to the interpretation of the relationship between sport’s ability and inhibitory control is that males rated their sports ability higher than females, and as reported above, these nonverbal interference tasks often produce male advantages.

A possible relationship between team-sports ability and interference control may be surprising for those familiar with contemporary theories in sports psychology because of the emphasis on the role of deliberate practice leading to automatization of skilled sport performance (e.g., Ericsson, Charness, Feltovich, & Hoffman, 2006). However, Toner and Moran (2014) have advocated for more research on the role of controlled processing and Furley and Wood (2016) review evidence that working memory capacity is often associated with better performance in team sports. The study most related to the type of interference control that is the focus of the present investigation is that of Vestberg, Gustafson, Maurex, Ingvar, and Petrovic (2012), who tested soccer players with different levels of advanced skills using the D-KEFS test battery of executive functions (Homacka, Lee, & Ricco, 2005). The design fluency component requires participants to remember previous responses by updating working memory and inhibition skills in order to not repeat previous responses. Also included was a color-word Stroop test and the Trail-Making Test. Players from the Swedish highest national soccer leagues outperformed players from the lower division on all of these measures of EF. Furthermore, the EF test scores obtained in the fall of 2007 were used predict a performance measure that combines goals and assists over a 17-month interval in 2008 and 2009. The correlation (cf = 0.54, p = .006) was statistically significant and noteworthy in magnitude. These results are consistent with the interpretation that EF contributes to team-sports ability, even at very high levels of skill.

Physical exercise

Individuals with superior team-sports ability are also likely to be fit, and in the present study, the frequency of exercise, working out, and participation in team sports notably did not predict the composite interference scores or the outcome measure in any of the task-specific regression analyses. Furthermore, these small correlations are positive, rather than negative, indicating that individuals reporting higher levels of physical exercise were actually trending toward larger interference effects.

In several large-scale studies (Paap et al., 2017 Paap & Greenberg, 2013 Paap & Sawi, 2014), the correlations between parents’ educational levels and a variety of EF measures were always nonsignificant and often near zero. The participants in each case were university students. In the present study, the proxies for SES were extended to include family income. Neither the composite measure of SES nor the separate factors predicted the composite interference scores. Studies using children often report effects of SES on EF. For example, Calvo and Bialystok (2014) tested six-year-old children and reported main effects for both bilingualism and SES on the flanker and Stroop effects. A possible explanation for why the relationship is consistently weak and nonsignificant in our studies is that the lower SES students in our college student population either had enriching early experience despite their parent’s education and income or have otherwise managed to compensate for disadvantages in early childhood.

The conundrum of sex, sports, gF, and their relationship to interference control

Males had smaller interference scores in the composite measure and individual regression analyses of the spatial and vertical Stroop task. Although sex was confounded with Raven’s scores, the same male advantage was observed when the 52 males were matched in Raven’s to 52 females. This evidence for sex differences in interference control in the present study should be interpreted cautiously, but two recent studies using spatial Stroop tasks similar to ours also reported statistically significant male advantages in the form of smaller interference effects. Stoet (2016) tested 236 males and 182 women in an online study and reported 42 ms interference scores for males and 29 ms for females. Evans and Hampson (2015) tested 90 males and 86 females and, estimating from their Fig. 4, the interference effects were apparently approximately 60 ms and 40 ms, respectively. For purposes of comparing across the studies, a separate two-way ANOVA on our spatial Stroop RT data yielded a significant Sex x Congruency interaction (F(1, 199) = 14.92, p < .001, partial η 2 squared = .070). The interference effect for males was 70 ms compared to 96 ms for females. The overall spatial Stroop effects in our study are atypically large. This is not too surprising as only 25% of the trials were incongruent compared to the usual 50–50 balance. A more extreme bias was used by Christakou et al. (2009) with only 11.5% incongruent trials and led to even larger spatial Stroop effects, namely, 110 ms for males and 129 ms for females. This male advantage was not statistically significant, Footnote 8 but the study was underpowered with only 38 males and 25 females. When incongruent trials are rare, a strategy of relying entirely on reactive mechanisms may be induced. Further pursuit of the sex effect in the spatial Stroop task with a systematic manipulation of the proportion of incongruent trials and determination of whether the male advantage is nested primarily in a preference for reactive inhibition over proactive may be worthwhile.

Lynn and Irwing (2004) suggest that the male advantage in the Raven’s test may be nested in the spatial-visualization ability in hierarchical factor models like Carroll’s (1993). In contrast to Raven’s, the ability to manipulate visual-spatial representations may play little role in interference tasks that require decisions about a single stimulus (e.g., spatial Stroop, vertical Stroop, and Simon) that remains in view until a response is made Although quite speculative, this provides one explanation for why matching on Raven’s scores does not reduce or eliminate the male advantage in interference control.

The Raven’s test was developed to assess an individual’s abstract reasoning ability without having to rely on declarative knowledge and the influence of language, education, or cultural factors (Carpenter, Just, & Shell, 1990 Raven, 1939). As reviewed by Lynn and Irwing (2004), many experts judge it as one of the best tests of gF as defined by Cattell (1971) because of its ability to discriminate relations, reason abstractly, solve novel problems, and adapt to new situations. Paap and Sawi (2014) note that EF should be related to gF because the components of EF (monitoring, updating, switching, and inhibiting) logically serve successful reasoning, problem solving, and adapting, whereas high quality reasoning seems to require more than the sum of the parts of EF. However, the degree to which EF and gF are actually separate constructs has been questioned, if not challenged, by Salthouse (Salthouse, Atkinson, & Berish, 2003 Salthouse, Pink, & Tucker-Drob, 2008) who showed that multiple measures of gF were strongly related to several measures of EF and that performance on classic EF tasks will sometimes load on the gF factor rather than the EF factor when allowed to do so. Salthouse (2010) observes, in a somewhat dispiriting manner, that if gF encompasses a broad spectrum of controlled processing, then investigators working from different research traditions may be giving different names to the same dimension of individual differences. That said, the intimate relationship between EF and gF appears less promiscuous for the inhibiting function of EF than for updating (Salthouse et al., 2003, Tables 9 and 10). This would be consistent with a working hypothesis that the interference effects measured in the present study and Raven’s scores share some dimensions of individual differences, but are separable constructs.

Recall that in the present study males outperformed females on the Raven’s test. Setting aside the omnipresent possibility of a Type 1 error, the difference could be due to a bias favoring higher gF males in our student population or it could reflect a genuine difference in the general population of young adults. Although the presence of sex differences in the Raven’s test remains controversial, Lynn and Irwing’s (2004) meta-analysis of 57 studies showed a statistically significant male advantage emerging at the age of 15 (0.10d) that grew to 0.33d among young adults aged 20–29 and remaining stable through old age. Their meta-analysis had two notable strengths: (1) avoiding apples and oranges comparisons by including only versions of the Raven’s test and excluding other intelligence tests and (2) including only general population studies with samples of at least 50 males and 50 females.

Limitations

Although four different nonverbal interference tasks were used that varied in terms of S-S compatibility and whether conflict arose from distractors versus a task-irrelevant dimension of the imperative stimulus, some results possibly would be different if the proportion of incongruent trials encouraged greater reliance on proactive inhibition. Likewise, some of our background variables relied on a single item. Future research might focus on developing scales for these predictors that have desirable psychometric properties. The complete absence of significant relationships between interference scores and measures of self-control and impulsivity may be attributed, in part, to the reliance on self-reports that rely on memory and are subject to various types of bias.

An optimist’s conclusions

The interference scores from the four nonverbal interference tasks have adequate split-half reliabilities and three (i.e., Simon, spatial Stroop, and vertical Stroop) cohered into a latent variable that may reflect the ability to resolve conflict between two dimensions of a single stimulus (namely, identity and location). This latent variable, expressed as a standardized composite of each task’s interference scores, is significantly related to sex and gF in that males and individuals with higher intelligence are better at resolving this type of conflict. The male advantage is sustained in a subset of males and females that are matched on Raven’s scores. Years of musical experience did not predict the composite interference scores but was associated with the magnitude of the Simon effect in incongruent RT residuals. As the Simon task is a pure S-R task (see Fig. 1), it may be more sensitive to a form of conflict resolution common to music performance, although we have no reason to believe that music performance is richer in S-R incompatibilities compared to S-S. Future research could test this hypothesis. Likewise, frequency of mindfulness/meditation did not predict the composite interference scores, but its regression coefficient was significant in predicting both Simon and spatial-Stroop effects. In the previous research (see Table 1), the relationship between mindfulness/meditation and interference control appears more consistent in the training studies than in studies comparing meditators to non-meditators. Thus, the possibility that mindfulness/meditation enhances interference control remains a plausible hypothesis but may be more robust following training. Finally, a surprising disconnect exists between the composite measure of interference control and self-ratings of impulsivity and control in everyday life.

A pessimist’s conclusions

The problem with the conclusions offered by optimists is that they are often influenced by a confirmation bias for reporting positive effects and a penchant for seeing any positive findings as a roadmap to future research that might eventually validate the constructs of interest, albeit with a more complicated theory than initially envisioned. But if the constructs do not exist or are markedly different, then the roadmap is a blind alley that prevents self-correction. Therefore, a pessimist might offer a different conclusion.

Four common nonverbal interference tasks that are typically assumed to measure inhibitory control did not all load on a common latent variable. The three tasks that did form a latent variable were not the tasks one would expect on the basis of Kornblum’s taxonomy (see Paap et al., 2019). Prior to the present study, no latent variable analysis has been able to extract a latent variable that includes the interference scores from two or more nonverbal interference tasks. Footnote 9 When prior studies do succeed in extracting a latent variable that includes a single nonverbal interference score, it loads weakly and is dominated by a different measure—often the antisaccade task (Rey-Mermet et al., 2018). In the same vein, Friedman and Miyake (2016) could not extract an inhibition factor that was separable from updating and shifting.

The formation of a latent variable for three of our tasks could be an artifact of the stimulus and response similarities across the tasks. Rey-Mermet et al. (2018) recommended and practiced the advice to deliberately introduce differences in the stimulus displays and response modes for tasks selected to load on the same latent variable. As Friedman and Miyake (2016) noted, task impurity seems to be an unavoidable quality of EF tasks like the nonverbal interference tasks. By definition, EFs involve controlling lower-level processes, so any inhibitory control task must include nonexecutive processes that could influence performance in addition to the EF of interest. One method for removing the influence of unreliability and task impurity is latent variable analysis. For present purposes, the important characteristic is that they capture only common variance across multiple measures this common variance cannot include random measurement error and will not include non-EF variance to the extent that tasks are selected to have different lower-level processes. The perceptual encoding, response selection, and response execution processes in the present study are, unfortunately, very similar and very well could explain the significant but small intertask correlations.

With the regression analyses, when a set of 11 predictors that have been hypothesized to be related to inhibitory control were entered in a stepwise regression on the composite interference scores, only sex and Raven’s score entered the model. When the same stepwise regression was conducted on the interference-scores from each individual task, Raven’s score was the only significant predictor for all four tasks. Sex was included in the model for two of the tasks with music training, mindfulness/meditation, and team sports included in only one model. Two of these predictors in the bootstrapped analysis of individual tasks had 95% CIs that included zero and are likely to be unreliable in future tests. The three methods (stepwise regression on interference scores, hierarchical regression on incongruent trial RT, and LASSO) intended to provide converging evidence each identify a predictor that the other two do not: music is selected in the analysis of incongruent-trial RT residuals (Simon task), team sports is selected by the stepwise regression of the interference scores (Simon task), and team sports is selected by the LASSO regression (composite of 3 tasks). The only solid relationship is that Simon, spatial Stroop, and vertical Stroop effects decrease as the Raven’s scores increase. Taking at face value that Raven’s is tapping into gF abilities and not skills, this would suggest that interference control in these generic nonverbal tasks are, at the individual differences level, influenced more by heritability than experience (see Paap, 2018b for a discussion of the possible role of heritability in EF).

The possibility of a causal relationship between EF and gF is important, as illustrated by the Engle, Kane and colleagues theory that EF/EA drives both gF and WMC. But the only nonverbal interference task typically included in their EA battery is the flanker task, and the flanker effect always loaded weakly on the EF/EA latent variable. A related but different issue was raised by Chuderski et al. (2012), who reported that latent variables for both inhibition and interference did not account for any meaningful portion of gF variance because the simple correlations were completely mediated by the storage capacity latent variable. The coup de grâce that inhibitory control is related to gF may be the Rey-Mermet et al. (2019) finding that a coherent latent variable for EF could not be established despite good reliabilities for all measures. Furthermore, WMC and gF—modeled as separate but correlated factors—were unrelated to the individual measures of EF, which included modified versions of both the arrow flanker and Simon tasks. In summary, inhibitory control is probably task-specific, not domain-general, and not causally related to gF. At best, subsets of nonverbal interference tasks may exist that share more specific mechanisms of conflict resolution. Going forward, we should stop using the flanker, Simon, and spatial Stroop tasks.

Another major purpose was to further evaluate the relationship between trait measures of self-control or impulsivity and measures of inhibitory control that are commonly used in cognitive psychology laboratories. Although the array of nonverbal interference tasks used in the present study was different from most of the cognitive control tasks surveyed by Duckworth and Kern (2011), our results sustain their conclusion that trait-like measures of self-control and interference control measured in RT tasks are not measuring the same thing. The differences in temporal dynamics and motivation may contribute to this dissociation. In any event, one should not interpret interference scores as “inhibitory control,” “self-control,” or “impulsivity” without converging evidence supporting such a generalization.


Appendix E. Invariance Testing for Multi-Group Confirmatory Factor Analysis (CFA) Modeling

Although we previously established that the latent constructs were adequately represented by their observed indicators, the multi-group CFA models allowed for a test of whether the means and factor loadings were similar or dissimilar across the two groups. To test this, we specified three different models: a configural model, a metric model, and a scalar model. In the configural model, the observed variable means and the factor loadings are allowed to vary essentially, we are testing whether the factor structure is equivalent across boys and girls. In the metric model, we imposed constraints on the factor loadings however, the means are allowed to vary across the two groups. Finally, in the scalar model, we constrained the means and factor loadings to equality across groups. Invariance across both means and factor loadings is necessary to meaningfully compare the two groups. Given that we included a higher-order factor within our CFA models, we first tested invariance for the first-order factors and then followed-up with invariance testing for the higher-order factor separately.

The configural model (Model MGa1) provided an excellent fit to the data (see Table 4 ), indicating the factor structure was appropriately represented for the first-order factors for boys and girls. Following this, we tested for metric invariance. Because the configural model was nested within the metric model (Model MGa2), we used chi-square difference testing to determine whether adding equality constraints led to a significant worsening of model fit. When compared to the configural model, imposing equality constraints on the factor loadings did not result in a significantly worse fitting model, Δ㱲 = 14.442 with 8 df, p = .071, suggesting the factor loadings were similar across the two groups. Thus, the more parsimonious metric model was retained. Comparing the metric model to the scalar model (Model MGa3), while imposing additional equality constraints on the means across the two groups, did not result in significantly worse fit, Δ㱲 = 12.985 with 8 df, p = .112. So the scalar model was the most parsimonious and preferred model. Thus, we established that the mean structure and factor loadings were equivalent for the first-order factors across boys and girls.

Because we were able to establish invariance for the first-order factors, we tested for metric invariance for the second-order factor (i.e., whether the loadings for the subtraction and addition factors were similar across the two groups). For the multi-group models, we fixed the unstandardized path from the computation factor to one of the first order factors to 1 given that standardization is inappropriate in multi-group analyses (see Kline, 2011). First, we specified a configural model for the factor structure of the second-order factor only (MGb1). Following this, we constrained the factor loadings to equality across boys and girls (MGb2). A comparison of this model to the configural model resulted in a nonsignificant chi-square difference, Δ㱲 = 1.363 with 2 df, p = .506. Scalar invariance was tested by constraining the latent factor means to equality (i.e., fixed to zero across groups Model MGb3). This did not lead to a significant degradation in model fit compared to the metric model, Δ㱲 = 10.960 with 7 df, p = .140. Because we established invariance for the full CFA model, we proceeded by testing equivalence of the estimates for the best-fitting SEMs for boys and girls (across Models B5 and G5).


Introduction

Cognitive training has become increasingly popular (see Strobach and Karbach 2020, for a review) as the elderly population has rising life expectancy and therefore growing risk of cognitive and functional decline. Moreover, the cognitive demands for academic and occupational success are increasing with each generation. The main promise of cognitive training interventions is to induce lasting performance gains in cognitive domains that go beyond the practiced task and are relevant for daily functioning. Training-induced changes are thought to be triggered by a prolonged mismatch between situational demands and range of functions and performance an individual’s cognitive system is able to support (Lövdén et al. 2010). This mismatch fosters adaptive structural brain changes (e.g., neurogenesis, synaptogenesis, long-term potentiation) that effectively increase the possible range of cognitive performance to meet the altered environmental demands. Although the results of many training studies are promising, there is high variability across studies and individuals in such training-induced plastic changes (e.g., Katz et al. 2016, for a review), and even meta-analyses on the topic reveal conflicting conclusions (e.g., Kassai et al. 2019 Melby-Lervåg and Hulme 2016 vs. Au et al. 2016 Karbach and Verhaeghen 2014 Nguyen et al. 2019). On the one hand, this reflects large differences between studies in terms of training type, training features, and target population (see Fig. 1, panel 3), but it also highlights large inter-individual differences in performance gains.

Tentative model of a study design. We suggest including neuroimaging pre- and post-training as well as in the follow-up to be able to understand the mechanisms leading to cognitive training gain, generalization to unrelated tasks, and maintenance of these effects over longer periods of time

In recent years, these individual differences in training-induced cognitive performance gains have attracted considerable scientific interest (e.g., Bürki et al. 2014 Karbach et al. 2017 Lövdén et al. 2012). According to the supply-demand mismatch model (Lövdén et al. 2010), the extent to which mismatch drives plastic changes depends on the current state of flexibility of the cognitive system. For instance, if environmental demands greatly exceed the existing functional capacity—as would be the case when asking a 4-year-old to maintain 7 digits in working memory—the impetus for change will be reduced. When the inter-individual variation is high, averaging across participants can be misleading (e.g., Moreau and Corballis 2018). It is now clear that the “one-fits-all” solutions are not working for cognitive training, and it is time to move towards individualized training programs (Colzato and Hommel 2016 Karbach and Unger 2014 Kliegel and Bürki 2012). In order to do that, we need to (1) determine which inter-individual differences lead to the variation in training-related outcomes and (2) understand the mechanisms leading to training gain and transfer.

In an attempt to identify individual characteristics that might influence the success of a training regimen, previous studies focused on age, sex, education, baseline cognitive performance, intelligence, personality, and motivation (e.g., Katz et al. 2016, for a review see Fig. 1, panel 2). The results of these studies remain inconclusive. For instance, most studies agree that baseline cognitive performance is associated with training-related changes, but there is no agreement on the direction of this relation. Some reports found greater training-induced gains in individuals with higher baseline performance (Foster et al. 2017 Wiemers et al. 2019), whereas others concluded that individuals with low baseline performance benefit more because they have more room to improve (Jaeggi et al. 2011 Zinke et al. 2014). There is an increasing trend to combine basic demographic, psychometric, and behavioral measures with magnetic resonance imaging (MRI)–based measures of brain morphological and functional characteristics to resolve these inconsistencies. The rationale is that brain markers are reliable indicators of the current functional organismic capacity, i.e., the possible range of cognitive performance. Neural predictors can be rather specific (e.g., hippocampal subfield volume) or more general (e.g., whole-brain functional connectivity patterns), depending on the complexity of the cognitive functions they are believed to support. Moreover, direct assessment of training-induced change in brain structure and function has advanced the understanding of the mechanisms underlying cognitive performance increments. In this article, we give an overview of findings on brain structural and functional predictors of cognitive improvement as well as training-related brain changes. We discuss implications for future training research and address existing practical challenges.


Real-time prediction of short-timescale fluctuations in cognitive workload

Human operators often experience large fluctuations in cognitive workload over seconds timescales that can lead to sub-optimal performance, ranging from overload to neglect. Adaptive automation could potentially address this issue, but to do so it needs to be aware of real-time changes in operators’ spare cognitive capacity, so it can provide help in times of peak demand and take advantage of troughs to elicit operator engagement. However, it is unclear whether rapid changes in task demands are reflected in similarly rapid fluctuations in spare capacity, and if so what aspects of responses to those demands are predictive of the current level of spare capacity. We used the ISO standard detection response task (DRT) to measure cognitive workload approximately every 4 s in a demanding task requiring monitoring and refueling of a fleet of simulated unmanned aerial vehicles (UAVs). We showed that the DRT provided a valid measure that can detect differences in workload due to changes in the number of UAVs. We used cross-validation to assess whether measures related to task performance immediately preceding the DRT could predict detection performance as a proxy for cognitive workload. Although the simple occurrence of task events had weak predictive ability, composite measures that tapped operators’ situational awareness with respect to fuel levels were much more effective. We conclude that cognitive workload does vary rapidly as a function of recent task events, and that real-time predictive models of operators’ cognitive workload provide a potential avenue for automation to adapt without an ongoing need for intrusive workload measurements.


3 A dialogue on the right level of explanation to understand the mind

This dialogue allegedly took place at one of the Annual Meetings of the Cognitive Science Society, and it has three protagonists: Alex, a student of cognitive science, and two well-known professors, Professor B from the University of Brain City and Professor M from the University of Mind City. Alex just presented a poster for the first time at an international conference, received very good feedback on the poster, and started to think…

“I’m now seriously considering to apply for a PhD program in cognitive science. There are two professors over there right now, who work at different universities, which both have an excellent reputation for their cognitive science programs. But I’ve also heard that the two programs are different in many ways. Most of the professors at the University of Brain City use brain imaging and techniques from genetics to try to understand the biological foundations of cognition. The program at the University of Mind City relies mainly on methods from cognitive psychology, artificial intelligence, linguistics, and anthropology to understand cognition beyond the level of biology. So I may well introduce myself to Professors B and M and ask them about their two programs and their ‘philosophy.’”

“Hi, I am Alex!”—I started the conversation. “I’m interested in your opinion, if you have a moment. I’d love to apply for a PhD position in cognitive science, and I’m considering applying to the programs at your universities.”

The two professors were happy to tell me about their programs. They initially asked me if I liked any particular talk at the conference. After thinking about it for a moment, I replied: “Did you hear this talk on understanding human cognition from the ground up at the level of genes? I very much enjoyed it. The speaker argued that genetic factors will become a more prominent level of explanation in the cognitive sciences. What do you two think about that?”

Professor B, who is working at the University of Brain City and well known as an advocate of biological approaches to understand the mind said: “No, I didn’t hear that presentation.”

Professor M, from the University of Mind City, did not visit the talk either, but asked me what the main claim of the talk was, and why I liked it.

I stammered a bit, but then said: “I think, the main claim was that genetic variation can help to predict cognitive performance in a wide swathe of tasks. Certain genes would predict, for example, which people are the best at solving different spatial problems, or who is at risk to develop specific spatial disorders.”

Professor M, a well-known critic of cognitive scientists trying to understand how the mind works only by looking in the brain or genes, asked me right away: “But did the speaker say anything substantial about the cognitive mechanisms of spatial problem-solving, or of any other cognitive ability?”

Before I could answer, Professor B intervened: “Well, you’re asking too much. At this stage of research, it’s unrealistic to fully explain how something like spatial cognition works from the ‘bottom-up’—all the way from individual genes to knowledge structures for spatial navigation.”

“But that should be the goal, right?”—I asked. Professor M did not hesitate and argued that human reasoning and problem-solving rely on cognitive representations: “I don’t think that we can, or should, reduce phenomena that appear on the cognitive level all the way down to the level of genes and molecules.”

I was a bit disappointed, and wondered: “But why not? I think it would be exciting if we could understand the genetic components of cognitive capacities.”

“I’m sceptical about how much that helps, when we want to understand how spatial thinking works and why it works the way it does. Results from cognitive research also allow us to predict human performance, for instance, when people get lost or draw false inferences in reasoning. Predicting behavior is also an important goal, but I’ve never seen that this is possible based on brain imaging or DNA sequencing.”

“But wouldn’t evidence from biological levels also be relevant to confirm or disconfirm competing cognitive explanations?”

“I agree with you that the levels of genes, molecules, and neural pathways are relevant sources of evidence about the correct explanatory model of spatial thinking.”

Professor M looked at Professor B and said: “But most explanations of how genes affect human cognition are currently too unconstrained to give us testable predictions. And the existing empirical evidence is too weak.”

“I think there’s a risk of conflating explanatory issues with issues of evidential relevance here”—said Professor B —“One issue is: ‘What is the relevant evidence to evaluate an explanation?’ A different issue is: ‘To what level does the explanation belong?’ Consider, for instance, alternative models that explain spatial reasoning in terms of representations, processes, and resources. These models belong to the cognitive level. But this doesn’t mean that evidence from neuroscience and genetics cannot be relevant to evaluate them.

“You know, I have just read an article where the researchers empirically identified 18 candidate genes for depression, which have often been studied for their relevance to depression phenotypes. No evidence was found for any of these candidate genes to be associated with depression. None of these candidate genes for depression were more strongly associated with depression phenotypes than noncandidate genes. I was disappointed.”

“Don’t worry, there’s still a lot more work to do before we can answer questions about the genetic pathways and mechanisms underlying cognitive capacities and mental illnesses but it’s an exciting time. We have many new experimental tools we can use to examine and test cognitive models at a micro-level. Our lab at the University of Brain City just bought state-of-the-art technologies for neurostimulation and imaging. I once collaborated with a geneticist on a project on intelligence and conditional reasoning and the geneticist was much less reductionist than what many of our colleagues in cognitive science, including my colleague here, seem to believe.”

“And how did it work?” I asked.

Professor B answered: “The main challenge was to find a common vocabulary for talking about the same things. It took some time, but it eventually worked.”

Professor M replied: “I’ve heard several talks from genetic psychology, and I must say that they were always disappointing. The effect sizes were often extremely small, if there were any. Typically, there was no interpretable pattern of results in these studies. The only thing most of these studies indicate is that the people who do this research are strong believers of biology as key to understanding cognition.”

I was finding this discussion entertaining. Professor B said sharply: “Not sure that’s fair. Those studies certainly show, at least, that we should clearly define what phenomenon we want to explain, before we can say anything sensible about the most adequate level in its causal structure, at which testing should be conducted, and understanding pursued.”

A question was already revolving in my mind: “Professor B, you mentioned your collaboration with a geneticist on conditional reasoning. But aren’t the norms of what we consider rational defined by social agreement? It seems strange to use the vocabulary of biology to explore accurate or fallacious conditional reasoning. Or am I wrong?”

“No, you’re not wrong, in a way. But I think many colleagues would agree that cognitive theories should at least be consistent with relevant neural evidence, otherwise they are wrong.”

“But why shouldn’t it be the other way around?” I asked.

Professor M smiled: “Good point, I actually think that a theory on the neural level must be consistent with what we know on the cognitive level otherwise it is wrong. We know much more on the cognitive level than on the neural level, and most of the terms that cognitive neuroscientists use are cognitive terms. Imagine if neuroscientists just talked about synapses, dendrites, neurotransmitters, action potentials, and so forth. It would be rather boring, don’t you think?”

“Well, sorry, but this is a bad argument,” Professor B protested. “You know that brain research doesn’t boil down to cognitive neuroscience. It includes important—and, if you ask me, exciting—fields like neuroanatomy, neurochemistry, and biophysics. These fields can do just fine without cognitive terms. And their results are relevant to understanding cognitive capacities anyway.”

“Okay, brain research might be exciting just as physics and many other areas of the sciences and humanities”—said Professor M. “But I disagree”— M continued —“regarding the importance for cognitive science. Reductionist approaches can tell us little, if anything, about the nature of cognitive phenomena.”

“I am sorry”—I said—“but what do you exactly mean with reductionism? You mentioned this term already, but I don’t know what it exactly means.” Professor B explained that reductionism is a term that broadly refers to the view that we can fully understand cognitive phenomena by studying their underlying physical or biological component structures like genes and neurons.”

But then I asked: “Do cognitive scientists really believe that any cognitive phenomenon, spatial navigation or conditional reasoning for example, can be adequately explained at a single level? My sense is that most think molecular biology and its techniques are just one method, and should not serve as the only, or the best, model for gaining understanding in cognitive science.”

“You’re right,” Professor B replied. “But, just to tell you about another ‘ism,’ the view you have just described may be better called monism. That’s the opinion that there is only one correct level of explanation or one correct method targeting the correct level of explanation for a phenomenon.”

“I am sorry again. But what do you mean with level?”

Professor B explained: “When I use the term ‘level’ in my work, I simply mean ‘some scientific domain of interest.’ Nothing deep.”

“Does that mean that you think there’s no 'right' level for explaining any cognitive phenomenon?”, Professor M asked the colleague.

“You know what”— B admitted—“I really don’t have a rigorous justification for what makes a level ‘right’. I don’t think we need one. What the ‘right level’ is depends on the specific context of research.”

I was surprised because Professor M agreed: “It makes little sense to talk about a ‘right level’ of explanation in a vacuum, without taking into account our background knowledge about the phenomenon we want to understand, relevant theoretical and practical interests, and so on. When I use the term ‘level,’ I have in mind certain properties of a mechanism, like scale, granularity, or hierarchical composition.”

Once again I had to ask: “What do you mean by that?”

“I mean,” Professor M answered, “that higher levels are composed of things at lower levels. Explanations of cognitive phenomena at lower levels would refer to smaller entities, or to faster interactions between entities. Explanations that refer to axons are at a lower level than explanations that refer to, say, cultural processes, which rely on spatially distributed social networks.”

Professor B clarified that the kind of hierarchical composition Professor M was talking about is not a hierarchical model of scientific disciplines or theories, where each level corresponds to a theory, and fundamental physics is at the bottom level. I nodded, and Professor B continued to explain: “And it is important to see that scale is also related to the notion of complexity. Think of consciousness. Many scholars claim that conscious experience emerges from interactions among brain components at a micro-scale. Thus, even if we perfectly knew how brains are organized and wired, we may not be able to predict the more complex phenomena emerging at a macro-scale.”

Professor M suddenly interjected: “Right, assume that Professor B here finds a 1-to-1 mapping between any two particular mental and physical states at least in one individual. This discovery would not be of much interest, because scientists are interested in general principles, not just explanation for single observations. So we want to find identities between kinds of mental states and kinds of neural states in the brain. Yet finding such identities is impossible, because any given type of mental state can be realized by many distinct types of brain states, both across and within individuals. And any given brain state can implement many different mental states.”

“That’s too quick” objected Professor B. “The idea that mental states are multiply realized by different types of physical states is less prevalent in scientific practice. The physical differences between humans come with functional differences and this means that it is really hard to identify a genuine case of multiple realization, where two different physical structures realize the same psychological capacity, but in very different ways.”

Professor M said: “This is really a strong statement. When you and I think about the concept of “love” that relies on the same neural processes in our brains? Okay, we might not have the same concept of love, but even if we had, that would not be represented in the same way in our brain because the concept is acquired by learning and what has been learned is mapped onto different neural structures.”

Professor B cringed, and turned to me: “How do you understand the term ‘level’?”

The first thing that came to mind was David Marr’s three-level framework for analysing cognitive systems, which I had studied in different courses in my Master’s.

B said: “If anything, Marr’s three-level framework stands for one of the fundamental convictions of the cognitive sciences, namely that the mind consists in some sort of information processing.”

“Why?”—I asked—“Is there any other way to characterize what cognitive science is about?”

Professor M answered: “Well, an increasing number of people say there is. Many who subscribe to dynamical systems theory criticize the information processing paradigm. They say that cognitive systems should be studied in terms of their dynamics and interactions with the environment, instead of computations over representations. The idea the mind is an information processing system has become something like an unquestioned axiom of our discipline. Over the last decades, the term ‘information’ has developed into a vague and overly used term.”

I was a little confused at this stage, as I assumed that cognitive scientists do not have to decide between either dynamical systems theory, or computation and information. That seemed a false dichotomy to me.

Professor B clarified: “It’s important to remember, though, that not all information consists in symbolic, language-like representations. For example, at the representational level, we assume the states of a calculator represent specific numbers. We ascribe meaningful representational content to different states of the calculator. At the syntactic level, we individuate operations on meaningless numerals at this syntactic level, strictly speaking, the calculator doesn’t represent numbers and doesn’t perform arithmetical operations over numbers.”

Growing impatient, Professor M asked: “What’s your point, B?”

“My point is that to be a good calculator, it’s not necessary the calculator employs symbolic representations in a way that involves comprehension of arithmetic. I think the debate between representationalists and anti-representationalists often seems to neglect this point, confusing representational ascriptions to a system with ascription of comprehension.”

I interrupted and said: “The pocket calculator example reminds me of what I have learned in my classes about the question of whether a thermostat is a cognitive system. I think it isn’t. Whenever the sensor detects a certain temperature, then it turns on or off the heating. It performs this function without any flexibility a certain input always leads to a certain output. But that is not how cognitive systems behave. They respond flexibly to the input from the environment and make predictions. Don’t you think?”

Professor M agreed and said that this is a topic of one of the popular courses in their program at Mind University.

Still a little confused, I asked for clarification: “Many cognitive scientists, also at this conference, seem to offer explanations in terms of representations. Does that mean the level of representations is particularly important to explain cognitive phenomena?”

Professor M replied confidently: “Genuinely cognitive phenomena can be adequately explained only by considering this level. For instance, when we want to deal with what the philosopher Daniel Dennett calls taking an intentional stance toward a cognitive system’s behavior, then representations are essential in studying cognitive systems. We know from the history of psychological research that explanations that do not account for internal representations and processes don’t have much power. The cognitive turn showed the assumption of intermediate internal processes can explain how the human mind works in a much more powerful way than representation-free approaches. Chomsky showed that for language, Bandura and Bruner for learning, Atkinson and Shiffrin for memory, the list is endless.”

“But”—I insisted—“do you think that all adequate explanations of cognitive phenomena should be pitched at the level of algorithms and representations? It seems to me that capacities like perception and motor control may be better explained in terms of a representation-free, dynamic, and distributed interaction between organisms and their environment.”

Professor B said: “Our goal, at least the ‘research philosophy’ at our program at the University of Brain City, is not to explain the less fundamental, whatever that is, in terms of the more fundamental. Our goal is to integrate different explanations, both at a level and across levels of organization and functionality. From this perspective, different researchers in different sub-disciplines in our department contribute different causal, constitutive, and contextual constrains to mechanistic or biophysical explanations of cognitive phenomena.”

Professor M pointed out: “Mechanistic integration and unification in cognitive science are more easily asserted than achieved. Have you seen the program of this conference? It includes sessions on ‘memory encoding,’ ‘explanation,’ ‘perception,’ ‘creativity,’ ‘word learning,’ ‘neural dynamics,’ ‘rationality,’ and so on. Within each session, different researchers use different theoretical frameworks, different methodological protocols, and different datasets, to study what they assume to be the same kinds of phenomena or capacities. In fact, our field is often explicitly named in the plural, as the cognitive sciences. What makes it, or should be making it, an integrated, cohesive, science, is the goal to explain cognition on the algorithmic level. At least, that’s the ‘research philosophy’ at our program at the University of Mind City. More generally, we aim to understand the mind at the personal level.”

I was starting to get a better idea of the programs at Universities of Brain City and Mind City but, once again, I had a question: “What’s the personal level?”

Professor B explained: “The distinction between the levels of sub-personal and personal explanations is a distinction between the explanation of cognitive phenomena in terms of concepts that refer to components of a system—for instance, in terms of brain states or circuits—and in terms of concepts that refer to mental states of a whole system, for instance in terms of a person’s thoughts, motivations, and emotions.”

Professor M added: “Another way to cash out the distinction is in terms of causal-mechanic explanation and reason-based explanation. Sometimes we need explanations that are grounded in people’s beliefs, goals, motives, and values. But we also sometimes need mechanistic explanations that view the human mind as driven, not by reasons, but by sub-personal causal factors.”

“So, if I understand the distinction correctly, we may explain why a person is reading a book on cognition, by saying this person is a student of psychology, has an exam next week, and wants to pass the exam. This would be a type of causal explanation. However, if she has forgotten some parts of the content and thus cannot remember that during the exam, this requires a mechanistic explanation of how human memory works.”

Professor M said: “Well stated. Learning is essential. By the way, we should not forget that behaviorism primarily flourished in the US. European psychology was largely unaffected by behaviorism and always more open to mental concepts.”

I was surprised and said, “Oh, I didn’t know that. Very interesting."

Professor M added: “Yes, and the link between science and the values of a society are also important when we talk about the levels of explanation. Today, many people in our society think about biological explanations as something unchangeable and hardwired. But this is a fatal, although quite common, misunderstanding.”

Professor B asked: “Are you talking about epigenetics?”

Professor M responded: “Not only. Indeed, epigenetics shows that genes can be switched on or off in particular environmental conditions. Nevertheless, biological approaches are often used to justify injustice and the unfair distribution of goods and chances in society. We should emphasize much more convincingly to the wider public that learning fundamentally shapes our brains, and that environmental and cultural conditions have an enormous impact on the functioning of our brains and genes.”

Professor B nodded and said: “Yes that’s right and there are good reasons to say that, during the evolution of humankind, biology has been largely replaced by culture as the major driving force in human evolution. The famous example is the selection of lactose tolerance in groups with dairy traditions. And heritable diseases can persist and spread if they occur in families with social power. But this is not the point here. I agree we should think very carefully about the consequences that explanations pitched at a certain level can have for our society.”

Now it suddenly became loud, because one session ended and everyone rushed to the coffee. Professor B said, “Oh, there is my husband, I should leave now.” Professor M said, “Oh, and my husband, too.” They both thanked me for approaching them and wished me all the best with my applications. I had a nice conversation with these two professors, which helped me to get a better sense of the programs at their Universities, and their approaches to explaining the mind. I am sure I will encounter several of the ideas and arguments we discussed in my future studies. I just have to be admitted into a PhD program in cognitive science now. Maybe one that combines the two lines of thinking that I have learned about.


1 Introduction

How should we study cognition? How can we understand how the mind works? Such questions are inextricably linked to the notion of levels of explanation for cognitive phenomena. Some researchers argue that cognitive phenomena are best studied and understood at some particular level, for example at the functional level, at the level of neural networks in the brain, or at the level of biological or cultural evolution. Such positions are sometimes associated with the conviction that the ultimate aim of cognitive science is to establish a single, comprehensive account of cognitive phenomena based on their causal mechanisms, or on a single set of principles, at some privileged level of explanation.

Other researchers disagree and maintain that for any cognitive phenomenon there is no privileged level of explanation. From this view, a full understanding of cognitive phenomena requires explanations that integrate multiple levels of explanation. For these researchers, cognitive science should pursue integration and become a genuine interdisciplinary endeavour.

A third position is that the explanatory targets of cognitive science have a much more disunified and unstable character than the targets of disciplines such as physics. Hence, the understanding of cognitive phenomena necessarily resembles a patchwork of relatively autonomous levels and approaches. These are just three positions, and there are many other opinions in between.

Intuitively, it is plausible that any cognitive phenomenon can be studied and explained at different levels. For example, if you want to explain how you can learn something new by reading this text, you might seek an explanation at some “high” level, considering psychological, social, or cultural factors that are relevant to acquiring new knowledge by reading. But you might also pursue explanations at some “lower” levels, for instance, the functional level of syntactic parsing, lexical processing, and memory update. Or you might go at “lower” levels still, and study eye movement or brain activity during reading.

But does it really make sense to talk about different levels of explanations in cognitive science? Does the notion of a level of explanation play any important role in studying and understanding cognitive phenomena? A positive answer would beget several other questions. What exactly is a level of explanation? Is there any privileged level of explanation for a given cognitive phenomenon? How are different levels of explanation in cognitive science related, or how should they be related? How can unified multi-level accounts of cognitive phenomena be effectively pursued? A negative answer would instead suggest that the notion of level of explanation is confused and does not do much epistemic work in cognitive science. Level of explanation would be a generic notion related to a number of distinct research strategies and questions, which are more precisely defined in terms of concepts such as scale, composition, complexity, and hierarchy.

  • What is a level of explanation for cognitive phenomena? Is there a privileged level or kind of explanation in cognitive science? How could we tell?
  • How do different levels of explanation fit together, or relate to one another? How should explanations at one level inform or constrain explanations at some other level?
  • Can the different approaches to the mind, brain, and culture be unified? Or is a plurality of approaches and levels of explanation a genuine feature of cognitive science? What would it take to unify or integrate different levels of explanation?
  • What is reductionism in the sciences of mind, brain, and culture? How does reductionism promote or hinder our understanding of the mind?
  • Which kind of explanations should be more represented in the future of cognitive science?

Such questions have always been the subject of controversy. So it is unsurprising that this is not the first topiCS on the theme. For example, there is another issue devoted to David Marr and “Levels of analysis in Cognitive Science” (Peebles & Cooper, 2015 ). However, the authors of the present volume express opinions that go far beyond the debates about Marr’s trichotomy between what is computed, how it is computed, and in which hardware it is computed.

In the remainder of this introduction, we first provide an overview of the different contributions in this special issue. Then we are going to do an experiment: We present a dialogue between a student of cognitive science and two professors representing different views on the theme of this volume. The aim of this dialogue is to lay out a complex pattern of arguments and counterarguments concerning the topic of levels of explanation in the cognitive science to a wider readership. We close with some general remarks and a list of recommended readings.


Real-time prediction of short-timescale fluctuations in cognitive workload

Human operators often experience large fluctuations in cognitive workload over seconds timescales that can lead to sub-optimal performance, ranging from overload to neglect. Adaptive automation could potentially address this issue, but to do so it needs to be aware of real-time changes in operators’ spare cognitive capacity, so it can provide help in times of peak demand and take advantage of troughs to elicit operator engagement. However, it is unclear whether rapid changes in task demands are reflected in similarly rapid fluctuations in spare capacity, and if so what aspects of responses to those demands are predictive of the current level of spare capacity. We used the ISO standard detection response task (DRT) to measure cognitive workload approximately every 4 s in a demanding task requiring monitoring and refueling of a fleet of simulated unmanned aerial vehicles (UAVs). We showed that the DRT provided a valid measure that can detect differences in workload due to changes in the number of UAVs. We used cross-validation to assess whether measures related to task performance immediately preceding the DRT could predict detection performance as a proxy for cognitive workload. Although the simple occurrence of task events had weak predictive ability, composite measures that tapped operators’ situational awareness with respect to fuel levels were much more effective. We conclude that cognitive workload does vary rapidly as a function of recent task events, and that real-time predictive models of operators’ cognitive workload provide a potential avenue for automation to adapt without an ongoing need for intrusive workload measurements.


3 A dialogue on the right level of explanation to understand the mind

This dialogue allegedly took place at one of the Annual Meetings of the Cognitive Science Society, and it has three protagonists: Alex, a student of cognitive science, and two well-known professors, Professor B from the University of Brain City and Professor M from the University of Mind City. Alex just presented a poster for the first time at an international conference, received very good feedback on the poster, and started to think…

“I’m now seriously considering to apply for a PhD program in cognitive science. There are two professors over there right now, who work at different universities, which both have an excellent reputation for their cognitive science programs. But I’ve also heard that the two programs are different in many ways. Most of the professors at the University of Brain City use brain imaging and techniques from genetics to try to understand the biological foundations of cognition. The program at the University of Mind City relies mainly on methods from cognitive psychology, artificial intelligence, linguistics, and anthropology to understand cognition beyond the level of biology. So I may well introduce myself to Professors B and M and ask them about their two programs and their ‘philosophy.’”

“Hi, I am Alex!”—I started the conversation. “I’m interested in your opinion, if you have a moment. I’d love to apply for a PhD position in cognitive science, and I’m considering applying to the programs at your universities.”

The two professors were happy to tell me about their programs. They initially asked me if I liked any particular talk at the conference. After thinking about it for a moment, I replied: “Did you hear this talk on understanding human cognition from the ground up at the level of genes? I very much enjoyed it. The speaker argued that genetic factors will become a more prominent level of explanation in the cognitive sciences. What do you two think about that?”

Professor B, who is working at the University of Brain City and well known as an advocate of biological approaches to understand the mind said: “No, I didn’t hear that presentation.”

Professor M, from the University of Mind City, did not visit the talk either, but asked me what the main claim of the talk was, and why I liked it.

I stammered a bit, but then said: “I think, the main claim was that genetic variation can help to predict cognitive performance in a wide swathe of tasks. Certain genes would predict, for example, which people are the best at solving different spatial problems, or who is at risk to develop specific spatial disorders.”

Professor M, a well-known critic of cognitive scientists trying to understand how the mind works only by looking in the brain or genes, asked me right away: “But did the speaker say anything substantial about the cognitive mechanisms of spatial problem-solving, or of any other cognitive ability?”

Before I could answer, Professor B intervened: “Well, you’re asking too much. At this stage of research, it’s unrealistic to fully explain how something like spatial cognition works from the ‘bottom-up’—all the way from individual genes to knowledge structures for spatial navigation.”

“But that should be the goal, right?”—I asked. Professor M did not hesitate and argued that human reasoning and problem-solving rely on cognitive representations: “I don’t think that we can, or should, reduce phenomena that appear on the cognitive level all the way down to the level of genes and molecules.”

I was a bit disappointed, and wondered: “But why not? I think it would be exciting if we could understand the genetic components of cognitive capacities.”

“I’m sceptical about how much that helps, when we want to understand how spatial thinking works and why it works the way it does. Results from cognitive research also allow us to predict human performance, for instance, when people get lost or draw false inferences in reasoning. Predicting behavior is also an important goal, but I’ve never seen that this is possible based on brain imaging or DNA sequencing.”

“But wouldn’t evidence from biological levels also be relevant to confirm or disconfirm competing cognitive explanations?”

“I agree with you that the levels of genes, molecules, and neural pathways are relevant sources of evidence about the correct explanatory model of spatial thinking.”

Professor M looked at Professor B and said: “But most explanations of how genes affect human cognition are currently too unconstrained to give us testable predictions. And the existing empirical evidence is too weak.”

“I think there’s a risk of conflating explanatory issues with issues of evidential relevance here”—said Professor B —“One issue is: ‘What is the relevant evidence to evaluate an explanation?’ A different issue is: ‘To what level does the explanation belong?’ Consider, for instance, alternative models that explain spatial reasoning in terms of representations, processes, and resources. These models belong to the cognitive level. But this doesn’t mean that evidence from neuroscience and genetics cannot be relevant to evaluate them.

“You know, I have just read an article where the researchers empirically identified 18 candidate genes for depression, which have often been studied for their relevance to depression phenotypes. No evidence was found for any of these candidate genes to be associated with depression. None of these candidate genes for depression were more strongly associated with depression phenotypes than noncandidate genes. I was disappointed.”

“Don’t worry, there’s still a lot more work to do before we can answer questions about the genetic pathways and mechanisms underlying cognitive capacities and mental illnesses but it’s an exciting time. We have many new experimental tools we can use to examine and test cognitive models at a micro-level. Our lab at the University of Brain City just bought state-of-the-art technologies for neurostimulation and imaging. I once collaborated with a geneticist on a project on intelligence and conditional reasoning and the geneticist was much less reductionist than what many of our colleagues in cognitive science, including my colleague here, seem to believe.”

“And how did it work?” I asked.

Professor B answered: “The main challenge was to find a common vocabulary for talking about the same things. It took some time, but it eventually worked.”

Professor M replied: “I’ve heard several talks from genetic psychology, and I must say that they were always disappointing. The effect sizes were often extremely small, if there were any. Typically, there was no interpretable pattern of results in these studies. The only thing most of these studies indicate is that the people who do this research are strong believers of biology as key to understanding cognition.”

I was finding this discussion entertaining. Professor B said sharply: “Not sure that’s fair. Those studies certainly show, at least, that we should clearly define what phenomenon we want to explain, before we can say anything sensible about the most adequate level in its causal structure, at which testing should be conducted, and understanding pursued.”

A question was already revolving in my mind: “Professor B, you mentioned your collaboration with a geneticist on conditional reasoning. But aren’t the norms of what we consider rational defined by social agreement? It seems strange to use the vocabulary of biology to explore accurate or fallacious conditional reasoning. Or am I wrong?”

“No, you’re not wrong, in a way. But I think many colleagues would agree that cognitive theories should at least be consistent with relevant neural evidence, otherwise they are wrong.”

“But why shouldn’t it be the other way around?” I asked.

Professor M smiled: “Good point, I actually think that a theory on the neural level must be consistent with what we know on the cognitive level otherwise it is wrong. We know much more on the cognitive level than on the neural level, and most of the terms that cognitive neuroscientists use are cognitive terms. Imagine if neuroscientists just talked about synapses, dendrites, neurotransmitters, action potentials, and so forth. It would be rather boring, don’t you think?”

“Well, sorry, but this is a bad argument,” Professor B protested. “You know that brain research doesn’t boil down to cognitive neuroscience. It includes important—and, if you ask me, exciting—fields like neuroanatomy, neurochemistry, and biophysics. These fields can do just fine without cognitive terms. And their results are relevant to understanding cognitive capacities anyway.”

“Okay, brain research might be exciting just as physics and many other areas of the sciences and humanities”—said Professor M. “But I disagree”— M continued —“regarding the importance for cognitive science. Reductionist approaches can tell us little, if anything, about the nature of cognitive phenomena.”

“I am sorry”—I said—“but what do you exactly mean with reductionism? You mentioned this term already, but I don’t know what it exactly means.” Professor B explained that reductionism is a term that broadly refers to the view that we can fully understand cognitive phenomena by studying their underlying physical or biological component structures like genes and neurons.”

But then I asked: “Do cognitive scientists really believe that any cognitive phenomenon, spatial navigation or conditional reasoning for example, can be adequately explained at a single level? My sense is that most think molecular biology and its techniques are just one method, and should not serve as the only, or the best, model for gaining understanding in cognitive science.”

“You’re right,” Professor B replied. “But, just to tell you about another ‘ism,’ the view you have just described may be better called monism. That’s the opinion that there is only one correct level of explanation or one correct method targeting the correct level of explanation for a phenomenon.”

“I am sorry again. But what do you mean with level?”

Professor B explained: “When I use the term ‘level’ in my work, I simply mean ‘some scientific domain of interest.’ Nothing deep.”

“Does that mean that you think there’s no 'right' level for explaining any cognitive phenomenon?”, Professor M asked the colleague.

“You know what”— B admitted—“I really don’t have a rigorous justification for what makes a level ‘right’. I don’t think we need one. What the ‘right level’ is depends on the specific context of research.”

I was surprised because Professor M agreed: “It makes little sense to talk about a ‘right level’ of explanation in a vacuum, without taking into account our background knowledge about the phenomenon we want to understand, relevant theoretical and practical interests, and so on. When I use the term ‘level,’ I have in mind certain properties of a mechanism, like scale, granularity, or hierarchical composition.”

Once again I had to ask: “What do you mean by that?”

“I mean,” Professor M answered, “that higher levels are composed of things at lower levels. Explanations of cognitive phenomena at lower levels would refer to smaller entities, or to faster interactions between entities. Explanations that refer to axons are at a lower level than explanations that refer to, say, cultural processes, which rely on spatially distributed social networks.”

Professor B clarified that the kind of hierarchical composition Professor M was talking about is not a hierarchical model of scientific disciplines or theories, where each level corresponds to a theory, and fundamental physics is at the bottom level. I nodded, and Professor B continued to explain: “And it is important to see that scale is also related to the notion of complexity. Think of consciousness. Many scholars claim that conscious experience emerges from interactions among brain components at a micro-scale. Thus, even if we perfectly knew how brains are organized and wired, we may not be able to predict the more complex phenomena emerging at a macro-scale.”

Professor M suddenly interjected: “Right, assume that Professor B here finds a 1-to-1 mapping between any two particular mental and physical states at least in one individual. This discovery would not be of much interest, because scientists are interested in general principles, not just explanation for single observations. So we want to find identities between kinds of mental states and kinds of neural states in the brain. Yet finding such identities is impossible, because any given type of mental state can be realized by many distinct types of brain states, both across and within individuals. And any given brain state can implement many different mental states.”

“That’s too quick” objected Professor B. “The idea that mental states are multiply realized by different types of physical states is less prevalent in scientific practice. The physical differences between humans come with functional differences and this means that it is really hard to identify a genuine case of multiple realization, where two different physical structures realize the same psychological capacity, but in very different ways.”

Professor M said: “This is really a strong statement. When you and I think about the concept of “love” that relies on the same neural processes in our brains? Okay, we might not have the same concept of love, but even if we had, that would not be represented in the same way in our brain because the concept is acquired by learning and what has been learned is mapped onto different neural structures.”

Professor B cringed, and turned to me: “How do you understand the term ‘level’?”

The first thing that came to mind was David Marr’s three-level framework for analysing cognitive systems, which I had studied in different courses in my Master’s.

B said: “If anything, Marr’s three-level framework stands for one of the fundamental convictions of the cognitive sciences, namely that the mind consists in some sort of information processing.”

“Why?”—I asked—“Is there any other way to characterize what cognitive science is about?”

Professor M answered: “Well, an increasing number of people say there is. Many who subscribe to dynamical systems theory criticize the information processing paradigm. They say that cognitive systems should be studied in terms of their dynamics and interactions with the environment, instead of computations over representations. The idea the mind is an information processing system has become something like an unquestioned axiom of our discipline. Over the last decades, the term ‘information’ has developed into a vague and overly used term.”

I was a little confused at this stage, as I assumed that cognitive scientists do not have to decide between either dynamical systems theory, or computation and information. That seemed a false dichotomy to me.

Professor B clarified: “It’s important to remember, though, that not all information consists in symbolic, language-like representations. For example, at the representational level, we assume the states of a calculator represent specific numbers. We ascribe meaningful representational content to different states of the calculator. At the syntactic level, we individuate operations on meaningless numerals at this syntactic level, strictly speaking, the calculator doesn’t represent numbers and doesn’t perform arithmetical operations over numbers.”

Growing impatient, Professor M asked: “What’s your point, B?”

“My point is that to be a good calculator, it’s not necessary the calculator employs symbolic representations in a way that involves comprehension of arithmetic. I think the debate between representationalists and anti-representationalists often seems to neglect this point, confusing representational ascriptions to a system with ascription of comprehension.”

I interrupted and said: “The pocket calculator example reminds me of what I have learned in my classes about the question of whether a thermostat is a cognitive system. I think it isn’t. Whenever the sensor detects a certain temperature, then it turns on or off the heating. It performs this function without any flexibility a certain input always leads to a certain output. But that is not how cognitive systems behave. They respond flexibly to the input from the environment and make predictions. Don’t you think?”

Professor M agreed and said that this is a topic of one of the popular courses in their program at Mind University.

Still a little confused, I asked for clarification: “Many cognitive scientists, also at this conference, seem to offer explanations in terms of representations. Does that mean the level of representations is particularly important to explain cognitive phenomena?”

Professor M replied confidently: “Genuinely cognitive phenomena can be adequately explained only by considering this level. For instance, when we want to deal with what the philosopher Daniel Dennett calls taking an intentional stance toward a cognitive system’s behavior, then representations are essential in studying cognitive systems. We know from the history of psychological research that explanations that do not account for internal representations and processes don’t have much power. The cognitive turn showed the assumption of intermediate internal processes can explain how the human mind works in a much more powerful way than representation-free approaches. Chomsky showed that for language, Bandura and Bruner for learning, Atkinson and Shiffrin for memory, the list is endless.”

“But”—I insisted—“do you think that all adequate explanations of cognitive phenomena should be pitched at the level of algorithms and representations? It seems to me that capacities like perception and motor control may be better explained in terms of a representation-free, dynamic, and distributed interaction between organisms and their environment.”

Professor B said: “Our goal, at least the ‘research philosophy’ at our program at the University of Brain City, is not to explain the less fundamental, whatever that is, in terms of the more fundamental. Our goal is to integrate different explanations, both at a level and across levels of organization and functionality. From this perspective, different researchers in different sub-disciplines in our department contribute different causal, constitutive, and contextual constrains to mechanistic or biophysical explanations of cognitive phenomena.”

Professor M pointed out: “Mechanistic integration and unification in cognitive science are more easily asserted than achieved. Have you seen the program of this conference? It includes sessions on ‘memory encoding,’ ‘explanation,’ ‘perception,’ ‘creativity,’ ‘word learning,’ ‘neural dynamics,’ ‘rationality,’ and so on. Within each session, different researchers use different theoretical frameworks, different methodological protocols, and different datasets, to study what they assume to be the same kinds of phenomena or capacities. In fact, our field is often explicitly named in the plural, as the cognitive sciences. What makes it, or should be making it, an integrated, cohesive, science, is the goal to explain cognition on the algorithmic level. At least, that’s the ‘research philosophy’ at our program at the University of Mind City. More generally, we aim to understand the mind at the personal level.”

I was starting to get a better idea of the programs at Universities of Brain City and Mind City but, once again, I had a question: “What’s the personal level?”

Professor B explained: “The distinction between the levels of sub-personal and personal explanations is a distinction between the explanation of cognitive phenomena in terms of concepts that refer to components of a system—for instance, in terms of brain states or circuits—and in terms of concepts that refer to mental states of a whole system, for instance in terms of a person’s thoughts, motivations, and emotions.”

Professor M added: “Another way to cash out the distinction is in terms of causal-mechanic explanation and reason-based explanation. Sometimes we need explanations that are grounded in people’s beliefs, goals, motives, and values. But we also sometimes need mechanistic explanations that view the human mind as driven, not by reasons, but by sub-personal causal factors.”

“So, if I understand the distinction correctly, we may explain why a person is reading a book on cognition, by saying this person is a student of psychology, has an exam next week, and wants to pass the exam. This would be a type of causal explanation. However, if she has forgotten some parts of the content and thus cannot remember that during the exam, this requires a mechanistic explanation of how human memory works.”

Professor M said: “Well stated. Learning is essential. By the way, we should not forget that behaviorism primarily flourished in the US. European psychology was largely unaffected by behaviorism and always more open to mental concepts.”

I was surprised and said, “Oh, I didn’t know that. Very interesting."

Professor M added: “Yes, and the link between science and the values of a society are also important when we talk about the levels of explanation. Today, many people in our society think about biological explanations as something unchangeable and hardwired. But this is a fatal, although quite common, misunderstanding.”

Professor B asked: “Are you talking about epigenetics?”

Professor M responded: “Not only. Indeed, epigenetics shows that genes can be switched on or off in particular environmental conditions. Nevertheless, biological approaches are often used to justify injustice and the unfair distribution of goods and chances in society. We should emphasize much more convincingly to the wider public that learning fundamentally shapes our brains, and that environmental and cultural conditions have an enormous impact on the functioning of our brains and genes.”

Professor B nodded and said: “Yes that’s right and there are good reasons to say that, during the evolution of humankind, biology has been largely replaced by culture as the major driving force in human evolution. The famous example is the selection of lactose tolerance in groups with dairy traditions. And heritable diseases can persist and spread if they occur in families with social power. But this is not the point here. I agree we should think very carefully about the consequences that explanations pitched at a certain level can have for our society.”

Now it suddenly became loud, because one session ended and everyone rushed to the coffee. Professor B said, “Oh, there is my husband, I should leave now.” Professor M said, “Oh, and my husband, too.” They both thanked me for approaching them and wished me all the best with my applications. I had a nice conversation with these two professors, which helped me to get a better sense of the programs at their Universities, and their approaches to explaining the mind. I am sure I will encounter several of the ideas and arguments we discussed in my future studies. I just have to be admitted into a PhD program in cognitive science now. Maybe one that combines the two lines of thinking that I have learned about.


VISUO-SPATIAL SKETCHPAD

Interest in visuo-spatial memory developed during the 1960s, when Posner & Konick (1966) showed that memory for a point on a line was well retained over a period ranging up to 30 seconds, but it was disrupted by an interpolated information-processing task, suggesting some form of active rehearsal. Dale (1973) obtained a similar result for remembering a point located in an open field. In contrast to these spatial memory tasks, Posner & Keele (1967) produced evidence suggesting a visual store lasting for only two seconds. However, their method was based on speed of processing letters, in which a visual letter code appeared to be superseded by a phonological code after two seconds. Although this could reflect the duration of the visual trace, it could equally well reflect a more slowly developing phonological code that then overrides the visual.

Visual STM

A colleague, Bill Phillips, and I decided to test this using material that would not be readily nameable. We chose 5 × 5 matrices in which approximately half the cells would be filled at random on any given trial. We tested retention over intervals ranging from 0.3 to 9 seconds, by presenting either an identical stimulus or one in which a single cell was changed, with participants making a same/different judgment. We found a steady decline over time, regardless of whether we measured performance in terms of accuracy or reaction time (Phillips & Baddeley 1971). A range of studies by Kroll et al. (1970), using articulatory suppression to disrupt the use of a name code in letter judgments, came to a similar conclusion, that the Posner and Keele result was based on switching from a visual to a phonological code, perhaps because of easier maintenance by subvocal rehearsal. Meanwhile, Phillips went on to investigate the visual memory store using matrix stimuli, demonstrating that accuracy declines systematically with number of cells to be remembered (Phillips 1974), suggesting limited visual STM capacity. It was this work that influenced our initial concept of the visuo-spatial sketchpad.

Spatial STM

The most frequently used clinical test of visuo-spatial memory is the Corsi block-tapping test (Milner 1971), which is spatially based and involves sequential presentation and recall. The participant views an array of nine blocks scattered across a test board. The tester taps a sequence of blocks, and the participant attempts to imitate this. The number of blocks tapped is increased until performance breaks down, with Corsi span typically being around five, about two less than digit span. Della Sala et al. (1999), using a modified version of the Phillips matrix task, showed that visual pattern span is dissociable from spatial Corsi span, with some patients being impaired on one while the other is preserved, and vice versa. Furthermore, pattern span can be disrupted by concurrent visual processing, whereas Corsi span is more susceptible to spatial disruption (Della Sala et al. 1999). I return to the visual-spatial distinction at a later point.

Visuo-Spatial WM

During the 1970s, research moved from visual STM to its role in visual imagery. Our own studies used a technique developed by Brooks (1968), in which participants are required to remember and repeat back a sequence of spoken sentences. In half of the cases the sentences can be encoded as a path through a visually presented matrix. The other half of the instructions were not readily encodable spatially. We found that recall of the visuo-spatially codable sentences was differentially disrupted by pursuit tracking (Baddeley et al. 1975a). We interpreted this result in terms of the sketchpad, leading to the question of whether the underlying store was visual or spatial. This we tested using a task in which blindfolded participants tracked a sound source (spatial but not visual) or detected the brightening of their visual field (visual but not spatial), again while performing the Brooks task. We found that the tracking still disrupted the spatial but did not interfere with the verbal task, whereas the brightness judgment showed a slight tendency in the opposite direction, leading us to conclude that the system was spatial rather than visual (Baddeley & Lieberman 1980).

Although these results convinced me that the system was essentially spatial, Robert Logie, who was working with me at the time, disagreed and set out to show that I was wrong. He succeeded, demonstrating that some imagery tasks were visual rather than spatial. He used a visual imagery mnemonic whereby two unrelated items are associated by forming an image of them interacting for example, cow and chair could be remembered as a cow sitting on a chair. Logie (1986) showed that this process can be disrupted by visual stimuli such as irrelevant line drawings or indeed by simple patches of color. There are now multiple demonstrations of the dissociation of visual and spatial WM. Klauer & Zhao (2004) critically review this literature before performing a very thorough series of investigations controlling for potential artifacts their results support the distinction between visual and spatial STM, a distinction that is also supported by neuroimaging evidence (Smith & Jonides 1997).

Yet further fractionation of the sketchpad seems likely. Research by Smyth and colleagues has suggested a kinesthetic or movement-based system used in gesture and dance (Smyth & Pendleton 1990). Another possible channel of information into the sketchpad comes from haptic coding as used in grasping and holding objects, which in turn is likely to involve a tactile component. Touch itself depends on a number of different receptor cells capable of detecting pressure, vibration, heat, cold, and pain. We currently know very little about these aspects of STM, and my assumption that information from all of these sources converges on the sketchpad is far from clearly established.

The nature of rehearsal in the sketchpad is also uncertain. Logie (1995, 2011) suggests a distinction between a “visual cache,” a temporary visual store, and a spatial manipulation and rehearsal system, the “inner scribe,” although the precise nature of visuo-spatial rehearsal remains unclear.


Introduction

Cognitive training has become increasingly popular (see Strobach and Karbach 2020, for a review) as the elderly population has rising life expectancy and therefore growing risk of cognitive and functional decline. Moreover, the cognitive demands for academic and occupational success are increasing with each generation. The main promise of cognitive training interventions is to induce lasting performance gains in cognitive domains that go beyond the practiced task and are relevant for daily functioning. Training-induced changes are thought to be triggered by a prolonged mismatch between situational demands and range of functions and performance an individual’s cognitive system is able to support (Lövdén et al. 2010). This mismatch fosters adaptive structural brain changes (e.g., neurogenesis, synaptogenesis, long-term potentiation) that effectively increase the possible range of cognitive performance to meet the altered environmental demands. Although the results of many training studies are promising, there is high variability across studies and individuals in such training-induced plastic changes (e.g., Katz et al. 2016, for a review), and even meta-analyses on the topic reveal conflicting conclusions (e.g., Kassai et al. 2019 Melby-Lervåg and Hulme 2016 vs. Au et al. 2016 Karbach and Verhaeghen 2014 Nguyen et al. 2019). On the one hand, this reflects large differences between studies in terms of training type, training features, and target population (see Fig. 1, panel 3), but it also highlights large inter-individual differences in performance gains.

Tentative model of a study design. We suggest including neuroimaging pre- and post-training as well as in the follow-up to be able to understand the mechanisms leading to cognitive training gain, generalization to unrelated tasks, and maintenance of these effects over longer periods of time

In recent years, these individual differences in training-induced cognitive performance gains have attracted considerable scientific interest (e.g., Bürki et al. 2014 Karbach et al. 2017 Lövdén et al. 2012). According to the supply-demand mismatch model (Lövdén et al. 2010), the extent to which mismatch drives plastic changes depends on the current state of flexibility of the cognitive system. For instance, if environmental demands greatly exceed the existing functional capacity—as would be the case when asking a 4-year-old to maintain 7 digits in working memory—the impetus for change will be reduced. When the inter-individual variation is high, averaging across participants can be misleading (e.g., Moreau and Corballis 2018). It is now clear that the “one-fits-all” solutions are not working for cognitive training, and it is time to move towards individualized training programs (Colzato and Hommel 2016 Karbach and Unger 2014 Kliegel and Bürki 2012). In order to do that, we need to (1) determine which inter-individual differences lead to the variation in training-related outcomes and (2) understand the mechanisms leading to training gain and transfer.

In an attempt to identify individual characteristics that might influence the success of a training regimen, previous studies focused on age, sex, education, baseline cognitive performance, intelligence, personality, and motivation (e.g., Katz et al. 2016, for a review see Fig. 1, panel 2). The results of these studies remain inconclusive. For instance, most studies agree that baseline cognitive performance is associated with training-related changes, but there is no agreement on the direction of this relation. Some reports found greater training-induced gains in individuals with higher baseline performance (Foster et al. 2017 Wiemers et al. 2019), whereas others concluded that individuals with low baseline performance benefit more because they have more room to improve (Jaeggi et al. 2011 Zinke et al. 2014). There is an increasing trend to combine basic demographic, psychometric, and behavioral measures with magnetic resonance imaging (MRI)–based measures of brain morphological and functional characteristics to resolve these inconsistencies. The rationale is that brain markers are reliable indicators of the current functional organismic capacity, i.e., the possible range of cognitive performance. Neural predictors can be rather specific (e.g., hippocampal subfield volume) or more general (e.g., whole-brain functional connectivity patterns), depending on the complexity of the cognitive functions they are believed to support. Moreover, direct assessment of training-induced change in brain structure and function has advanced the understanding of the mechanisms underlying cognitive performance increments. In this article, we give an overview of findings on brain structural and functional predictors of cognitive improvement as well as training-related brain changes. We discuss implications for future training research and address existing practical challenges.


Appendix E. Invariance Testing for Multi-Group Confirmatory Factor Analysis (CFA) Modeling

Although we previously established that the latent constructs were adequately represented by their observed indicators, the multi-group CFA models allowed for a test of whether the means and factor loadings were similar or dissimilar across the two groups. To test this, we specified three different models: a configural model, a metric model, and a scalar model. In the configural model, the observed variable means and the factor loadings are allowed to vary essentially, we are testing whether the factor structure is equivalent across boys and girls. In the metric model, we imposed constraints on the factor loadings however, the means are allowed to vary across the two groups. Finally, in the scalar model, we constrained the means and factor loadings to equality across groups. Invariance across both means and factor loadings is necessary to meaningfully compare the two groups. Given that we included a higher-order factor within our CFA models, we first tested invariance for the first-order factors and then followed-up with invariance testing for the higher-order factor separately.

The configural model (Model MGa1) provided an excellent fit to the data (see Table 4 ), indicating the factor structure was appropriately represented for the first-order factors for boys and girls. Following this, we tested for metric invariance. Because the configural model was nested within the metric model (Model MGa2), we used chi-square difference testing to determine whether adding equality constraints led to a significant worsening of model fit. When compared to the configural model, imposing equality constraints on the factor loadings did not result in a significantly worse fitting model, Δ㱲 = 14.442 with 8 df, p = .071, suggesting the factor loadings were similar across the two groups. Thus, the more parsimonious metric model was retained. Comparing the metric model to the scalar model (Model MGa3), while imposing additional equality constraints on the means across the two groups, did not result in significantly worse fit, Δ㱲 = 12.985 with 8 df, p = .112. So the scalar model was the most parsimonious and preferred model. Thus, we established that the mean structure and factor loadings were equivalent for the first-order factors across boys and girls.

Because we were able to establish invariance for the first-order factors, we tested for metric invariance for the second-order factor (i.e., whether the loadings for the subtraction and addition factors were similar across the two groups). For the multi-group models, we fixed the unstandardized path from the computation factor to one of the first order factors to 1 given that standardization is inappropriate in multi-group analyses (see Kline, 2011). First, we specified a configural model for the factor structure of the second-order factor only (MGb1). Following this, we constrained the factor loadings to equality across boys and girls (MGb2). A comparison of this model to the configural model resulted in a nonsignificant chi-square difference, Δ㱲 = 1.363 with 2 df, p = .506. Scalar invariance was tested by constraining the latent factor means to equality (i.e., fixed to zero across groups Model MGb3). This did not lead to a significant degradation in model fit compared to the metric model, Δ㱲 = 10.960 with 7 df, p = .140. Because we established invariance for the full CFA model, we proceeded by testing equivalence of the estimates for the best-fitting SEMs for boys and girls (across Models B5 and G5).


Discussion

Relationship between self-control/impulsivity and interference control

Participants completed Whiteside and Lynam’s (2001) subscales for three facets of impulsivity (premeditation, urgency, and perseverance) and Tangney et al.’s (2011) BSCS. Earlier reviews and analyses by Allom et al. (2016) and Duckworth and Kern (2011) reported very small correlations between self-report trait measures of self-control and objective measures of EF obtained with a variety of laboratory tasks but did not specifically examine the nonverbal interference tasks that are the focus of the present study.

As described in more detail in the results, and as shown in Table 6, the correlations between the trait measures of impulsivity/self-control and the interference effects that presumably reflect some type of conflict resolution processing are nonsignificant. The strong and significant correlation reported by Enticott et al. (2006) between trait impulsivity and spatial Stroop interference was not significant in our data for premeditation, urgency, or perseverance (see Table 6). With the exception of Enticott et al., the cumulative evidence shows that interference effects do not predict self-reported impulsivity in everyday life. As Wolff et al. (2016) note, a persisting gap between EFs and self-control implies that adequate EF could be a necessary condition, but it is clearly not a sufficient condition for successful self-control.

Another potential cause of the disconnect may be that the laboratory tasks are very sensitive to the participant’s calibration of speed and accuracy, a skill that has little relevance to delaying gratification (urgency), planning before acting (premeditation), or having the grit to persist in the face of adversity (perseverance). Either implicitly or explicitly, the computerized EF tasks almost always encourage the participant to go as fast as possible without making more than an occasional error. The mechanisms needed to filter out competing information in the nick of time and when there is little intrinsic value associated with a “correct” response, may be different from those needed to resist actions that are affect laden and/or creatures of habit and have genuine costs and benefits. Moreover, competing information in the real world does not typically appear at random and is exquisitely tied to the onset of new task relevant information, and the conflict need not be resolved within the first couple of hundred ms of the onset of the event. In fact, any rapid suppression of responses counter to long-term goals often needs to be sustained in order to be ultimately successful.

Relationship between special experiences and interference control

Bilingualism

As shown in Table 4, the correlation between the ratio of L2/L1 proficiency and the composite measure of interference control was near zero. For this dataset, Paap et al. (2019) also reported no significant relationships between interference control and any of the following dimensions of bilingual experience: L2 proficiency, similarity of L2 to L1, age-of-acquisition of L2, percentage of time speaking L2, frequency of language switching per day, frequency of code switching, the mean number of languages used per context (e.g., at home, at work, at school, with friends, etc.), and the number of languages spoken. The results from this study are consistent with the meta-analyses described earlier (Donnelly et al., 2019 Lehtonen et al., 2018 Paap, 2019). The most straightforward conclusion is that bilingualism does not enhance inhibitory control. Paap, Johnson, and Sawi (2015, 2016) present an extended discussion of why a steady drip of significant findings occurs in the published literature, and Paap et al. (2019) conclude that bilingual language control may be encapsulated within the language-processing system and, consequently, have no beneficial effect on domain-general control.

Video game playing

In the present study, the composite interference score significantly correlated with the frequency of video game play (r = − .214), but when Raven’s scores, sex, and other factors were entered into the model, the regression coefficient for video game playing was no longer significant. Likewise, the frequency of video game play was not a predictor in the regression analyses of the individual tasks. The regression results are consistent with the results of Dye et al. (2009), showing no difference between players and nonplayers on flanker effects and the results of Unsworth et al. (2015) analyses showing no correlation between a continuous measure of video gaming and either Simon effects or flanker effects. From the studies reviewed in the introduction, only the training study by Hutchinson et al. is consistent with the hypothesis that video game play improves interference control and that study was restricted to Simon effects. However, as shown in Fig. 3, frequency of video game play was not a significant predictor for Simon effects either. In summary, little exists in the present study to stem what appears to be the tide that video game play has little or no impact on interference control as expressed in nonverbal interference tasks.

Music training

Years of music training was not a significant predictor of the composite interference scores. Neither was it a significant predictor in any of the separate stepwise analyses of interference scores. However, it was a significant predictor of Simon incongruent-trial residuals. This was the first time that the relationship between music training and Simon effects was assessed, and accordingly, no prior literature exists to support or guide an interpretation that music performance may hone interference control in the Simon task but not produce benefits on other nonverbal interference tasks. Consistent with the expectations laid out in the introduction, the current results provide no compelling evidence that music training or performance enhances inhibitory control to the extent that this hypothesis can be confirmed across a set of nonverbal interference tasks.

Mindfulness /meditation

meditation/meditation in our data are very inconsistent. The bivariate correlation between frequency of meditation and the composite interference scores was near zero (r = + 0.05), as was the beta coefficient for the regression analysis on the composite interference scores (β = + 0.07). However, significant positive beta coefficients were found for the meditation/mindfulness predictor in both the stepwise analysis of spatial Stroop interference scores (β = + 0.14) and the stepwise analysis of spatial Stroop residuals (β = + 0.07). These positive regression coefficients are, of course, opposite of what one would predict if mindfulness/meditation led to smaller interference scores and faster incongruent trials. The reliability of these positive regression coefficients in the analysis of the spatial Stroop is further questioned by the finding that the bootstrapped 95% CIs for both regression coefficients included zero. In contrast, in the analysis of the incongruent RT residuals for the Simon task, the beta for the mindfulness/meditation predictor was significant and in the expected negative direction (β = − 0.06). However, it was not a significant predictor of either the stepwise or LASSO regressions on Simon interference scores, which reduces the impact of the positive outcome in the regression on the Simon incongruent-RT residuals.

Recall that many training studies did not show significant facilitation and that most of the cross-sectional comparisons of meditators to non-meditators showed no group differences. We offer the following conjecture regarding why this pattern occurs in studies of mindfulness /meditation. Potential effects of bilingualism, music performance, or playing video games on nonverbal interference tasks are clear cases of far transfer in the sense that, for example, musicians are not practicing music when they are doing a flanker task, but meditators may be in a meditative state. This seems more probable when the last session of training culminates with the post-test of the interference task. Whether intentional or not, if a meditative state continues into the post-test, all types of cognitive control may be enhanced. Posner (2018) has recently reported that connectivity in the anterior cingulate cortex is improved following 2 to 4 weeks of meditation training and that the increase in frontal theta following meditation training might be the cause of improved connectivity. A critical question is whether improved connectivity is relatively durative and facilitates any processing employing those networks or if meditation induces temporary states that must be reinstated to produce benefit.

Team-sports ability

Team-sports ability was self-rated using this item originally developed by Paap and Greenberg (2013): Team sports often involve dividing your attention between a ball, a goal, your opponents, and your teammates. Do you excel at these sports? Team-sports ability enjoys the third highest zero-order correlation (r = − 0.19), with the composite interference scores and the beta coefficient being significant in the analysis of Simon interference effects (β = − 0.19). However, it did not enter the final stepwise model for any of the other tasks or for any of the tasks in the regression analyses of incongruent RT residuals.

In regression analyses similar to those used in the present study, Paap and Greenberg reported significant beta coefficients in their Study 3 for separate analyses of flanker effects and switching costs but not for Simon effects. A further complication to the interpretation of the relationship between sport’s ability and inhibitory control is that males rated their sports ability higher than females, and as reported above, these nonverbal interference tasks often produce male advantages.

A possible relationship between team-sports ability and interference control may be surprising for those familiar with contemporary theories in sports psychology because of the emphasis on the role of deliberate practice leading to automatization of skilled sport performance (e.g., Ericsson, Charness, Feltovich, & Hoffman, 2006). However, Toner and Moran (2014) have advocated for more research on the role of controlled processing and Furley and Wood (2016) review evidence that working memory capacity is often associated with better performance in team sports. The study most related to the type of interference control that is the focus of the present investigation is that of Vestberg, Gustafson, Maurex, Ingvar, and Petrovic (2012), who tested soccer players with different levels of advanced skills using the D-KEFS test battery of executive functions (Homacka, Lee, & Ricco, 2005). The design fluency component requires participants to remember previous responses by updating working memory and inhibition skills in order to not repeat previous responses. Also included was a color-word Stroop test and the Trail-Making Test. Players from the Swedish highest national soccer leagues outperformed players from the lower division on all of these measures of EF. Furthermore, the EF test scores obtained in the fall of 2007 were used predict a performance measure that combines goals and assists over a 17-month interval in 2008 and 2009. The correlation (cf = 0.54, p = .006) was statistically significant and noteworthy in magnitude. These results are consistent with the interpretation that EF contributes to team-sports ability, even at very high levels of skill.

Physical exercise

Individuals with superior team-sports ability are also likely to be fit, and in the present study, the frequency of exercise, working out, and participation in team sports notably did not predict the composite interference scores or the outcome measure in any of the task-specific regression analyses. Furthermore, these small correlations are positive, rather than negative, indicating that individuals reporting higher levels of physical exercise were actually trending toward larger interference effects.

In several large-scale studies (Paap et al., 2017 Paap & Greenberg, 2013 Paap & Sawi, 2014), the correlations between parents’ educational levels and a variety of EF measures were always nonsignificant and often near zero. The participants in each case were university students. In the present study, the proxies for SES were extended to include family income. Neither the composite measure of SES nor the separate factors predicted the composite interference scores. Studies using children often report effects of SES on EF. For example, Calvo and Bialystok (2014) tested six-year-old children and reported main effects for both bilingualism and SES on the flanker and Stroop effects. A possible explanation for why the relationship is consistently weak and nonsignificant in our studies is that the lower SES students in our college student population either had enriching early experience despite their parent’s education and income or have otherwise managed to compensate for disadvantages in early childhood.

The conundrum of sex, sports, gF, and their relationship to interference control

Males had smaller interference scores in the composite measure and individual regression analyses of the spatial and vertical Stroop task. Although sex was confounded with Raven’s scores, the same male advantage was observed when the 52 males were matched in Raven’s to 52 females. This evidence for sex differences in interference control in the present study should be interpreted cautiously, but two recent studies using spatial Stroop tasks similar to ours also reported statistically significant male advantages in the form of smaller interference effects. Stoet (2016) tested 236 males and 182 women in an online study and reported 42 ms interference scores for males and 29 ms for females. Evans and Hampson (2015) tested 90 males and 86 females and, estimating from their Fig. 4, the interference effects were apparently approximately 60 ms and 40 ms, respectively. For purposes of comparing across the studies, a separate two-way ANOVA on our spatial Stroop RT data yielded a significant Sex x Congruency interaction (F(1, 199) = 14.92, p < .001, partial η 2 squared = .070). The interference effect for males was 70 ms compared to 96 ms for females. The overall spatial Stroop effects in our study are atypically large. This is not too surprising as only 25% of the trials were incongruent compared to the usual 50–50 balance. A more extreme bias was used by Christakou et al. (2009) with only 11.5% incongruent trials and led to even larger spatial Stroop effects, namely, 110 ms for males and 129 ms for females. This male advantage was not statistically significant, Footnote 8 but the study was underpowered with only 38 males and 25 females. When incongruent trials are rare, a strategy of relying entirely on reactive mechanisms may be induced. Further pursuit of the sex effect in the spatial Stroop task with a systematic manipulation of the proportion of incongruent trials and determination of whether the male advantage is nested primarily in a preference for reactive inhibition over proactive may be worthwhile.

Lynn and Irwing (2004) suggest that the male advantage in the Raven’s test may be nested in the spatial-visualization ability in hierarchical factor models like Carroll’s (1993). In contrast to Raven’s, the ability to manipulate visual-spatial representations may play little role in interference tasks that require decisions about a single stimulus (e.g., spatial Stroop, vertical Stroop, and Simon) that remains in view until a response is made Although quite speculative, this provides one explanation for why matching on Raven’s scores does not reduce or eliminate the male advantage in interference control.

The Raven’s test was developed to assess an individual’s abstract reasoning ability without having to rely on declarative knowledge and the influence of language, education, or cultural factors (Carpenter, Just, & Shell, 1990 Raven, 1939). As reviewed by Lynn and Irwing (2004), many experts judge it as one of the best tests of gF as defined by Cattell (1971) because of its ability to discriminate relations, reason abstractly, solve novel problems, and adapt to new situations. Paap and Sawi (2014) note that EF should be related to gF because the components of EF (monitoring, updating, switching, and inhibiting) logically serve successful reasoning, problem solving, and adapting, whereas high quality reasoning seems to require more than the sum of the parts of EF. However, the degree to which EF and gF are actually separate constructs has been questioned, if not challenged, by Salthouse (Salthouse, Atkinson, & Berish, 2003 Salthouse, Pink, & Tucker-Drob, 2008) who showed that multiple measures of gF were strongly related to several measures of EF and that performance on classic EF tasks will sometimes load on the gF factor rather than the EF factor when allowed to do so. Salthouse (2010) observes, in a somewhat dispiriting manner, that if gF encompasses a broad spectrum of controlled processing, then investigators working from different research traditions may be giving different names to the same dimension of individual differences. That said, the intimate relationship between EF and gF appears less promiscuous for the inhibiting function of EF than for updating (Salthouse et al., 2003, Tables 9 and 10). This would be consistent with a working hypothesis that the interference effects measured in the present study and Raven’s scores share some dimensions of individual differences, but are separable constructs.

Recall that in the present study males outperformed females on the Raven’s test. Setting aside the omnipresent possibility of a Type 1 error, the difference could be due to a bias favoring higher gF males in our student population or it could reflect a genuine difference in the general population of young adults. Although the presence of sex differences in the Raven’s test remains controversial, Lynn and Irwing’s (2004) meta-analysis of 57 studies showed a statistically significant male advantage emerging at the age of 15 (0.10d) that grew to 0.33d among young adults aged 20–29 and remaining stable through old age. Their meta-analysis had two notable strengths: (1) avoiding apples and oranges comparisons by including only versions of the Raven’s test and excluding other intelligence tests and (2) including only general population studies with samples of at least 50 males and 50 females.

Limitations

Although four different nonverbal interference tasks were used that varied in terms of S-S compatibility and whether conflict arose from distractors versus a task-irrelevant dimension of the imperative stimulus, some results possibly would be different if the proportion of incongruent trials encouraged greater reliance on proactive inhibition. Likewise, some of our background variables relied on a single item. Future research might focus on developing scales for these predictors that have desirable psychometric properties. The complete absence of significant relationships between interference scores and measures of self-control and impulsivity may be attributed, in part, to the reliance on self-reports that rely on memory and are subject to various types of bias.

An optimist’s conclusions

The interference scores from the four nonverbal interference tasks have adequate split-half reliabilities and three (i.e., Simon, spatial Stroop, and vertical Stroop) cohered into a latent variable that may reflect the ability to resolve conflict between two dimensions of a single stimulus (namely, identity and location). This latent variable, expressed as a standardized composite of each task’s interference scores, is significantly related to sex and gF in that males and individuals with higher intelligence are better at resolving this type of conflict. The male advantage is sustained in a subset of males and females that are matched on Raven’s scores. Years of musical experience did not predict the composite interference scores but was associated with the magnitude of the Simon effect in incongruent RT residuals. As the Simon task is a pure S-R task (see Fig. 1), it may be more sensitive to a form of conflict resolution common to music performance, although we have no reason to believe that music performance is richer in S-R incompatibilities compared to S-S. Future research could test this hypothesis. Likewise, frequency of mindfulness/meditation did not predict the composite interference scores, but its regression coefficient was significant in predicting both Simon and spatial-Stroop effects. In the previous research (see Table 1), the relationship between mindfulness/meditation and interference control appears more consistent in the training studies than in studies comparing meditators to non-meditators. Thus, the possibility that mindfulness/meditation enhances interference control remains a plausible hypothesis but may be more robust following training. Finally, a surprising disconnect exists between the composite measure of interference control and self-ratings of impulsivity and control in everyday life.

A pessimist’s conclusions

The problem with the conclusions offered by optimists is that they are often influenced by a confirmation bias for reporting positive effects and a penchant for seeing any positive findings as a roadmap to future research that might eventually validate the constructs of interest, albeit with a more complicated theory than initially envisioned. But if the constructs do not exist or are markedly different, then the roadmap is a blind alley that prevents self-correction. Therefore, a pessimist might offer a different conclusion.

Four common nonverbal interference tasks that are typically assumed to measure inhibitory control did not all load on a common latent variable. The three tasks that did form a latent variable were not the tasks one would expect on the basis of Kornblum’s taxonomy (see Paap et al., 2019). Prior to the present study, no latent variable analysis has been able to extract a latent variable that includes the interference scores from two or more nonverbal interference tasks. Footnote 9 When prior studies do succeed in extracting a latent variable that includes a single nonverbal interference score, it loads weakly and is dominated by a different measure—often the antisaccade task (Rey-Mermet et al., 2018). In the same vein, Friedman and Miyake (2016) could not extract an inhibition factor that was separable from updating and shifting.

The formation of a latent variable for three of our tasks could be an artifact of the stimulus and response similarities across the tasks. Rey-Mermet et al. (2018) recommended and practiced the advice to deliberately introduce differences in the stimulus displays and response modes for tasks selected to load on the same latent variable. As Friedman and Miyake (2016) noted, task impurity seems to be an unavoidable quality of EF tasks like the nonverbal interference tasks. By definition, EFs involve controlling lower-level processes, so any inhibitory control task must include nonexecutive processes that could influence performance in addition to the EF of interest. One method for removing the influence of unreliability and task impurity is latent variable analysis. For present purposes, the important characteristic is that they capture only common variance across multiple measures this common variance cannot include random measurement error and will not include non-EF variance to the extent that tasks are selected to have different lower-level processes. The perceptual encoding, response selection, and response execution processes in the present study are, unfortunately, very similar and very well could explain the significant but small intertask correlations.

With the regression analyses, when a set of 11 predictors that have been hypothesized to be related to inhibitory control were entered in a stepwise regression on the composite interference scores, only sex and Raven’s score entered the model. When the same stepwise regression was conducted on the interference-scores from each individual task, Raven’s score was the only significant predictor for all four tasks. Sex was included in the model for two of the tasks with music training, mindfulness/meditation, and team sports included in only one model. Two of these predictors in the bootstrapped analysis of individual tasks had 95% CIs that included zero and are likely to be unreliable in future tests. The three methods (stepwise regression on interference scores, hierarchical regression on incongruent trial RT, and LASSO) intended to provide converging evidence each identify a predictor that the other two do not: music is selected in the analysis of incongruent-trial RT residuals (Simon task), team sports is selected by the stepwise regression of the interference scores (Simon task), and team sports is selected by the LASSO regression (composite of 3 tasks). The only solid relationship is that Simon, spatial Stroop, and vertical Stroop effects decrease as the Raven’s scores increase. Taking at face value that Raven’s is tapping into gF abilities and not skills, this would suggest that interference control in these generic nonverbal tasks are, at the individual differences level, influenced more by heritability than experience (see Paap, 2018b for a discussion of the possible role of heritability in EF).

The possibility of a causal relationship between EF and gF is important, as illustrated by the Engle, Kane and colleagues theory that EF/EA drives both gF and WMC. But the only nonverbal interference task typically included in their EA battery is the flanker task, and the flanker effect always loaded weakly on the EF/EA latent variable. A related but different issue was raised by Chuderski et al. (2012), who reported that latent variables for both inhibition and interference did not account for any meaningful portion of gF variance because the simple correlations were completely mediated by the storage capacity latent variable. The coup de grâce that inhibitory control is related to gF may be the Rey-Mermet et al. (2019) finding that a coherent latent variable for EF could not be established despite good reliabilities for all measures. Furthermore, WMC and gF—modeled as separate but correlated factors—were unrelated to the individual measures of EF, which included modified versions of both the arrow flanker and Simon tasks. In summary, inhibitory control is probably task-specific, not domain-general, and not causally related to gF. At best, subsets of nonverbal interference tasks may exist that share more specific mechanisms of conflict resolution. Going forward, we should stop using the flanker, Simon, and spatial Stroop tasks.

Another major purpose was to further evaluate the relationship between trait measures of self-control or impulsivity and measures of inhibitory control that are commonly used in cognitive psychology laboratories. Although the array of nonverbal interference tasks used in the present study was different from most of the cognitive control tasks surveyed by Duckworth and Kern (2011), our results sustain their conclusion that trait-like measures of self-control and interference control measured in RT tasks are not measuring the same thing. The differences in temporal dynamics and motivation may contribute to this dissociation. In any event, one should not interpret interference scores as “inhibitory control,” “self-control,” or “impulsivity” without converging evidence supporting such a generalization.


Language Development

Given the remarkable complexity of a language, one might expect that mastering a language would be an especially arduous task indeed, for those of us trying to learn a second language as adults, this might seem to be true. However, young children master language very quickly with relative ease. B. F. Skinner (1957) proposed that language is learned through reinforcement. Noam Chomsky (1965) criticized this behaviorist approach, asserting instead that the mechanisms underlying language acquisition are biologically determined. The use of language develops in the absence of formal instruction and appears to follow a very similar pattern in children from vastly different cultures and backgrounds. It would seem, therefore, that we are born with a biological predisposition to acquire a language (Chomsky, 1965 Fernández & Cairns, 2011). Moreover, it appears that there is a critical period for language acquisition, such that this proficiency at acquiring language is maximal early in life generally, as people age, the ease with which they acquire and master new languages diminishes (Johnson & Newport, 1989 Lenneberg, 1967 Singleton, 1995).

Children begin to learn about language from a very early age (Table 1). In fact, it appears that this is occurring even before we are born. Newborns show a preference for their mother’s voice and appear to be able to discriminate between the language spoken by their mother and other languages. Babies are also attuned to the languages being used around them and show preferences for videos of faces that are moving in synchrony with the audio of spoken language versus videos that do not synchronize with the audio (Blossom & Morgan, 2006 Pickens, 1994 Spelke & Cortelyou, 1981).

Table 2. Stages of Language and Communication Development
Stage Age Developmental Language and Communication
1 0–3 months Reflexive communication
2 3–8 months Reflexive communication interest in others
3 8–12 months Intentional communication sociability
4 12–18 months First words
5 18–24 months Simple sentences of two words
6 2–3 years Sentences of three or more words
7 3–5 years Complex sentences has conversations

Each language has its own set of phonemes that are used to generate morphemes, words, and so on. Babies can discriminate among the sounds that make up a language (for example, they can tell the difference between the “s” in vision and the “ss” in fission) early on, they can differentiate between the sounds of all human languages, even those that do not occur in the languages that are used in their environments. However, by the time that they are about 1 year old, they can only discriminate among those phonemes that are used in the language or languages in their environments (Jensen, 2011 Werker & Lalonde, 1988 Werker & Tees, 1984).

Watch It

This video explains some of the research surrounding language acquisition in babies, particularly those learning a second language.

Newborn Communication

Figure 2. Before they develop language, infants communicate using facial expressions.

Do newborns communicate? Certainly, they do. They do not, however, communicate with the use of language. Instead, they communicate their thoughts and needs with body posture (being relaxed or still), gestures, cries, and facial expressions. A person who spends adequate time with an infant can learn which cries indicate pain and which ones indicate hunger, discomfort, or frustration.

Intentional Vocalizations

Infants begin to vocalize and repeat vocalizations within the first couple of months of life. That gurgling, musical vocalization called cooing can serve as a source of entertainment to an infant who has been laid down for a nap or seated in a carrier on a car ride. Cooing serves as practice for vocalization. It also allows the infant to hear the sound of their own voice and try to repeat sounds that are entertaining. Infants also begin to learn the pace and pause of conversation as they alternate their vocalization with that of someone else and then take their turn again when the other person’s vocalization has stopped. Cooing initially involves making vowel sounds like “oooo.” Later, as the baby moves into babbling (see below), consonants are added to vocalizations such as “nananananana.”

Babbling and Gesturing

Between 6 and 9 months, infants begin making even more elaborate vocalizations that include the sounds required for any language. Guttural sounds, clicks, consonants, and vowel sounds stand ready to equip the child with the ability to repeat whatever sounds are characteristic of the language heard. These babies repeat certain syllables (ma-ma-ma, da-da-da, ba-ba-ba), a vocalization called babbling because of the way it sounds. Eventually, these sounds will no longer be used as the infant grows more accustomed to a particular language. Deaf babies also use gestures to communicate wants, reactions, and feelings. Because gesturing seems to be easier than vocalization for some toddlers, sign language is sometimes taught to enhance one’s ability to communicate by making use of the ease of gesturing. The rhythm and pattern of language are used when deaf babies sign just as when hearing babies babble.

At around ten months of age, infants can understand more than they can say. You may have experienced this phenomenon as well if you have ever tried to learn a second language. You may have been able to follow a conversation more easily than to contribute to it.

Try It

Holophrasic Speech

Children begin using their first words at about 12 or 13 months of age and may use partial words to convey thoughts at even younger ages. These one-word expressions are referred to as holophrasic speech (holophrase). For example, the child may say “ju” for the word “juice” and use this sound when referring to a bottle. The listener must interpret the meaning of the holophrase. When this is someone who has spent time with the child, interpretation is not too difficult. They know that “ju” means “juice” which means the baby wants some milk! But, someone who has not been around the child will have trouble knowing what is meant. Imagine the parent who exclaims to a friend, “Ezra’s talking all the time now!” The friend hears only “ju da ga” which, the parent explains, means “I want some milk when I go with Daddy.”

Underextension

A child who learns that a word stands for an object may initially think that the word can be used for only that particular object. Only the family’s Irish Setter is a “doggie.” This is referred to as underextension. More often, however, a child may think that a label applies to all objects that are similar to the original object. In overextension, all animals become “doggies,” for example.

First words and cultural influences

First words for English-speaking children tend to be nouns. The child labels objects such as a cup or a ball. In a verb-friendly language such as Chinese, however, children may learn more verbs. This may also be due to the different emphasis given to objects based on culture. Chinese children may be taught to notice action and relationship between objects while children from the United States may be taught to name an object and its qualities (color, texture, size, etc.). These differences can be seen when comparing interpretations of art by older students from China and the United States.

Vocabulary growth spurt

One-year-olds typically have a vocabulary of about 50 words. But by the time they become toddlers, they have a vocabulary of about 200 words and begin putting those words together in telegraphic speech (short phrases). This language growth spurt is called the naming explosion because many early words are nouns (persons, places, or things).

Two-word sentences and telegraphic speech

Words are soon combined and 18-month-old toddlers can express themselves further by using phrases such as “baby bye-bye” or “doggie pretty.” Words needed to convey messages are used, but the articles and other parts of speech necessary for grammatical correctness are not yet included. These expressions sound like a telegraph (or perhaps a better analogy today would be that they read like a text message) where unnecessary words are not used. “Give baby ball” is used rather than “Give the baby the ball.” Or a text message of “Send money now!” rather than “Dear Mother. I really need some money to take care of my expenses.” You get the idea.

Child-directed speech

Why is a horse a “horsie”? Have you ever wondered why adults tend to use “baby talk” or that sing-song type of intonation and exaggeration used when talking to children? This represents a universal tendency and is known as child-directed speech or motherese or parentese. It involves exaggerating the vowel and consonant sounds, using a high-pitched voice, and delivering the phrase with great facial expression. Why is this done? It may be in order to clearly articulate the sounds of a word so that the child can hear the sounds involved. Or it may be because when this type of speech is used, the infant pays more attention to the speaker and this sets up a pattern of interaction in which the speaker and listener are in tune with one another. When I demonstrate this in class, the students certainly pay attention and look my way. Amazing! It also works in the college classroom!

Watch It

This video examines new research on infant-directed speech.

Try It

Theories of Language Development

How is language learned? Each major theory of language development emphasizes different aspects of language learning: that infants’ brains are genetically attuned to language, that infants must be taught, and that infants’ social impulses foster language learning. The first two theories of language development represent two extremes in the level of interaction required for language to occur (Berk, 2007).

Chomsky and the language acquisition device

This theory posits that infants teach themselves and that language learning is genetically programmed. The view is known as nativism and was advocated by Noam Chomsky, who suggested that infants are equipped with a neurological construct referred to as the language acquisition device (LAD), which makes infants ready for language. The LAD allows children, as their brains develop, to derive the rules of grammar quickly and effectively from the speech they hear every day. Therefore, language develops as long as the infant is exposed to it. No teaching, training, or reinforcement is required for language to develop. Instead, language learning comes from a particular gene, brain maturation, and the overall human impulse to imitate.

Skinner and reinforcement

This theory is the opposite of Chomsky’s theory because it suggests that infants need to be taught language. This idea arises from behaviorism. Learning theorist, B. F. Skinner, suggested that language develops through the use of reinforcement. Sounds, words, gestures, and phrases are encouraged by following the behavior with attention, words of praise, treats, or anything that increases the likelihood that the behavior will be repeated. This repetition strengthens associations, so infants learn the language faster as parents speak to them often. For example, when a baby says “ma-ma,” the mother smiles and repeats the sound while showing the baby attention. So, “ma-ma” is repeated due to this reinforcement.

Social pragmatics

Another language theory emphasizes the child’s active engagement in learning the language out of a need to communicate. Social impulses foster infant language because humans are social beings and we must communicate because we are dependent on each other for survival. The child seeks information, memorizes terms, imitates the speech heard from others, and learns to conceptualize using words as language is acquired. Tomasello & Herrmann (2010) argue that all human infants, as opposed to chimpanzees, seek to master words and grammar in order to join the social world [5] Many would argue that all three of these theories (Chomsky’s argument for nativism, conditioning, and social pragmatics) are important for fostering the acquisition of language (Berger, 2004).

Try It


Strengths, Limitations, and Conclusions

Overall, this research represents a novel investigation of emotion-cognition linkages framed within a differential susceptibility model, and includes several methodological strengths. First, use of a behavioral paradigm to index cognitive processing eliminated distortion due to response biases such as social desirability, which may occur when informants select responses that will be viewed favorably by others (e.g., the endorsement of positive but not negative maternal attributes). Second, emotional reactivity was assessed in response to naturally occurring events, thereby minimizing confounds associated with estimating reactions to hypothetical stressors. Finally, the administration of semi-structured diagnostic interviews provided a comprehensive and refined assessment of maternal psychopathology.

In spite of these strengths, several limitations are worth noting. First, maternal psychopathology served as a proxy for the emotional quality of caregiving experiences it would be helpful in future research to assess specific parenting behaviors during mother-child interactions (e.g., maternal sensitivity) that may shape youths' cognitive processing. Second, the study included a relatively small sample of youth, in which only a subset of caregivers experienced diagnoses or subclinical symptoms of psychopathology. Thus, future research will need to replicate these findings in a large, ethnically diverse sample of youth as well as in samples of caregivers with diagnostic levels of psychopathology. Third, our emotional reactivity index reflected the experience of negative emotionality in response to stress. Although this index is consistent with the construct of difficult temperament, which is the focus of theory and research on differential susceptibility, it is unclear whether the cognitive benefits accrued to youth with high emotional reactivity resulted from non-depressed mothers' ability to react in an emotionally supportive manner when youth are stressed or whether the same youth also displayed heightened positive emotionality in response to support, thereby resulting in positive cognitive biases. Finally, this study specifically examined cognitive biases during the processing of mother-referent information, and it remains to be determined whether results generalize to youths' cognitive processing of other relationships (e.g., peers, siblings) or non-interpersonal domains (e.g., academics, health).

In sum, these findings illuminate one personal characteristic of youth that shapes emotion-cognition linkages during early adolescence, and reveal trade-offs of emotional reactivity for cognitive processing such that both enhancing and impairing effects emerge as a function of socialization environment. That is, in the context of maternal depression, youths' heightened emotional arousal and distress may impair cognition by generating a perseverative focus on negative features of the environment, including information about emotionally insensitive or unavailable caregivers. In contrast, in parenting contexts characterized by low maternal depression (and, perhaps, accompanying warmth and sensitivity), youths' emotional reactivity may enhance cognition by allowing youth to interpret caregiving interactions in a positive light. Given that negative cognitive biases represent a risk factor for depression, these findings implicate youths' emotional reactivity and maternal depression as joint targets of intervention and prevention endeavors. Overall, this research emphasizes the importance of considering integrative, developmentally sensitive perspectives of the complex interplay between emotion and cognition, which may involve mutually enhancing or impairing associations, particularly as emotion-cognition linkages pertain to the onset and maintenance of psychopathology across the lifespan.


Watch the video: Το Μυστικό της Νίκης - Ιστορία κινουμένων σχεδίων (January 2022).