Chapter 6 Perceiving the World Around Us: How Divergent Methods Illustrate Convergent Perspectives

Jordan Wylie, The Graduate Center, CUNY, May 25 2018

6.1 Introduction

Exogenous, sensory data helps us navigate through our daily lives. Our body’s specialized cells and tissues receive raw sensory information and translate it into signals that the mind and body can understand. The architecture of our brains is well-suited to handle, sort, and filter through the enormous amounts of sensory signals and noise that we encounter every day. One such sensory modality, vision, dominates phenomenological experience and has, in turn, dominated research on both bottom-up (or outside-in) and top-down (or inside-out) approaches to perception. The rich visual system literature spans domains and methodologies within psychology and related disciplines, many of which suggest a shared understanding of how the brain is able to integrate stored information, while continually processing new incoming stimuli. Though it may still be unclear, new computational approaches and computer-informed methodologies have shed light on an age-old debate. Namely, do our motivations and expectations inform conscious perception?

Research spanning decades has demonstrated the amazing capacity of the human visual system. Much of the brain’s posterior cortical structure is devoted in some way to processing visual information, with nearly half of the nonhuman primate neocortex being devoted to such processes (DiCarlo, Zoccolan, & Rust, 2012; Felleman & Van Essen, 1991). The dense visual network within the human brain is not contained within the occipital lobe, but recruits assistance from surrounding cortical areas. During visual processing, neural pathways work together to quickly discriminate between stimuli on a vast number of features; patterns, colors, motion, and many other structural features of our visual environments are registered on the retina and then integrated into ongoing neural and cognitive processing.

Despite a vast literature, complete mechanistic understanding of visual perception and recognition are still absent. Researchers continue to debate how exactly we interpret the world around us, and which methods are most appropriate for tackling that question. Some theorists have purposed a functionally impenetrable visual perception that is unadulterated by cognitive processes occurring elsewhere in the brain (Pylyshyn, 1999), while others disagree (see Friston, 2010), believing perception to be integrated with cognition similar to nearly all other functions within the brain. The present chapter will attempt to review this issue and relevant research findings guided by the predictive brain lens.

Specifically, the focus of the following chapter will be to examine how evidence of predictions inform visual recognition, as supported by neuroscientific and cognitive findings. How might social norms, our motivations and emotions, and informational assumptions influence what we see? Perceptual and recognition accuracy are fundamental to our visual experience, allowing us to interpret and make sense of the world around us. While research on the visual system touches many other important aspects to visual experiences (e.g., attention), those are not within the scope of the present review. Instead, I will focus on how neuroscientific, affective and motivational, and cognitive approaches can reveal important information about visual object recognition.

6.2 The Visual System and Present Controversy

It is well documented that human beings have exceptional visual capabilities. While an owl may see with acuity at night, and the lizard may lack a visual blind spot, the human visual system, which has evolved from a shared primate brain, allows for flexibility and an emergent, powerful ability to predict. The combination of our physiology and cognitive abilities enables the integration of vast amounts of visual information to create perceptual experiences that do not deviate much from those around us. Indeed, visual experience requires some uniformity to ground humans in an agreed upon representation of reality. Perception is, therefore, rooted in this understanding and must be tethered to some similarity across people. By counter example, hallucinations demonstrate how perception without reality constraints lacks any observable order (see Clark, 2013). We must agree that a particular pattern of waves that hit the retina yield the color green, this is the first step in semantic development and abstracting away important ideas. But the question remains, how do we (or do we at all) use previous information, memories and states to inform and facilitate visual perception?

However, these questions are not novel. Beginning as far back as Descartes (1637), there has been a marked intrigue in how, mechanistically, humans are able to assimilate the extensive visual information present at any given moment to adequately traverse our social environments. This curiosity has not waned. Visual system-centered work has extended beyond m using of Da Vinci and Descartes to more modern-day science like Hubel (Polyak,1957; Schmolesky, 1995). Today, we capitalize on access to neurophysiological components of the visual system and a general template for information processing to inform questions and research concerning vision. Research done with similar primate visual systems, the macaque monkey, illuminated some key neural structures that process visual information (Fitzpatrick, Itoh, & Diamond, 1983; Shipp & Zeki, 1989). By mapping specific connections across brain regions, we have begun to piece together where the visual system is distributed in the brain, which regions are most crucial for processing visual information, and the cascade of processes and networks responsible for the bringing visual information to conscious awareness.

However, correlating activity and locating areas within the brain can only answer so many questions. It is necessary to extend these models to other domains to understanding how these areas work. Are they running in parallel or serially? Modular accounts of visual recognition suggest that distinct visual cortical areas (V1-V5) process different types of visual information and together makeup the primary cascade for recognizing objects (Ungerleider & Haxby, 1994). Visual information first hits the retina and is pooled by ganglion cells. It is here that the other important facets of the visual system are most salient. Namely, attending to specific areas allows for information to hit the field of vision, with information situated close to the fovea most effectively represented. After this, information is passed to the LGN for transduction, which finally sparks the cascade of processing to create the final representation (Van Essen, Anderson, & Felleman, 1992; Felleman & Van, 1991).

Beginning at the primary visual cortex, or V1, rudimentary object features are calculated from raw visual information (Desimone & Ungerleider, 1989). From there, increasingly specialized areas selectively fill in missing information that builds up to a representation of the object at the conscious level (Kastner & Ungerleider 2000). Moreover, computational evidence suggests that the cascade of processing within the visual system that leads to the conscious categorization and subsequent identification of objects proceeds in a hierarchical fashion. Again, using the macaque monkey’s visual system as a proxy, findings suggest that the specific cortical areas (e.g., V1, V4) process distinct components of the overall sensory input. This processing occurs through a series by which processing low-level features of the two-dimensional space eventually producing a three-dimensional (3-D) object representation (Perrett & Oram, 1993). Other theorists have extended this work to evaluate the applicability given human biological constraints, finding that physiological evidence that implicates the inferior temporal cortex (IT) can be modelled for basic performance (Riesenhuber & Poggio, 1999). Essential to this line of reasoning is the feed-forward building of complexity.

Neurophysioloigcal mapping has established specific subdivisions, each of which contribute to the overall functioning. These subdivisions are largely made up of different cellular signal, including the magnocellular (M-pathway) and the parvocellular (P-pathway), research that has been spearheaded by studies of the macaque monkey visual system (Maunsell, Nealey, & DePriest, 1990). The M-pathway and P-pathway have also been linked to specific spatial frequency information, whereby the M-pathway, situated primarily in the dorsal stream, is sensitive to low-spatial frequency information, and the P-pathway, primarily located in the ventral stream, is sensitive to high-spatial frequency information (Burr, Morrone, & Ross, 1994; Goffaux et al., 2005). Whereas the P-pathway primarily facilitates perception of contrast and color, notably higher-order features of visual perception, the M-pathway facilitates perception of motion and coarse greyscale information (Merigan and Maunsell, 1993; Vuilleumier et al., 2003). Interestingly, these pathways are also thought to map onto unconscious (M-pathway) and conscious (P-pathway) visual processing (Tapia & Breitmeyer, 2011). These divergent features of cellular channels within the visual system highlights important characteristics of our primate visual system; namely, the parallel, coordinated nature of visual processing.

Whereas these approaches are founded on the bottom-up nature of visual perception and object recognition, recent research has begun to challenge this view. For instance, the subjective value of an object may affect the proximity in which said object is perceived. Subjectively more desired objects (e.g., money) are estimated as closer in proximity than less desirable objects (Balcetis & Dunning, 2010). Further, research on action potential suggests that perception of hill steepness is influenced by metabolic costs (Proffitt, 2006). This line of research has revived the New Look debate, which claims that our beliefs, motivations, affordances, and more directly affect how we interpret incoming visual data (Balcetis, 2016). The updated New Look advocates a penetrability of perception by cognition, while other researchers continue to advocate a cognitively “impenetrable” V1, suggesting methodological limitations have stymied legitimate challenges to conventional conceptions of bottom-up processing (see Firestone & Scholl, 2016). Are the differences in observed behavior a function of response biases or actual perceptual modulation? Does the money actually appear closer? Or is it just a function of the relative desirability? If a judgement is driving the observed differences in responses, early regions of the visual process cascade may be independent of cognitive influence. While evidence is mounting, these questions remain to be answered.

In the following sections, I will attempt to answer these questions through the predictive lens. First, I will touch on prominent approaches to studying object recognition in humans. While the lens from which object recognition is studied varies greatly, an overarching goal is to better understand how human brains utilize prior information to inform ongoing processing of visual information. How might stimulus-driven conceptions of vision be limiting? What units or features of visual stimuli make up the foundation for understanding complex objects encountered? The scope of our knowledge about our environments is vast, informed by physiological states, memory, and sensory signals across modalities. How all of this information is integrated represents a critical question within vision research. Beginning with neuropsychological approaches to vision and then moving to cognitive science methods, the following sections will attempt to consider the relative advances and limitations to studying vision through the predictive lens.

To answer these questions of general perception and more specific questions of object recognition, methodological approaches began by neural pathway mapping. This was useful in identifying particular neurons that process specific information, but it does not get us closer to understanding how this happens. Computational approaches are currently gaining momentum. These approaches differ slightly from traditional neurobiological study of perception, instead assessing how to maximize information processing model fits.

6.2.1 Neurophysiological Evidence

Neural approaches have guided much of the research on the visual system and predictive vision. These approaches capitalize on the physical accessibility of neuroscientific methodology, enabled by the similarity in the visual cortices of other primates (Milner & Goodale 1995). Early work has isolated two major pathways by which our visual system preferentially processes distinct components of what we see (Milner & Goodale, 2008). Specifically, the ventral stream tells us what an object is, while the dorsal tells us where it is (Goodale & Milner, 1992; Goodale & Westwood, 2004). Moreover, the ventral stream is a low-level visual pathway, made up of descending and ascending routes, that are necessary for detailed visual information (Bar, 2000). Much of the research on ventral stream processing has focused on the traditional bottom-up approach, or on the ascending pathways, whereby visual information hits the V1, V2, and V4 cortical areas and then projects onto the high-level regions that are implicated in object perception like the IT (Bar, 2000). Generally, it is through these visual cortices (V1-V4) that the ventral visual stream hierarchically creates the visual representation (Hong, et al., 2016). Contrastingly, the dorsal stream is implicated in movement-based vision, aiding in the calibration of motor functions and detection of movement in the periphery. This pathway is less implicated in the process of object recognition but rather, fine-tuned for perception for action (Goodale & Milner, 1992). However, some research has suggested that our brain may rely on predictions from gross low-level features to supply an initial guess for other parts of the visual system to inform (Bar, 2003; Bar et al., 2006; Kveraga, Boshyan, & Bar, 2007). These findings highlight the potentially important contribution of the dorsal stream to complex visual object recognition and have opened the door to new explorations within the purview of the predictive lens.

The dual-system process model posits that ascending neural pathways in the ventral stream hierarchically build the representation of a given object. Though this is undoubtedly a piece of the puzzle, there is much to be gleaned from incorporating feedback loops into the model of visual object recognition. Specifically, research has demonstrated the importance of context in object perception (Bar, 2004; Fenske, Aminoff, Gronau, & Bar, 2006), engagement of higher-order structures in processing degraded or ambiguous stimuli (Wyatte, Curran, & O’Reilly, 2012), and perceptibility of low spatial frequency objects (Kverega, 2007). Recent evidence has corroborated these findings, demonstrating context-dependent oscillation patterns within the prefrontal and parietal cortices (Helfrich, Huang, Wilson, & Knight, 2017). These findings emphasize the role of expectation in guiding and enhancing visual perception. Contextual factors provide cues to associate with similar objects, facilitating and even biasing visual processing.

The two-system approach garnered a great deal of support, engendering a flood of scientific research. However, there are fundamental limitations to such approaches. Namely, these approaches rely on isolating pathways in the brain, ostensibly overlooking the complexities and interdependences that exist between these pathways. It is well documented that the brain is an iterative, dynamic organ (Cunningham & Zelazo, 2007). Research in this domain has implicated the PFC and reflective processing to extricate human neural processing from the more automatic associative processing seen in animals, favoring a hierarchical brain architecture that allows for afferent and efferent connections between brain regions (see Zelazo & Cunningham, 2007). This approach has also been situated within the object recognition literature. Work done by Bar (2000) underscores how studying the pathways separately may have neglected important contributions from prefrontal brain areas to ongoing visual processing.

To fully understand how neurobiology might suggest a top-down, cognitive penetrability, it is important to reconcile the role of the orbitofrontal cortex (OFC) in the processing of affective information. Specifically, the amygdala has established connectivity to the OFC, which implicates it in encoding emotionally salient information (Pessoa & Adolphs, 2010). Studies have implicated the OFC in representing threat and reward (Kringelbach & Rolls 2004), as well as in processing and representing auditory and visual information (Kringelbach 2005). This research has further linked the OFC to visual processing by demonstrating that the OFC is activated around 80-130ms after stimulus onset (Lamme & Roelfsema 2000). While this is not the earliest component of visual processing (<100ms), utilization of fMRI and MEG imaging has established the temporal activity during the short latency period, a time early enough to modulate ongoing processing (Barrett & Bar, 2009; O’Callaghan, Kveraga, Shine, Adams, & Bar, 2016).

Finally, while not the primary focus of the present chapter, attention is an important part of visual perception that often obfuscates the interpretation of visual system penetrability. Attention is an obviously critical facet of the visual system and is important to understanding the ways in which higher-order cognitive processes bias visual processing. We can only process what we attend to, and as such, visual attention operates as a sort of gate keeper in the cascade of conscious object representation. Specific stimulus properties, like emotion, are prioritized, which increases the likelihood that they will be attended to (Öhman, Flykt, & Esteves, 2001). Indeed, like a perceptual bias or expectation, attentional biases increase the probability of recognition. This makes parsing attentional versus perceptual biases difficult to disentangle. However, even at the attentional level, some findings are suggesting an influence of visual expectations (Gantman & Van Bavel, 2014). Attentional control refers to processes that facilitate both suppression of irrelevant stimuli (temporal attention) and broaden the breadth of visual field input (spatial attention), which research has shown are attentional systems differentially biased based on emotional states (Clore & Huntsinger, 2007; Gable & Harmon-Jones, 2008, 2010). Spatial, or broadened attention, increases target detection in peripheral locations, yet increases inaccurate responses due to the costs associated with unfocused processing of visual stimuli. Conversely, temporal or flexible attention requires a focused lens, making irrelevant and peripheral targets difficult to process, but increases the accuracy of target identification. Specifically, previous research has shown that high arousal emotions increase local target detection compared to happiness and sadness (Easterbrook, 1959; Eysenck et al., 2007; Gable & Harmon-Jones, 2008, 2010; Clore & Huntsinger, 2007; Wells & Matthews, 1996). These results highlight the ways in which motivations and emotions may influence attentional processing, subsequently influencing perception. Therefore, even on the attentional level, top-down expectancies can produce influences on visual processing.

6.2.2 Emotions, Motivations, and Perception

A critical piece to the present argument is that emotions, often referred to as a part of system-one (Kahneman, 2011), create powerful preferences from which we see the world. Emotions are a crucial component of our capacity to navigate social systems. As mentioned above, emotion saliency provides an interesting intersection between attention and prediction. Emotions guide the way we see our worlds, interact with others, and motivate goal-directed behavior. Research highlights the impact of emotions on our lives, demonstrating that emotions influence our attitudes (DeSteno, Dasgupta, Bartlett, & Cajdric, 2004; Esses, & Dovidio, 2002), our decisions (Lerner & Keltner, 2001; Lerner, Small, & Loewenstein, 2004), and our judgements (Forgas, 2013; Clore & Huntsinger, 2007).

As mentioned above, recent studies have illustrated that there is a top-down contribution to object recognition stemming from the dorsal stream (Kveraga et al., 2007). One prominent theory, the Frame and Fill Theory (FnF), posits that object processing within the ventral stream relies on contributions from the dorsal stream (magnocellular connections; M-pathway), which contributes global outlines of visual input and estimates of what the object is via the OFC. The ventral visual stream (parvocellular channels; P-pathway) relies on the global template and ‘fills’ in necessary details for accurate object recognition (Bullier, 2001;Chen et al.,2007). Further, by introducing the dorsal stream as a mechanism through which emotions may bias object recognition, there may also be important implications for the biasing of attention. Research suggests that the dorsal stream governs the shifting of attention (Siegel, Donner, Oostenveld, Fries, & Engel, 2008). Thus, the FnF theory provides a cohesive model of attention and object recognition for studying biases that influence early processing, specifically in relation to biases caused by emotional content. Through this lens, the relationships between emotion and object recognition can be better tested, by biasing processing toward the M-pathway and the dorsal stream object recognition (and flexible attention) may be facilitated.

Findings from the lens of affective neuroscience also suggest that the primacy of emotion may guide gating mechanisms of early visual inputs as well as recruit the engagement of the OFC, altering ongoing perceptual and visual processing. (Feldman-Barrett & Bar, 2009; Schmitz et al. 2009). Emotions and affective states are informationally rich, intimately interacting with cognitive processes (see Clore, Gasper, & Garvin, 2001). Together, emotion and perception optimize the visual identification process. Specific evidence has implicated positive and negative affect in the encoding of visual information, suggesting that differences between positive and negative states interact with the encoding of peripheral information by altering the field of vision (Schmitz, De Rosa, Anderson, 2009; Rowe et al., 2007). Moreover, binocular rivalry studies have unlocked interesting insights into what achieves perceptual dominance. Emotional faces (Alpers & Gerdes, 2007), affectively conditioned (Alpers et al., 2005), and motivationally valued (Balcetis, Dunning, & Granot, 2012) stimuli all overtake the perceptual experience compared to neutral and control stimuli. Collectively, these findings hint at important relationships between emotional or motivational value and visual prioritization. If emotions interact with value representations of stimuli, expectations and predictions may be reflected in the processing of visual information. However, within the dominant framing of visual processing, connectivity between emotion “centers” in the brain (e.g., the amygdala) and top-down contributors (e.g., OFC), are not well established.

6.2.3 Zooming in on Fear

Fear has emerged as an important emotion to assess different processing during visual perception. A surfeit of research has investigated the link between amygdala functioning and emotions, much of which has focused on amygdala activation in the recognition and response to potential threats through feedback from the visual cortex (Amaral et al., 1992; LeDoux, 1998; Ledoux, 2002; Pessoa et al., 2002; Adolphs & Spezio, 2006). Further, studies have shown automatic detection of threatening stimuli presented outside of conscious awareness (Öhman & Mineka, 2001; Öhman, 2005), enhanced attention in visual searches when in fearful states (Öhman, Flykt, & Esteves, 2001), and biased perceptions when afraid (Stefanucci, Proffitt, Clore, & Parekh, 2008). Such results suggest a unique role emotion, and specifically fear, may have on initial attention and perception and may influence higher-level processes like object-identification.

While the role of the amygdala in the processing of emotion is well established, the purported mechanisms that affect cognitive processes in fear states are incompatible, relying on conflicting top-down and bottom-up processing models to explain a variety of phenomena (Pessoa & Adolphs, 2010). Specifically, two routes have been posited for amygdala directed processing. The low route, which has the advantage of speed, suggests a direct subcortical route from the thalamus, and the higher route, from the thalamus to the visual cortex to the amygdala (Rolls, 1999). Indeed, the amygdala provides a critical source of input for affective processing. Yet, evidence of a low route existing in high order species is lacking (Shi & Davis, 2001) and high route processing has yet to reconcile speed issues (Shi & Davis, 2001). This inconsistency diminishes understanding of how the amygdala gets information to the level of consciousness quickly enough to incite action.

It is well established that fear is associated with facilitating attention toward and perception of dangerous entities compared to other emotional states. Again, the common framework for explaining these findings are primary cortical visual pathways that send low grade visual information to the amygdala, which then identifies threatening entities. Mixed findings and the lack of a unifying theory have limited understanding of how cognition might influence ongoing processing. Moreover, these approaches rely on the primacy of affect, suggesting that cognition has little to do with initial processing of affective information, a notion that continues to be contended (Lazarus, 1982; Storbeck & Clore, 2007).

The emphasis on subcortical processing of information, restricts top-down contributions of processing emotional stimuli such as motivations, perceptions, and attitudes. For instance, the visual system is sensitive to and biased by endogenous (Balcetis & Dunning, 2006; Tiedens, Unzueta, & Young, 2007; Skelly, & Decety, 2012) and exogenous (environmental cues) factors (Proffit, 2006; Cole, Balcetis, & Dunning, 2013), which can change the nature of processing of visual features and perceptions of such objects. Similarly, endogenous and exogenous factors may even bias attention, like fear increasing attentional flexibility, enhancing the ability to detect peripheral objects, which is contrary to the standard assumption that fear only narrows attention (Awh & Pashler, 2010). Current paradigms examining such rapid detection of objects and subsequent object recognition rely on assuming the independence of dorsal and ventral streams, with scarce focus on how they interact with one another and how emotion may modulate such interactions.

6.2.4 Other Emotions

The majority of visual perception research examines how fear interacts with processing, though some research has suggested that other emotions (particularly negative valence) evidence biases. One such study investigated how faces are perceived as fundamentally different depending on context in which the face was presented (Aviezer et al., 2008). Although this is notably not a study on object recognition, it nonetheless highlights how the visual system is context dependent. This occurs with something as vital as the accurate recognition and identification of other human faces. Emotion has marked effects on a number of cognitive processes and may have dissociable effects on object recognition during instances in which emotion states are congruent with predicted sensory input compared to when they are not.

6.2.5 Motivated Perception

In addition, research within the motivational domain has suggested an influence of motivational drive on perception. The implications of goals in biasing visual system processing is a particularly consequential possibility (Weber, 1996; Inbar, & Pizarro, 2009). For instance, within the social framework, moral goals have been shown to influence our decisions (Haidt, 2007), our attitudes (Helzer & Pizarro, 2011), our emotions (Haidt, 2003), and even the ‘popping out’ of salient words (Gantman & Van Bavel, 2014). Through a motivated perceptual lens, moral goals may facilitate object recognition of salient images. Research has also shown race-based processing may also rely on low-spatial frequency cues (Correll, Hudson, Guillermo, & Earls, 2017), suggesting some integration of social conceptualizations and expectancies into the ongoing visual perceptual process.

The effect of motivation on perception is not limited to the moral sphere. Research has also demonstrated goal- and action- driven effects on perception. For example, externally incentivizing a specific construal of ambiguous figures drives differences in reported encounters with the target construal (Balcetis & Dunning, 2006). A number of psychological studies have evidenced top-down effects informed by subjective value, race and stereotypes, and political climate. These types of social knowledge mark a high-level form of context, which has previously been implicated in object perception under isolated laboratory conditions. One study has suggested that outside of frequency, learning and biases in responses, perceptual dominance is explained by subjectively valued influences (Balcetis, Dunning, & Granot, 2012). Further, research done by Levin and Banaji (2006) revealed differences in the perceived lightness of faces matched on luminance. African Americans were seen as darker skinned compared to European-descent faces, a finding that has been attributed to top-down knowledge of facial featural differences between these two races (Levin & Banaji, 2006). Unfortunately, other research has corroborated findings suggesting racial stereotyping modulatory effects on perception. For instance, one study found that race of the target predicted erroneously firing a gun in a computer game, even when the incentive was structured to be accurate and shoot only targets who were holding a gun, not a tool (Correll, Park, Judd, & Wittenbrink, 2002). Another set of findings has demonstrated how a self-identified political group and government stability can alter perception of skin color, favoring lighter skin representations when the target is identified as a member of your political in-group (Caruso, Mead, & Balcetis, 2009) and under instances of in-group instability (Stern et al., 2016). Acquiring accurate social knowledge is important to individual functioning. However, this means that our socially determined biases may permeate into cognition and perception. These findings highlight how such knowledge constrains visual processing and biases in the direction of one’s goals or beliefs.

6.2.6 Conclusion

To synthesize the dominant themes so far, the amygdala is crucial for processing of emotional stimuli and is specifically sensitive to fear. Motivational works also provides an informative perspective on how individual internal states or internally valued goals can fundamentally alter perception. Research on the mechanisms that govern how fear impacts attention and object recognition rely on conflicting cortical processing routes, routes which preclude top-down contributions, and omit the early activation on prefrontal cortical areas of the brain. Consequently, more research is needed to establish and extend mechanistic understanding of the influence of fear. Fear imparts biases onto what we see, biases which can produce and reinforce maladaptive behavioral responses (e.g., always seeing a snake instead of a sock increases stress response, a process that is ultimately corrosive for the body). The biases and predictions we bring into our phenomenological experiences constrains what we see, especially in instances in which relevant objects or scenes are obfuscated in some way.

On a broader level, this literature pulls at intuition because we know the human brain to be a remarkable predictor (though we are objectively bad lay statisticians). We have adapted the ability to quickly detect threatening objects and to do so in the direction of greater false positives than negatives. Given what is known about the adaptability of the brain, its proclivity to make predictions (in terms of lessening energy costs), and the false positive bias, learned associations may be driving many of these predictions. Such predictions require descending neural processing, and like our conscious navigating of our complex, social environments, they are susceptible to errors and heuristically biased assumptions.

6.2.7 Limits to these approaches

While some evidence has converged on the predictive advantage of both emotional and motivational states, there is still an ongoing debate as to where exactly these differences exist. Attention, response biases and demand characteristics are each potentially contributing to findings (for a detailed review see Firestone & Scholl, 2016). Indeed, parsing through whether prior information biases early visual processing (e.g., V1) is a controversial topic. Though scientists would agree on many substantive evaluations of visual processing (e.g., parallel processing), prefrontal cortical access to V1 is one discrepancy that is challenging to resolve using reverse inference (imaging studies) and ineffectually controlled behavioral paradigms. Critics of the descending neural pathway view of object recognition suggest that evidence cited above may be primarily attributed to judgements, and that the scope of our current technologies limits the claims that can be made. For instance, imaging studies using functional magnetic resonance imaging (fMRI) often rely on patterns of activation and lack temporal resolution.

Moreover, it is completely uncontroversial to not the interconnected nature of the brain. A large number of studies utilizing neurophysiological or emotion/motivational methods have emphasized specific “centers” in the brain. However, it still remains unclear how selective these neural regions are. For instance, the amygdala was once thought to selectively attend to fear/threat stimuli (Davis, 1997). Recent research has suggested this may not be the case. Instead, the amygdala seems to come online for a number of stimulus features, including emotional saliency (not just fear) and novelty (Sander, Grafman, & Zalla, 2003). What about how we process faces? What makes this different than how we process objects? Additionally, dealing with assessing differences in perception versus judgement findings still marks an important and difficult task.

6.3 Cognitive and computational approaches

Again, the primary aim of the present chapter is to synthesize evidence across domains that hint at the penetrability of perception. Instead of being able to draw a hard line between cognition and perception, evidence continues to mount suggesting perceptual experiences are shaped by temporal and spatial predictions (Rohenkohl, Gould, Pessoa, & Nobre, 2014). The location of this influence, then, is the critical component in debate. Indeed, it would be uncontroversial to find that expectations shape judgements.

Instead of focusing on the physical regions of the brain that may allow for a particular sequence of neuronal firing, cognitive approaches are rooted in information processing theory. Namely, exogenous stimuli provide particular information that is then transduced and processed through the brain, much like a metaphorical computer. These approaches to object recognition have, like with models of instance theory (Hintzman, 1986), back-propagation (Rumelhart, Hinton, & Williams, 1986) and hierarchical models of associative memory (Fukushima, 1984), illustrated the benefit of representing structures in a way that makes mapping and generalizing patterns feasible. Though it is unlikely that human object recognition and computer vision converge, utilizing these models to combine them with theory of human vision may reveal interesting predictions and advances in the understanding of perception.

6.3.1 Cognitive Approaches

Before diving into cognitive theories of object recognition, it is first important to situate visual perception within the cognitive domain. Research within this domain focused on characteristics and patterns of visual perception (Gibson,1950) and dominate processing styles (Navon, 1977). These research enterprises have spurred a great deal of subsequent visual research and continue to inform models today. For instance, Navon (1977) explicated how processing of global features precedes that of local features. This perspective directly maps onto other theories that consider the predictive mind and the quick processing of coarse information through M-pathway channels.

Early cognitive theories of object recognition were grounded in information processing perspectives. One such perspective came from Biederman (1987) where he posited that objects are made up of reducible units called ‘geons’. These units provide that foundation for the visual system to build up and recognize more complex objects and scenes. This view, much like early semantic models (see McClelland & Rumerlhart, 1981), relied on basic feature detection as the mechanism that allows for complex combinations and processing of visual information. From this perspective, object attributes like size and location do not appear to be integral to the recognition process. For example, researchers in a priming study found that object representations were not affected by removal of attributes or alteration of left-right orientation, suggesting that identification is occurring on the geon level (Biederman & Cooper, 1991; Hummel & Biederman, 1992). Further, controlled cognitive processes (e.g., semantic) cannot account for differences in priming effects. Instead, matched exemplars do not see a recognition advantage (Biederman & Cooper, 1991). While these approaches have yielded important results, there are still limitations in the organization of these attributes and geons, perspective and field of vision variations, and the concentration on a traditional bottom-up sequence that is triggered by individual stimulus properties.

Several other factors contribute to and moderate successful object recognition. Some work has attempted to extend basic units for recognition approaches by including subliminal priming. These approaches allow for the more specific understanding of how object representations reach the conscious level. For example, without inclusion of semantic effects, visual subliminal priming facilitated object recognition, even in instances in which the objects location had changed (Bar & Biederman, 1998). Other work has demonstrated the effects of color on object recognition, which highlight some important features. Namely, color did not facilitate the identification and recognition of objects that were manufactured or that did not naturally occur in the presented color scheme (Humphrey, Goodale, Jakobson, & Servos, 1994). These findings underscore two important concepts; first, the cascade of processing that allows for integration of more complex information (including color) over the series, and second, the importance of expectation in the identification process. Colors that did not match their typical or predicted form did not facilitate the recognition process (Humphrey et al., 1994). Evidence in this direction supports the notion that prior and expectancies are informing ongoing visual perception. Encountering a mismatch of expectation requires more effortful, controlled processing to make sense of the prediction error.

6.3.2 Computational Approaches

Computational models of object recognition include a variety of methods and purposes. Many of these recent models primarily focus on error-reduction or variance mapping as a means to achieve a specific outcome, with little care for cognitive or neurophysiological theories. However, even these models still enable interesting, fruitful tests of object perception and cognitive penetration.

Attneave (1954) first emphasized that the primary role of visual perception is to process relevant information. From there, he claimed, it becomes clear how repetitive and interdependent the majority of our visual experience is, and how these associations allow for perceptual processes to incorporate higher-order information, as it is purely economical to do so. In spite of this perspective, much of the literature still focused on ascending, feed-forward processing. For example, Marr (1982) developed a prominent theoretical approach to object recognition that emphasized computational methods, the complexity of constructing 3-D representations, and the bottom-up nature of processing stages. Additionally, Marr (1982) emphasized the influence of viewpoint on object perception. Indeed, it seems that viewpoint is an important factor for the object identification and not the object categorization process (for review see Milivojevic, 2012). Again, the bottom-up approach has illuminated a number of facets of the visual perceptual process that are important but is fundamentally excluding how top-down processes interact. To understand how we utilize prior knowledge, how it is integrated, descending neural pathways and the corresponding information must be included.

As mentioned above, some computational models have focused less on updating or integrating a cognitive theory of visual perception, instead favoring an outcome-related, engineering approach. Machine learning and computer algorithms have given rise to research devoted to creation of technological advances in the categorization and decoding of objects. One study found that stimulus representation cortical patterns can predict the contents of sleep imagery by correlating patterns of hallucinations during sleep with specific patterns of stimuli representations while awake (Horikawa, Tamaki, Miyawaki, & Kamitani, 2013). Another interesting study reconstructed faces by correlating trained faces with patterns of voxel activity (Cowen, Chun, & Kuhl, 2014). However, unlike objects, faces do not contain much variance in the general structure, which makes conservation of integral information after a primary component analysis much more straightforward. These studies highlight how computer-driven methodology can produce meaningful results, but without a theoretical foundation, how these results can inform ongoing debates on human vision is often obfuscated. How are these patterns of neural activity representing different features of objects or faces? How does viewpoint and an object’s position in space and time affect perception and recognition? These questions are fundamental to the understanding of complex object recognition, and necessary to the question of integration of expectation and prediction.

Other computational models have focused more specifically on the combination of theory and empirical evidence. Such models, like predictive coding models (Friston, 2010; Clark 2013), utilize Bayesian theory, and generative models to propose hierarchical perceptual processing that is integrated with descending connections from high-order cortical structures. Instead of purely processing in a feed-forward manner, the human brain is constantly maintaining a representation of the external environment that is informed by past experience, motivations and emotions, memory, and object values. Predictive coding purposes that optimization of perception and action relies on the minimization of prediction errors with recurrent loops (Friston, 2008; Friston, 2010). This idea harkens back to neurophysiological models of the dynamic, iterative brain (Cunningham & Zelazo, 2007). Here, the predictive coding model informs both computational theories by introducing ways in which information is represented (e.g., prediction error) and how those signals are integrated into ongoing processes.

Predictive coding reveals how the brain may be economically reducing the processing power required to manage the massive amounts of sensory information. This information is quickly assessed to allow appropriate responding to environment momenta. By understanding that the brain is relying on presuppositions about the organization and probabilities of specific pieces of information, we can begin to make sense of the disparate findings within the psychological study of vision. Whereas bottom-up, hierarchical processes explain how representation units are passed from one area to the next when encountering input, descending pathways hold information about expectations, predictions, and incorporate error (Clark, 2013). This approach parallels models of associative learning whereby bottom-up learning provides necessary cues for encoding, but retrieval is not perfect. Instead, retrieval is related to a number of environmental aspects of the encounter. Research suggests that frequency (Tulving, 1972), emotional value (Carstensen, Fung & Charles, 2003; Teasdale & Russell, 1983), and many other features (for another example, see Storbeck & Clore, 2005) of both internal and external experience overlap and account for variance in accurate retrieval.

Still evidence has not adequately delineated to what extent and at what level priors and expectations may influence perception and object recognition. Bayesian priors have oft been utilized as a means for incorporating high-level information into processing. But the question of where the priors interact still remains. Moreover, the distinction between types of top-down influences and cognitive penetration may be an important feature to explore (Hohwy, 2017). What kind of information are they holding? Hohwy (2017) describes the ways in which the minimization of prediction error coupled with Bayesian priors can lead to instances of cognitive penetrability. Indeed, there is a theoretical requirement and a practical one to the inclusion of error minimization processes in cases in which expectations are especially strong to encounter particular stimuli. Otherwise, it is difficult to reconcile how information is learned so that it may be integrated into an expectation.

Further, the iterative nature of processing allows for prediction and corroborating (or not) evidence to occur time and time again. Much like associative memory models, the very repetition across contexts can allow for the decoupling of the stimulus in the original environment but holds onto the cooccurrence of an object and the surrounding environment. The goal of perception is to be accurate on a global level, meaning prediction errors that occur in response to visual illusions should be considered functional for the minimization of error over time (Lupyan, 2015; Purves et al., 2011). Although it is indeed true that low-level, sensory signals are necessary input for the process, the synthesizing of visual input is not passive. Instead, perception has been called a “constructive process of turning various forms of energy (mechanical, chemical, electromagnetic) intoinformationuseful for guiding behavior” (Lupyan, 2015). Moreover, recent research in the emotion field has suggested that language is a major contributor to the emotional cascade. This sentiment has been paralleled in other perceptional processes (Nook et al., 2017). Lupyan and Clark (2015) have proposed that language plays a vital role in visual perceptual processes as well. If language is one top-down constraint on perception, a number of mixed findings within cultural psychology can begin to merge. This represents an informative and interesting perspective for the many cognitive processes that require prediction and perception.

6.3.3 Conclusion

In sum, cognitive and computational approaches have incorporated findings from neurophysiology to support understanding the process by which visual object recognition occurs. Cognitive approaches have demonstrated specific units that are represented in the cascade, uncovering the increasingly complex series. Two-dimensional surfaces, patterns, colors, edges, and many other aspects of objects all inform the visual system and aid in the overall identification process. Computational approaches allow for modelling of unobserved phenomena that has extended what was previously understood about the hierarchical nature of the brain. These models have demonstrated how prediction and prediction error can make sense of our complex perceptual system.

6.3.4 Limits to these approaches

However, these approaches are also subject to limitations. Namely, cognitive studies which reduce objects to basic units suffer much of the same problem that neurophysiological findings do, they are constrained by only studying aspects in isolation. For instance, Biederman (1991) cannot account for a number of naturally occurring visual “environments”. Point of view, field of vision, emotional or motivational state, attentional biases, and more all interact with the most fundamental features of visual object recognition. Computational cognitive approaches run into slightly different issues. Most models require training from set of stimuli, which can lead to biases. Further, this research has given rise to the technologies that have creeped into nearly every facet of our daily lives. Facial recognition is used to unlock smartphones, and while this may be an efficient means for accessing a handheld device, there are still a number of implications. Importantly, the machine learning training sets may be systematically biased, leading to a biased algorithm and subsequent codification systems that rely on this type of data synthesis.

6.4 Conclusion and Implications

Both the cognitive and the neuropsychological approaches to studying the visual system and object recognition have revealed unique solutions and more interesting questions. Yet, there is much still to be learned and integrated across these domains. Indeed, computational cognitive models have been exceedingly informative, particularly in regard to application in computer vision. What can be gleaned from integrating these approaches to object recognition?

The models and approaches mentioned were by no means exhaustive. Instead, the purpose of the present chapter was to investigate what perceptual and object recognition evidence exists within both neurophysiological and cognitive studies that provides a foundation for conceptualizing the human brain as a predictive machine. Neurophysiology has demonstrated where information is entering, how it proceeds, and demarcated what cells and areas are doing distinct work. However, a number of these models do not incorporate theory on dynamic, integrated processing, instead focusing on isolated units. Cognitive theories diverge on this point. The center around extensive information processing formulations. Though some still consider modular accounts of cognition, the orientation toward process is valuable and something to be incorporated in future research.

Informed by neurophysiological findings, we see advances in abnormal mind perception. From schizophrenia (Butler, Silverstein, & Dakin, 2008) to autism spectrum disorder (Dakin & Frith, 2005; Grice et al., 2001), the etiological underpinnings of nonconforming minds may be better understood by defining specializations of specific cortical areas in the brain. Studies that are concerned with representations and information processing have been particularly helpful in the technological domain as computers work like this. For example, major advances in computer vision have been spearheaded by cognitive computational models (Brown, 1985; Ullman et al., 2016). Recently, attempts at computer vision have shifted to convolution networking to better match the complexity and accuracy achieved by the human brain (Simonyan & Zisserman, 2014). Again, there are clear applications of this work and the theories that emerge through trials are potentially informative for a number of other phenomena. Namely, perception is one link to the consciousness. Understanding how prediction and sensory information are integrated in our brain can provide a meaningful step toward solving problems of consciousness.

The biases that predictions and expectations produce exist elsewhere (such as visual attention) and have profound effects on downstream cognitive processes like judgments and behavior. For instance, evidence clearly shows that biases influence important social customs like eye-witness reporting (MacLeod, 2002; Storbeck & Clore, 2005). The consequences of these decisions are clear and understanding how emotion regulates and biases affective feelings and associations may contribute to understanding how to recalibrate social systems where a great deal of the ultimate decision depend on individual accounts.

6.5 References

Adolphs, R., & Spezio, M. (2006). Role of the amygdala in processing visual social stimuli.Progress in brain research,156, 363-378.

Alpers, G. W., & Gerdes, A. (2007). Here is looking at you: emotional faces predominate in binocular rivalry.Emotion,7(3), 495.

Alpers, G. W., Ruhleder, M., Walz, N., Mühlberger, A., & Pauli, P. (2005). Binocular rivalry between emotional and neutral stimuli: A validation using fear conditioning and EEG.International Journal of Psychophysiology,57(1), 25-32.

Amaral,J.L.Price,A.Pitkänen,S.T.Carmichael (1992). Anatomical organization of the primate amygdaloid complex. J.P.Aggleton(Ed.), The Amygdala:Neurobiological Aspects of Emotion, Memory, and Mental Dysfunction,Wiley-Liss,New York(1992), pp.1-66

Amodio, D. M., Zinner, L. R., & Harmon-Jones, E. (2007). Social psychological methods of emotion elicitation.Handbook of emotion elicitation and assessment, 91.

Attneave, F. (1954). Some informational aspects of visual perception.Psychological review,61(3), 183.

Aviezer, H., Hassin, R. R., Ryan, J., Grady, C., Susskind, J., Anderson, A., … & Bentin, S. (2008). Angry, disgusted, or afraid? Studies on the malleability of emotion perception. Psychological science, 19(7), 724-732.

Awh, E. & Pashler, H. (2000). Evidence for split attentional foci. Journal of Experimental Psychology: Human Perception and Performance, 26, 834-846.

Balcetis, E. & Dunning, D. (2006). See what you want to see: Motivational influences on visual perception. Journal of Personality and Social Psychology, 91, 612-625.

Balcetis, E., Dunning, D., & Granot, Y. (2012). Subjective value determines initial dominance in binocular rivalry.Journal of Experimental Social Psychology,48(1), 122-129.

Balcetis, E. (2016). Approach and avoidance as organizing structures for motivated distance perception.Emotion Review,8(2), 115-128.

Balcetis, E., & Dunning, D. (2010). Wishful seeing: More desired objects are seen as closer.Psychological science,21(1), 147-152.

Bar, M. (2000). A Cortical Mechanism for Triggering Top-Down, Journal of Cognitive Neuroscience. 15:4, pp. 600–609.

Bar, M. (2003). A cortical mechanism for triggering top-down facilitation in visual object recognition. Journal of Cognitive Neuroscience, 15, 600-609.

Bar, M., Kassam, K. S., Ghuman, A. S., Boshyan, J., Schmid, A. M., Dale, A. M., Hamalainen, M. S., Marinkovic, K., Schacter, D. L., Rosen, B. R., & Halgren, E. (2006). Top-down facilitation of visual recognition. Proceedings of the National Academy of Sciences, 103, 449-454.

Bar, M., & Biederman, I. (1998). Subliminal visual priming.Psychological Science,9(6), 464-468.

Barrett, L. F., Bar, M., (2009). See it with feeling: affective predictions during object perception. Philosophical Transactions of The Royal Society. 1325–1334.

Bradley, M. M., Sabatinelli, D., Lang, P. J., Fitzsimmons, J. R., King, W., & Desai, P. (2003). Activation of the visual cortex in motivated attention.Behavioral neuroscience,117(2), 369.

Brown, S. W. (1985). Time perception and attention: The effects of prospective versus retrospective paradigms and task demands on perceived duration.Perception & Psychophysics,38(2), 115-124.

Biederman, I. (1987). Recognition-by-components: a theory of human image understanding.Psychological review,94(2), 115.

Biederman, I., & Cooper, E. E. (1991). Priming contour-deleted images: Evidence for intermediate representations in visual object recognition.Cognitive psychology,23(3), 393-419.

Britton, J. C., Taylor, S. F., Sudheimer, K. D., & Liberzon, I. (2006). Facial expressions and complex IAPS pictures: Common and differential networks. NeuroImage, 31(2), 906–919. https://doi.org/10.1016/j.neuroimage.2005.12.050

Bullier, J. (2001). Integrated model of visual processing. Brain Research. Brain Research Reviews, 36(2/3), 96–107.

Burr, D. C., Morrone, M. C., & Ross, J. (1994). Selective suppression of the magnocellular visual pathway during saccadic eye movements.Nature,371(6497), 511.

Butler, P. D., Silverstein, S. M., & Dakin, S. C. (2008). Visual perception and its impairment in schizophrenia.Biological psychiatry,64(1), 40-47.

Campbell, J. I., & Thompson, V. A. (2012). MorePower 6.0 for ANOVA with relational confidence intervals and Bayesian analysis.Behavior research methods,44(4), 1255-1265.

Carstensen, L. L., Fung, H. H., & Charles, S. T. (2003). Socioemotional selectivity theory and the regulation of emotion in the second half of life.Motivation and emotion,27(2), 103-123.

Caruso, E. M., Mead, N. L., & Balcetis, E. (2009). Political partisanship influences perception of biracial candidates’ skin tone.Proceedings of the National Academy of Sciences,106(48), 20168-20173.

Chen, C. M., Lakatos, P. S., Shah, A. S., Mehta, A. D., Givre, S. J., Javitt, D. C., &

Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and brain sciences, 36(3), 181-204.

Clore, G. L., & Huntsinger, J. R. (2007). How emotions inform judgment and regulate thought. Trends in Cognitive Sciences, 11(9), 393–399. https://doi.org/10.1016/j.tics.2007.08.00

Clore, G. L., Gasper, K., & Garvin, E. (2001). Affect as information.Handbook of affect and social cognition, 121-144.

Cole, S., Balcetis, E., & Dunning, D. (2013). Affective signals of threat increase perceived proximity.Psychological science,24(1), 34-40.

Correll, J., Park, B., Judd, C. M., & Wittenbrink, B. (2002). The police officer’s dilemma: Using ethnicity to disambiguate potentially threatening individuals.Journal of personality and social psychology,83(6), 1314.

Correll, J., Hudson, S. M., Guillermo, S., & Earls, H. A. (2017). Of kith and kin: Perceptual enrichment, expectancy, and reciprocity in face perception.Personality and Social Psychology Review,21(4), 336-360.

Cowen, A. S., Chun, M. M., & Kuhl, B. A. (2014). Neural portraits of perception: reconstructing face images from evoked brain activity. Neuroimage, 94, 12–22.

Cunningham, W. A., & Zelazo, P. D. (2007). Attitudes and evaluations: A social cognitive neuroscience perspective.Trends in cognitive sciences,11(3), 97-104.

Dakin, S., & Frith, U. (2005). Vagaries of visual perception in autism.Neuron,48(3), 497-507.

Davis, M. (1997). Neurobiology of fear responses: the role of the amygdala.The Journal of neuropsychiatry and clinical neurosciences.

Descartes, R. (1984).The philosophical writings of Descartes: Volume 3, the correspondence(Vol. 3). Cambridge University Press.

DeSteno, D., Dasgupta, N., Bartlett, M. Y., & Cajdric, A. (2004). Prejudice from thin air: The effect of emotion on automatic intergroup attitudes. Psychological Science, 15(5), 319-324.

DiCarlo, J. J., Zoccolan, D., & Rust, N. C. (2012). How does the brain solve visual object recognition?.Neuron,73(3), 415-434.

Easterbrook, J. A. (1959). The effect of emotion on cue utilization and the organization of behavior. Psychological Review, 66, 183-201.

Esses, V. M., & Dovidio, J. F. (2002). The role of emotions in determining willingness to engage in intergroup contact. Personality and Social Psychology Bulletin, 28(9), 1202-1214.

Felleman, D. J., & Van, D. E. (1991). Distributed hierarchical processing in the primate cerebral cortex.Cerebral cortex (New York, NY: 1991),1(1), 1-47.

Fenske, M. J., Aminoff, E., Gronau, N., & Bar, M. (2006). Top-down facilitation of visual object recognition: object-based and context-based contributions.Progress in brain research,155, 3-21

Firestone, C., & Scholl, B. J. (2016). Cognition does not affect perception: Evaluating the evidence for" top-down" effects.Behavioral and brain sciences,39.

Fitzpatrick D, Itoh K, Diamond IT. The laminar organization of the lateral geniculate body and the striate cortex in the squirrel monkey (Saimiri sciureus). J Neurosci. 1983;3:673–702. PubMed PMID: 6187901.

Forgas, J. P. (2013). Don’t worry, be sad! On the cognitive, motivational, and interpersonal benefits of negative mood.Current Directions in Psychological Science,22(3), 225-232.

Friston, K. (2008). Hierarchical models in the brain. PLoS computational biology, 4(11), e1000211.

Friston, K. (2010). The free-energy principle: a unified brain theory?. Nature Reviews Neuroscience, 11(2), 127.

Friston and Clark

Gable, P. A. & Harmon-Jones, E. (2010). The blues broaden, but the nasty narrows: Attentional consequences of negative affects low and high in motivational intensity. Psychological Science, 21, 211-215.

Gantman, A. P., & Van Bavel, J. J. (2014). The moral pop-out effect: Enhanced perceptual awareness of morally relevant stimuli.Cognition,132(1), 22-29.

Gibson, J. J. (1950). The perception of the visual world.

Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and action.Trends in neurosciences,15(1), 20-25.

Goodale, M. A., & Westwood, D. A. (2004). An evolving view of duplex vision: separate but interacting cortical pathways for perception and action.Current opinion in neurobiology,14(2), 203-211.

Grice, S. J., Spratling, M. W., Karmiloff-Smith, A., Halit, H., Csibra, G., de Haan, M., & Johnson, M. H. (2001). Disordered visual processing and oscillatory brain activity in autism and Williams syndrome. Neuroreport, 12(12), 2697-2700.

Haidt, J. (2003). The moral emotions.Handbook of affective sciences,11(2003), 852-870.

Haidt, J. (2007). The new synthesis in moral psychology.science,316(5827), 998-1002.

Harmon-Jones, E., & Gable, P. A. (2008). Incorporating motivational intensity and direction into the study of emotions: Implications for brain mechanisms of emotion and cognition-emotion interactions.Netherlands Journal of Psychology,64(4), 132-142.

Harmon-Jones, E., Gable, P. A., & Price, T. F. (2013). Does Negative Affect Always Narrow and Positive Affect Always Broaden the Mind? Considering the Influence of Motivational Intensity on Cognitive Scope. Current Directions in Psychological Science, 22(4), 301–307. https://doi.org/10.1177/0963721413481353

Harmon-Jones, E., & Gable, P. A. (2017). On the role of asymmetric frontal cortical activity in approach and withdrawal motivation: An updated review of the evidence. Psychophysiology, (May 2016). https://doi.org/10.1111/psyp.12879

Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2002). Human neural systems for face recognition and social communication. Biological Psychiatry, 51(1), 59–67.

Helfrich, R. F., Huang, M., Wilson, G., & Knight, R. T. (2017). Prefrontal cortex modulates posterior alpha oscillations during top-down guided visual perception.Proceedings of the National Academy of Sciences,114(35), 9457-9462.

Helzer, E. G., & Pizarro, D. A. (2011). Dirty liberals! Reminders of physical cleanliness influence moral and political attitudes.Psychological science,22(4), 517-522.

Hintzman, D. (1986). Schema abstraction in a multiple-trace memory model. Psychological Review, 93, 411–428. Fukushima, K. (1984). A hierarchical neural network model for associative memory.Biological cybernetics,50(2), 105-113.

Hummel, J. E., & Biederman, I. (1992). Dynamic binding in a neural network for shape recognition.Psychological review,99(3), 480.

Horikawa, T., Tamaki, M., Miyawaki, Y., & Kamitani, Y. (2013). Neural decoding of visual imagery during sleep. Science, 340, 639–642.

Hohwy, J. (2017). Priors in perception: Top-down modulation, Bayesian perceptual learning rate, and prediction error minimization.Consciousness and cognition,47, 75-85.

Humphrey, G. K., Goodale, M. A., Jakobson, L. S., & Servos, P. (1994). The role of surface information in object recognition: Studies of a visual form agnosic and normal subjects.Perception,23(12), 1457-1481.

Inbar, Y., Pizarro, D. A., Knobe, J., & Bloom, P. (2009). Disgust sensitivity predicts intuitive disapproval of gays.Emotion,9(3), 435.

Johnson, M. H. (2001). Disordered visual processing and oscillatory brain activity in autism and Williams syndrome.Neuroreport,12(12), 2697-2700.

Kahneman, D. (2011).Thinking, fast and slow. Macmillan.

Kastner, S., Ungerleider, L. G. (2000). Mechanisms of visual attention in the human cortex.Annual review of neuroscience,23(1), 315-341.

Kringelbach, M. L., & Rolls, E. T. (2004). The functional neuroanatomy of the human orbitofrontal cortex: evidence from neuroimaging and neuropsychology.Progress in neurobiology,72(5), 341-372.

Kringelbach, M. L. (2005). The human orbitofrontal cortex: linking reward to hedonic experience.Nature Reviews Neuroscience,6(9), 691-702.

Kveraga, K., Boshyan, J., & Bar, M. (2007). Magnocellular projections as the trigger of top-down facilitation in recognition. Journal of Neuroscience, 27, 13232-13240

Lamme, V. A. & Roelfsema, P. R. (2000). The distinct modes of vision offered by feedforward and recurrent processing. Trends Neurosci. 23, 571–579.

Lang, P., Bradley, M., & Cuthbert, B. (1999). International Affective Picture System (IAPS): Technical manual and affective ratings. Gainesville, FL: The Center for Research in Psychophysiology, University of Florida.

Lang, P., & Bradley, M. M. (2007). The International Affective Picture System (IAPS) in the study of emotion and attention.Handbook of emotion elicitation and assessment,29.

Lazarus, R. S. (1982). Thoughts on the relations between emotion and cognition.American psychologist,37(9), 1019.

LeDoux, J. (1998). Fear and the brain: where have we been, and where are we going?.Biological psychiatry,44(12), 1229-1238.

LeDoux, S. F. (2002). Defining natural sciences.Behaviorology Today,5(1), 34-36.

Lerner, J. S., & Keltner, D. (2001). Fear, anger, and risk.Journal of personality and social psychology,81(1), 146.

Lerner, J. S., Small, D. A., & Loewenstein, G. (2004). Heart strings and purse strings: Carryover effects of emotions on economic decisions.Psychological science,15(5), 337-341.

Lupyan, G. (2015). Cognitive penetrability of perception in the age of prediction: Predictive systems are penetrable systems.Review of philosophy and psychology,6(4), 547-569.

Lupyan, G., & Clark, A. (2015). Words and the world: Predictive coding and the language-perception-cognition interface.Current Directions in Psychological Science,24(4), 279-284.

MacLeod MD. (2002) Retrieval-induced forgetting in eyewitness memory: forgetting as a consequence of remembering.Appl. Cognit. Psychol. 16:135–149.

Marr, D. (1982).Vision: A Computational Investigation Into. WH Freeman.

Maunsell, J. H., Nealey, T. A., & DePriest, D. D. (1990). Magnocellular and parvocellular contributions to responses in the middle temporal visual area (MT) of the macaque monkey.Journal of Neuroscience,10(10), 3323-3334.

McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: I. An account of basic findings.Psychological review,88(5), 375.

Merigan, W. H., & Maunsell, J. H. (1993). How parallel are the primate visual pathways?.Annual review of neuroscience,16(1), 369-402.

Milivojevic, B. (2012). Object Recognition Can Be Viewpoint Dependent or Invariant–It’s Just a Matter of Time and Task. Frontiers in computational neuroscience, 6, 27.

Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. Oxford: Oxford University Press.

Milner, A. & Goodale, M. (2008). Two visual systems re-viewed. Neuropsychologia, 46, 774-785.

Navon, D. (1977). Forest before trees: The precedence of global features in visual perception.Cognitive psychology,9(3), 353-383.

Nook, E. C., Sasse, S. F., Lambert, H. K., McLaughlin, K. A., & Somerville, L. H. (2017). Increasing verbal knowledge mediates development of multidimensional emotion representations.Nature Human Behaviour,1(12), 881.

O’Callaghan, C., Kveraga, K., Shine, J. M., Adams, R. B., & Bar, M. (2016). Convergent evidence for top-down effects from the “predictive brain”.Behavioral and Brain Sciences,39.

Öhman, A., & Mineka, S. (2001). Fears, phobias, and preparedness: Toward an evolved module of fear and fear learning. Psychological Review, 108, 483-522.

Öhman, A., Flykt, A., & Esteves, F. (2001). Emotion drives attention- Snakes in the grass. Journal of Experiemntal Psychology: General, 130(3), 466–478. https://doi.org/10.1037/AXJ96-3445.130.3.466

Öhman, A. (2005). The role of the amygdala in human fear: automatic detection of threat.Psychoneuroendocrinology,30(10), 953-958.

Perrett, D. I., & Oram, M. W. (1993). Neurophysiology of shape processing.Image and Vision Computing,11(6), 317-333.

Pessoa, L. & Adolphs, R. (2010). Emotion processing and the amygdala: From a ‘low road’ to ‘many roads’ of evaluating biological significance. Nature Neuroscience Reviews, 11, 77 -782.

Pessoa, L., McKenna, M., Gutierrez, E., & Ungerleider, L. G. (2002). Neural processing of emotional faces requires attention.Proceedings of the National Academy of Sciences,99(17), 11458-11463.

Phelps, E. A., Ling, S., & Carrasco, M. (2006). Emotion facilitates perception and potentiates the perceptual benefits of attention. Psychological Science, 17, 292-299.

Polyak S. Chicago: University of Chicago Press;The vertebrate visual system.1957.

Proffitt, D. R. (2006). Embodied perception and the economy of action.Perspectives on psychological science,1(2), 110-122.

Purves, D., Wojtach, W.T., & Lotto, R.B.. (2011) Understanding vision in wholly empirical terms.Proceedings of the National Academy of Sciences of the United States of America108(Suppl 3): 15588–15595.

Pylyshyn, 1999

Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex.Nature neuroscience,2(11), 1019.

Rohenkohl, G., Gould, I. C., Pessoa, J., & Nobre, A. C. (2014). Combining spatial and temporal expectations to improve visual perception.Journal of vision,14(4), 8-8.

Rolls, E.T. (1999). The Brain and Emotion. Oxford, UK: Oxford University Press.

Sabatinelli, D., Fortune, E. E., Li, Q., Siddiqui, A., Krafft, C., Oliver, W. T., … & Jeffries, J. (2011). Emotional perception: meta-analyses of face and natural scene processing. Neuroimage, 54(3), 2524-2533.

Sander, D., Grafman, J., & Zalla, T. (2003). The human amygdala: an evolved system for relevance detection.Reviews in the Neurosciences,14(4), 303-316.

Schmitz, T. W., De Rosa, E., & Anderson, A. K. (2009). Opposing influences of affective state valence on visual cortical encoding.Journal of Neuroscience,29(22), 7199-7207.

Schmolesky, M. (1995). The Primary Visual Cortex. Webvision: The Organization of the Retina and Visual System, 1–38. https://doi.org/NBK11524

Schroeder, C. E. (2007). Functional anatomy and interaction of fast and slow visual pathways in macaque monkeys. Cerebral Cortex, 17, 1561-1569.

Shi, C. & Davis, M. (2001). Visual pathways involved in fear conditioning measured with fear potentiated startle: behavioral and anatomic studies. Journal of Neuroscience, 21, 9844-55.

Shipp S, Zeki S. The organization of connections between areas V5 and V1 in macaque monkey visual cortex. Eur J Neurosci. 1989;1:309–332. PubMed PMID: 12106142.

Skelly, L. R., & Decety, J. (2012). Passive and motivated perception of emotional faces: Qualitative and quantitative changes in the face processing network. PLoS One, 7(6), e40371.

Siegel, M., Donner, T. H., Oostenveld, R., Fries, P., & Engel, A. K. (2008). Neuronal synchronization along the dorsal visual pathway reflects the focus of spatial attention. Neuron, 60, 709-719.

Simonyan & Zisserman, 2014

Somerville, L. H., Wagner, D. D., Wig, G. S., Moran, J. M., Whalen, P. J., & Kelley, W. M. (2013). Interactions between transient and sustained neural signals support the generation and regulation of anxious emotion. Cerebral Cortex, 23(1), 49–60. https://doi.org/10.1093/cercor/bhr373

Stefanucci, J. K., Proffitt, D. R., Clore, G. L., & Parekh, N. (2008). Skating down a steeper slope: Fear influences the perception of geographical slant.Perception,37(2), 321-323.

Stern, C., Balcetis, E., Cole, S., West, T. V., & Caruso, E. M. (2016). Government instability shifts skin tone representations of and intentions to vote for political candidates.Journal of personality and social psychology,110(1), 76.

Storbeck J, Clore GL. (2005). With sadness comes accuracy, with happiness, false memory: mood and the false memory effect.Psychol. Sci. 16:785–791.

Storbeck, J. & Clore, G. L. (2007). On the interdependence of cognition and emotion.

Cognition & Emotion, 21, 1212-1237

Storbeck, J. (2012). Performance Costs When Emotion Tunes Inappropriate Cognitive Abilities : Implications for Mental Resources and Behavior, 141(3), 411–416.

Storbeck, J., Dayboch, J., & Wylie, J. (2016). Fear broadens attention: Fear and happiness motivate attentional flexibility impairing split attentional foci. Emotion. Submitted.

Tapia, E., & Breitmeyer, B. G. (2011). Visual consciousness revisited: magnocellular and parvocellular contributions to conscious and nonconscious vision.Psychological Science,22(7), 934-942.

Teasdale, J. D., & Russell, M. L. (1983). Differential effects of induced mood on the recall of positive, negative and neutral words.British Journal of Clinical Psychology,22(3), 163-171.

Tiedens, L. Z., Unzueta, M. M., & Young, M. J. (2007). An unconscious desire for hierarchy? The motivated perception of dominance complementarity in task partners. Journal of personality and social psychology, 93(3), 402.

Tulving, E. (1972). Episodic and semantic memory.Organization of memory,1, 381-403.

Ullman, S., Assif, L., Fetaya, E., & Harari, D. (2016). Atoms of recognition in human and computer vision. Proceedings of the National Academy of Sciences, 113(10), 2744-2749.

Ungerleider, L. G., & Haxby, J. V. (1994). ‘What’and ‘where’in the human brain.Current opinion in neurobiology,4(2), 157-165.

Van Essen, D. C., Anderson, C. H., & Felleman, D. J. (1992). Information processing in the primate visual system: an integrated systems perspective.Science,255(5043), 419-423.

Weber, J. (1996). Influences upon managerial moral decision making: Nature of the harm and magnitude of consequences.Human Relations,49(1), 1-22.

Whalen, P. J., Rauch, S. L., Etcoff, N. L., McInerney, S. C., Lee, M. B., & Jenike, M. A. (1998). Masked presentations of emotional facial expressions modulate amygdala activity without explicit knowledge. Journal of Neuroscience, 18(1), 411–418. https://doi.org/9412517

Wiens, S., Peira, N., Golkar, A., & Öhman, A. (2008). Recognizing masked threat: Fear betrays, but disgust you can trust.Emotion,8(6), 810.

Wyatte, D., Curran, T., & O’Reilly, R. (2012). The limits of feedforward vision: recurrent processing promotes robust object recognition when objects are degraded.Journal of Cognitive Neuroscience,24(11), 2248-2261.

Zelazo, P. D., & Cunningham, W. A. (2007). Executive function: Mechanisms underlying emotion regulation.