What is an example of sensorimotor learning?
Neuron. Author manuscript; available in PMC 2017 Nov 23. Published in final edited form as: PMCID: PMC5131723 NIHMSID: NIHMS823763 The relationship between the brain and the environment is flexible, forming the foundation for our ability to learn. Here we review the current state of our
understanding of the modifications in the sensorimotor pathway related to sensorimotor learning. We divide the process in three hierarchical levels with distinct goals: 1) sensory perceptual learning, 2) sensorimotor associative learning, and 3) motor skill learning. Perceptual learning optimizes the representations of important sensory stimuli. Associative learning and the initial phase of motor skill learning are ensured by feedback-based mechanisms that permit trial-and-error learning. The
later phase of motor skill learning may primarily involve feedback-independent mechanisms operating under the classic Hebbian rule. With these changes under distinct constraints and mechanisms, sensorimotor learning establishes dedicated circuitry for the reproduction of stereotyped neural activity patterns and behavior. Many of our behaviors are modified through sensorimotor learning. Here we broadly define sensorimotor learning as an improvement in one’s
ability to interact with the environment by interpreting the sensory world and responding to it with the motor system. Let’s take an example of braking the car while driving in traffic. To perfect this task, one needs to learn the skill to accurately estimate the flow of traffic (perceptual learning; novices tend to focus on the car in front of them, while experts can selectively use a more diverse set of cues). When one identifies the slowing of the traffic, the visual information initiates a
motor program to brake the car (associative learning). They also improve the skill of manipulating the brake smoothly (motor skill learning; try braking with your left foot in an empty parking lot—you’ll be surprised.). As illustrated by this example, even a relatively simple behavior involves a multi-level learning process. Accordingly, this review discusses neural changes during sensorimotor learning in these three hierarchical levels. We note, however, that these levels are closely
intertwined with each other and often occur simultaneously. Therefore some mechanisms are likely shared across these levels. An unfortunate consequence of the broad scope of this review is that many studies or even systems that deserve attention had to be excluded. Despite this compromise, we hope that the broad scope helps us to underscore the distinct requirements of each step, which provide distinct constraints on the underlying neural mechanisms
(Figure 1). Three hierarchical levels of sensorimotor learning and their unique tasks. 1. Sensory Perceptual LearningLearning of sensorimotor behavior involves selective extraction and efficient processing of sensory information to generate an appropriate action. At the sensory processing stage, rich and multiplex information in the environment is transmitted to the sensory organs where attributes of sensory stimuli are transduced to electrical signals, such as action potentials. As the transduced signal reaches the central nervous system, cognitive factors actively determine what is sampled and what is ignored in the environment. In this vein, the perceptual stage of sensorimotor learning is a process of establishing optimal representations of external stimuli that are deemed to be meaningful, a process known as perceptual learning. This process involves changes in response properties of individual and populations of neurons. In this section, we review recent attempts to understand dynamic changes in sensory representations during perceptual learning, and discuss how these changes are implemented through alterations in operation modes of the underlying circuit. Nature of physiological changes during perceptual learningDespite decades of research, there is still a controversy as to where in the brain neurons change their response properties with perceptual enhancement during sensorimotor learning and whether and how such changes are causally linked to behavioral improvement. Experiments in visual psychophysics demonstrated that the improved perceptual ability is restricted to the trained stimulus feature (e.g. orientation) as well as the location in visual space. These results are often interpreted as evidence for the involvement of early stages of cortical visual processing, where neurons are highly selective to physical attributes of visual stimuli, have relatively small receptive fields, and the retinotopic organization is preserved. However, recent experiments using a newly developed double-training paradigm challenged this notion by demonstrating that the feature discrimination (e.g. contrast) ability can be transferred to a new retinal location if subjects were primed at the second location with a task-irrelevant feature (e.g. orientation) (Xiao et al., 2008). This observation indicates that perceptual learning may also involve changes in non-retinotopic higher brain areas. Neurophysiological mechanisms underlying these observations in psychophysics have been under intense scrutiny. Theoretical studies have proposed that changes in the tuning curve of individual neurons, such as sharpening, gain modulations, or shift in the peak in early stages of sensory processing could increase the neuron’s ability to discriminate similar stimuli (Teich and Qian, 2003) (Figure 2A). These theories are supported by several experimental studies, where neurons in V1 and V4 increase their selectivity to task-relevant stimuli (Poort et al., 2015; Schoups et al., 2001; Yan et al., 2014; Yang and Maunsell, 2004). Similar effects, such as increase in the tuning sharpness or expansion in the cortical area of representation, were also observed in primary sensory areas during frequency discrimination learning involving somatosensation or audition in owl monkeys (Recanzone et al., 1992; Recanzone et al., 1993). Other theories, however, have postulated that the enhanced behavioral performance is due to improved perceptual judgment in later stages of sensory processing. These theories propose that perceptual learning involves appropriate routing and weighting of the most informative inputs from the sensory processing stage to the decision stage, while neural properties in early sensory areas are unaltered (Law and Gold, 2009; Petrov et al., 2005). Consistently, during motion discrimination training in monkeys, little change was observed in motion-evoked responses in the middle temporal area, a motion-sensitive sensory area, but responses to task-specific motion stimuli emerged and gradually increased in the lateral intraparietal area, a region known to be involved in decision-making (Law and Gold, 2008). Furthermore, a more recent experiment showed minimal changes in stimulus discriminability of neural ensembles in mouse vibrissal primary somatosensory (vS1) during learning of a whisker-mediated object-localization task, also supporting such late-stage models (Peron et al., 2015). Emerging principles and changes in the circuit operation during perceptual learning (A) Changes in neural activity during perceptual learning. Left, changes in single neuron activity. Perceptual learning could involve changes in the tuning of individual neurons by increasing its sharpness or gain, or shifting its peak. Right, changes in population activity. Perceptual learning could enhance discriminability of stimuli by decreasing the trial-by-trial response fluctuations (σ), increasing the distance between mean responses (d) or changing noise correlations. Individual dots indicate single trials. Note that the changes in fluctuations and distance can be achieved by independent changes of single neurons, while noise correlation changes would require a coordination across neurons. (B) Perceptual learning could involve changes in the circuit operation. Learning-dependent suppression of distal dendritic inhibition (top) or perisomatic inhibition (bottom) could enhance the impact of top-down processing or the gain of principal neurons, respectively. Single neuron responses must be considered in the context of the underlying population activity structures. Recent simulation suggested that, at least within certain constraints, sharpening or amplification in the tuning of single neurons at early stages of sensory processing is neither necessary nor sufficient to improve population codes. For instance, sharpening of the tuning curve can be mediated by changes in intracortical connectivity, which can alter correlation statistics and lead to a large loss of information (Bejjanki et al., 2011; Series et al., 2004). To reconcile these issues, theoretical studies have proposed how different correlation structures could affect sensory coding. Enhanced discriminability during perceptual learning may, for example, depend on the relationship between two forms of correlation structures in the ensemble activity: similarity in tuning properties between a pair of neurons, known as signal correlation, and trial-by-trial response fluctuations to identical stimuli, known as noise correlation (Averbeck et al., 2006; Oram et al., 1998; Zohary et al., 1994). For similarly tuned neurons (i.e. positive signal correlation), reduction in noise correlations would increase the information about stimulus identity since the degree of overlap in firing rate distributions between two neurons decreases. Likewise, an increase in noise correlations in neurons with dissimilar tuning (i.e. negative signal correlation) would improve coding accuracy since common noise can be subtracted (Figure 2A) (Romo et al., 2003). Indeed, a recent study in songbirds found that, after auditory discrimination learning, larger signal correlations in cortical neurons coincided with smaller noise correlations for task-relevant auditory stimuli but not for task-irrelevant or novel stimuli (Jeanne et al., 2013). In contrast, two monkey studies demonstrated a reduction in noise correlations in neurons in the medial superior temporal area or V1 during perceptual learning (Gu et al., 2011; Yan et al., 2014). The reduction, however, was observed across a range of signal correlations and did not seem to be related to the improvement in coding fidelity. These discrepancies clearly point out that a unified account of the correlational nature of population-level changes underlying perceptual learning is yet to be achieved. Importantly, discrimination learning does not always improve the discriminability by neural ensembles. By monitoring odor representations by mitral cells in the mouse olfactory bulb, it was found that mitral cells became better at discriminating the odorants when mice were trained to discriminate between very similar odorants. However, when mice discriminated between very dissimilar odorants, counterintuitively, the representations of the two odorants gradually became more similar. This bidirectional effect was interpreted such that learning achieves an optimal separation of representations of familiar stimuli, balancing the robustness of discrimination and capacity of coding (Chu et al., 2016). Metabolic efficiency might be another major design principle that sensory systems aim to achieve during sensorimotor learning. Sparse coding, where information is represented by a relatively small number of spikes and/or neurons, is observed in different sensory modalities across a wide range of species (Brecht and Sakmann, 2002; DeWeese et al., 2003; O’Connor et al., 2010; Olshausen and Field, 2004; Perez-Orive et al., 2002). The reduction in population responses may be a common feature of learning-driven changes in population coding (see also the motor skill learning section below), which could reduce overlaps between representations in space and time and facilitate decoding by downstream areas (Laurent, 2002). Indeed, chronic tracking of the same neural population over sensorimotor learning demonstrated a decrease in the number of responsive neurons and/or magnitudes of responses to the same sensory stimuli (Chu et al., 2016; Gdalyahu et al., 2012; Makino and Komiyama, 2015). Generation of neural assemblies dedicated to learned behaviorRepresentations of behaviorally-relevant sensory stimuli are gradually stabilized through learning. Recent advances in two-photon calcium imaging permit long-term monitoring of the same neural population, providing insights into how sensory representations evolve over time. For instance, responses of neurons in the mouse V1 become more reliable and selective over the course of visual discrimination training (Poort et al., 2015). Similarly, representations of mouse vS1 neurons become more stabilized following a whisker-mediated object-localization task (Peron et al., 2015). Such a learning-dependent stabilization of activity patterns is one of the emergent properties observed in many brain areas, including motor cortex (Huber et al., 2012; Peters et al., 2014). These processes are likely facilitated by synaptic plasticity whereby interconnected subnetworks are formed to generate learned activity patterns. For instance, neurons sharing similar receptive field properties are more likely to be connected (Cossell et al., 2015; Ko et al., 2011; Lee et al., 2016; Wertz et al., 2015) and these features emerge upon eye opening (Ko et al., 2013). Sensory experience further refines the circuit by pruning connections between visually non-responsive neurons (Ko et al., 2014), suggesting that repeated exposure to natural statistical features, together with intrinsic spontaneous activity, establishes a dedicated neural circuit for sensory processing. Stable representations with low trial-to-trial variability might help fine discrimination through a more robust readout of task-relevant information by downstream neurons. For instance, perceptual grouping of different mixture ratios of tones or odors may be achieved via attractor-like, discrete representations of neural assemblies. In this scheme, representations within the same category share similar and highly reproducible neural trajectories in a high dimensional state space while representations across categories diverge their response dynamics (Bathellier et al., 2012; Niessing and Friedrich, 2010). Importantly, these distinct categorical representations can predict the performance of perceptual grouping (Bathellier et al., 2012). Inhibitory circuits in perceptual learningThe changes in sensory representations during perceptual learning likely involve a variety of mechanisms, among which inhibitory circuits have garnered considerable attention in recent years. This was partially due to the development of genetic tools to identify and manipulate specific subtypes of inhibitory neurons. Inhibition is mediated by the neurotransmitter GABA, which shapes the activity of principal glutamatergic neurons in space and time. Inhibition contributes to gain modulations by altering the slope of the input-output function. It can also sharpen tuning curves of principal neurons by suppressing responses to non-preferred stimuli through an increase in spike threshold (‘iceberg effect’). These two parameters, changes in gain and sharpening of the tuning curve, are two of the aforementioned potential mechanisms to increase the individual neuron’s ability to discriminate similar stimuli (Figure 2A). Consistent with these notions, activation of parvalbumin (PV)-expressing inhibitory neurons in the mouse visual cortex sharpens orientation tuning and improves behavioral discrimination of similarly oriented visual stimuli (Lee et al., 2012). Furthermore, in the mouse olfactory bulb, local GABAergic neurons contribute to pattern separation of similar odors in mitral/tufted cells and enhance discrimination performance of the animal (Gschwend et al., 2015). Together with the theoretical support, it is possible that these inhibitory neurons play an active role in enhancing the principal neurons’ discriminability of stimuli during perceptual learning. Longitudinal recording from genetically defined inhibitory neural populations over learning will be a useful approach to test this idea. GABAergic inhibitory neurons are highly heterogeneous in morphology, physiological properties and gene expression. By regulating distinct subcellular compartments of principal neurons, different subtypes of inhibitory interneurons may function to regulate the flow of information (Chen et al., 2013; Kepecs and Fishell, 2014; Lovett-Barron et al., 2014). For example, PV-expressing Basket cells or Chandelier cells modulate gain through its inhibitory action on perisomatic regions or axon initial segments (Atallah et al., 2012; Wilson et al., 2012). Somatostatin (SOM or SST)-expressing Martinotti cells inhibit distal dendrites of principal neurons and may regulate inputs carried by long-range projections (Gentet et al., 2012). It is likely that different types of learning involve distinct changes in inhibitory network activity and computations in order to gate or route various incoming signals. For instance, auditory associative fear learning in mice was associated with cholinergic activation of layer 1 inhibitory interneurons, which then suppress layer 2/3 PV inhibitory neurons. The resulting disinhibition of the feedforward drive could enhance cortical representations of sensory information by increasing the gain of principal neurons (Figure 2B) (Letzkus et al., 2011). In contrast, the increased influence of non-sensory information in mouse V1, likely carried by long-range feedback inputs, coincided with the reduced activity of SOM interneurons (Figure 2B). Artificial reactivation of SOM interneurons partially reversed the learning-related change in principal neuron activity (Makino and Komiyama, 2015). These results are consistent with the notion that SOM inhibitory interneurons act as a gate for long-range inputs and that this gate can be flexibly adjusted by learning. Unraveling how distinct types of inhibitory neurons interact with each other to modulate the firing pattern of individual principal neurons and their population correlation structures during learning is an important future direction. Bottom-up and top-down processing during perceptual learningSo far, we have discussed changes in sensory representations during perceptual learning within a local circuit. However, neurons receive convergent inputs from other brain areas and inter-areal interactions likely play important roles for perceptual learning. For instance, it is now evident that sensory processing involves intricate interactions of concurrent streams of information flow, one from the environment in a bottom-up manner and the other from higher-order brain areas in a top-down manner (Figure 2B). Even neurons in early stages of sensory processing may therefore be subject to influences of contexts and cognitive factors, which could profoundly modify their receptive field properties. Traditionally, it has been considered that perceptual learning is mainly driven by bottom-up processes. For example, psychologists showed that passive tactile stimulation of human fingers improved two-point discrimination (Godde et al., 2000). Likewise, mere exposure to task-irrelevant stimuli that are below subject’s detection threshold (i.e. without their awareness) improved task performance when subjects were tested subsequently (Watanabe et al., 2001). These studies have often been used as evidence that bottom-up information processing is sufficient to induce sustainable changes in the brain to improve behavioral performance, under the assumption that top-down processing is disengaged during passive or subthreshold experience. Recent studies, however, provide an alternate view advocating that top-down processing, such as attention, expectation and motor commands, is an essential component of perceptual learning. In this view, it is argued that perceptual learning could dynamically switch the operation modes of downstream circuits according to ongoing behavioral requirements (Gilbert and Li, 2013). For instance, neurons in monkey V1 exhibit stronger top-down mediated contextual modulations after training with a three-line bisection task, where the subjects were asked to report which of the two reference lines was closer to the central line (Crist et al., 2001). In mouse V1, enhanced orientation discriminability by neural populations was diminished when mice were disengaged from the task, further supporting the importance of top-down control in learning (Poort et al., 2015). In addition, attention can rapidly control gain of single neurons (Reynolds et al., 2000) or change interneuronal correlations (Cohen and Maunsell, 2009) on a moment-by-moment basis. For example, it was recently shown that attention can increase or decrease noise correlations in monkey V4 depending on whether neurons provide evidence for the same or opposite stimulus choices in a contrast discrimination task (Ruff and Cohen, 2014), in a manner similar to how learning alters the relationship between signal and noise correlations (Jeanne et al., 2013). These acute top-down modulations are somewhat distinct from the traditional notion of perceptual learning, but they can underlie the improved perceptual discriminability during learning. Furthermore, in the primate, neurons in V1 produce sparse responses when images spanning non-classical receptive fields are included (Vinje and Gallant, 2000). This well-known phenomenon of surround suppression may be explained by the predictive coding scheme, whose goal is to reduce redundancy by removing predictive components of the input by top-down modulation. In this scenario, higher brain areas with larger receptive fields can predict stimulus attributes on smaller receptive fields in lower brain areas because of statistical regularities in space inherent in natural scenes (Rao and Ballard, 1999). Learning of such regularities in the sensory environment may “explain away” bottom-up sensory representations by suppressing the activity in lower brain areas with the inhibitory machinery, which could lead to sparse coding and enhance metabolic efficiency. Understanding the circuit mechanisms by which top-down control selectively modifies single neuron properties or population structures during perceptual learning is an area of active investigation. Recent efforts to directly visualize and manipulate top-down processing provided compelling evidence that adaptive sensory representations require top-down processing. By expressing GCaMP in mouse vibrissal motor cortex and imaging the activity of their axons in vS1, feedback projections were shown to be functionally heterogeneous, including responses to touch or whisker movement (Petreanu et al., 2012). With similar approaches, responses of top-down inputs from mouse piriform cortex to olfactory bulb were shown to have various tuning properties, and these inputs contributed to decorrelation of mitral cell responses to odors (Boyd et al., 2015; Otazu et al., 2015). Chronic monitoring of top-down inputs during associative learning in mice showed enhancement of top-down influences from retrosplenial cortex to V1, possibly carrying information about the timing of the associated event (Figure 2B) (Makino and Komiyama, 2015). Interestingly, such signal may also be dependent on the cholinergic input (Chubykin et al., 2013), implying an additional mechanism involving changes in neuromodulation. In line with these studies, learning of an object localization task in mice led to initial enhancement in dendritic spine growth in the barrel cortex at layer 1, where top-down inputs make synaptic connections (Kuhlman et al., 2014). The causal link of top-down processing for perceptual tasks has also been demonstrated by microstimulation or pharmacological inactivation of top-down sources (Moore and Armstrong, 2003; Xu et al., 2012) or optogenetic manipulations of top-down axons (Manita et al., 2015; Zhang et al., 2014). These results confirm the importance of top-down processing in sensorimotor learning. Remaining questions in perceptual learningIt is important to synthesize these diverse physiological phenomena into a coherent conceptual framework. Receptive field properties of individual neurons are tightly related to the activity of other circuit components. For example, a better understanding of the roles of different subtypes of inhibitory neurons in learning-dependent changes would clarify how information is differentially routed through learning (Figure 2B). In addition, roles of inter-areal interactions involving bottom-up and top-down processing, including neuromodulatory systems, in regulating learning-related changes in inhibitory network activity or local correlation structures (Chen et al., 2015a; Fu et al., 2014; Nelson and Mooney, 2016; Zhang et al., 2014) need further investigation. Moreover, how the layered structure of the cortex integrates and segregates incoming information during learning is an important issue. Such an approach to reverse engineer the brain circuit underlying learning requires identification and perturbation of the activity of individual circuit elements dedicated to the task. It is also important to note that the changes in sensory representations during sensorimotor learning, including correlations of neural activity, should be ultimately discussed in light of the downstream readout mechanisms that are often unknown. Finally, although microcircuit dynamics during learning have been extensively studied in the recent years, it is equally important to understand how the meso- and macro-scopic dynamics influence sensory representations during sensorimotor learning (Wekselblatt et al., 2016). 2. Sensorimotor Associative LearningIn addition to the enhanced stimulus detection and discrimination discussed in the previous section, sensorimotor learning requires linking particular aspects of environmental stimuli with specific actions. This section discusses neural mechanisms related to the associative component of learning by focusing on cases in which conspicuously distinct stimuli are paired with motor responses that the subjects already know how to perform proficiently. Although conditioned reflexes such as fear conditioning belong to such a category, we will discuss mostly associative learning producing non-reflexive movements, as neural circuitry and mechanisms underlying conditioned reflexes are extensively dealt with in other recent reviews (Gründemann and Lüthi, 2015; Herry and Johansen, 2014; Mahan and Ressler, 2012; Maren et al., 2013). We first review neural circuits and activity changes involved in sensorimotor associative learning, and then neural mechanisms underlying those changes. Neural representation changes during sensorimotor associative learning: formation of dedicated pathways between sensory input and motor outputThe locus of arbitrary associative learning in mammalian nervous systems has been first inferred from human patients with brain lesions. For instance, damage to the human lateral frontal cortex resulted in a severe impairment in learning arbitrary sensorimotor associations without deficits in sensory discrimination or movements (Milner, 1982). To more precisely delineate the neural circuits involved in associative learning, subsequent studies employed controlled lesions in specific brain areas and/or axon bundles of non-human primates and measured the effect on learning arbitrary sensorimotor association. In a typical experiment, a set of sensory stimuli (e.g., different shapes of visual stimuli, different colors, etc.) was paired arbitrarily with a set of motor responses (e.g., gripping a stick versus touching a button, saccade to the left versus right). Learning such stimulus-response relationships by trial and error was impaired by lesions in diverse areas, including the dorsal premotor cortex (PMd), prefrontal cortex (PFC), connections between inferior temporal cortex and PFC, connections from the basal ganglia to the frontal cortex via thalamus, hippocampal formation, and fornix (Canavan et al., 1989; Gaffan and Harrison, 1988, 1989; Murray and Wise, 1996; Petrides, 1982; Rupniak and Gaffan, 1987). In contrast, lesion in the posterior parietal cortex, a region that has been widely implicated in perceptual decision-making process, did not compromise arbitrary associative learning, but instead impaired spatial control of movements, consistent with more recent acute perturbation results (Hwang et al., 2012; Rushworth et al., 1997). These findings motivated studies to examine neural activity changes in those identified brain areas during associative learning using the kind of tasks described above. The commonly observed learning-related change across areas including PMd, dorsolateral PFC, orbitofrontal cortex (OFC), amygdala, and the striatum is that neurons become active selectively for a particular stimulus, response, or response outcome over the course of learning (Asaad et al., 1998; Mitz et al., 1991; Pasupathy and Miller, 2005; Schoenbaum et al., 1998). Notably, in dorsolateral PFC, many neurons become active only for a particular sensory and motor combination (Asaad et al., 1998). For example, when monkeys had already associated a stimulus and leftward saccades, and then learned to associate a new stimulus with the same leftward saccades, some neurons became active in trials in which leftward saccades were made in response to the new stimulus, but not in response to the first stimulus. Such neurons recruited for a specific stimulus-response combination seem to be involved in creating a dedicated pathway between the newly paired sensory input and motor output. In contrast, neurons in OFC and amygdala appear to encode the valence of the stimulus irrespective of the nature of the stimulus-motor response combinations, as neurons in these regions show the same activity for different stimuli or for different responses as long as the stimuli predict the same outcome (e.g., reward) (Schoenbaum et al., 1998; Wallis and Miller, 2003). Therefore, OFC and amygdala might contribute to associative learning by providing predicted outcome information for the computation of reward prediction error (i.e., discrepancy between the actual and predicted reward), while PFC might actually build an express pathway between the learned sensory input and motor output (Schoenbaum et al., 2009; Wallis and Miller, 2003). Although neural changes related to associative sensorimotor learning might be similar across different brain areas (e.g., the emergence of selective activity for a specific stimulus-response combination, or selective activity for the predicted outcome), the temporal dynamics of neural changes could differ, hinting at a hierarchical order in learning-related changes, transfer of information between areas, and potentially different roles of those areas. For instance, in both dorsolateral PFC and striatum, as the animal’s association performance improved, neurons became more active for a specific stimulus-response pair between stimulus onset and response onset. Intriguingly, this association-selective activity developed earlier in the striatum than the dorsolateral PFC during the training, suggesting that rewarded associations are first identified by the basal ganglia, and the basal ganglia output may train slower learning mechanisms in PFC (Pasupathy and Miller, 2005). Different temporal dynamics were also found in the responses related to predicted outcomes in OFC and amygdala (Morrison et al., 2011). Neurons that predict aversive outcomes evolved during learning earlier in amygdala than OFC, whereas neurons that predict reward appeared earlier in OFC, suggesting complex inter-areal interactions underlying associative behaviors. In line with this view, lesions in one area reduced the expected outcome coding in the other (Rudebeck et al., 2013; Saddoris et al., 2005). While the studies mentioned above focus primarily on brain areas outside the primary sensory and motor areas, activity changes related to sensorimotor associative learning have also been reported in the primary regions. Some of neural changes in the primary areas may be attributable to concurrent perceptual enhancement or motor skill learning discussed in the other sections, but other changes seem to be related to the associative component of learning. As mentioned above, visual cortical neurons become more sensitive to top-down signal, anticipating the arrival of the associated event during associative learning (Makino and Komiyama, 2015). Additionally, in the primary motor cortex of macaques, neurons became sensitive to the visual features of stimulus such as colors after learning to associate different colors with different reaching movements (Zach et al., 2008). Neural mechanisms underlying associative learningThe previous section examined neural changes related to associative learning across various areas mostly in primate brains, after their involvement was inferred from gross lesion studies. This section reviews more recent discoveries revealing neural mechanisms leading to such neural changes by breaking down the learning process into three conceptual elements, i.e., exploration, reinforcement, and path optimization. Many of these new studies were conducted in non-primate animals in which advanced molecular tools for dissecting neural circuits such as optogenetics and cell-type specific labelling are available. Nonetheless, the majority of brain areas in discussion share functional homology between species, and our hope is that the principles we describe are general across species. ExplorationWhen first facing a new sensorimotor task, we do not necessarily know the defined set of action goals relevant to the task, but instead discover them by exploring our motor/action repertoire. During this behavioral exploration, not only different action goals are tested, but also various motor patterns to achieve the same goals are probed. In this section, we focus on the exploration of action goals. Explorations of motor patterns will be further discussed in the motor skill learning section. A number of brain areas appear to be involved in controlling exploration during sensorimotor association tasks. In macaques, neurons in the globus pallidus internus, the output structure of the basal ganglia, showed lower pre-movement activity during exploratory behavior, and higher activity during an exploitive phase of associative learning (Sheth et al., 2011). The supplementary eye field has also been implicated in promoting animals to explore alternative responses (Donahue et al., 2013). Enhanced exploration was accompanied by axonal bouton loss in mouse OFC neurons that project to the dorsomedial PFC, raising the possibility that the interconnectivity between the two areas might adjust the extent of exploration (Johnson et al., 2016). In humans, blood oxygen level dependent (BOLD) signals in the rostral PFC and the intraparietal sulcus increase in explorative trials during reinforcement learning (Daw et al., 2006). Neuromodulators also seem to play a role in controlling exploration. Activating locus coeruleus noradrenergic input to anterior cingulate cortex (ACC), likely suppressing ACC activity, enhanced explorative behaviors of rats (Tervo et al., 2014). The increased BOLD signal in the rostral PFC during exploration might be controlled, in part, by dopamine, as individuals with a gene allele that inefficiently breaks down dopamine in PFC tend to explore more than those with different alleles during learning (Frank et al., 2009). Further supporting the role of dopamine for exploration, blocking dopamine receptors in the macaque PFC reduced the monkey’s tendency to switch motor responses during associative learning (Puig and Miller, 2012, 2015). This dopamine-dependent exploration might be related to dopamine-dependent synaptic plasticity in PFC (Seamans and Yang, 2004). More specifically, in mouse PFC slices, LTP is absent in layer 5 pyramidal neurons due to GABAergic inhibition, but dopamine enables LTP by acting on D2 receptors on inhibitory interneurons and reducing GABAergic transmission to pyramidal neurons (Xu and Yao, 2010). Also, dopamine extends the temporal window of coincidence detection for LTP between pre and postsynaptic activation by acting on D1 receptors on pyramidal neurons (Xu and Yao, 2010). Thus, one possibility is that dopamine opens the window of plasticity in PFC, during which behavioral exploration is permitted. ReinforcementDuring exploration, the brain must fortify or weaken certain pathways to ultimately exploit the most effective pathway to achieve the desirable behaviors. A widely-hypothesized neural mechanism underlying this process is synaptic weight update in association areas based on reward prediction error (Pessiglione et al., 2006; Sutton and Barto, 1998). This learning mechanism has gained popularity since the finding that the activity of dopamine neurons closely reflects reward prediction error as it is enhanced by unexpected reward or any indicator of potential reward (such as conditioned stimuli) and suppressed when an expected reward is not present (Eshel et al., 2015; Schultz et al., 1993; Waelti et al., 2001). In this hypothesis, the brain continuously computes the discrepancy between the expected reward and the actual outcome following each executed behavior, and reinforces the weights of active synapses after positive prediction error, while weakening them after negative error (Pessiglione et al., 2006; Sutton and Barto, 1998). The modified synaptic weights reflect the newly evaluated likelihood that the behavior will generate beneficial outcomes, allowing the brain to adaptively route sensory information and elicit optimal motor actions. The most plausible locus of such plasticity is the striatum in the basal ganglia which is heavily innervated by dopaminergic neurons, receives convergent sensory information through cortico-striatal projections, and sends its output to influence cortical and subcortical motor control regions. Supporting the associative role of the striatum, after rats learned to associate two different types of auditory stimuli with two different actions, optogenetic stimulation of the cortico-striatal projection neurons that represent one type of stimuli caused the rats to more frequently generate the action paired with that stimulus type (Znamenskiy and Zador, 2013). Importantly, this associative learning was accompanied by a selective potentiation of cortico-striatal synapses in a manner that conforms to the specifically learned associative rules, demonstrating that cortico-striatal synapses are indeed a site of plasticity during associative learning (Xiong et al., 2015). The synaptic reshaping in the striatum is likely guided by dopaminergic neurons encoding reward prediction error, as indicated by multiple lines of evidence. First, long-term potentiation (LTP) and long-term depression (LTD) of cortico-striatal synapses depend on the phasic burst of dopamine (Shen et al., 2008; Yagishita et al., 2014). Second, perturbing the balance of dopamine or dopamine receptors impairs associative learning, probably due to aberrant plasticity (Bach et al., 2008; Eyny and Horvitz, 2003; Smith-Roe and Kelley, 2000). Furthermore, delivering microstimulation in the striatum or optogenetically activating dopamine neurons during the reinforcement period of correct trials, supposedly mimicking positive prediction error, significantly increased the rate of associative learning or prevented blocking/extinction of association (Steinberg et al., 2013; Williams and Eskandar, 2006). As examined so far, there is compelling evidence that dopamine-dependent plasticity in cortico-striatal synapses plays a critical role in sensorimotor associative learning. However, relatively little is known about how this plasticity in the inputs to the basal ganglia relates to the selection and execution of particular motor programs (Hélie et al., 2015; Hikosaka et al., 2006). One possibility is that the basal ganglia output through the thalamus generates appropriate motor actions by activating specific PFC neurons, which would then activate specific motor cortical circuits. Subsequently, the coincident activity between inputs from the basal ganglia and from sensory areas may strengthen the specific synapses onto PFC neurons from sensory areas via Hebbian plasticity (Hélie et al., 2015). Such plasticity could generate a shortcut pathway from sensory to prefrontal to motor cortices, bypassing the basal ganglia circuit (Figure 3A). Thus, well-practiced associations may be driven more efficiently through this shortcut pathway at later learning stages. However, this simple model has several unresolved issues. First, this model assumes that the striatum contains neural activity patterns that can specifically drive a variety of precise motor patterns, a notion that has yet to be demonstrated. Second, as reviewed above, striatal neurons start discriminating different stimuli earlier than PFC neurons during associative learning, and that the time course of behavioral improvement matches that of PFC neural changes (Pasupathy and Miller, 2005). The different time courses between PFC and the striatum are difficult to explain with the model in which the basal ganglia drive motor responses through PFC. An alternative hypothesis is that dopamine-dependent plasticity would render striatal neurons receiving input from both sensory and motor areas to become selectively active for specific stimulus-response pairs (Figure 3B). Such associative striatal activity, while it may not drive motor patterns, can serve as a teaching signal to strengthen the specific sensory input synapses to PFC neurons driving that specific motor pattern (Figure 3B). Two important, unproven assumptions for this model are 1) individual striatal neurons receive convergent inputs of specific sensory and motor information, and 2) the striatal neurons that receive projections from a specific motor circuit returns its output preferentially to PFC neurons driving that specific motor circuit. We also note that, given the highly divergent projections of basal ganglia outputs to many cortical and subcortical regions, these models are almost certainly oversimplified. Dissecting the projection circuit from the basal ganglia to downstream areas using optogenetic tools could be an important step towards a better understanding of the output function of the basal ganglia. Circuit models of sensorimotor associative learning (A) In this model, sensorimotor association is initially executed by dopamine-dependent plasticity to strengthen the corticostriatal synapses in the basal ganglia carrying specific sensory inputs (‘1’). The downstream pathway drives specific motor responses via PFC (blue). The basal ganglia output to PFC also strengthens sensory input synapses in PFC (‘2’), which subsequently forms a pathway from sensory to prefrontal to motor cortices, bypassing the basal ganglia (green). Further training creates direct cortico-cortical pathways between sensory and motor cortices, via coincidental activation-dependent plasticity (red, ‘3’). (B) Alternative hypothesis: The basal ganglia output to PFC provides a teaching signal, without driving specific motor responses. During the exploration phase of learning, striatal neurons that receive convergent inputs carrying specific sensory and motor information undergo plasticity based on the dopamine prediction error signal (left). This association-specific activity in the basal ganglia provides a teaching signal for PFC neurons that drive the specific motor program to strengthen the synapses carrying the specific sensory information (right). Grey boxes denote the sites of plasticity. In this model, learning is behaviorally evident only after the plasticity in PFC. Path optimizationOne effect of associative learning is the decreased reaction time of the associated motor response, indicating an increased efficiency in information processing. The increased efficiency might be achieved by shortening signal transduction pathways between sensory and motor ends. The shortcut circuit that bypasses basal ganglia discussed above would serve this purpose (Figure 3A). Furthermore, strengthening direct synaptic connections between the sensory and motor cortices would further expedite signal transduction (Figure 3A). As reinforcement learning progresses, coincidental activations of sensory and motor neural populations, each representing the learned stimulus and response respectively, occurs more frequently. Such coincidental activation would permit Hebbian plasticity at the cortico-cortical synapses that correspond to the associations. A potential cellular basis for such plasticity has been studied in the barrel cortex, where coincidental arrivals of long-range input from the motor cortex in the apical dendritic tuft and the ascending sensory input onto layer 5 pyramidal neurons evoked long-lasting plateau potentials in the tuft (Xu et al., 2012). Such plateau potentials have been shown to induce LTP in the apical tuft in hippocampal slices (Takahashi and Magee, 2009). Thus, in the barrel cortex, coincidental arrivals of motor and sensory signals might drive LTP in the apical dendritic synapses via plateau potentials, strengthening the direct connection between the two regions. Likewise, in the motor cortex where the apical tuft receives long-range input from the sensory cortex, coincidental sensory and motor signals may drive LTP in the apical dendrites, strengthening the connectivity between task relevant sensory and motor signals during associative learning. Supporting this idea, a loss of NMDA receptor function that impaired primary motor cortex LTP slowed down associative learning in mice (Hasan et al., 2013). Therefore, this non-linear cellular mechanism of integrating concurrent sensory and motor inputs (i.e., the formation of plateau potentials) could generate direct, fast signal transduction pathways between repeatedly associated stimuli and motor responses. Overtraining can further increase efficiency, producing reflexive, habitual responses that are insensitive to action outcome contingency (Smith and Graybiel, 2013). As behavior shifts from goal-directed action to habit, dominant control over behaviors also move from dorsomedial (DMS) to dorsolateral striatum (DLS) (Yin and Knowlton, 2006). Recent experiments suggest that the shift from DMS to DLS requires activity attenuation of cortico-striatal neurons in OFC and post-synaptic depression in D2 neurons in DLS (Gremel et al., 2016; Shan et al., 2015). Transition from DMS to DLS over time is also observed during motor skill learning, suggesting that DLS ultimately permits automatic, stereotypical behaviors. 3. Motor Skill LearningEven after attaining the perceptual improvement and flexible stimulus-response associations described in the previous sections, successful sensorimotor learning still ultimately depends on the generation of a skilled motor behavior that consistently yields favorable outcomes. This process is known as motor skill learning, and is canonically defined as the repetition-mediated increase in the speed and accuracy of a newly acquired motor behavior (Diedrichsen and Kornysheva, 2015; Shmuelof and Krakauer, 2014). Such learning follows a well characterized temporal pattern, beginning with a rapid initial improvement (a “fast learning” phase), followed by more moderate refinements over a longer time course (a “slow learning” phase) (Karni et al., 1998). The early stages involve exploration of a range of behaviors and concomitant outcome-based selection, whereafter repetition-based refinements of the task dominate, driving the formation of a highly stereotyped movement with little trial-to-trial variability. The task of the motor-associated brain regions, therefore, is to create a dedicated pathway for the effortless and stereotyped execution of a learned skill by first exploring possible distributions of behaviors that yield positive outcomes, then defining and refining a final distribution. The goal of the following section is to highlight the current understanding of the governing set of principles that likely guide learning-mediated changes in the brain during the acquisition of a motor skill. Specifically, we propose that the combination of behavioral exploration, outcome-mediated feedback, and Hebbian mechanisms of plasticity are sufficient to generate a stable circuit that can accurately and reliably produce a novel motor behavior. Exploration of neural representationsDuring the initial stage of behavioral exploration, the brain must likewise sample a variety of circuits that could potentially elicit effective movements. As such, the early stages of motor learning should be characterized by a large number of movement-related circuits, which can then be refined as the motor behavior is honed. Consistent with this idea, a large body of literature suggests that the neural representations of a newly acquired motor skill can expand during the initial phase of learning. Expansion of motor representations in the cortexThe motor cortex contains a ‘somatotopic map’, inasmuch as stimulation of different cortical areas evokes movements of different body parts. Far from being a static representation of motor primitives, the somatotopic map of the motor cortex is a highly plastic feature. A dramatic example of this comes from peripheral nerve lesions in rats, which shrink the cortical areas corresponding to injured inputs, allowing uninjured regions to encroach onto this newly available cortical territory (Donoghue and Sanes, 1987). Similarly, studies in a host of model organisms ranging from rodents to humans have repeatedly shown that motor learning causes an expansion of the somatotopic motor map for the associated muscle groups, suggesting that these muscles now have an elaborated representation in the cortex (Karni et al., 1995; Kleim et al., 1998a; Kleim et al., 2004; Nudo et al., 1996; Pearce et al., 1999). In one example, training squirrel monkeys in an object-retrieval task – which required fine coordination of the involved digits – caused an expansion of the cortical region over which micro-stimulation could induce movements in those same digits (Nudo et al., 1996). Similarly, in humans, repetition of fine finger movements caused an expansion of the cortical region over which transcranial magnetic stimulation could induce finger movements (Pascual-Leone et al., 1994; Pascual-Leone et al., 1995). Critically, this process seems to be unique to the early stages of learning (Classen et al., 1998; Pascual-Leone et al., 1994), when the animal was in a largely exploratory phase of learning, and only just started to show signs of producing more stereotyped behavior. The expansion of the cortical map itself is difficult to interpret in terms of the underlying neural representations, as cortical microcircuits consist of individual neurons that are highly heterogeneous. Therefore it is noteworthy that the map expansion has been observed to occur in concert with – and is perhaps explained by – an increase in the size of neural ensembles associated with the learned skill (Costa et al., 2004; Peters et al., 2014). As an example, the population of cells in layer 2/3 of the motor cortex of mice whose firing correlated with a particular motor task was shown to expand as the animals repeatedly performed a lever-press task (Peters et al., 2014). The total number of active cells for each movement bout, however, remained constant, meaning that the activated population of cells was more variable from movement to movement during this phase. The initial expansion of the ensemble size, therefore, provides a larger pool from which to make a selection, increasing the likelihood that a circuit would select a global vs. a local maximum of optimality. Potential mechanisms of population expansionThe changes in brain ensemble activity are likely subserved by changes at the synaptic level. In support of this notion, the learning of forelimb reaching tasks in rats has been shown to result in enhanced synaptic responses in M1 excitatory neurons after learning (Hodgson et al., 2005; Rioult-Pedotti et al., 1998). Such training also briefly occluded the induction of long-term potentiation (LTP), suggesting that LTP-like mechanisms are invoked during motor skill learning (Hodgson et al., 2005; Rioult-Pedotti et al., 2007; Rioult-Pedotti et al., 2000). More recently, it was shown that thalamocortical inputs in the rat motor cortex are potentiated specifically for those cells that correspond to the trained motor group (in this case, the distal forelimb used for a reaching task), indicating that thalamo-recipient synapses in the motor cortex undergo LTP in a use-dependent fashion (Biane et al., 2016). Other indications of LTP have also been observed to occur during motor learning in rodents, such as the increase in dendritic spine size (Fu et al., 2012). Motor learning has also been shown to cause an increase in the density of incoming axonal projections (Sampaio-Baptista et al., 2013) as well as an elaboration of the dendritic arbor of M1 neurons (Gloor et al., 2015; Greenough et al., 1985). Furthermore, dendritic spines on M1 pyramidal cells increase in number during the early stages of learning (Fu et al., 2012; Peters et al., 2014; Xu et al., 2009), indicating the formation of new putative synaptic sites. The individual newly formed spines are long-lasting and thus might represent enduring physical traces of motor learning (Xu et al., 2009). The increase in spine number overlaps temporally with the expansion of the size of the neuronal ensemble, suggesting that these two processes are potentially related (Peters et al., 2014). The overall spine density subsequently returns to pre-learning levels, notably also in parallel with the late-stage reduction in ensemble size (Chen et al., 2015b; Xu et al., 2009). Interestingly, spines that form during learning have been shown to spatially cluster on a subset of dendritic branches (Lai et al., 2012; Yang et al., 2014) as well as within branches (Fu et al., 2012). Such an arrangement of dendritic spines might afford nonlinear behavior of dendrites, increasing the efficacy of new spines in driving the neuron to spike (Govindarajan et al., 2006). The mechanisms by which the addition of dendritic spines is controlled during learning are likely numerous, allowing the recruitment of a variety of context-specific signals to influence the process. Local inhibitory circuits may play a role to gate synaptic changes onto motor cortical neurons during motor learning (Chen et al., 2015b; Donato et al., 2013). One study used longitudinal imaging to show that motor learning induces a reduction in the number of inhibitory synapses onto apical dendritic tufts of excitatory neurons, the dendritic compartment where the addition of dendritic spines is the most pronounced. Furthermore, specific stimulation of a subset of inhibitory neurons that selectively inhibit apical dendritic tufts impaired the stabilization of new spines and motor learning (Chen et al., 2015b). Thus, local inhibitory microcircuits can tune excitatory neurons to be more or less plastic and determine their incorporation into a learning-related ensemble. In summary, the early stages of motor learning are marked by an expansion of the neural ensemble in the motor cortex available for use by the learned motor skill, thus allowing the sampling of a number of new circuit options (Figure 4A). This expansion is potentially explained by plasticity of cells in the motor cortex that renders these cells more synaptically connected and more sensitive to synaptic input. Importantly, the motor cortex is a layered structure in which the superficial layer sends feedforward excitation to deep layer output neurons (Weiler et al., 2008), and neurons in different layers likely exhibit distinct dynamics during learning (Masamizu et al., 2014). It should also be noted that other brain regions also contribute to the early phase of learning. Indeed, signals from other regions likely act to drive or facilitate the cortical plasticity described above. For instance, as described in the previous section, the basal ganglia are thought to provide important signals for increasing the variability of motor behaviors via output from the globus pallidus internus. Determining how such signals interact with the cortical networks during the processes described above requires further study. The advent of new imaging approaches that allow for the simultaneous imaging of multiple brain regions, combined with projection-specific labeling and perturbation, will help to facilitate this advancement. Hierarchical mechanisms of circuit modification shape the formation of novel motor skills (A) During the early phases of learning, the system explores a variety of behavioral options, which coincides with an expansion of the neuronal ensemble size in the motor cortex. (B) Favorable outcomes reinforce a corresponding population of cells, shifting the mean behavior in the process. (C) The repetition of the selected behavior drives Hebbian plasticity in the associated population of cells, eventually resulting in a refined ensemble and highly stereotyped behavior. Selection of a mean: creating effective motor circuitsThe exploration of movements and circuits necessitates that the brain must select a circuit that can reliably produce the target movement. How is such a selection made? A natural expectation is that successful/unsuccessful pathways are reinforced/punished through feedback-based mechanisms. In fact, there is significant evidence implicating the basal ganglia and cerebellum in performing exactly these tasks for the selection of an appropriate motor behavior. In particular, the basal ganglia are specialized in reinforcement learning, as discussed in the previous section, while the cerebellum is thought to facilitate learning based on error signals. Thus, it is perhaps the joint efforts of these brain areas that allow for the selection of an appropriate target behavior and corresponding circuit. If behavioral exploration by ensemble expansion broadens the distribution of behavioral options, these feedback-based mechanisms may dictate a new mean about which the final distribution will center (Figure 4B). Basal GangliaAs briefly mentioned in the previous section, the recruitment of the basal ganglia seems to occur in two parallel, anatomically distinct streams. DMS, or the ‘associative’ striatum, which receives inputs from association cortices (e.g. the prefrontal cortex) (McGeorge and Faull, 1989; Voorn et al., 2004), is involved primarily in the early stages of motor learning, probably reflecting associative learning to establish action goals (Yin et al., 2009). Correspondingly, there is an increase in the glutamatergic sensitivity of medium spiny neurons (MSNs, the primary output neurons of the striatum) in DMS during this period (Yin et al., 2009), suggesting the occurrence of learning-induced potentiation. In contrast, DLS, or ‘sensorimotor’ striatum, which receives sensory and motor inputs from a variety of cortical regions, is primarily engaged during the later stages of learning, when task performance starts to plateau (Yin et al., 2009). Likewise, the glutamatergic sensitivity of MSNs in this region were found to only increase in the late stages of learning (Yin et al., 2009). The changes in synaptic strength are likely due to potentiation of currents through AMPAR-type glutamate receptors at the synaptic surface of MSNs (Yin et al., 2009). Thus, LTP-like plastic changes of MSNs in these regions are likely critical for motor learning. Consistent with this notion, genetic removal of functional NMDARs from MSNs impair motor learning (Beutler et al., 2011). The differential temporal recruitment of DMS and DLS suggests an evolving importance of different information streams (i.e. associative vs. sensorimotor) for reward-mediated shaping of behavior, consistent with the hierarchical reinforcement learning model (Haruno and Kawato, 2006). CerebellumComplementing the role of the basal ganglia, the cerebellum is also critical in the learning of new motor skills (Sanes et al., 1990). Cerebellar learning is thought to be driven by error signals that indicate differences between the intended movement and the one that was actually executed. The best studied cellular modifications observed in the cerebellum during learning involve long-term depression (LTD) of the parallel fiber-to-Purkinje cell synapse (De Zeeuw and Yeo, 2005). This LTD is triggered by movement errors originating from climbing fiber input (Ito and Kano, 1982). More prolonged bursts of activity from climbing fibers, scaling with movement error, induces more complex spiking in Purkinje cells. Thus, the size of the error proportionally increases intracellular calcium levels and therefore the expression of LTD (Yang and Lisberger, 2014). Cerebellar LTD has been repeatedly observed to occur in response to learning, and forms the basis of many standard models of cerebellar learning. However, there is also evidence that potentiation of the Purkinje cell response (e.g. enhanced simple and complex spike discharge rate) occurs during learning (Berthier and Moore, 1986; Ojakangas and Ebner, 1994), and that there are accompanying structural modifications, including the addition of dendritic spines. The acquisition of complex motor skills, for instance, has been shown to increase the number of parallel fiber-to-Purkinje cell synapses in the cerebellum (Kleim et al., 1998b). Furthermore, a recent study showed that optogenetic activation of Purkinje cells was sufficient to drive learned changes in the vestibulo-ocular reflex (Nguyen-Vu et al., 2013), suggesting that Purkinje cell output, in addition to changes in Purkinje cell inputs, can drive behavioral modifications. Motor cortexBasal ganglia and cerebellar circuits provide major inputs to the motor cortex through the thalamus, so the feedback-based learning in these circuits likely assist in selecting effective circuits in the motor cortex from the broadened population initially explored. Less is known about mechanisms that work within motor cortex to select appropriate circuits. However, it has been proposed that dopamine plays an important role in regulating spine plasticity in the motor cortex during learning. Dopaminergic projections from the ventral tegmental area are present in the motor cortex, and ablation of dopaminergic terminals in the motor cortex impaired the learning of a reaching task (Hosp et al., 2011). A subsequent study revealed D2-type dopamine receptors mediate spine addition in the motor cortex (Guo et al., 2015). The information conveyed by dopaminergic input to the motor cortex during learning is still unclear; dopamine could be actively selecting rewarding pathways, analogous to striatum, but it is also possible that it simply functions as a permissive factor for normal plasticity. Importantly, however, recent studies have indicated that there is significant degeneracy in the cortical populations corresponding to a particular movement; i.e., there are multiple populations that are effective in eliciting a similar motor behavior. Furthermore, early motor movements that were by chance very similar to the expert movement were shown to utilize a cortical population that did not necessarily resemble the expert circuit in any clear way (Peters et al., 2014). Thus, the mechanisms and criteria on which cells are selected for inclusion into a stable motor ensemble require further investigation. This process involves activity-induced transcriptional mechanisms, as motor cortex neurons that activate the immediate early gene Arc during motor learning are more likely to be active during subsequent execution of the learned behavior (Cao et al., 2015). Refinement of a final learned representationOnce a successful motor behavior has been identified, continued practice leads to a highly refined, low-variability version of the skill. What are the circuit dynamics that correspond to this change? Since the population of possible cells initially expanded so as to increase behavioral variability, does the population then decrease to reduce this variability? In line with exactly this possibility, some studies suggest that the initial expansion of the somatotopic map can be followed by a period of contraction, returning the map to a near pre-training size without a corresponding deterioration in the performance of the skill (Molina-Luna et al., 2008; Pascual-Leone et al., 1994). This phenomenon is echoed by several studies that suggest an overall reduced level of cortical activation during execution of a highly practiced motor behavior (Jenkins et al., 1994; Ma et al., 2010; Picard et al., 2013; Toni et al., 1998; Ungerleider et al., 2002; Wymbs and Grafton, 2015). For example, fMRI measurements in humans showed that professional piano players recruit smaller regions of cortex than control subjects when performing a complex finger movement (Krings et al., 2000). It should be noted, however, that other studies support the notion that M1 activity actually increases during performance of a highly learned skill (Floyer-Lea and Matthews, 2005; Karni et al., 1995; Penhune and Doyon, 2002). This apparent discrepancy could be due to differences in the nature of the motor tasks, or perhaps the different time points used. Nonetheless, consistent with the reduction in activation size during skilled movements, the later stages of learning in mice yield a renormalization of the neuronal ensemble size in layer 2/3 of motor cortex. This phase of learning coincides with increased rate of spine elimination on the dendrites of excitatory neurons (Peters et al., 2014). It is likely that this process of refinement is at least partially dissociable from the feedback-based selection discussed earlier, in that the refinement process is selecting from a variety of circuits that can all successfully lead to the desired motor output. While outcome-mediated feedback likely still provides basic boundaries for the behavior to ensure that it is shaped based on outcome, the existing motor-related ensemble at this stage of learning likely operates mostly within these bounds. Thus any plasticity acting on such an ensemble may rely mostly on outcome-independent mechanisms. Refinement in this context can therefore be thought of as a reduction in the circuit-level degeneracy for the corresponding motor skill (Figure 4C). How could this degeneracy reduction be achieved? The best-known form of feedback-independent plasticity is of the classical Hebbian form. Such a mechanism of selectively strengthening connections between co-active neurons, combined with homeostatic plasticity to keep the total synaptic strengths constant, could generate a local circuit that can autonomously generate a particular activity pattern in response to repeated activation coinciding with the repetition of the motor skill. In support of this idea, modeling work suggests that unconstrained repetition of an artificial neural network, accounting for only Hebbian mechanisms of plasticity and heterosynaptic competition, can lead to the emergence of a stable, reproducible activity pattern (Fiete et al., 2010). This process could thus be classified as unsupervised learning, in which neither an error nor a reward signal need be present for the system to continue to evolve. Unsupervised learning is likely of primary importance for the later stages of skill learning, when a basic model of the behavior has already been generated, allowing mere repetition to ultimately achieve effortless, reproducible performance. Motor cortex is likely a central locus of unsupervised learning, as suggested previously (Doya, 1999). We again note that it is probable that unsupervised learning and feedback-based mechanisms overlap to a certain degree, with feedback continuing to provide a level of supervision over the behavioral products of the Hebbian plasticity in the cortex. As a final note, it should be pointed out that the role of motor cortex in motor learning and movement execution is debated. While motor cortex is unambiguously required for the learning of motor skills, its involvement in the execution of learned or highly stereotyped movements is controversial. A recent study showed that post-learning lesion of motor cortex did not affect the execution of stereotyped sequences of movements learned in an unconstrained manner (Kawai et al., 2015), while another study has shown that the same manipulation eliminated the ability to perform a skilled reach-and-grasp task (Conner et al., 2005). A unifying principle seems to be that the more dexterous and awkward the learned movements are, the more dependent the execution is on the motor cortex. Furthermore, the degree of training may also be an important factor such that overtraining may gradually reduce cortical dependence. Concluding RemarksWe have reviewed a variety of changes in neural activity and connectivity patterns during sensorimotor learning. While these changes underscore the dynamic nature of the sensorimotor pathway, drawing a causal link between neural changes and behavioral improvement remains a fundamental challenge in the study of learning. This is especially challenging in cases where neural changes are highly distributed across many brain areas. For example, we have reviewed that perceptual learning induces neural changes at multiple levels of sensory processing. An ideal test for the necessity of neural changes in learning would be to block the changes without affecting other aspects of circuit functions. Pharmacological or genetic inactivation of NMDARs has been used as a means to assess the necessity of synaptic plasticity in a brain region of interest. However, NMDAR signaling is required not only for synaptic plasticity but also for basal synaptic transmission, so the interpretation of these experiments is not straightforward. As a potentially more specific approach, a study reported the development of a molecular genetic tool that is designed to reverse recent synaptic potentiation events when activated by light (Hayashi-Takagi et al., 2015). When this tool was activated in the motor cortex following training, impairment in motor skill learning was observed. Another study identified a plasticity event in a specific class of inhibitory neurons associated with motor skill learning, which could play a permissive role in allowing excitatory circuit plasticity. The authors attempted to test this idea by controlling the activity of these inhibitory neurons using optogenetics, which blocked normal synaptic plasticity in excitatory neurons and impaired motor skill learning (Chen et al., 2015b). Additionally, an interesting study reported that the expression of conditioned fear response could be inactivated and reactivated by optogenetic protocols that would weaken or strengthen the synapses of auditory inputs onto the amygdala (Nabavi et al., 2014). While the need for the test of specificity for these manipulations cannot be overstated, the expanding molecular toolkit will allow researchers to perform increasingly more specific manipulations to test the causality of neural changes in learning. Lastly, we note that the pursuit of the precise neural changes that support learning is further confounded by the fundamentally fluid nature of memory. It has long been appreciated that the stability of a memory is dependent on time such that older memories are often more stably maintained. This implies that the underlying neural mechanisms, including involved brain regions, may be dynamically shifting over time. In fact, an emerging principle that we proposed in the associative learning section is a gradual shortening of the pathway connecting sensory inputs to motor outputs. Such fluidity and distributed nature of memory trace makes it a major challenge to identify the precise changes in brain circuits that mediate behavioral improvement during learning. Major progress would be afforded by holistic, brain-wide observations of changes combined with manipulations with high molecular, temporal and spatial precision. AcknowledgmentsWe thank members of the Komiyama lab, especially M. Chu, B. Danskin, R. Hattori, and H. Liu for comments and discussions. This research was supported by grants from NIH (R01 DC014690-01, R21 DC012641, R01 NS091010A, U01 NS094342 and R01 EY025349), Human Frontier Science Program, Japan Science and Technology Agency (PRESTO), New York Stem Cell Foundation, David & Lucile Packard Foundation, Pew Charitable Trusts and McKnight Foundation to T.K. and by the NARSAD Young Investigator Grant to H.M.. N.G.H. is supported by an NIH training grant (T32NS007220). T.K. is a NYSCF-Robertson Investigator. FootnotesAUTHOR CONTRIBUTIONS The first, second and third sections were written mainly by H.M., E.J.H. and N.H., respectively, with inputs from all other authors. Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. References
What are some examples of sensorimotor?Sensorimotor stage examples include instances when you hide an object under a blanket, and the child tries to find it.. Seeing.. Touching.. Sucking.. Feeling.. What are sensorimotor activities?Sensorimotor Activities
Crawling, balancing, visual tracking, and coordination are all ways that a baby experiences the world while simultaneously developing their brain and body. Often, children who struggle with learning or developmental disorders have sensorimotor system delays. 1.
What is learning through in Piaget's sensorimotor stage?The Sensorimotor Stage
During this earliest stage of cognitive development, infants and toddlers acquire knowledge through sensory experiences and manipulating objects. A child's entire experience at the earliest period of this stage occurs through basic reflexes, senses, and motor responses.
What are the 6 stages of sensorimotor development?The sensorimotor stage is composed of six sub-stages and lasts from birth through 24 months. The six sub-stages are reflexes, primary circular reactions, secondary circular reactions, coordination of reactions, tertiary circular reactions, and early representational thought.
|