Prior precision modulates the minimisation of prediction error in human auditory cortex

The predictive coding model of perception proposes that successful representation of the perceptual world depends upon cancelling out the discrepancy between prediction and sensory input (i.e., prediction error). Recent studies further suggest a distinction between prediction error associated with non-predicted stimuli of different prior precision (i.e., inverse variance). However, it is not fully understood how prediction error from different precision levels is minimised in the predictive process. The current research used magnetoencephalography (MEG) to examine whether prior precision modulates the cortical dynamics of the making of perceptual inferences. We presented participants with cycles of repeated tone quartets which consisted of three prime tones and one probe tone. Within each cycle, the three prime tones remained identical while the probe tones changed at some random point (e.g., from repetition of 123X to repetition of 123Y). Therefore, the repetition of probe tones can reveal the development of perceptual inferences in low and high precision contexts depending on its position within the cycle. We found that the two conditions resemble each other in terms of N1m modulation (as both were associated with N1m suppression) but differ in terms of N2m modulation. While repeated probe tones in low precision context did not exhibit any modulatory effect, repeated probe tones in high precision context elicited a suppression and rebound of the N2m source power. The differentiation suggested that the minimisation of prediction error in low and high precision contexts likely involves distinct mechanisms.


24
Our brain constantly predicts forthcoming sensory inputs. The predictive coding model of as prediction error and communicate it forward to the next higher level. The message-passing 33 between hierarchical cortical levels iterates to match prediction and sensory input as much as 34 possible, that is, to minimise prediction error in the system.

36
Recent research further suggested the necessity to distinguish between two conditions inducing 37 prediction error: the unpredicted condition (where there is no precise prediction) and 38 mispredicted condition (where there is a precise prediction being violated). Conceptually, 39 unpredicted condition is mainly associated with prediction error generated by sensory input 40 that is not anticipated, whereas mispredicted condition triggers not only prediction error 41 generated by sensory input that is not anticipated but also prediction error generated by 42 prediction that is not fulfilled (Arnal & Giraud, 2012). The dissociation was supported by 22 23 The differentiation raised the question whether the minimisation of prediction error in low and 24 high precision contexts also involves distinct mechanisms. Here we looked into the neuronal 25 underpinnings of the minimisation of prediction error in low and high precision contexts using 26 magnetoencephalography (MEG). Specifically, we examined whether there is a difference 27 between the two conditions in terms of N1m and N2m modulation, given that these long-28 latency components are mediated by top-down effects in cortical networks and therefore rest 29 on backward connections (Garrido et al., 2007). We presented participants with cycles of 30 repeated tone quartet which consisted of three prime tones and one probe tone. Within each 31 cycle, the three prime tones remained identical while the probe tones changed at some random 32 point (e.g., from repetition of 123X to repetition of 123Y). Therefore, the repetition of probe 33 tones can reveal the development of perceptual inferences in low and high precision contexts 34 depending on its position within the cycle. In the beginning of a cycle where the three prime 35 tones are of little predictive value, the presentation of probe tone X triggers prediction error in 36 a low precision context (because listeners would predict a probe tone to be presented but cannot 37 be quite sure of its frequency). In the middle of a cycle where the prime tones are already 38 associated with probe tone X, the presentation of probe tone Y triggers prediction error in a 39 high precision context (because listeners would tend to predict probe tone X to appear but such  Eighteen healthy adults (average age: 24; 6 males; 14 right-handed) with no history of 4 neurological, psychiatric, or visual/hearing impairments as indicated by self-report participated 5 in the experiment. Participants gave written informed consent and were paid for participation. 6 Ethical approval was granted by the ethics committee of National Taiwan Normal University 7 (Taiwan) and the University of Jyväskylä (Finland). Four participants were excluded from data 8 analysis for excessive measurement noise, leaving fourteen participants in the final sample 9 (average age 24; 3 males; 12 right-handed). 10 11

12
Sinusoidal tones with a loudness of 80 phons (i.e. 80 dB for tones of 1000 Hz) were generated 13 using Matlab. The duration of each tone was 50 ms (including 5 ms rise/fall times). The 14 frequency of each tone was within the range of 261.626 -493.883 Hz, matching the absolute 15 frequency of a series of seven natural keys on a modern piano (i.e., C4 D4 E4 F4 G4 A4 B4). 16 17 A total of 90 pairs of tone quartets (consisting of three prime tones and one probe tone) were 18 created. Each pair of tone quartet was identical in the prime tones but different in the probe 19 tone in terms of frequency (e.g., F4-E4-G4-A4 and F4-E4-G4-D4). The frequency of the prime

27
A total of 10 blocks of 9 cycles were presented. Each cycle consisted of the repetition of a pair 28 of tone quartet, where the first tone quartet was repeated 4 to 6 times before the second tone 29 quartet was repeated 4 to 6 times. The reason we presented each tone quartet 4 to 6 times was 30 to prevent participants from learning high-order regularities (e.g., correctly anticipating a 31 change in probe tone). Therefore, a cycle could contain 8 to 12 tone quartets. While the 32 repetition of the first tone quartet turned the initially non-predicted probe tone into a predicted 33 tone in a low precision context, the repetition of the second tone quartet turned the initially 34 non-predicted probe tone into a predicted tone in a high precision context ( Figure 1A). Figure 1B illustrates a tone quartet, which started with a silent interval of 500 ms. Each tone 37 was separated by a 500 ms stimulus onset asynchrony (SOA). 10% of the probe tones were of 38 attenuated loudness by 20 dB. Participants were required to press a key when they detected a 39 softer probe tone as soon as possible to maintain their attention. The offset of the probe tone 40 was followed by a jittered inter-trial interval (ITI) of 700-800 ms. There was no separation 41 between cycles distinct from the ITI. A fixation cross remained on the screen for the duration 42 of the block. The whole experiment took around 42 minutes (i.e., 900 trials x 2800 ms).

36
Presentation (Neurobehavioral Systems, Inc., USA) was used for stimulus presentation.

Data recording and analysis
MEG data was collected using a 306 channel whole-head device (Elekta Neuromag, Finland) 11 in a two-layered magnetically shielded room at the University of Jyväskylä. The sampling rate 12 was 1000 Hz. A high-pass filter of 0.03 Hz and a low-pass filter of 200 Hz were used.

13
Continuous head position monitoring was used based on five Head-Position Indicator (HPI) 14 coils, with three at the forehead and two behind the ears. Electrooculography (EOG) was 15 recorded using electrodes lateral to each eye and above and below the left eye. 16 17 Offline, head movements were corrected and external noise sources were attenuated using the 18 temporal extension of the source subspace separation algorithm (Taulu et al., 2005) in the 19 MaxFilter program (Elekta Neuromag, Finland).

21
After the initial head movement correction, the data was analysed using BrainStorm 3.2 (Tadel 22 et al., 2011). Signal subspace projection was used to correct for eye blinks. The MEG signal 23 was filtered at 1-40 Hz and segmented from -100 ms to 500 ms relative to the onset of the 24 stimulus using a 100 ms pre-stimulus baseline. Segments with over 5000 fT/cm peak-to-peak 25 values in gradiometers or 7000 fT peak-to-peak values in magnetometers were rejected. As all 26 tone quartets were repeated at least 4 times, segments to the 5th and 6th presentations of tone 27 quartets were also rejected to ensure our analysis is based on equal number of trials. The trial 28 numbers after artefact rejection in each condition are listed in Table 1 The experimental effects were examined in source space. As individual magnetic resonance 2 images (MRI) were not available from the participants, the ICBM152 MRI template was used.

3
The weighted minimum norm estimates (wMNE) were calculated using the unconstrained  To extract the N1m and N2m measures, we first identified the N1m and N2m from the grand 9 average global field power (GFP) of the gradiometers (across 14 participants and 32 conditions) 10 ( Figure 2A). Then, we identified brain regions from the Dessikan-Killiany parcellation which 11 showed the largest source activity around the auditory cortices at the N1m and N2m, including The grand average source solution (across 2 hemispheres, 5 brain regions, 14 participants, and 8 32 conditions) was used to identify the N1m and N2m time windows for statistical analysis. The source power in the N1m and N2m time windows of the probe tones were submitted to the 5 2 (precision: low/high precision context) x 4 (repetition: 1st/2nd/3rd/4th presentation) repeated 6 measures analysis of variance (ANOVA). Greenhouse-Geisser correction was applied when 7 appropriate (and will be indicated in the following section with epsilon values). 8 9

10
The ANOVA on the N1m source power showed only a main effect of repetition (F(3,39) = 11 6.33, p < 0.01, partial eta squared = 0.33) (Figure 4 left). The effect was due to larger response 12 to the 1st presentation compared to all the other presentations ( Table 2). No significant 13 differences were found between the response strength for the other presentations.
14 15 The ANOVA on the N2m source power revealed a precision x repetition interaction (F(3,39)      The current research used MEG to examine whether prior precision modulates the cortical error in low and high precision contexts involves distinct mechanisms. 15 16 In electrophysiology literature, N1/N1m is known to reflect multiple processes of signalling precision context) and prediction alteration (as in high precision context). Instead, N1m seems 27 to reflect the overall reduction in prediction error.

29
According to the predictive coding model of perception, prediction error can be adjusted

31
Prediction error is weighted less in low than high precision contexts (Schröger et al., 2015), 32 leading to smaller N1 responses to target tones following random than regular tone sets in EEG 33 (Hsu et al., 2015). We speculate that the difference between low and high precision contexts 34 might be less conspicuous here so that we did not obtain a main effect of precision on the N1m Nevertheless, the differentiation between prediction error processing in low and high precision 6 contexts was evident on the N2m source power. Specifically, tones in low precision context 7 triggered smaller source power than tones in high precision context upon the 1st presentation 8 and the 4th presentation. More importantly, stimulus repetition triggered different response 9 pattern in low and high precision contexts. While repeated tones in low precision context did error can manifest differently at the cortical level depending on its initial precision status. 16 17 Specifically, novel probe tones presented in the beginning of each cycle are associated with 18 lower prior precision, as listeners had a general expectation that a probe tone would appear but 19 had little if any idea concerning its frequency. The repetition of these stimuli increases the 20 precision of prediction error, which in turn minimises the cortical responses encoding 21 prediction error. It is possible that this process takes place automatically in the auditory cortices 22 near planum temporale, hence modulating the N1m but not the N2m.

24
On the other hand, novel probe tones presented in the middle of each cycle are associated with Although measures were taken to prevent participants from learning high-order regularities in 6 the current research, it cannot be excluded that participants might become aware of the stimulus 7 structure (i.e., the probe tones would change after 4 to 6 repetitions). However, if this happened, 8 participants would expect for changes of probe tones in both the low and high precision 9 contexts. Therefore, it cannot account for the difference between conditions reported here. It is 10 also unlikely that the dissociation of probe tones in low and high precision context was due to 11 how much the probe tones differ from their preceding tones (i.e., the three prime tones) in terms 12 of frequency. It is because the frequency of these tones was determined by random sampling.

13
The allocation of these tones to low/high precision context was dependent on their position 14 within a cycle (i.e., whether they were presented in the beginning/middle of a cycle) rather than 15 their frequency. 16 17 The dissociation of probe tones in low and high precision context is closely related to the mixed