Dentate Gyrus Integrity Is Necessary for Behavioral Pattern Separation But Not Statistical Learning

Abstract Pattern separation, the creation of distinct representations of similar inputs, and statistical learning, the rapid extraction of regularities across multiple inputs, have both been linked to hippocampal processing. It has been proposed that there may be functional differentiation within the hippocampus, such that the trisynaptic pathway (entorhinal cortex > dentate gyrus > CA3 > CA1) supports pattern separation, whereas the monosynaptic pathway (entorhinal cortex > CA1) supports statistical learning. To test this hypothesis, we investigated the behavioral expression of these two processes in B. L., an individual with highly selective bilateral lesions in the dentate gyrus that presumably disrupt the trisynaptic pathway. We tested pattern separation with two novel auditory versions of the continuous mnemonic similarity task, requiring the discrimination of similar environmental sounds and trisyllabic words. For statistical learning, participants were exposed to a continuous speech stream made up of repeating trisyllabic words. They were then tested implicitly through a RT-based task and explicitly through a rating task and a forced-choice recognition task. B. L. showed significant deficits in pattern separation on the mnemonic similarity tasks and on the explicit rating measure of statistical learning. In contrast, B. L. showed intact statistical learning on the implicit measure and the familiarity-based forced-choice recognition measure. Together, these results suggest that dentate gyrus integrity is critical for high-precision discrimination of similar inputs, but not the implicit expression of statistical regularities in behavior. Our findings offer unique new support for the view that pattern separation and statistical learning rely on distinct neural mechanisms.


INTRODUCTION
The ability to distinctly remember events with overlapping features is an integral part of episodic memory functioning. Pattern separation refers to the formation of discrete neural representations of similar inputs and is thought to support the ability to distinguish between similar events in memory (Marr, Willshaw, & McNaughton, 1971). A separate, yet equally important feature of memory is the extraction of commonalities shared across events, which supports the ability to predict future events. The process of extracting such regularities in the environment over time is referred to as statistical learning (SL; . At the functional level, pattern separation and SL serve different goals and may rely on computations that cannot be supported by the same neural circuitry. The current study aims to investigate whether the two processes are dissociable in their neural mechanisms by examining the impact of a highly selective lesion in a structure that has previously been linked to both of them, namely, the hippocampus. The hippocampus has long been known to play an important role in episodic memory ( Vargha-Khadem et al., 1997;Squire & Zola-Morgan, 1991;Scoville & Milner, 1957). Within the hippocampus, computational models have posited that the sparse coding of granule cells in the dentate gyrus subregion form distinct representations of each episodic instance, which are then passed onto the downstream CA3 through the mossy fiber pathway (Norman & O'Reilly, 2003;McClelland, McNaughton, & O'Reilly, 1995;Marr et al., 1971). Recordings from granule cells in rodent models have shown heightened activation levels during spatial pattern separation (Leutgeb, Leutgeb, Moser, & Moser, 2007). Evidence of pattern separation in humans comes from studies that employed fMRI adaptation paradigms in combination with presentation of pictures of objects of varying similarity (Lacy, Yassa, Stark, Muftuler, & Stark, 2011;Bakker, Kirwan, Miller, & Stark, 2008). In these studies, the activation level of a hippocampal region that comprised dentate gyrus and CA3 elicited by visual "lures" (items similar to previously encountered items) was found to more closely resemble that elicited by new items than by repeated items. This result has been taken to suggest that dentate gyrus/CA3 is sensitive to small changes in input signals Bakker et al., 2008), supporting a potential role in pattern separation. Ultrahigh-resolution functional neuroimaging has also revealed distinguishable activation patterns for similar scenes in the dentate gyrus, and not in other hippocampal subfields when the stimuli were matched on the novelty-familiarity dimension (Berron et al., 2016). There is also evidence from patient studies, indicating that individuals with amnesic mild cognitive impairments and reduced volume in dentate gyrus/CA3 show deficits in their ability to discriminate previously seen objects from similar lures (Yassa et al., 2010). Collectively, these results support the role of dentate gyrus in pattern separation. It must be noted, however, that some of the imaging protocols employed did not have sufficient resolution to allow for clear differentiation of dentate gyrus from neighboring CA3.
Further support for the critical role of the dentate gyrus in pattern separation is provided by the case report of B. L., an amnesic individual with highly selective hippocampal lesions affecting the dentate gyrus bilaterally (Baker et al., 2016;Kwan et al., 2015). B. L.'s pattern separation abilities were assessed on a behavioral task consisting of incidental encoding of a series of visual objects, followed by a test requiring the classification of old objects, lures, and novel foils as "old," "similar," or "new" (mnemonic similarity task [MST]; Stark, Yassa, Lacy, & Stark, 2013). B. L. was able to correctly identify most of the targets and foils, but showed a heightened tendency to endorse lures as previously seen, indicating normal recognition abilities along with a specific deficit in hippocampally dependent pattern separation (Baker et al., 2016).
Although many different sources of evidence have implicated the hippocampus in computations of pattern separation that allow for the representation of highly similar events, recent findings suggest it is also involved in rapidly generalizing and extracting statistical regularities across multiple events (Ellis et al., 2021;Henin et al., 2021;Covington, Brown-Schmidt, & Duff, 2018;Schapiro, Gregory, Landau, McCloskey, & Turk-Browne, 2014;Turk-Browne, Scholl, Chun, & Johnson, 2009). SL is typically studied by presenting participants with a continuous stream of repeating triplets (e.g., ABCDEFABCGHI…), such that items within a triplet (e.g., A-B) co-occur more frequently than cross-boundary items (e.g., C-D). Participants' ability to become sensitive to this hidden structure is subsequently assessed on a recognition test, in which participants discriminate between target triplets and foils . Neuroimaging studies have reported enhanced hippocampal activation during viewing of visual items in a structured order compared with a random order (Ellis et al., 2021;Schapiro, Kustner, & Turk-Browne, 2012;Turk-Browne et al., 2009). Similarly, using intracranial recording, Henin et al. (2021) found evidence for unit (pairs or triplets)-based item organization in the hippocampus during both visual and auditory SL. Further evidence comes from developmental works showing that individual differences in the volume of the hippocampus predict auditory (Finn, Kharitonova, Holtby, & Sheridan, 2019) and visual (Schlichting, Guarino, Schapiro, Turk-Browne, & Preston, 2017) SL in young children and adults. Although beyond the focus of the current study, many other brain regions outside of the hippocampus have also been implicated in SL, including sensory/perceptual cortical regions, the left inferior frontal gyrus, and the striatum (see Batterink, Paller, & Reber, 2019, for review). Their involvement has been suggested to differ depending on various aspects of SL, such as sensory modality, input complexity, engagement of attention, and the types of statistical representations that are acquired (Henin et al., 2021;Conway, 2020).
Neuropsychological evidence also supports the role of the hippocampus in SL. Schapiro et al. (2014) tested the SL abilities of the case L.S.J., who suffered complete bilateral hippocampal tissue loss and broader medial temporal lobe (MTL) damage, across four different types of visual and auditory stimuli (shapes, scenes, syllables, and tones). L.S.J. showed at-chance performance in the explicit recognition of target triplets from novel foils across all four types of stimuli. Using the same paradigm, Covington et al. (2018) reported that four other individuals with either selective hippocampal damage or more extensive MTL damage also showed impaired recognition performance relative to healthy controls, although here performance was still above chance. However, a key limitation of these studies is that statistical-learning performance was tested only using the typical forced-choice recognition task, which primarily captures explicit knowledge based on stimulus familiarity. Given previous work that SL produces dissociable implicit and explicit knowledge in healthy adults (Batterink, Reber, Neville, & Paller, 2015), these previous studies leave open the possibility that sensitivity to statistical structure in behavior that does not require explicit memory judgments may occur independently of hippocampal contributions. Such implicit markers of statistical structure may be present, for example, in implicit RT measures of learning (Batterink et al., 2015).
To reconcile the involvement of the hippocampus in both pattern separation and SL, Schapiro, Turk-Browne, Botvinick, and Norman (2017) have postulated that the two processes may rely on separate neural circuitry within the hippocampus. Building on the influential complementary learning systems model that differentiates between functionally distinct hippocampal and neocortical learning mechanisms (McClelland et al., 1995), Schapiro et al. hypothesized that similar complementary division of labor exists within the hippocampus, in the form of the trisynaptic pathway and the monosynaptic pathway, respectively. In Schapiro et al.'s model, the trisynaptic pathway, which projects from the entorhinal cortex to the dentate gyrus, CA3, and CA1 in successive stages, is considered to underlie the encoding of specific individual episodic instances. By contrast, the monosynaptic pathway, projecting directly from the entorhinal cortex to CA1, is proposed to support the integration of inputs across multiple temporally separated instances. Schapiro et al. tested their theory using a neural network model that simulated these physiological properties of the hippocampus. When the model was presented with an item sequence with explicit pair boundaries (AB|CD|EF|GH| AB…), paired items were represented distinctly within the dentate gyrus and CA3. In contrast, when items were presented continuously such that item pairings could only be learned through tracking the statistical regularities over time (ABCDABEFABEF…), item pairs were represented more strongly within CA1 than in the dentate gyrus or CA3. Moreover, a model with a "lesioned" trisynaptic pathway still successfully tracked statistical regularities, suggesting that the monosynaptic pathway alone is sufficient to support SL.
A few empirical studies provide some preliminary support for the proposed division of labor between the trisynaptic pathway and the monosynaptic pathway. In infants, the correlation between SL and hippocampal volume has been found to be stronger in the anterior hippocampus (Ellis et al., 2021)-which contains a greater proportion of the CA1 subfield (Canada, Hancock, & Riggins, 2021;Malykhin, Lebel, Coupland, Wilman, & Carter, 2010)than the posterior hippocampus. Sherman, Graves, and Turk-Browne (2020) hypothesized that, because of the CA1 region being shared by the two pathways, episodic encoding and SL are in direct competition with each other. Supporting this hypothesis, a neuroimaging study reported that when faced with competing goals of encoding the current, predictive (A) stimulus and predicting the upcoming (B) stimulus, the hippocampus represented the current stimulus more weakly than the future stimulus, which also coincided with lower recognition performance for predictive stimuli . The finding that prediction of future events impairs the episodic encoding of the current event is in line with the view that the two processes are in direct competition because of the overlap between their proposed neural mechanisms. However, to our knowledge, no study in human participants has directly tested the involvement of the trisynaptic pathway and the monosynaptic pathway in pattern separation and SL, respectively.
We had a unique opportunity to assess whether there is a division of labor for pattern separation and SL within the hippocampus by testing the unique case of B. L.. B. L. has a rare, highly selective lesion of bilateral dentate gyrus, along with documented highly specific behavioral impairments in discriminating similar objects in memory (Baker et al., 2016). Based on the model of hippocampal circuit organization proposed by Schapiro et al. (2017), we specifically tested the hypothesis that a selective dentate gyrus lesion would impair pattern separation but leave SL intact. To match learning materials across these two domains, highly similar spoken syllables were used as stimuli across both pattern separation and SL tasks. Following the rationale put forward in the computational modelling work conducted by Schapiro et al. (2017), participants' ability to encode and distinguish individually chunked items was taken as measure of pattern separation, and their ability to extract items from a continuous syllable sequence was taken as a measure of SL. To capture both implicit and explicit knowledge accrued as a result of SL, we used three separate SL tasks: (1) a rating task in which participants rated the familiarity of target triplets (words) and two types of foil triplets (partwords and nonwords), each presented in isolation, (2) a forced-choice recognition task in which participants had to select target triplets that were pitted directly against nonwords, and (3) a RT-based task that indirectly measured learning of statistical regularities through RT facilitation. Moreover, to ensure that the behavioral evidence for auditory pattern separation in the case of B. L. was not specific to linguistic stimuli, we also administered a second pattern separation task using common environmental sounds. We predicted that B. L. would demonstrate a deficit in behavioral markers of pattern separation on both tasks. Critically, we further predicted that implicit expression of SL would be preserved in B. L., but that this intact performance would possibly go hand in hand with impairments on SL tasks that require explicit stimulus discrimination, as previously observed in patients with less selective hippocampal lesions (Covington et al., 2018;Schapiro et al., 2014).

B. L.
B. L. was 60 years old at the time of testing and has 13 years of education. At age 24 years, B. L. suffered anoxic brain injury as a result of an electrical injury and cardiac arrest. As reported by Baker et al. (2016), high-resolution 3 T MRI scans revealed highly selective bilateral ischemic lesions in the hippocampus that were primarily restricted to the dentate gyrus and a portion of CA3 ( Figure 1), with signal abnormalities detected in the dentate gyrus bilaterally but not in CA1-2 or subiculum. Comparison of the volume of hippocampal subfields between B. L. and 119 age-matched controls revealed that B. L.'s dentate gyrus volume is approximately 50% smaller along the entire anterior-posterior axis relative to controls (Baker et al., 2016). CA1, in contrast, is not reduced in size; in fact, it is numerically larger (8%) than in controls (Baker et al., 2016). Whole-brain imaging (Baker et al., 2016) also revealed small volume reductions in the left superior-posterior parietal cortex and right precuneus. B. L.'s recent neurocognitive performance was reported in Mitchnick et al. (2022). B. L. scored 24/30 on the Montreal Cognitive Assessment, which is slightly below the cutoff point for identifying mild cognitive impairment (26/30; Nasreddine et al., 2005).

Controls
Control participants matched to B. L. in terms of age, education level, and linguistic background were recruited both from the local community and through a crowdsourcing platform for online studies, Prolific.co. Community participants (n = 12, six women) had an average age of 60.5 (range = 57-65) years and completed an average of 15 years of education (range = 10-18 years). All participants were monolingual English speakers, had normal vision and hearing, and had no history of neurological or psychiatric disorders. Testing of the community participants took place at the University of Western Ontario over two separate days.
Additional participants were recruited from Prolific to achieve a sample size of control participants similar to that of a previous study that also compared B. L.'s MST performance to controls (Baker et al., 2016, who report a sample of n = 20). Prolific participants were filtered based on their response to Prolific's screening questionnaire. Participants were required to be between 55 and 65 years old, to identify as male, to be native English speakers, to have completed 12-14 years of education, to have no hearing difficulties, and to have no history of neurological or psychiatric disorders. Recruitment on Prolific was conducted separately for word MST, sound MST, and SL tasks, and each participant completed just a single task. Eleven participants were recruited for word MST, 11 for sound MST, and 14 for SL (nonoverlapping samples). Informed consent was obtained from all participants in compliance with the research ethics board of the University of Western Ontario, York University, and Baycrest Health Sciences. All participants were compensated for their time.

Word Mnemonic Similarity Task
The stimuli for this task were generated from a set of 25 unique trisyllabic nonsense words (e.g., gopula), created from 75 unique syllables. These 25 words were presented in the task as "First presentation" items. Of this set, five words were each repeated 10 additional times to create "Repeat" items. Another five of the original set were each used to create additional five "Lures" by recombining the three syllables in five different ways (e.g., gopula: golapu, lagopu, lapugo, pugola, pulago; referred to as a "lure family;" see the bottom of Figure 2A). The words in each of these lure families are highly similar to the original word and to one another; correctly differentiating them requires one to remember not only the three syllables but also the order in which they were combined (Forest, Finn, & Schlichting, 2022;Park, Rogers, & Vickery, 2018). The remaining 15 first presentation items served as foils and were never repeated. Thus, in total, the task consisted of 25 first presentation trials, 25 lure trials, and 50 repeat trials, for 100 trials ( Figure 2A). During the task, the items were ordered such that each repeat or lure item was separated by an average of six intervening items (range = 2-12 items). The first trial associated with an "Old" response occurred approximately 10 items into the task.
All word items were created by randomly pairing three different consonants with three different vowels (i.e., CVCVCV), with the constraint that the syllables used in lure and repeat items were never used in another word. To create the auditory stimuli for this task, Microsoft Word's "Read Aloud" function was used to produce individual syllables that were recorded and combined into words with Audacity. Each word was approximately 1 sec long, and the perceived loudness was normalized across all sound files.

Environmental Sound Mnemonic Similarity Task
The experimental stimuli consisted of 88 unique common environmental sounds collected from Internet sources (FindSounds.com; Free SFX.co.uk; ZapSplat.com), which Of the initial 25 unique word items (orange), five were used to create 50 repeat trials (green) and another five were used to create 25 lure trials (blue). The items were ordered in a manner such that there was an average of six intervening items between a repeat/lure item and the subsequent repeat/ lure item. Participants responded "old" to repeated items and "new" to items that were presented for the first time or items that sounded similar to but different from a previous item. (B) The stimulus set for the environmental sound MST consisted of 40 "repeat pairs" and 24 "lure pairs." Note that visual presentation of items is included for illustration of task design only. It was not part of the experiment. included sounds produced by animals, humans, and manmade objects. Task stimuli were selected on the basis of a pilot study, in which 25 online participants between ages 18 and 30 years listened to 100 sounds and generated descriptive labels for each sound. The sounds were then ranked in terms of concept agreement (i.e., the proportion of participants who assigned the same label to a sound) and the 67 sounds with the highest agreement were used in either the actual experiment (64 sounds) or the practice (three sounds). Of the 64 experimental sounds, 40 sounds were duplicated to create 40 "Repeat" pairs. The remaining 24 sounds were each paired with an additional sound from the same semantic category collected from the same Internet sources, creating 24 "Lure" pairs (e.g., two different "ringing bell" sounds). In total, the task consisted of 64 first presentation trials (40 first presentations of repeat sounds +24 first presentations of lure sounds), 24 lure trials (24 second presentations of lure sounds), and 40 repeat trials (40 second presentations of repeat sounds; Figure 2B). All stimuli were manipulated using Audacity software to be monophonic, have a duration of 1-2.5 sec, and have approximately the same perceived loudness. Items were ordered such that repeats and lures were separated by an average of 13 intervening items (range = 5-24) from their counterparts. The first trial associated with an "Old" response occurred approximately 10 items into the task.

SL Task
The stimuli for the SL tasks consisted of 12 unique syllables, taken from Batterink and Paller (2019), which were combined to form four trisyllabic nonsense words (tafuko, regeme, rupuni, fetisu; Figure 3). The sound file for each syllable was 300 msec in duration.
Exposure phase. To create the continuous speech streams used in the initial Exposure Phase, each word was concatenated in pseudorandom order with the constraint that the same word never appeared consecutively, at a rate of 380 msec per syllable. Each word was repeated 90 times, resulting in a 6.84-min-long continuous stream.
Target detection task. Thirty-six speech streams were created. Each stream consisted of all four words in the language presented 4 times each (48 total syllables), concatenated together in the same manner as in the Exposure Phase but with the constraint that the word containing the target syllable never appeared as the first or the last word in the stream. This yielded four targets per stream, and 48 targets per triplet position across the entire task. Each speech stream was 18.24 sec long.
To examine possible learning effects during the target detection task itself, the 36 streams were subdivided into three blocks of 12 streams, with each of the 12 syllables serving as the target syllable once per block. Within each block, the 12 syllables were ordered in a way such that the syllable position of the target (word-initial, word-middle, word-final) were evenly distributed across the block (e.g., initial, middle, final, middle, initial, final…).
Explicit SL tasks. The stimuli for the rating task consisted of 12 trisyllabic items: four words from the exposure phase, four partwords that contained two syllables from the same word and one syllable from another word (1. rege + ko, 2. feti + me, 3. ta + puni, 4. ru + tisu), and four nonwords that contained syllables from three different words that never occurred adjacent to each other during the exposure phase (1. pu + ge + ti, 2. ni + su + ta, 3. fu + ru + me, 4. ko + re + fe). None of the partwords appeared across word boundaries during the exposure phase. The same four words and four nonwords were used as the stimuli for the two-alternative forced choice (2AFC) recognition task (partwords were not included).

Word and Sound MSTs
Both word and sound MSTs were modeled after the continuous version of the MST (Stark, Stevenson, Wu, Figure 3. Diagram of the SL paradigm. Following a 6.8-min exposure phase, participants completed (1) the target detection task, (2) the rating task, and (3) the 2AFC recognition task. The four partwords used in the familiarity rating task were created by combining two syllables from the same word and one syllable from another word. The nonwords used in the familiarity rating and the 2AFC recognition tasks consisted of syllables from three different words.
Rutledge, & Stark, 2015; Figure 2), which requires participants to make continuous "Old" and "New" recognition judgments to a list of items. Each trial began with a 1.5-sec pause followed by the auditory presentation of a word or a sound item along with a prompt on the screen ("New or Old?"). Participants were instructed to label an item as "Old" if the item had been presented before or "New" if the item had never been presented before. They were also specifically instructed that some of the items may sound similar to one another, but that similarsounding items should also be labelled as "New" if they had not been previously presented. Before the task started, participants underwent five practice trials during which they were given feedback on whether their responses were correct or incorrect.

SL Task
Exposure phase. Participants listened to a 6.84-min speech stream, which was divided into three 2.28-min blocks. At the end of each block, participants were asked to guess the total number of unique syllables used in the speech stream and were then given an optional break (maximum 30 sec). While listening, participants also performed a cover task in which they responded to pauses within the speech with a keypress. Eighteen short pauses were inserted into the speech stream, and the number of hits to the pauses was used to confirm that participants were continuously listening to the stream. The timing of the pauses was pseudorandom with the constraint that they always occurred after the second syllable in a word, so as not to indicate word boundaries. All participants performed well on this cover task, with no participant missing more than one pause.
Target detection task. After the exposure phase, participants completed the target detection task, designed to indirectly assess participants' knowledge of the statistics of the speech stream. This task requires participants to make speeded responses to target syllables embedded in short segments of the continuous speech stream. At the beginning of each trial, a written form of the target syllable (e.g., "ta") was displayed on the screen while the auditory syllable was presented twice. The written form of the syllable then remained on the screen while the short speech stream was presented. Participants were required to make a keypress each time they detected the target syllable. Both speed and accuracy were emphasized.
Before starting the task, participants completed two practice trials. A different speech stream made up of three trisyllabic words was used for the practice. The words in the practice speech stream contained none of the 12 syllables in the exposure speech stream and were generated using a different speech synthesizer voice. At the end of each practice trial, participants were given their average RT and their total number of hits. Rating task. This task was designed to assess participants' explicit knowledge of the words. On each trial, participants listened to a trisyllabic item and rated their familiarity with the item on a scale of 1-4 (1 = least familiar). The task consisted of 12 trials (four words, four partwords, and four nonwords), presented in random order.
Two-alternative forced-choice recognition task. This task served as an additional measure of participants' explicit knowledge. On each trial, participants listened to a word and nonword pair, and selected the one that sounded more familiar to them. The numbers "1" and "2" appeared on the screen for 1 sec before each word was presented, and participants responded with the number associated with the more familiar sounding word. The same four nonwords used in the familiarity rating task were paired exhaustively with the four words, resulting in 16 trials. The word appeared as the first item in half of the trials, and trials were presented in random order.

Perceptual Similarity Rating Task
B. L. and the community participants were administered an additional perceptual similarity rating task with the stimuli used for the sound MST to ensure that any performance deficits were not because of difficulties with discriminating the sounds at the perceptual level. The task consisted of 83 pairs of sounds, which included 24 lure pairs, 24 "different" pairs in which a sound was paired with another sound (not its lure counterpart), and 35 "identical" pairs in which the same sound was repeated twice. All of the 88 unique sounds were used at least once in this task. Each pair was presented successively with a 1-sec interstimulus interval followed by a prompt on the screen asking participants to rate the similarity of the two sounds on a scale of 1-4. Participants were instructed to use 1 when the sounds were completely different, 4 when they were identical, and 2 or 3 when they were similar.

Visual MST
In addition to the two auditory MSTs, B. L. was also tested on the original, visual version of the MST used in Baker et al. (2016) to examine whether his pattern separation abilities for visual objects had changed since his last testing. The original MST (Stark et al., 2013) was administered in a third separate session that took place 4 weeks after the testing of the two auditory MSTs using the same protocol as in Baker et al. (2016). To avoid possible practice effects from viewing set "C" in 2016, B. L. was tested using set "D" in 2021. During the initial study phase, B. L. viewed 128 images of everyday objects (e.g., toothbrush) while judging whether they were indoor or outdoor items. Later, in a separate test phase, he was instructed to discriminate between previously studied target items (e.g., toothbrush), new items that did not appear in the study phase (e.g., spoon), and lures that were visually and conceptually similar to the studied items (e.g., toothbrush in a different color), by identifying them as "old," "new," or "similar." There were 64 of each stimulus type.
General Testing Setup B. L. and the community controls were tested in person over two sessions separated by a 1-week period. The first session consisted of the two MSTs and the perceptual similarity rating task. The second session consisted of the SL tasks. In addition, the controls were also administered a pure-tone audiometry and Montreal Cognitive Assessment in the second session to assess their hearing and neurocognitive status. Participants were asked to verbally provide responses for all the tasks except for the target detection task. The experiment was carried out in a quiet room with an experimenter who provided task instructions at the beginning of each task and recorded participants' verbal responses. The tasks were administered on a ThinkPad laptop with two portable speakers on either side of the laptop.
The online controls completed the tasks on their own personal computers and provided responses using the keyboard. They were instructed to use headphones to listen to the auditory stimuli. Each task began with a volume adjustment task during which participants listened to music and adjusted their sound volume to a comfortable level. To ensure that participants were using working headphones as instructed, a headphone check task adapted from Woods, Siegel, Traer, and McDermott (2017) was administered.
For the entire sample, all tasks were created using PsychoPy 2020.2.10 and were hosted and administered online via Pavlovia.

Data Analysis
To test for potential differences between the controls tested in person versus online, t tests were conducted to compare both control groups in the core analyses; as reported throughout results, no significant group differences were found on any measure. Therefore, data from both control groups were collapsed for comparisons between B. L. and control participants.

MST
Auditory MSTs. Estimates derived from signal detection theory were used to index discrimination between different types of items while controlling for response biases (Stark et al., 2015;Leal, Tighe, & Yassa, 2014;Yassa et al., 2011). Participants' ability to discriminate similar lures from repeat items (lure discrimination score) was computed by subtracting the probability of responding "New" to any given item from the correct rejection rate for lure items (p("New"|Lure)p("New"|Repeat)). Recognition score was computed by subtracting the probability of responding "New" to any repeat item from the correct rejection rate for first presentation items (p("New"|First presentation)p("New"|Repeat)). B. L. and controls' performance was compared using a modified t test designed for comparing a single individual to a small sample (< 50) of controls (Crawford & Howell, 1998). Effect sizes were estimated as B. L.'s score expressed as z scores of controls' scores (Z cc ), following the method for effect size estimation described in Crawford, Garthwaite, and Porter (2010). Four separate one-tailed t tests were conducted comparing B. L. and controls' lure discrimination and recognition scores from the two MSTs.
Visual MST. B. L.'s performance on the original visual MST was analyzed in the same way as in Baker et al. (2016). Pattern separation was measured using the Lure Discrimination Index (LDI; Stark et al., 2013), computed by subtracting the proportion of "Similar" responses to foils from the proportion of "Similar" responses to lures (p("Similar" | Lure)p("Similar" | Foil)). His general recognition was computed by subtracting the proportion of "Old" responses to foils from the proportion of "Old" responses to repeats (p("Old" | Repeat)p("Old" | Foil)).

SL Task
Target detection task. Responses made within 0-1200 msec of a target syllable onset were considered as "hits" and were used toward analyses (Batterink & Paller, 2017Batterink et al., 2015). All other responses were considered false alarms. Average RT was computed for syllables at the initial, middle, and final triplet position. Controls' RTs were then entered into a repeated-measures ANOVA with Triplet Position (1-3) and Stream Position (4-45) as within-subject factors; Stream Position was included as a factor to rule out any potential confounds of stream position on the main triplet effect of interest (Himberger, Finn, & Honey, 2019). B. L.'s RTs were analyzed with an item-based ANOVA, with Syllable Position and Stream Position as between-items factors, to test for a single-subject priming effect. Linear contrasts for syllable position are reported. A significant linear effect of Triplet Position, with faster RTs to third syllables than first syllables, was considered to indicate RT facilitation.
To compare B. L. and controls' RT facilitation, a "RT prediction score" was computed by subtracting the average RT to the final triplet position from the average RT to the initial triplet position and dividing the difference by the average RT to the initial triplet position . A modified one-tailed t test was then used to compare RT prediction scores between B. L. and controls (Crawford & Howell, 1998). Furthermore, we tested whether B. L.'s observed RT facilitation was significantly greater than would be expected given the null hypothesis of no SL. As part of this process, for a given iteration, B. L.'s RTs for each of the 144 targets were scrambled across triplet positions and a RT prediction score was calculated based on this scrambled data, using the same formula as above. This process was repeated 1000 times, generating a null distribution of B. L.'s RT prediction scores. B. L. was considered to have shown a significant RT facilitation if his observed RT prediction score fell in the 95th percentile or above in this null distribution.
Rating task. Average rating was computed for each of the three word categories (word, part-word, and nonword). For controls, ratings were entered into a repeatedmeasures ANOVA with Word Category as a within-subject factor. B. L.'s ratings were entered into an item-based ANOVA, with Word Category as a between-items factor. In addition, a "Word-Partword ( W-PW ) score" and a "Word-Nonword (W-NW) score" was computed for each individual by taking the difference between the average ratings for words and partwords and between the average ratings for words and nonwords, respectively. B. L.'s W-PW and W-NW scores were each compared with controls using the modified one-tailed t test (Crawford & Howell, 1998).
2AFC recognition task. A one-sample t test was conducted to test if the performance of controls was significantly different from chance level (50%). A modified one-tailed t test (Crawford & Howell, 1998) was conducted to compare the average performance accuracy between B. L. and controls.

MST
On each of the four measures (lure discrimination and recognition scores for both word MST and sound MST), B. L. was the lowest performing individual among all participants ( Figure 5).
In the word MST, participants were exposed to multiple lures from the same lure family-words that consisted of the same three syllables but combined in different orders (e.g., gopula vs. golapu). Motivated by the finding in Forest et al. (2022) that SL can result in orderindependent representations of triplet grouping, we explored whether participants formed increasingly strong generalized lure family representations that would become independent of order as they encountered more lures from the same family; such generalized representations would be reflected in increased difficulties in separating lures from the same family in memory as the task progressed. To test this, we fitted a logistic regression model to the controls' old/new responses on the lure trials with the number of prior occurrences of a lure within a given family (1-5), overall trial number, and lure family as predictor variables. Consistent with this idea, Wald test (df = 4) indicated that the number of prior lure occurrences within a lure family significantly increased the likelihood of responding "Old" to a lure ( p = .025; Table 1).
In contrast, B. L.'s data did not show evidence of increased likelihood of responding "Old" to a lure as more lures from a family were encountered (Table 1). On the contrary, B. L.'s likelihood of responding "Old" was the lowest after five prior lure occurrences within a lure family. However, B. L.'s low accuracy on the lure trials (he erroneously labelled 16/25 lures as old) might have made it challenging to detect any pattern in his data.
Although not present in B. L., the finding in controls that mnemonic discrimination became increasingly poorer as more lures from a family were encountered supports the interpretation that general, order-independent representations of triplet membership accrued and strengthened gradually over the course of the task, potentially supported by SL mechanisms.

Visual MST
Because B. L. was the only participant who completed the original visual MST, B. L.'s performance in 2021 (age 60 years in 2021) was compared with his performance in 2015 (age 54 years), reported in Baker et al. (2016). When given three response options, B. L. still demonstrated a bias toward responding "Old" on lure trials, similar to his performance on the auditory MSTs ( Figure 6A). However, unlike his previous performance, B. L. also exhibited a tendency toward responding "Similar" to foil items. This high rate of "Similar" responses on foil trials resulted in a low accuracy (56%) for foils and an LDI of −0.13 that is lower than his LDI in Baker et al. (2016; 0.01; Figure 6B). B. L.'s recognition, score, which is not dependent on the probability of "Similar" responses, showed a slight improvement compared with his earlier performance (Baker et al., 2016).

SL Task
B. L. was severely impaired on the rating task, but performed within normal limits on the 2AFC recognition task and the target detection task.

Target Detection Task
B. L. performed well on the task, with an average hit rate of 84.7% and average false alarm rate of 14.3%. The hit rate and false alarm rate of controls were 86.3% and 10.4%, respectively (Figure 8). Although B. L.'s overall RT (average = 635.8 msec) was slower than controls (average = 527.5 msec), he showed a robust SL effect that was comparable to control  Figure 7A). For controls, there was also a significant main effect of Stream Position of the target on RTs such  that RTs became progressively slower toward the end of each trial, F(1, 3215) = 18.22, p < .001, η 2 = .006. Stream position did not significantly predict B. L.'s RTs, F(1, 118) = 2.08, p = .15. B. L.'s RT prediction score was significantly greater than the score based on the random RT distribution generated from B. L.'s own data (99th percentile), also indicative of significant SL. B. L.'s RT prediction score did not differ from that of controls, t(25) = 0.91, p = .81. The two control groups' RT prediction scores also did not differ from each other, t(24) = 0.45, p = .65. Interestingly, controls' hit rates also increased toward the later triplet positions, F(2, 908) = 3.53, p = .030, such that the hit rate for the last position was significantly higher than the hit rate for the middle position, t(908) = −2.58, p = .03. B. L.'s hit rates also increased numerically as a function of syllable position, but this pattern was not statistically significant, F(2, 33) = 0.55, p = .58 (Figure 8).

2AFC Recognition Task
Controls performed with an average accuracy of 67.8%, which was significantly greater than chance-level performance, t(25) = 7.59, p < .001, d = 1.49. B. L.'s average accuracy on the recognition task was 62.5% and did not significantly differ from controls, t(24) = −0.42, p = .34 ( Figure 7C). There was no difference in accuracy between the two control groups, t(24) = −0.58, p = .57.
To examine whether B. L.'s poor perceptual discrimination of similar sounds (relative to controls) may account for his deficits in pattern separation on our memory task, we excluded all lures that B. L. failed to discriminate on the perceptual task from the sound MST analysis. B. L.'s mnemonic discrimination accuracy did not improve even when considering only lures that he successfully discriminated at the perceptual level (25% correct ➔ 23.5% correct), suggesting a deficit in mnemonic discrimination over and above his perceptual impairments.
In summary, B. L. performed significantly worse than controls on both the word and the sound MST as well as on the rating measure of SL, but showed comparable performance to controls on the RT-based and the 2AFC measures of SL (Table 2).

DISCUSSION
The current study investigated pattern separation and SL in B. L., an individual with a highly selective hippocampal lesion in the dentate gyrus (Baker et al., 2016;Kwan et al., 2015). We tested the hypothesis that the dentate gyrus-a key structure of the trisynaptic pathway-supports pattern separation but is not necessary for SL. By and large, our results supported this hypothesis. When exposed to individually presented trisyllabic words, B. L. showed difficulties in recognizing previously heard words and differentiating previously heard words from similar lures, indicating a deficit in both pattern separation and general recognition. A similar pattern of deficits emerged when B. L.'s pattern separation abilities were assessed with environmental sound items. However, despite his impaired ability to distinctly encode individual episodic events, B. L. showed successful learning of trisyllabic words embedded within a continuous syllable sequence on the target detection task and the 2AFC task, demonstrating a preserved ability to track statistical regularities across multiple inputs over time. In contrast, B. L. showed a severe impairment in recognizing and differentiating the learned words on arguably the purest explicit measure of SL-the rating task. This task requires explicit highresolution retrieval of the learned words and thus may in part rely on pattern separation, making it sensitive to trisynaptic pathway disruption. The critical role of the dentate gyrus in pattern separation has been supported by a large number of studies (e.g., Berron et al., 2016;Lacy et al., 2011;Yassa et al., 2010), including a recent study on B. L. himself that revealed his circumscribed deficit in discriminating similar visual objects in memory judgments on the widely used MST developed by Stark and colleagues (Baker et al., 2016;Stark et al., 2013). The current finding that B. L. was additionally impaired in discriminating two types of auditory stimuli further supports the necessity of the dentate gyrus in pattern separation, and extends its involvement across multiple sensory domains (see also Baker, 2022 for additional evidence concerning auditory pattern separation in B. L.). The importance of the hippocampus in auditory episodic memory has been suggested in previous work that has revealed impaired auditory recognition memory in patients with hippocampal damage (e.g., Squire, Schmolck, & Stark, 2001). Neuroimaging evidence indicates that different types of visual stimuli (faces, scenes) are represented uniquely in the hippocampus in a category-agnostic manner (Huffman & Stark, 2014), suggesting that the process of pattern separation is invariant to stimulus type, and by extension sensory domain (Kent, Hvoslef-Eide, Saksida, & Bussey, 2016). However, research on pattern separation outside the visual domain is still scarce (Bjornn, Van, & Kirwan, 2022;Herman, Baker, Cazes, Alain, & Rosenbaum, 2020;Trier, Lacy, & Marsh, 2016), and whether the dentate gyrus supports pattern separation in a domain-general manner across modalities remains a topic of continuing investigation.
B. L. performed comparably to controls on the target detection task and the 2AFC recognition task for word stimuli encountered in our SL paradigm, but did not show evidence of learning on the rating task. Previous studies suggest that indirect and direct measures of SL vary in the level of their dependency on processes of explicit memory retrieval (Kiai & Melloni, 2021;Siegelman, Bogaerts, Kronenfeld, & Frost, 2018;Batterink et al., 2015). The target detection task measures learning without taxing explicit retrieval and has been suggested to capture the implicit knowledge of statistical regularities acquired through new learning (Kiai & Melloni, 2021;Batterink et al., 2015). Although the 2AFC task requires explicit recognition judgments, it can be completed by comparing gist-based familiarity for target words and novel nonwords, in the absence of highly precise memory for the words. Of note, our nonword foils contained multiple types of statistics that learners could potentially use to successfully discriminate them from words (Forest et al., 2022;Henin et al., 2021;Park et al., 2018), including triplet membership, ordinal position of individual syllables, and transitional probability (containing syllable pairs that had a transitional probability of 0). Previous evidence indicates that learners may represent all of these different statistical cues, with different brain regions contributing differentially to these representations (Henin et al., 2021). Furthermore, participants can successfully distinguish target words from nonwords even if they have only developed a general association between the three syllables in a word, without knowing the specific order in which they had occurred (Forest et al., 2022;Park et al., 2018). Thus, B. L. may have succeeded on the current 2AFC task based solely on memory for triplet memberships, without needing to rely on more specific memories of item-to-item transitions. There is evidence to suggest that such familiarity-based recognition can also be achieved based on implicit knowledge ( Voss & Paller, 2008) and that it may not require hippocampal integrity (Holdstock et al., 2002).
In contrast, performance on the rating task cannot be supported by comparing the familiarity of words and foils, but depends on the recollection of individually presented items, a process that has been suggested to functionally differ from familiarity-based recognition ( Yonelinas, 2002) and that is known to depend on the integrity of the hippocampus (Bowles et al., 2010;Holdstock et al., 2002). Moreover, the rating task in the current study also included partwords, which share more overlap with target words, whereas the 2AFC task involved only nonword foils with multiple, highly distinct statistical cues, as described above. Taken together, the rating task can thus be seen as relying more heavily on the encoding and retrieval of high-precision representations of the learned triplets, and by extension is more likely to be affected by impairments in pattern separation.
Using both implicit and explicit measures of SL, our study extends previous research on the involvement of the hippocampus in SL (Covington et al., 2018;Schapiro et al., 2014). In these two previous studies, patients with hippocampal damage or damage in the broader MTL beyond the hippocampus showed deficits in both visual and auditory SL on the 2AFC measure. By contrast, B. L. showed intact 2AFC recognition performance, suggesting that recognition of learned triplets may rely on hippocampal regions outside the dentate gyrus. In addition, we also found that the implicit expression of SL is not affected by dentate gyrus lesion. These findings are in line with the proposal that the trisynaptic pathway is not necessary for SL -at least at the level of implicit or familiarity-based expressions of what has been learned. However, more explicit, recollection-based expressions of SL may still depend on pattern separation and the integrity of the trisynaptic pathway, as suggested by B. L.'s poor performance on the rating task. A still-open question is whether patients with broader hippocampal damage (such as those studied by Covington, Schapiro, and colleagues) would also show intact performance on the target detection task we administered, or whether their observed SL deficits would also extend to performance on implicit measures. Future studies of such patients may shed light on whether the hippocampus is strictly necessary for SL, or whether it merely contributes to learning in healthy populations by acquiring explicitly accessible representations of statistical regularities .
The hippocampus is only part of a larger network of regions spanning both cortical and subcortical structures that have been implicated in SL Frost, Armstrong, Siegelman, & Christiansen, 2015). Neuroimaging studies have reported activation of a number of regions during SL of auditory or visual stimuli, including the striatum, the inferior frontal gyrus, and sensory-related processing areas such as the occipital cortex and the superior temporal gyrus (Moser et al., 2021;Sandoval, Patterson, Dai, Vance, & Plante, 2017;Schapiro et al., 2017;Karuza et al., 2013;Turk-Browne et al., 2009;McNealy, Mazziotta, & Dapretto, 2006). In Henin et al. (2021), intracranial recordings during SL were used to group individual syllables based on the similarity of the neural activity evoked by each syllable. The study found evidence of triplet-based syllable organization within the hippocampus and lower feature (e.g., ordinal position)based syllable organization in other cortical regions, suggesting that different regions form complementary representations of regularities during SL (Henin et al., 2021). In particular, the authors suggest that predictionbased coding may occur within cortical areas, while the role of the hippocampus may be in accurately and flexibly representing the identity of the extracted units, supporting further use of these representations in various cognitive operations. B. L.'s profile of robust RT facilitation but impaired explicit rating performance is in line with this idea. Similarly, Covington and colleagues' hippocampallesioned patients showed low yet above chance recognition performance, whereas temporal-lobe epilepsy patients performed no better than chance on the explicit 2AFC task but showed intact performance on a RT-based online task (Henin et al., 2021). In the current study, there is no direct evidence that the hippocampus is involved in any of the SL tasks we employed. Future neuroimaging research can investigate the direct involvement of different hippocampal subregions in SL to further our understanding of the exact nature of hippocampal contributions to SL, including for the specific task employed in the current study.
Interestingly, when pairs of similar environmental sounds (e.g., two different types of cow moos) were presented in direct succession during the perceptual similarity rating task, B. L. rated more pairs as being "exactly the same" than controls. Recent studies have reported that B. L. shows a deficit in perceptually discriminating simultaneously presented similar faces and novel complex objects (Mitchnick et al., 2022;Baker, Youm, Levy, Moscovitch, & Rosenbaum, 2020), supporting the purported role of the hippocampus, and, in particular, the dentate gyrus, not just in mnemonic but also in perceptual discrimination (Inhoff et al., 2019;Erez, Lee, & Barense, 2013;Barense, Gaffan, & Graham, 2007;Lee et al., 2005). The findings in the current study hint that the role of the dentate gyrus in perceptual discrimination extends to the auditory domain as well. However, although B. L. performed poorly in differentiating similar sounds at both perceptual and mnemonic levels, our analyses suggests that his impaired pattern separation performance on our memory task cannot be fully accounted for by his difficulties in auditory perceptual discrimination. Even when considering only lures that B. L. successfully differentiated at the perceptual level, B. L.'s mnemonic discrimination accuracy remained impaired.
In contrast to earlier reports with the visual MST (Baker et al., 2016), it should be noted that B. L. also scored significantly lower than controls on general recognition across the two pattern separation tasks. This was primarily driven by his tendency to respond "Old" to first presentation items. B. L.'s accuracy for first presentation trials was 44% on the word MST and 47% on the sound MST. To determine whether this poor recognition performance was specific to the auditory stimuli used in the current study, we also administered the same visual MST used in Baker et al. (2016). On this task, B. L.'s accuracy for novel, unrelated foil trials was 56%, this time because of his tendency to respond "Similar" to novel items. As a result, B. L.'s current LDI (correct rejection rate for lure items minus the probability of responding "similar" to foils; −0.13) decreased from the value reported by Baker et al. [2016]; LDI = 0.01; Figure 6B), but his Recognition Index (correct rejection rate for target items minus the probability of responding "old" to foils) did not show any decrease. The variation in response bias exhibited by B. L. across tasks, and across multiple assessments with the same task, poses a challenge for determining whether the changes observed reflect a true decline in his task performance since the initial investigation. In addition, at the time of testing for the current study, B. L. was 60 years old and within the age range in which age-related pattern separation deficits have been reported (e.g., Stark et al., 2015). He was also newly diagnosed with cancer. It is also possible that the process of normal aging along with changes in his health status has further exacerbated B. L.'s deficits in pattern separation and by extension affected general recognition. Lastly, given our current design, we cannot be sure whether B. L.'s pattern of impairment and preservations across our tests of SL would extend to different stimulus types and modalities. Prior evidence suggests that SL shows distinct patterns based on modality and stimulus type (Raviv & Arnon, 2018;Siegelman et al., 2018;, and additional studies will be necessary to understand the contributions of the dentate gyrus to other types of SL.
Overall, this study sheds light on the role of the dentate gyrus in pattern separation and SL, providing support for a previously proposed neural computational model of these two processes . We found that, consistent with previous literature, the dentate gyrus plays an essential role in pattern separation across both visual and auditory stimuli. On the other hand, SL-when probed with tasks that require implicit or low-resolution, familiaritybased expression of knowledge-can be maintained in the absence of dentate gyrus integrity and the trisynaptic pathway of which it is a critical component. However, a measure of SL that relies upon explicit, high-precision memory retrieval was also disrupted by dentate gyrus impairment and thus may rely on trisynaptic pathway-dependent memory processes. We conclude that SL operates largely independently of pattern separation mechanisms, but that both processes may be called upon interactively when highresolution statistical memory representations are needed.