The Grounded Expertise Components Approach in the Novel Area of Cryptic Crossword Solving

This paper presents a relatively unexplored area of expertise research which focuses on the solving of British-style cryptic crossword puzzles. Unlike its American “straight-definition” counterparts, which are primarily semantically-cued retrieval tasks, the British cryptic crossword is an exercise in code-cracking detection work. Solvers learn to ignore the superficial “surface reading” of the clue, which is phrased to be deliberately misleading, and look instead for a grammatical set of coded instructions which, if executed precisely, will lead to the correct (and only) answer. Sample clues are set out to illustrate the task requirements and demands. Hypothesized aptitudes for the field might include high fluid intelligence, skill at quasi-algebraic puzzles, pattern matching, visuospatial manipulation, divergent thinking and breaking frame abilities. These skills are additional to the crystallized knowledge and word-retrieval demands which are also a feature of American crossword puzzles. The authors present results from an exploratory survey intended to identify the characteristics of the cryptic crossword solving population, and outline the impact of these results on the direction of their subsequent research. Survey results were strongly supportive of a number of hypothesized skill-sets and guided the selection of appropriate test content and research paradigms which formed the basis of an extensive research program to be reported elsewhere. The paper concludes by arguing the case for a more grounded approach to expertise studies, termed the Grounded Expertise Components Approach. In this, the design and scope of the empirical program flows from a detailed and objectively-based characterization of the research population at the very onset of the program.


INTRODUCTION
Research on expertise development has attempted to reveal the mechanisms through which some individuals are able to show levels of performance, skill-sets, or knowledge which are reproducibly superior to that of others active in that particular domain (Ericsson and Towne, 2010). The relative contributions of deliberate practice and innate cognitive aptitude have been hotly debated [e.g., the recent special issue on expertise development in Intelligence (Detterman, 2014), and the recent review by Hambrick et al. (2016)], and may reflect an ideological clash between the contrasting approaches of experimental and differential psychology, with the former focusing on the general processes of skill acquisition, and the latter upon the identification of key differentiating factors in individual performance (Hambrick et al., 2014b). Studies of expertise development on both sides of the argument have tended to remain focused upon a relatively restricted range of practice-intensive domains-primarily chess, music, sport and Scrabble-and to have followed well-worn investigative paths. These have included diary/retrospective studies of practice (Ericsson et al., 1993); the Expert-Performance Approach (EPA- Ericsson and Ward, 2007), including paradigms based on the original de Groot chess experiments (de Groot, 1946(de Groot, /1965Tuffiash et al., 2007;Ericsson and Towne, 2010); and tests of either general intelligence ("g") itself, or a restricted set of compartmentalized sub-skills believed on a priori grounds to be relevant to the domain (Bilalić et al., 2007;Grabner et al., 2007;Tuffiash et al., 2007). There is a danger that, in all of these approaches, research may be based more upon preconceived, theoretical assumptions concerning the demands of the domain, or upon strongly held ideological convictions about the nature of expertise, than on grounded empirical evidence. The time is therefore right for the exploration of new domains and for a fresh theoretical and methodological perspective (Hambrick et al., 2014b(Hambrick et al., , 2016. To this end, the current paper outlines a relatively unexplored domain of investigation-British cryptic crosswords-and proposes a novel methodology, termed the Grounded Expertise Components Approach (GECA). This places a far heavier emphasis upon the detailed understanding and characterization of the research population and upon a holistic and empirically argued view of the demands of the performance domain, rather than a small number of isolated elements, than has hitherto been the case.

US-Style Crosswords
A recent paper (Toma et al., 2014) contrasted two hypothesized cognitive drivers of proficiency (working memory (WM) capacity and strategy) in two "mind-game" domains: competitive Scrabble and national-level performance at US-style crosswords. In their introductory review, they characterized the skill-set necessary for US-style crossword solving as follows: In contrast to competitive SCRABBLE proficiency, crossword proficiency relies on semantic aspects of language such as general word knowledge (Hambrick et al., 1999) and superior recognition for word meanings (Underwood et al., 1994). [...] Unlike SCRABBLE, crossword puzzles do not require exceptional visuospatial strategies because the spatial layout of the game board is provided; therefore, visuospatial ability should not be as critical to crossword solving expertise as having an extensive understanding of word meanings. [...] Semantic understanding is necessary for the process of creating a word while solving crossword puzzles; therefore, expert crossword players should primarily rely on superior knowledge of word definitions. (p. 728).
Toma's characterization of crossword expertise is certainly very plausible, so far as it relates to the American crossword puzzle. At the root of the challenge set by US-style crosswords is the nature of the puzzle layout which consists of a heavily interlocking grid with fully cross-checked letters ( Figure 1A). Given the constraints of the US-style grid, the creator of the puzzle ("compiler"/"setter") often has to resort to highly obscure words, slang, brand names, sections of phrases, acronyms and even word fragments in order to populate the squares, and it is primarily this quality which determines the difficulty of the crossword (Hambrick et al., 1999;Nickerson, 2011;Sutherland, 2012). Clues are almost entirely "straightdefinition" with very few exceptions; these include puns, quizclues and obliquely referenced clues, such as "Present time?" (8 letters), where the answer is "YULETIDE" (Shortz, 2001;Nickerson, 2011). Essentially, therefore, US crosswords may be seen as semantically cued retrieval tasks (Nickerson, 1977(Nickerson, , 2011, requiring considerable crystallized knowledge, much of it obscure "crosswordese" [words frequently found in crossword puzzles, but very rarely in conversation (Hambrick et al., 1999;Romano, 2006)]. Additionally, good reasoning ability is hypothesized to be necessary for the evaluation of candidate responses, the deduction of entries from cross-checking letters already present in the grid (pattern recognition/word fragment completion) and the re-examination of earlier interpretations of a recalcitrant clue in order to identify misreadings of punning or ambiguous phrasing (Nickerson, 1977(Nickerson, , 2011Hambrick et al., 1999).
Failure to succeed in US-style crossword solving might therefore be hypothesized to arise from three causes (Hambrick et al., 1999): 1. insufficient knowledge to retrieve the target word from the preceding semantic cue; 2. inefficient retrieval strategies from long term memory; 3. weak reasoning abilities, leading to deficits in both clue interpretation and the use of ancillary information such as intersecting, cross-checking letters.
Consistent with this analysis, Toma's US-style crossword solvers were found to have significantly higher self-reported verbal SAT scores than Scrabble experts, and these scores were similar to those of undergraduates at a highly selective liberal arts college (average verbal SATs scores in the 95th percentile) acting as the control group (Toma et al., 2014). Additionally, in response to a free-text question ("What do you believe is the single most important skill you use during competition?") participants reported their dominant cognitive abilities as "Good memory/recall" (29%) and "Mental flexibility" (26%); other key facets were "Pattern recognition" (16%), "General knowledge" (10%) and "Good vocabulary" (10%).

British-Style Cryptic Crosswords
British-style cryptic crosswords are very popular within the UK and in other countries with historically close ties to Britain (e.g., Canada, Ireland, Australia, New Zealand, India, and Malta). In the UK, they appear in all the daily and weekend newspapers (both broadsheet and tabloid), and also in various literary, political and cultural magazines, on the web and in specialist puzzle collections. Although superficially similar, British-style cryptic crosswords differ from their American counterparts in fundamentally critical ways. A typical layout ( Figure 1B) comprises a 15 × 15 grid with half-turn rotational symmetry in which roughly half the letters are checked by intersecting clues. The remaining spaces are filled in with "blocks". The implications of the blocked grid design are profound: with fewer cross-checking letters, the setter no longer has to resort to the level of obscurity seen in American puzzles in order to make the entries mesh. Indeed, a former editor advises would-be setters of the (London) Times daily crossword: As far as vocabulary goes, obscure words are avoided. A high percentage of the vocabulary should be familiar to a person of reasonable education and knowledge [...] mostly without recourse to reference books, while commuting to work on a train, for example. (Greer, 2001, p. 52). Challenge is therefore no longer provided by the obscurity of the semantic retrieval task. Rather, the cryptic crossword is a tricky linguistic puzzle which "plays using language as a code", exploiting the potential ambiguity of the English language, in multiple senses, levels and contexts (Aarons, 2012, p. 224). In a cryptic clue, the apparent meaning ("surface reading") of the clue is phrased to be deliberately misleading. The solver learns to ignore this reading, and to look instead for a non-literal interpretation of the deconstructed clue components, comprising a grammatical set of coded instructions which, if executed precisely, will lead to the correct (and only) answer. The problem lies in recognizing and cracking the code: the task of the setter, like that of a magician, is to conceal the mechanism so subtly that the solution pathway is not easily detectable at first sight.
Solvers thus have to learn how to crack a cryptic crossword: the language is an artificial creation, meaning that "no-one is a native speaker of 'Cryptic"' (Aarons, 2012, p. 229). Although there is general agreement that the clues have to be fairly constructed (i.e., solvable), there are no hard-and-fast guidelines as to what the rules of engagement are (Aarons, 2012), leading to an almost infinite number of innovative ways to exploit the "versatile and quirky English language" (Connor, 2013). Nevertheless, there is some consensus over a number of basic mechanism types, and a wealth of "Teach-Yourself " primers exist (Gilbert, 2001;Greer, 2001;Stephenson, 2007;Manley, 2014). The vocabulary of cryptography is often used in the titles of these works: solvers are described as "cracking the codes" and deciphering the setter's "hidden messages". Indeed, in WWII, the British secret service MI6 recruited cryptographers to work on the Enigma Project at Bletchley Park by placing a discreet notice in the Daily Telegraph asking if anyone could solve the cryptic crossword in less than 12 minutes. This eventually resulted in the recruitment of six code-breakers with requisite skills from 25 applicants (Singh, 1999;Greer, 2001;Connor, 2014).
Cryptic crossword clues usually comprise two elements: a straight definition, plus the cryptic instructions for assembling the required solution-the "wordplay" (Box 1). It is not always straightforward to tell which clue element fulfills which purpose, and there is often no clear division between the two parts (Schulman, 1996;Greer, 2001;Aarons, 2012;Sutherland, 2012;Manley, 2014). Furthermore, the setter may frame the surface reading of the clue as an entirely plausible but misleading sentence, thus deliberately trapping the unwary solver in a "red herring" based on the inherent linguistic ambiguities of English (Aarons, 2012). The solver must therefore "overcome a lifetime's parsing habits" in order to avoid being sucked into the "deep structure" of the text: they must remain at the surface in order to explore other non-intuitive interpretations of the clue's components (Schulman, 1996, p. 309). In short, the cryptic is "the complicated, intellectually brooding cousin of the definitionalit had mystique and depth, it played hard to get with a capricious, whimsical air" (Brandreth, 2013).
Of the UK daily cryptics, the most famous is probably the Times crossword. Not all crosswords are equally challenging, however, and there is a widely recognized hierarchy of challenge involved (Biddlecombe, 2011;Connor, 2012;Sutherland, 2012). Difficulty in solving a standard block-style cryptic crossword is largely commensurate with the degree of concealment used by the setter in the clue mechanics, although vocabulary and clueing style can also be a factor.
One way to demonstrate expertise in cryptics is by rapid solving. Cryptic experts may be defined as those who can routinely solve a daily cryptic at the more difficult end of the spectrum in less than 30 minutes (Friedlander and Fine, 2009). As a benchmark, the average solving time required to make the elite grand final of the Times National Crossword Championship BOX 1 | Illustration of cryptic clue mechanisms.
British-style cryptic crossword clues must be treated as a grammatical set of coded instructions. The following two clues illustrate the process: Clue (a) Schulman (1996, p. 309) Active women iron some skirts and shirts (9) The definition is "Active women" = an obliquely phrased straight definition for FEMINISTS The wordplay comprises: FE (iron, chemical symbol) + MINIS (plural form of a type of skirt, hence the word "some") + TS (plural of "T," an abbreviation for "T-Shirt") The surface meaning is highly misleading; the interpretation of iron relies on a linguistic ambiguity (homonym employing different part of speech-noun, not verb).
Clue (b) 2013 Times Championship clue (http://www.piemag.com/2013/01/23/marks-mind/) Speciality of the Cornish side that's perfect with new wingers (5,4) The definition is "Speciality of the Cornish" = CREAM TEAS The wordplay comprises: DREAM TEAM ("side that's perfect") with D and M replaced by new letters on either edge ("with new wingers").
The surface meaning is misleadingly suggestive of football/rugby and contains some non-intuitive parsing of the components.
The algebraic/programming nature of the cryptic clue means that wordplay components may be flexibly recombined or anagrammed to form new units, e.g.: • substring(A+B+C) =D (e.g. Part of it 'it an iceberg (7) = TITANIC - Moorey (2009) (TNCC) is 9-15 minutes per puzzle (Biddlecombe, 2012). Conversely, many ordinary solvers tackle easier cryptics at the same level for decades, taking an hour or more to finish (if at all).
A second way to demonstrate expertise is by successfully solving advanced cryptics (Friedlander and Fine, 2009). Advanced cryptic crosswords are found in weekend newspapers and some magazines, and are generally "barred grid" (Figure 1C). Of these, the Listener Crossword is the most notoriously difficult, employing a high degree of clue mechanic concealment, obscure vocabulary, grids of startling originality and a thematic challenge, often involving a number of tricky lateral thinking steps on the basis of minimal guidance. Speed is not important-solvers have 12 days to submit their solution to each Listener puzzle. Very few entrants achieve an all-correct year (21 in 2010; 16 in 2011; 14 in 2012 1 ) and those submitting 42+ correctly (out of 52) appear on an annual roll of honor. The Magpie 2 , a monthly specialist magazine with 5 highly challenging advanced cryptic crosswords (and 1 mathematical puzzle) per issue, runs a similar all correct/roll of honor system, and is broadly of Listener standard.
Finally, for a few highly expert cryptic solvers, the ultimate challenge is to compose cryptics oneself. There are a number of clue-writing competitions [e.g., Azed monthly challenges in the Observer magazine and competitions run for Crossword Centre club members (Harrison, n.d.)] which attract entries from expert solvers; a few of these go on to become crossword professionals 1 Total number of individuals submitting entries in each year ranges from 1500 to 3300. See further the official Listener site www.listenercrossword.com. 2 http://www.piemag.com/about/.
(editors or setters for local or national publications), though for most this is not their full-time occupation.

Hypothesized Cognitive Demands of Solving British-Style Crosswords
Given the above, we might therefore hypothesize that cryptic crossword solvers' skills depend less upon crystallized intelligence and cued vocabulary retrieval than those of their American counterparts, although these factors are still clearly relevant. Indeed, even if a word referenced by a cryptic clue is not known to the solver, it can often be deduced from the wordplay, and there are potentially two quite distinct avenues to the clue's solution (Coffey, 1998): the crystallized route, tapping general knowledge and vocabulary to intuit the response, perhaps using cross-checking letters; and fluid intelligence which taps the ability to "derive logical solutions to novel problems" (Hicks et al., 2015, p. 187-see also Cattell, 1963;Carroll, 1982;Kane et al., 2005) using clue components. Key cognitive abilities might therefore be hypothesized to include: (i) The general capacity to analyze, reason, problem-solve and think "on one's feet, " which could reasonably be argued to draw heavily upon WM capacity and fluid intelligence (Kane et al., 2005;Shipstead et al., 2014); together with a liking for this type of cognitive challenge; (ii) A specific aptitude for cryptographic or mathematical thinking. The similarity of cryptic crossword clues to algebra or computer programming has been noted in passing (Manley, 2014), but has not attracted much scholarly attention. An Australian conference paper (Simon, 2004) draws a number of close analogies between solving cryptic crossword puzzles and computer programming problems, and suggests that the cryptic crossword "could one day be harnessed as one of a set of predictors of computer aptitude" claiming that "while intuition can be extremely helpful in solving crossword puzzles, it cannot take the place of clear analytical thought" (p. 302). The hunch that computer programmers and mathematicians might be particularly adept at cryptic crosswords seems to be backed up by informal membership polls undertaken by two free-membership internet-based cryptic crossword clubs (de Cuevas, 2004;Lancaster, 2005): in both surveys those engaged in STEM-based employment (Science, Technology, Engineering, Mathematics), particularly IT, were strongly represented; (iii) The visuospatial ability to mentally manipulate algebraic components of wordplay "fodder" (see Box 1). While many solvers use a physical jotting pad or electronic anagrammer to manipulate the letters, the ability to rapidly visualize and mentally process potentially promising combinations might be hypothesized to confer a speed advantage in solving cryptics (Minati and Sigala, 2013); (iv) The ability to pattern-match and, most specifically, complete word fragments provided by cross-checking letters, as already discussed for US-style crosswords (Nickerson, 1977(Nickerson, , 2011Hambrick et al., 1999); (v) The ability to break free from familiar patterns of thought (particularly red herrings deliberately supplied by the setter) using new and unusual interpretations of clue components. The concept of "breaking the frame" of context-induced fixedness has often been associated with traditional insight problems (Davidson, 2003;Pretz et al., 2003;DeYoung et al., 2008): it is the solver's perseveration with the erroneous approach to the problem which can render it unsolvable. The authors argue elsewhere (article in preparation) that cryptic crosswords are a form of insight problem, and that flexible, divergent, "breaking frame" thinking is critical to successful solving.
The above review suggests that cryptic crosswords make wide and complex demands on their solvers, who appear to require a good all-round blend of lexical aptitude, logical/analytical thinking skills and breaking frame/lateral thinking abilities. This combination of attributes implies a certain "entry level" of ability that might in other contexts readily translate into academic success.

Previous Work on Cryptic Crosswords
Given the large disparity in cognitive demands and processes between the two forms of crossword (American definitional and British-style cryptic), it is essential to discriminate clearly between research undertaken in each field, since the findings from one area may not be applicable to the other, and cannot be cited uncritically as if the domains were congruent or interchangeable (Almond, 2010). Research into American-style crosswords has recently become more prolific (Nickerson, 1977(Nickerson, , 2011Hambrick et al., 1999;Toma et al., 2014;Moxley et al., 2015), while research into British-style cryptic crosswords has been comparatively sparse: this is mainly, one suspects, because of the separate crossword traditions of Britain and America, effectively making the subject unknown in America. Previous research into cryptics has considered several discrete areas: exploration of the cognitive or linguistic challenges posed by cryptic clues (Forshaw, 1994;Schulman, 1996;Lewis, 2006;Aarons, 2012); the use of cryptic crosswords to preserve cognitive flexibility ("use-it-or-lose-it") in aging populations (Winder, 1993;Forshaw, 1994;Almond, 2010); and finally a cluster of small-scale interrelated studies at the University of Nottingham, exploring individual differences in cryptic crossword solving (Underwood et al., 1988(Underwood et al., , 1994Deihim-Aazami, 1999). Of all previous research, only the last three studies explored in any depth the question of what makes an expert solver excel in the cryptic crossword domain.
Starting from an interest in individual differences in reading ability and lexical memory, Underwood adopted the premise that cryptic crossword experts were likely to possess "particularly rich lexical networks" (Underwood et al., 1988, p. 302). In a comparison of 12 locally sourced cryptic crossword solvers with a convenience sample of 12 non-puzzlers on a battery of lexical tasks, the expectation was that the cryptic solvers, who were all of a good standard, would show particularly rapid, novel and accurate lexical data retrieval compared to the non-puzzlers. Unexpectedly, however, both groups showed very similar task performance, leading Underwood to conclude that cryptic crossword skills are "as much bound up in the cryptic puzzle codes as they are in lexical fluency" (1988, p. 306). In other words, in order to crack the cryptic challenge, solvers needed to apply problem-solving skills, in addition to possessing a good working vocabulary.
This conclusion was followed up in small-scale trials involving a convenience sample of 22 cryptic crossword enthusiasts (staff and students) from the University of Nottingham (Underwood et al., 1994;Deihim-Aazami, 1999). This population was split into "experts" (n = 14) and "intermediates" (n = 8) on the basis of their performance in solving 30 stand-alone cryptic crossword clues written by the researchers for the study. Participants were again submitted to a battery of lexical tasks, and also took the AH4 test of fluid intelligence (Heim, 1970). In a reversal of earlier findings experts out-performed the intermediates in the lexical tasks. Additionally, there was no significant difference between experts and intermediates on the AH4. From this, the researchers reasoned that success in cryptics was bound up with lexical skill and that fluid intelligence was not, after all, a factor (Underwood et al., 1994). They further hypothesized that greater exposure to cryptic crosswords over a number of years had enabled the expert solvers to broaden their vocabulary and their familiarity with cryptic clue architecture (Deihim-Aazami, 1999).
There were a number of issues with this research, however. The less-skilled group were all of student age, and had only recently commenced crossword solving, making them novices of unquantifiable future potential-"initiates" or "apprentices" according to Chi's taxonomy of proficiency (Chi, 2006), rather than experienced intermediates ("journeymen"). Additionally, neither sub-group was externally benchmarked for their performance in solving a grid-based, professionally written, publication-quality crossword of known difficulty. Both of these factors lead to doubts over the real levels of expertise present, and the assignment of participants to the correct groups. Furthermore, the experts were considerably older than the intermediates, which may have facilitated their lexical performance, as crystallized knowledge increases naturally with age (Horn and Cattell, 1967;Underwood et al., 1994). Finally, both groups scored exceptionally highly on the AH4, which was designed to be used on general populations educated only to high-school level (Deary and Smith, 2004), strongly suggesting that there were ceiling effects, given the degree-level education of the trial sample. This might explain the apparent lack of group differences in fluid intelligence.

The Expert-Performance Approach (EPA)
Deihim-Aazami's research (1999) contained many elements of the EPA methodological framework (Ericsson and Smith, 1991;Ericsson, 2000Ericsson, , 2006Ericsson and Williams, 2007;Tuffiash et al., 2007). The aim of the EPA is to facilitate ecologically valid lab-based research in order to enable researchers to observe, analyze and capture the essence of the performance domain as the participants engage in a representative task. This task is intended to demonstrate a clearly superior performance by the expert, and elucidate potential mechanisms for this superiority (Ericsson, 2000).
The three key stages of the EPA comprise (Tuffiash et al., 2007): (i) Identifying a representative task which captures the essence of expertise in the target domain. In de Groot's highly influential chess studies (de Groot, 1946(de Groot, /1965Gobet et al., 2004) this consisted of two tasks: (a) identifying the best next move; (b) recalling the board layout of a briefly displayed game. The first task was intended to simulate game-play, and the second to investigate domainspecific perceptual and mnesic mechanisms. De Groot's two paradigms became classics in other "mind game" studies. For example, in the Nottingham studies, participants coldsolved 30 freestanding cryptic clues, without the aid of a grid structure or intersecting letters (Underwood et al., 1994;Deihim-Aazami, 1999). Similarly, in Scrabble, Tuffiash adapted de Groot's first paradigm, asking participants to identify the best-scoring play, when presented with a set of 12 diverse game positions that might be encountered during a highly competitive Scrabble game (Tuffiash et al., 2007). (ii) Observing participants engaged in this task, while collecting "process-tracing data, " often by means of a think-aloud protocol Green and Gilhooly, 1996;Ericsson and Williams, 2007). Again, for Scrabble, Tuffiash recorded talk-aloud data for their participants as they debated the best next move (Tuffiash et al., 2007); and in the Nottingham trials, participants were asked to solve a further 37 isolated clues, talking aloud to explain their understanding of each clue's architecture and solution (Deihim-Aazami, 1999). Superior performance displayed by experts is argued to be directly linked to complex representations of the current task, and to be derived from a deeper and more detailed understanding of the underlying domain (Chase and Simon, 1973;Chi et al., 1981;Ericsson and Williams, 2007;Campitelli and Gobet, 2010). A systematic analysis of the material gained from process-tracing might therefore give insights into areas of comparative strength which underpin experts' superior performance (Tuffiash et al., 2007). Such data are commonly extended and enriched by identifying and experimentally manipulating additional cognitive subskills hypothesized to be key to the domain (Tuffiash et al., 2007): Tuffiash's Scrabble participants underwent an additional battery of primarily lexical tasks (Tuffiash et al., 2007); as did Deihim-Aazami's cryptic crossword participants (Deihim-Aazami, 1999), together with the AH4 (Heim, 1970); (iii) Accounting for the development of experts' superior performance by conducting interviews with participants, and teachers/parents where relevant, to establish key indicators of domain experience, such as starting age, key achievement milestones, and the quantity and type of practice undertaken (Tuffiash et al., 2007). This last step, and the content of the questionnaire administered, is driven by researcher conviction that the origin of high expertise in niche domains arises from extensive exposure to dedicated and structured practice regimes over at least 10 years (Ericsson et al., 1993;Ericsson, 2000). Collection of practice and experience data was therefore a feature of both the Scrabble (Tuffiash et al., 2007) and the cryptic crossword studies (Deihim-Aazami, 1999).

Current Research-The Grounded Expertise Components Approach (GECA)
The current research program could have followed the classic EPA path, by selecting representative tasks and key cognitive components from a purely theoretical standpoint, based on a priori detailed knowledge of the solving process. After all, the above review of hypothesized cognitive demands of British cryptics identified several promising avenues, and the Nottingham research, although small scale and conflicting in results, led the way in this domain. The current authors were reluctant, however, to impose their preconceived ideas upon the direction of the present study in this way. This reluctance was based on the conviction that one cannot conduct objective research on a niche population without first carefully characterizing this population across a large number of dimensions, leading to a grounded understanding of the drivers for participation in the field, the levels of immersion in the activity and the potential skills which are brought to it. A new approach (GECA) was therefore conceived, with the intention of providing empirical support for the direction and design of future controlled studies. The components of the GECA comprise: 1. A wide-ranging survey characterizing the domain-specific population across a large number of dimensions, ranging from demographics, levels of participation and experience to more indirectly related areas such as participants' education, occupation and hobbies, and motivational drivers. This differs from the EPA in that the survey is not limited to the collection of practice and experience data only, and is undertaken at the very outset of the research program. 2. Analysis of the survey data to identify characteristics of both the expert and non-expert population, leading to a grounded research rationale for the design of key elements in subsequent stages of the investigation. This avoids the potential trap of confounds arising from preconceived theoretical or ideological assumptions. 3. A lab-based recording (if appropriate-see Hambrick et al., 2016) of domain-relevant performance, to elucidate both the strategic and cognitive mechanisms involved. Although this stage draws heavily on the EPA, the GECA focuses on a more naturalistic task, fully reflecting the totality of the cognitive demands of the domain, rather than a series of isolated challenges (e.g., the de Groot paradigms: on this see further the Discussion section below). 4. The identification of supplementary sub-tests, to probe specific empirically-indicated cognitive or strategic processes believed to be instrumental in distinguishing experts from non-experts within the domain: again, this flows from the initial characterization of the population.
An exploratory survey was therefore designed to capture responses of cryptic crossword solvers to a broad range of 84 questions (many with extended sub-sections), as above. Analysis of this large body of data has led to a very detailed characterization of the solving population, many aspects of which will be reported elsewhere; in the interests of brevity we report here only the key findings relevant to the design of the subsequent research program and to the establishment of the GECA as a powerful research methodology. Although the net was cast very wide, the questionnaire was specifically set up to address the research question: "What is the nature of the cryptic crossword population in terms of their cognitive skills, motivation and expertise development?". The hypotheses were as follows, largely flowing from the nature of cryptic solving and its cognitive demands identified earlier: H1: cryptic crossword solvers would generally be academically able adults, given the cognitive complexity of the puzzle demands; this might imply that there is a cognitive ability threshold for entry into the domain; H2: solvers' education and occupation would predominantly be in scientific or IT-related fields, rather than in language fields, implying that cryptics might particularly appeal to the logical and analytical thinker with an aptitude and liking for problem-solving; H3: cryptic crossword solving regularly generates "Aha!" or insight moments, supporting the hypothesis that the cryptic clue is a classic type of insight problem through misdirection; and that this pleasurable experience is a salient driver of cryptic crossword participation; H4: solvers would generally enjoy effortful cognitive activity in all spheres of life including work and hobbies, and that this would be an important driver of cryptic crossword participation; H5: solving is essentially an intrinsically motivated activity, not generally undertaken for public acclaim or prize money 3 ; practice/engagement levels for both expert and nonexpert solvers would consequently be low and relatively unstructured compared to high profile competitive performance areas such as chess and music; H6: cryptic solving would not normally begin in childhood, in view of the cognitive complexity of the task, but is more likely to commence in late teenage years (for US crosswords, see also Moxley et al., 2015).

Materials
An 84-item wide-ranging questionnaire was developed and piloted. Most of the survey material was devised specifically for the study, but also incorporated the short-form "Need for Cognition" scale (Cacioppo et al., 1984) and the "Work Preference Inventory" (Amabile et al., 1994): both scales are described more fully below (Results Section). The survey was approved by the School of Science Ethics Committee, University of Buckingham. The questionnaire was made available both on the internet through SurveyMonkey R and on paper. Respondents were recruited in two phases: • Survey 1 involved contacting advanced cryptic crossword solvers, speed solvers and compilers at the very high-expert end of the spectrum. Respondents were sought by means of adverts circulated (i) on the Crossword Centre website solvers of daily block-style cryptics to take the same survey in order to obtain comparative data from non-experts. Adverts were placed on a number of websites providing a daily analysis of (and answers to) a wide range of block-style puzzles from UK newspapers; the survey was also re-advertised on the Crossword Centre website.
As the two questionnaires used were identical, data from the two surveys were combined and analyzed together. Participants were assigned to expertise groupings on the basis of responses to key fields within the survey (see below, "Definition of Research Groups"), regardless of survey phase.

Participants
There were 935 responses to the surveys (S1 n = 257; S2 n = 678); however, post-consolidation reviews of the data identified a number of unworkably incomplete (n = 109), duplicated (n = 14) or spoof (n = 7) records, which were excluded from the analysis, leaving 805 responses in total (S1 n = 234; S2 n = 571). Participants were aged 18-84 (mean = 52.1; SD = 12.4), and males (n = 632, 78.5%) outnumbered females (n = 173, 21.5%). The majority of respondents were British (n = 709, 88%); the remainder were from the USA (n = 28, 3%), Australia (n = 26, 3%), Ireland (n = 14, 2%); Canada, New Zealand, India, Holland, France and Spain (each 1% or less). All 805 participants responded to the initial pages of the survey; however, there was some attrition toward the end, with the lowest level of response to any section being 764. Overall reported numbers in the results section therefore vary according to the position of the question in the survey.

Definition of Research Groups
A key challenge in expertise studies is that of establishing rigorous, objectively-based and externally benchmarked criteria for assigning participants to research categories (Tuffiash et al., 2007). Ranking systems such as the Elo rating in chess (Gobet and Charness, 2006) and official Scrabble tournament metrics (Tuffiash et al., 2007) are particularly valuable, and have tended to encourage research in these areas (Grabner, 2014;Hambrick et al., 2014b). No such mathematically based ranking system exists for cryptics, and alternative methods have therefore been developed (Friedlander and Fine, 2009;Fine and Friedlander, 2010) resulting in both a 2-way (Ordinary/Expert) and a 3-way (Ordinary/High/Super-Expert) categorization of participant expertise.

Ordinary/Expert Solver Classification (O/E)
The Ordinary/Expert groups were designated as follows (Fine and Friedlander, 2010): • Ordinary (O): solvers who (by self-report) normally take longer than 30 min to solve quality broadsheet cryptics. They do not usually tackle advanced cryptics, and are not successful at this form of puzzle; • Experts (E): defined as those who can routinely solve one quality broadsheet cryptic in 30 min or less, who compile crosswords professionally, or who tackle advanced cryptics with regular success.

Ordinary/High/Super-Expert Definition (O/H/S)
The above definition of "Expert" is quite broadly conceived, and does not identify world-class performance in the same way as, for example, a FIDE Elo rating of 2200 ("Candidate Master") can do. The concept of "Super-Expert" cryptic crossword solver was therefore developed for earlier publications (Friedlander and Fine, 2009) in order to permit a more rigorous 3-way analysis of expertise, where required. In accordance with this, Super-Experts (S) were defined as those who fell into one or more of the following categories: • Those who edit or compose cryptics professionally, on at least an occasional basis, for broadsheet or specialist publications ("Pro"); • Those who regularly speed-solve a cryptic in <15 min; and/or had reached the final in the annual TNCC on at least one occasion ("Speed"); • Those who had solved 42+ Listener (or 48+ Magpie) crosswords correctly in 1 year, and were thus named on the official roll of honor ("Advanced").
Expert (E) participants not categorized as S were designated as "High Solvers" (H), enabling data to be analyzed using a 3-way structure (O/H/S). Conceptually, these three groups are similar to Chi's "Journeyman" (O), "Expert" (H) and "Master" (S) proficiency categories (Chi, 2006). Survey responses were strongly encouraged from those who had been tackling cryptics for at least 2 years, and preferably for 5+ years. Respondents could thus be assigned to the appropriate research group on the basis of mature performance in the field, thus avoiding possible confounds which can arise when classifying inexperienced novices of unquantifiable potential. In the event, only 11 participants with less than 2 years' experience completed the survey (1.4% of the total); of these, 9 were classified as O and 2 as H.

Expertise Ratings of the Surveyed Population
Solvers (n = 805) fell equally between O (n = 401, 49.8%) and E (n = 404, 50.2%). This high proportion of experts was attributable to recruitment methodology (particularly for Survey 1) and is not believed representative of the general level of solver expertise within the cryptic crossword solving community. Within the E group, H solvers (n = 225, 28.0% of overall total) outnumbered S solvers (n = 179, 22.2%).
As in other expertise studies (Tuffiash et al., 2007;Toma et al., 2014), the qualifying bar for the S designation was set rigorously high. Inspection of individuals coded to S revealed that many were acknowledged experts in the field, including: 49 professional setters or editors; 27 TNCC finalists, including 5 outright winners; and 111 roll of honor Listener/Magpie crossword solvers, of whom 31 had achieved an all-correct year. Some individuals qualified as S by virtue of two or more Pro/Speed/Advanced criteria (n = 52, 29% of S group). S group members were all known by name/reputation to the authors, and their skill level may be objectively verified by reference to publicly available competition statistics and compiler listings. The S category thus indisputably represents an elite body of top-class performers in the field.

Age, Experience, and Gender
Demographic data were supplied by 805 respondents; key points for discussion are summarized in Table 1.  Table 1. There was a highly significant association between expertise and gender [χ 2 (2) = 19.01, p < 0.001, Cramer's V = 0.154]. Post-hoc tests using standardized residuals indicated that this was driven by female participation in the O (z = 2.2, p < 0.05) and S (z = −3.1, p < 0.01) groups, with female participation dropping  Balfour, 2004); however, subscriptions to the Magpie 5 at the time of the survey also showed low levels of actual female participation, with membership running at 178M/10F, plus 6 couples, who solve and submit as a pair. This resulted in a female subscription level of 8% (16F out of 200, counting each member of the couple individually), comparable to our S data.

Handedness
Respondents were asked to supply their handedness: responses were coded as right/non-right (i.e., including ambidexterity), but will be referred to as right/left for convenience. Solvers were predominantly right-handed with overall levels of lefthandedness running at 11.3% (see Table 1). Levels of lefthandedness among O and H solvers fell between 8 and 12%, which would be considered low-normal within the general population (10 to 13.5% e.g., Gobet and Campitelli, 2007). S solvers show higher levels of left-handedness (14.0%); however the association between handedness and expertise was not overall statistically significant [χ 2 (2) = 3.893, p = 0.143, Cramer's V = 0.07]. Nevertheless, an analysis of the handedness of professional setters/editors (n = 49, all coded as S) shows a striking level (22.4%) of left-handedness; conversely other S group members (n = 130) now show a more conventional profile (10.8%). Reanalyzing the handedness statistics, with professionals identified separately, was statistically significant [χ 2 (3) = 8.734, p = 0.033, Cramer's V = 0.104]. Post-hoc inspection of the standardized residuals indicated that this was driven solely by the level of left-handedness within the Pro group (z = 2.3, p < 0.05). 4 As confirmed by Derek Harrison, website owner, e-mail dated 18/9/09. 5 Data supplied by Chris Lear, Magpie editorial team member, e-mail dated 25/9/09. Most cryptic crossword solvers began in their teens: 67.3% of respondents (n = 542) had started by the age of 20, and this rose to 88.2% by the age of 30 (n = 710). Starting age ranged from 6 to 65 (Mdn = 18), and the most popular age for commencing was 15 (n = 79; 9.8%) followed by 18 (n = 77; 9.6%). Only 21 respondents (2.6%; O: n = 2; H: n = 7; S: n = 12) claimed to have started solving cryptics before the age of 10; of these the majority (n = 20, 95%) had a parent or family member who also solved cryptics, and they learned the rules of solving from their parents (n = 19, 90%). There were significant group differences in starting age [H (2) = 122.70, p < 0.001]: S started earliest (Range = 6-40, Mdn = 15) followed by H (Range = 6-55, Mdn = 17) and then O (Range = 8-65, Mdn = 20). Pairwise comparisons, using Dunn's procedure (Dunn, 1964) with Bonferroni correction for multiple comparisons, revealed statistically significant differences in starting years between all groups, all with an adjusted significance of p . However, on average, members of each of the three groups had been tackling cryptics for approximately 30-40 years and were thus all highly experienced; and, regardless of group, 729 of the 805 respondents (90.6%) had been solving for 10 years or more. In common with findings in other expertise areas such as chess (Gobet and Ereku, 2014;Hambrick et al., 2014b;Campitelli, 2015), a small number of cryptic crossword solvers had achieved levels of high expertise within 5 years of starting (H = 9, S = 1). Conversely, many respondents (n = 57) had been solving for 45 years or more, but had remained as O.

Practice Levels
The survey collected a wide range of data relating to practice, experience and the range and difficulty of puzzles undertaken. Full details will be discussed elsewhere; the current paper presents summarized data only for key fields.

Hours Spent Solving Per Week
Respondents (n = 802) estimated how many hours they spent each week solving cryptic crosswords. Results are summarized in Table 2A.
Hours spent solving cryptic crosswords increased with expertise (O: M = 7.02; H: M = 7.27; S: M = 7.85); however, the median was 6 hours across all groups. This equates to

Hours Spent on Other Cryptic Crossword Activity per Week
Participants were also asked to estimate the amount of time spent on other crossword activities such as cryptic crossword social gatherings, blogging, consulting on-line solution pages or message boards, composing or test-solving cryptic crosswords and entering competitions. Results (n = 803) are summarized in Table 2B. Much of this difference was driven by the inclusion of the 49 crossword setters/editors within the S category, who would be expected to spend considerable amounts of crossword related time each week to fulfill their professional obligations: see Table 2B. Excluding these individuals, all groups spent on average less than 2 h per week (around 20 min per day) on other crossword related activities. Differences between the groups were still statistically significant [H (2) = 16.76, p < 0.001]. Pairwise comparisons revealed statistically significant differences between O-H (p = 0.007, r = 0.12) and O-S (p = 0.001, r = 0.15) alone; however comparison of H-S was no longer significant.

Education and Occupation
Education Education data were supplied by 780 respondents (O: n = 383; H: n = 220; S: n = 177); key points for discussion are summarized in Table 3 with further discussion in the following text.

Level of education reached
Respondents were asked to indicate their highest level of educational qualification either by radio button selection of standard UK options (e.g., O Level/GCSE, BA/BSc, PhD), or by free-text description, where these options were inappropriate. Data were reviewed independently by both authors, and assigned to the 8 bands of educational level (e.g., 0 = No Quals; 3 = A Level; 6 = BA/BSc; 8 = PhD) currently recognized by the UK Government (Gov.UK, 2015). An Independent Samples Mann-Whitney U test was conducted to determine whether there were differences in the distribution of educational qualification between expertise categories (O: n = 383; E: n = 397). The mean rank of E (403.76) was higher (more qualified) than for O respondents (376.76), although this trend was not statistically significant (U = 81,288.0, z = 1.77, p = 0.077).

Main subject studied
Survey participants were asked to indicate in a free-text field the main subject they studied in their education. This information was independently reviewed by both authors and coded to 43 subject areas; after resolution of differences these were then aggregated to 17 broad subject fields (see Appendix A in Supplementary Material) within 6 educational sectors: Arts and Design; Business; Human Studies; STEM; Wordsmiths; Other (unclassifiable/too many subjects identified). Most participants specified one subject which, given the generally high level of education achieved, was considered likely to be their degree/postgraduate specialism.

Main subject analyzed by RIASEC coding
Finally, educational subject data by participant (n = 757; O: n = 368; H: n = 214; S: n = 175) were assigned to "RIASEC" coding ("Holland codes": Gottfredson and Holland, 1996;Nauta, 2010: see further Box 2) using the standard listing supplied in the "CIP to HOC" section (Classification of Instructional Program Titles to Holland Occupational Codes) of the coding manual (Gottfredson and Holland, 1996). Forty-one distinct Holland codes were applied. Twenty-three records were not coded: in most cases participants had ceased education at 14-16 years old and were unable to supply a main subject. The mapping of subject area to code was relatively straightforward, and data assignation was double-checked by a second reviewer.
Following assignation of educational subject data to 3-letter codes, responses were analyzed by the primary RIASEC letter (see Table 4).

Field of Paid Occupation
Respondents were asked to supply their main paid occupation during their working life; teachers and lecturers were asked to state their main specialism. 780 responses were received (O: n = 383; H: n = 220; S: n = 177). Independent analysis of these details by both authors allowed for the allocation of respondents to 40 occupation areas based on discipline/field of work. Following resolution of differences, these were then aggregated to 23 occupational fields and thence to 7 occupational sectors: STEM, Finance, Office/Business, Wordsmiths, Performance, Manual, Other. For details, see Appendix B in Supplementary Material.
Overall scores again showed a strong bias toward "I" activities, accounting for 43.8% of the responses (O: 41.8%; H: 45.6%; S: 45.9%), compared to 10% in the 2010 US population norms (McClain and Reardon, 2015). Given the broadly comparable economic and technological profiles of the US and the UK, this appears considerably higher than might be expected. Employment in "I" fields typically involves relatively small numbers of largely STEM-based occupations involving highlyqualified individuals (McClain and Reardon, 2015).
"A" occupations (12.4%; publishers, journalists, writers, English teachers) were also considerably more prominent than US population norms (2%), and this Wordsmith activity was once more higher for H participants, with S being particularly low. While a chi-square analysis of primary RIASEC codes by expertise failed to achieve overall significance [χ 2 (10) = 10.05, p = 0.436, Cramer's V = 0.081), the comparatively low proportion of S occupied in "A" careers approached significance (S: 7.6%, S = 13, z = −1.8).
Code "S" was more prominent for survey occupation data than for education, and tended to reflect occupational teambuilding, committee and communications skills: for example, "SEC" (n = 29, 3.8%; e.g., civil servant) and "SER" (n = 22, 2.9%; e.g., company board director). Careers assigned to "R" tend to involve high levels of employment in relatively low-grade practical or mechanical tasks (McClain and Reardon, 2015), and were poorly represented among our survey population. Careers coded to "C" were more common for S (z = 1.2) than for O (z = −0.4) or H (z = −0.5), reflecting the larger proportion of accountants in this group ("CSI").

Occupation complexity
As part of the Holland coding process, occupation complexity was also recorded, using the Cx rating score in the "DOT to HOC" section of the coding manual (Gottfredson and Holland, 1996). This rating reflects the cognitive complexity of work demands (for the calculation algorithm, see Gottfredson and Holland, 1996, p. 723). Holland Cx scores range from <40 to >80; a Cx rating of 65 or higher is associated with a college degree and 4-10 years of "On-Job-Training" (Reardon et al., 2007). Survey respondents were engaged in cognitively complex jobs, with the mean and median scores for all groups being close to 70: see Table 7. Over half the respondents (53.8%) fell into the 70-79 band. Typical career options in the 60-69 band include secondary school teachers, middle-ranking civil servants and journalists; and in the 70-79 band lawyers, physicians, software engineers, company directors and university faculty staff. There were 151 teachers: 102 were predominantly secondary school specialists (Cx = 66); 49 were university faculty members (Cx = 74), including 9 at UK professorial level (Cx = 78).
Job complexity increased slightly with expertise (O: M = 69.2; H: M = 69.4; S: M = 69.7) and there were fewer S participants in the two lower complexity bands; however, job complexity did not differ significantly across expertise groups [H (2) = 1.230, p = 0.541).

"Need for Cognition" and Hobbies
"Need for Cognition" The short-form "Need for Cognition" (NFC) scale seeks responses to 18 statements relating to a person's tendency to seek out, engage in and enjoy effortful thinking (Cacioppo et al., 1984;von Stumm and Ackerman, 2013). Sample questions include (both reverse coded): "I like tasks that require little thought once I've learned them" and "Thinking is not my idea of fun." Respondents (n = 764; O: n = 377; H: n = 212; S: n = 175) rated each statement on a 5-point Likert scale (1 = "Completely Disagree"; 5 = "Completely Agree"). Full details of the results obtained will be discussed elsewhere; the current paper presents summarized data only.
Scores were corrected for reverse coding and averaged by participant to produce an individual NFC score. Overall, respondents showed mean NFC levels significantly in excess of the mid-point, 3: t (763) = 36.55, p < 0.001, d = 1.32. Indeed, no expertise group means fell below 3 on any of the individual 18 statements. NFC increased with expertise (O: M = 3.71, Mdn = 3.78; H: M = 3.77, Mdn = 3.78; S: M = 3.79, Mdn = 3.83) but differences were not statistically significant [H (2) = 3.319, p = 0.190]. A Mann-Whitney U test was also conducted to determine whether there were differences in the distribution of NFC scores between broad expertise categories (O: n = 377; E: n = 387). The mean rank of E (396.54) was higher (greater NFC) than for O respondents (368.09) and this In a separate question (reported later in Table 12), participants were asked to rate suggested motivators for their engagement with cryptic crossword solving. Full details of these motivators will be discussed elsewhere; however the wording of one suggested motivator is strongly reminiscent of the NFC: "My brain constantly demands to be engaged in intellectual pursuits in all I do. Pairwise comparisons were performed as above. These revealed statistically significant differences between O-H (p < 0.001, r = 0.18) and O-S (p < 0.001, r = 0.23) groups (adjusted p-values are presented). The difference between H-S was not statistically significant.

Hobbies
Participants were also asked about their hobbies outside cryptic crosswords. Data were collected in two parts: the first collected details of levels of engagement with other mind game activity (e.g., chess, Sudoku, Scrabble, non-cryptic crosswords); the second asked for free-text details of any other significant hobby activity. Data from these two sources were segmented, coded and combined to provide a rounded picture of cryptic crossword enthusiasts' leisure-time activities; mind game data were included only for those participants who engaged "regularly" with these activities. Sixty-two participants (8%) indicated 8% (n = 62) indicated that they had no important hobbies other than cryptic crosswords, or left this field blank (O: n = 31, 8%; H: n = 22, 10%; S: n = 9, 5%).

Hobbies by fields of interest
Hobbies were coded to hobby areas by an independent coder; these were then aggregated to 10 broad hobby sectors (see Table 8A), and three researchers (including the authors) then reviewed and agreed all codings. Overall, hobby replies (n = 2687) showed a pronounced bias toward cognitively challenging pastimes: 39.9% (n = 1073) related to other mind games (9.1% Sudoku, 4.9% Trivial Pursuits/Pub Quizzes; though only 2.8% non-cryptic crosswords and 2.3% Scrabble); 9.2% to reading/writing, and learning foreign languages (predominantly reading 6.8%); and 5.8% to academic or niche pursuits (e.g., astronomy, mycology, philately, philology/semiotics and transport enthusiasms). Responses also showed engagement with a wide range of sporting and outdoor activities, whether as participants or spectators (16.9%); and an active interest in the creative arts (7.2%) and music (6.7%).
Data were then examined by expertise group for each hobby sector to identify differences in respondent participation levels (n = 718) for each hobby sector (Table 8B). Findings will be reported in greater detail elsewhere. There was a particularly striking tendency for participants to pursue at least one mind game11 in addition to cryptic crosswords (

Musical interest/ability
In addition to the hobbies question, participants were asked to indicate their level of ability in music-making. Responses were given on a 6-point Likert scale (1 = "No interest/ability"; 6 = "Total immersion/professional level"). There were 767 responses to this question (O: n = 377; H: n = 215; S: n = 175). Overall, participants indicated that they had a modest competence in music (M = 3.02, Mdn = 3). Musical ratings were slightly higher for S participants (O: M = 3.00; H: M = 2.99; S: M = 3.13; all groups Mdn = 3). However, the overall comparison failed to achieve statistical significance [H (2) = 1.864, p = 0.394].
Participants were also asked whether they still participated in musical activities, and whether they sang, played or composed. Responses are summarized in Table 9.
Responses were evenly matched across all groups, with the exception of singing, which was particularly high for S (18.3%, z = 1.4) and low for H (10.7%, z = −1.4). Even so, differences in the distribution of singing participation between groups failed to reach statistical significance [χ 2 (2) = 4.564, p = 0.102, Cramer's V = 0.077]. Comparison with levels of singing, playing and composing in the general adult English population (DCMS, 2015) suggests that in all three areas the level of active music participation amongst cryptic crossword solvers is markedly above population norms.

Hobbies analyzed by RIASEC coding
Finally, hobbies (n = 2687) were analyzed by RIASEC coding using the 2-letter RIASEC code ("IR, " "AS" etc.) as supplied by the "Leisure Activities Finder" (Holmberg et al., 1997). Twentysix of the 30 available codes were used. Over 70% of the hobbies fell into one of 7 categories: as Table 10 indicates, there were no marked differences in the distribution of responses between expertise groups; although H participants engaged least with both personal and team sport, and most with the arts; and S participants engaged least with word-based games and activities.
On a person-specific basis, primary RIASEC hobby codes were then aggregated and averaged to produce an individual primary RIASEC code profile. Results are shown in Table 11.
Once again, the aggregation revealed a very strong tendency across all groups toward "I" activities (27.2%; O: 26.0%; H: 29.6%; S: 26.8%). "A" activities were even more important, and were higher among H solvers than O and S (30.7%; O: 30.2%; H: 32.8%, S: 29.3%): these involved musical, cultural, literary and wordbased activities. "S" activities featured prominently for O and S participants, but not for H (22.8%; O: 24.1%; H: 17.3%; S: 26.3%): these tended to be socializing activities involving clubs, sports and family/friends. Analysis of the distribution of RIASEC hobby scores by expertise was significant only for "S" activities [H (2) = 9.221, p = 0.010]. Pairwise comparisons were performed as above; adjusted p-values are presented. There were only statistically significant differences in scores between H-S, with S showing highest and H lowest scores (p = 0.009, r = 0.15).

Motivation in Everyday Life and Crossword Solving
The "Work Preference Inventory" Participants were asked to complete the "Work Preference Inventory" (WPI; Amabile et al., 1994), which was "designed as a direct, explicit assessment of individual differences in the degree to which adults perceive themselves to be intrinsically and extrinsically motivated toward what they do" (p. 952). Typical questions include "To me, success means doing better than other people" and "I have to feel that I'm earning something for what I do." There were 766 responses (O: n = 377; H: n = 215; S: n = 174) rating each of 29 statements on a 5-point Likert scale (1 = "Completely Disagree"; 5 = "Completely Agree"). Full details of the results obtained will be discussed elsewhere; the current paper presents summarized data only.  Data were analyzed using Amabile et al.'s (1994) 2-factor breakdown, into Intrinsically (IM: n = 14) and Extrinsically (EM: n = 15) motivated statements, and averaged for each participant within these categories. Overall, respondents showed higher scores on IM items (M = 3.63, Mdn = 3.64) than EM items (M = 2.62, Mdn = 2.60), and mean WPI levels were significantly different from the mid-point (3)

Motivation for Solving Crosswords
Participants were also asked to rate 26 statements relating to their motivation for solving cryptic crosswords on a 5-point Likert scale (1 = "Completely Disagree"; 5 = "Completely Agree"). The 26 statements were independently assigned to Intrinsic (IM: n = 19) and Extrinsic (EM: n = 7) motivational categories by the authors, following the methodology used in the WPI (Amabile et al., 1994). There were 786 responses (O: n = 388; H: n = 221; S: n = 177). Full details of the results obtained will be discussed elsewhere; the current paper presents summarized data only.
Data were analyzed into IM and EM statements (as above), and averaged for each participant within these categories. Overall, respondents again showed higher scores on IM items (M = 2.87, Mdn = 2.84) than on EM items (M = 1.53, Mdn = 1.43): see Table 12A. A Wilcoxon signed-rank test determined that there was a highly statistically significant difference (Mdn Diff = 1.41) between responses on IM and EM (z = 24.23, p < 0.001, r = 0.61).    Table 12B shows the 5 highest and 5 lowest ranked responses to the 26 statements (with abbreviated descriptions). All groups rated the "Aha!" or "Penny-Drop Moment" (PDM) as a key motivational factor for solving cryptics; closely allied with this was the statement "Solving well-written clues gives me a buzzit makes me smile or laugh out loud" which was ranked 4th in importance. In a separate question ("Is your enjoyment of the "penny-drop" moment enhanced if you have had to struggle with the clue?") only 11 of the 797 respondents (1.4%) claimed not to have had a PDM when solving cryptics, whereas 634 (79.5%) agreed that it had been strengthened by a stiff challenge. All 5 top-rated statements related to intrinsically motivated reasons, primarily concerned with intellectual challenge and the joyful and satisfying feeling of cracking a clue. There was no statistically significant difference between the groups for any of these statements. The lowest rated statements were all concerned with extrinsically driven reasons such as competition, prestige or collaboration: the median score for all of these questions was 1 across all groups, indicating a rejection of these suggestions.
Respondents were not generally drawn to cryptic crossword solving in order to learn new words (see Table 12B, other items): this suggestion came 14th, and the average response score was significantly lower than the mid-point (3)

DISCUSSION
So far as we are aware, this study is the first to employ a detailed and wide-ranging survey in a performance domain to characterize the nature of expert and non-expert participants across a large number of dimensions. This novel approach proved highly effective, with our results directly influencing subsequent research and providing empirical support for the direction and design of key elements in our studies.
Our findings fully supported the first hypothesis-that cryptic crossword solvers would generally be academically able adults, given the cognitive complexity of the puzzle demands. Over 80% of participants had achieved a university qualification of at least BA/BSc (with nearly 12% of these at doctorate level). This is approximately 8-10 times higher than UK university participation for the most relevant time-period. Furthermore, participants generally pursued cognitively complex post-university careers, with average Holland complexity scores for all groups being indicative of highly-trained, graduate or post-graduate professions. In all group comparisons, the (nonsignificant) trend was for complexity and academic achievement to be higher for E than for O solvers.
Our second hypothesis-that solvers' education and occupation would predominantly be in scientific or IT-related fields, rather than in language-related fields-was also supported. STEM subjects accounted for over half of university courses, compared to just over a quarter for Wordsmith specialisms. Post-university careers in the STEM and Finance areas continued this trend. Overall, nearly a quarter of all participants worked in IT, rising to almost a third among the S group. Indeed, S participants were significantly more likely to have studied Math and to have worked in the areas of IT or Banking/Accountancy than the other groups. Furthermore, when data were viewed through the prism of RIASEC coding, the code "IRE" (e.g., computing, math, engineering, chemistry) was noticeably prominent and increased with expertise: significantly so, in the case of education. Both education and occupation codings as a whole showed a very strong bias toward the RIASEC code "I, " generally thought to denote analytical, scholarly, scientific and research-oriented individuals; again this was particularly prominent for S solvers. Comparison with US workforce norms indicated that the level of employment in "I" occupations among our survey population was four times greater than might have been expected. Finally, even in their spare time, solvers opted for hobbies which were weighted toward intellectually stimulating activities ("I"); and the logical challenge of Sudoku (RIASEC code "IC", 9.8% hobby responses) was more popular than word puzzles, languages and writing (RIASEC code "AI", 7.3% hobby responses). Interestingly, playing Scrabble and solving non-cryptic crosswords were comparatively unpopular hobbies.
Wordsmith skills were thus dwarfed by STEM involvement, but were not entirely irrelevant to our participants. Reading for pleasure was a common hobby, although wherever Wordsmith activities occurred, trends (though non-significant) indicated slightly higher levels for H solvers (e.g., pursuing "A" RIASECcoded hobbies, following "A" RIASEC-coded careers, studying English at university, aiming to increase vocabulary through crosswords, and reading as a hobby). Conversely, S solvers were less likely than H or O to study English/languages at university and to pursue a Wordsmith career. It is possible, therefore, that word skills are a particular feature of H participation in cryptic solving, but that they do not translate into Super-Expert performance, where coding/analytical aspects of cryptic solving may be more relevant.
The findings relating to the first two hypotheses influenced our research in a number of ways. Firstly, they provided strong corroborative endorsement of the cognitive drivers we hypothesized would be key to solving cryptics. Our data thus appear to confirm that cryptics particularly appeal to academically able, logical and analytical thinkers with strong mathematical/computing aptitude. As Underwood originally suggested, cryptic crossword skill therefore appears to be bound up with code-cracking and problem-solving skills of a quasialgebraic nature (Underwood et al., 1988). Conversely, lexical ability, although no doubt valuable, does not appear to be a critical discriminator of high expertise among elite solvers. Although tests of lexical breadth and word retrieval did form part of our subsequent research programs, they were therefore not our primary focus: in this we deviated from the Nottingham studies (Underwood et al., 1994;Deihim-Aazami, 1999).
Given the high academic achievement across the entire sample, we hypothesized for our later trials that cognitively straightforward tests of WM load (e.g., simple and complex digit span tasks, or tests of visual short-term memory) would be unlikely to discriminate among groups as effectively as challenging fluid intelligence tasks, which (like cryptics) require the segregation and assembly of multiple task parts and the understanding of their controlling rules (Duncan et al., 2012). Fluid intelligence testing had already been shown not to distinguish between expertise levels in the Nottingham trials (Underwood et al., 1994;Deihim-Aazami, 1999); however, our survey results indicated that the test originally selected in these studies-the AH4 (Heim, 1970), designed for those who ceased education at 18-would have been wholly underpowered for the assessment of such a highly academically qualified population, leading to ceiling effects. A rerun of this comparison using the more appropriate AH5 test (Heim, 1968) was therefore a key priority for our research.
Our third hypothesis, that cryptic crosswords regularly trigger "Aha!" moments and function as a classic type of insight problem through misdirection, was also confirmed. All groups rated the "Aha!" moment as a key motivational reason for solving cryptics, and strongly agreed that its intensity was enhanced by the need to struggle with the clue. Confirmation that insight moments are indeed a key feature of the cryptic experience provided an important research rationale for subsequent phases of our studies, in which we explored individual differences in the ability to resist or resolve deliberate misdirection within our crossword solving population.
The "Aha!" finding also had profound implications for our selection of a representative task to capture the essence of expertise in cryptic solving during lab-based trials. Whereas other similar trials (Deihim-Aazami, 1999;Tuffiash et al., 2007) had followed de Groot's highly influential paradigms involving isolated, briefly presented challenges (de Groot, 1946(de Groot, /1965Gobet et al., 2004), we felt that this would elicit only meager and comparatively trivial process-tracing data for our research domain, while missing key areas of interest. Indeed, Tuffiash acknowledged similar concerns in his Scrabble study, reporting that "the verbal reports of our SCRABBLE players mainly consisted of strings of candidate solution words" (Tuffiash et al., 2007, p. 129), in response to a one-shot "best-next-move" challenge, restricting their subsequent analysis to a head-count of bingo words, non-bingo words and illegal words. Similarly, Deihim-Aazami reported a high proportion of lean responses to the talk-aloud task (the solving of 37 independent clues without a grid), commenting that "many experts gave incomplete verbalizations and this was due to the simplicity of the clues." (Deihim-Aazami, 1999, p. 124).
For this reason, in the GECA, we deliberately chose to set our participants the more representative task of solving an entire professionally-commissioned, high-quality cryptic crossword with grid. Performance was video-recorded and transcribed for both verbal and action-based data. In this way, we hoped to exploit the full potential of the think-aloud protocol, capturing a wide range of strategically important ancillary factors such as: chosen solving order of clues; length of time spent in impasse on each clue before moving onto another; frequency of return to an obstinately resistant item; perseveration with an incorrect solution pathway; the antecedents of "Aha!" solution moments; and the use of cross-checking letters as solution prompts. The approach also permitted data capture on the clarity of understanding of clue architecture, frequency of dictionary use, handwritten jottings (such as candidate anagram letters) and the shifting emotional state of our participants (e.g., frustration, triumph, laughter). In this we deviated from the Nottingham studies which had decided on a priori grounds that a grid layout was unnecessary since "true expertise is reflected in the ability to tackle the clue and not in whether there are intersecting letters" (Deihim-Aazami, 1999, p. 83). By contrast, we felt a reluctance to impose any such preconceived ideas upon the current study; rather, we preferred to allow the verbal protocol paradigm itself to elicit rich, unconstrained and ecologically valid process-tracing data, and to allow this to drive the identification of key cognitive and metacognitive drivers of high expertise in this domain.
We anticipated (hypothesis four) that solvers would generally enjoy effortful cognitive activity in all spheres of life including work and hobbies, and that this would be an important driver of cryptic crossword participation. This, too, was supported by our findings. Solvers voluntarily chose to engage with intellectually and culturally stimulating activities (music, theater, the arts) coded to "I" and "A" (RIASEC) in their leisure time, and their Need for Cognition scores (Cacioppo et al., 1984) were significantly higher than the test mid-point. The general trend was for enjoyment of effortful thought to increase with expertise, and a number of analyses comparing O and E solvers either achieved or approached statistical significance.
Allied to this were the findings that crossword solving is indeed an intrinsically motivated activity, undertaken for the love of mental stimulation and for personally gratifying insight rewards (hypothesis five). Extrinsically motivated reasons, such as prizes, competitions or impressing others, were not important to our participants, who also showed a significantly more intrinsically than extrinsically motivated profile in relation to their workplace role. We anticipated that practice/engagement levels for both expert and non-expert solvers would consequently be low and relatively unstructured compared to high profile competitive performance areas such as chess, Scrabble and music, where those who aspire to monetary rewards and worldwide prestige must undertake a heavy and inescapable burden of intense practice and effortful learning. This proved the case: time spent solving crosswords each week amounted to only 6-7 h, with no statistical difference between expertise groups. This equates to only 45-60 min per day, or 1-2 crosswords at typical solving speeds. Excluding those who set crosswords professionally, participants in all groups recorded only minimal time spent on other crossword related activity, amounting to approximately 20 min per day. Yet, at the end of several decades of solving, participants had achieved quite different expertise outcomes. Differences in the levels of day-to-day solving activities were therefore not significant, and seemed unlikely to account for performance differences between the solver groups.
Similarly, hypothesis six-that cryptic crossword solving would not normally begin in childhood, in view of the cognitive complexity of the task, but was more likely to have commenced in late teenage years-was also supported. This has important implications for the exploration of practice effects in the domain: unlike many other expertise domains (famously, chess and music) the solving of cryptic crosswords is a hobby chosen voluntarily at the onset of adulthood and pursued primarily for intrinsic enjoyment. Study of the skill acquisition phase and of the role of practice in this domain is therefore refreshingly free from potential confounds such as early childhood practice routines, intense parental pressure or extrinsically rewarded competition circuits. Our subsequent research pursued this line, looking at various aspects of cryptic crossword experience, including the degree of difficulty and the range of crosswords tackled, and the definition and extent-if any-of deliberate practice (Ericsson et al., 1993;Hambrick et al., 2014b;Moxley et al., 2015) in this domain.

CONCLUSION
Research into expert performance has traditionally centered upon a limited number of domains and has explored only a small number of aspects such as practice, starting age and WM capacity, based on a priori assumptions about the skill-sets required for excellence in the field. Cryptic crosswords have the potential to bring fresh perspectives to the debate: at the highest level of performance, the domain is represented by an elite body of demonstrably top-class practitioners; yet it is atypically unburdened by extreme practice routines, and is motivated by intrinsic rather than highly competitive extrinsic rewards. We have chosen to investigate this promising domain using a new, broad-based paradigm: the Grounded Expertise Components Approach.
Our paradigm makes two important modifications to the traditional Expert-Performance Approach. First, we argue that although it may be useful to conduct a paper review of hypothesized key cognitive skills in the domain, it is misleading to do this in isolation, without exploring the nature of those who undertake this activity, across a wide number of dimensions.
A detailed and open-minded characterization of the target population has proved invaluable in exploring the cryptic crossword domain, and has provided a secure empirical research rationale on which to base subsequent studies. We believe that it is a valid and valuable approach for other performance domains, and that this should be the first task in any such research program. By contrast, under the traditional paradigm, a questionnaire is more usually conducted at the end of the research, and used as a vehicle to capture experience and practice details alone.
One of the strengths of the GECA is thus its ability to suggest productive avenues for future research based on a secure rationale, and this has been the case for UK-style cryptic crosswords. An additional bonus is its ability to throw up interesting and relatively unexpected findings-such as the significant association between left-handedness and professional cryptic crossword setting, the comparatively high level of active musical participation amongst solvers, and the greater proportion of males apparently engaged in the domain, all of which we intend to investigate further. We suggest thatby framing the initial investigation widely-researchers can avoid that circularity of research which captures data only for theoretically predicted elements, and then attempts to restrict the characterization of the activity as a whole to these highly circumscribed results.
Secondly, we argue that care should be taken in choosing a representative lab-based task which enables a full exploration of all facets of performance in the field, exploiting the talk-aloud protocol to its full potential. One-shot, isolated tasks can never reflect the complete range of skills and strategies involved in carrying out the actual task itself, and the aim should be to replicate the demands of the full challenge wherever possible in order to identify key drivers of expertise. Where isolated tasks are chosen as the basis of the EPA procedure there is a risk that investigator preconceptions may lead to unintentional research biases, resulting in the overestimation of the importance of particular niche skills of interest, and the failure to seek or observe potentially key data on other relevant cognitive factors and broader strategic elements.
In conclusion, we agree wholeheartedly with the recent verdict of Hambrick and colleagues: "For scientists, the task now is to develop and test falsifiable theories of expertise that include as many relevant constructs as possible" (Hambrick et al., 2014a, p. 11). The time is therefore right for a more broadbased approach; and we believe that the Grounded Expertise Components Approach should play a role in this process.

AUTHOR CONTRIBUTIONS
KF designed the survey and analyzed data via an Access database. PF and KF reviewed data and agreed coding treatments. KF drafted the article and KF and PF reviewed and finalized it.

ACKNOWLEDGMENTS
The authors are indebted to a great number of people in the cryptic crossword community. We wish to thank in particular the owners and administrators of the websites who allowed us to advertise for participants (Times for the Times, 15 Squared, Big Dave's Crossword Blog, and Derek Harrison's Crossword Centre); John Green for advertising the survey with the annual Listener statistics; Peter Biddlecombe for recruiting volunteers at the Times National Crossword Championship; the team at the Magpie crossword magazine; Chris Lancaster and John de Cuevas for generously sharing with us the results of their informal surveys; the late and much-missed Roddy Forman for his unfailing and enthusiastic support of the research; and John Henderson and Jane Teather, who have played a major role in facilitating face-to-face contact with solvers. We are also grateful to Paige Vanleer for research assistance in coding parts of the data. A final thank you is due to all the participants who have taken part in our research. Many have now contributed to more than one phase in our crossword studies and we are extremely grateful for their continued faith in the value of our research and their enthusiasm to take part. The layouts of typical US-style and UK-style grids in Figure 1 were generated using Antony Lewis' Crossword Compiler software (version 8.1): http://www.crossword-compiler.com/. The survey was made available on the Internet via SurveyMonkey R (www.SurveyMonkey.com, Palo Alto, CA).

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.00567