AUTHOR=Grieve Jack , Montgomery Chris , Nini Andrea , Murakami Akira , Guo Diansheng TITLE=Mapping Lexical Dialect Variation in British English Using Twitter JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 2 - 2019 YEAR=2019 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2019.00011 DOI=10.3389/frai.2019.00011 ISSN=2624-8212 ABSTRACT=There is a growing trend in sociolinguistics and dialectology to analyse large corpora of social media data, but it is unclear if the results of these studies can be generalised to language as a whole. To assess the generalisability of Twitter dialect maps, this paper presents the first systematic comparison regional lexical variation in Twitter corpora and traditional survey data. We compare the regional patterns found in 139 lexical dialect maps based on a 1.8 billion word corpus of geolocated UK Twitter data and the BBC Voices dialect survey. A spatial analysis of these 139 map pairs finds a strong alignment between these two data sources, offering evidence that both approaches to data collection allow for the same basic underlying regional patterns to be identified. We conclude that these results license the use of Twitter corpora for general inquiries into regional linguistic variation and change.