Linguistic Variation

Seminar ‘Corpus Linguistics’

Quirin Würschinger, LMU Munich

July 10, 2025

Session Overview

  • Research projects group work
  • Linguistic variation in the use of tag questions
    • theory: Tottie and Hoffmann (2006)
      • structure
      • variation between BrE and AmE
      • social variation (gender, age, social class)
    • practice: BNC 2014 Spoken corpus on SkE

Workshop Projects

Tag Questions

What are tag questions?

  • consist of an anchor clause + a tag
  • subject in tag must be pronoun, there, or one
  • verb in tag must be auxiliary or modal

Polarity Patterns

  • reversed polarity most common
    • + -: It’s hot today, isn’t it?
    • - +: It isn’t working, is it?
  • constant positive and constant negative possible but rare
    • + +: She can drive, can she?
    • - -: He can’t swim, can’t he?

Study background

Tottie, Gunnel, and Sebastian Hoffmann. 2006. ‘Tag Questions in British and American English’. Journal of English Linguistics 34 (4): 283–311. https://doi.org/10.1177/0075424206294369.

  • compares tag questions in British (BNC-S) and American (LSAC) spoken corpora
  • examines form, polarity and pragmatic functions
  • explores sociolinguistic factors: gender, age, social class
  • finds tags about three times more frequent in British English, reversed polarity dominant (≈ 75 %)
  • women use tags more frequently and for facilitative purposes

Pragmatic Functions

  • Informational
    • You live in London, don’t you?
  • Confirmatory
    • This report was sent yesterday, wasn’t it?
  • Facilitating
    • Let’s move on, shall we?
  • Attitudinal
    • That was a fantastic game, wasn’t it?
  • Peremptory
    • Close the window, will you?
  • Aggressive
    • You’re going to mess this up again, aren’t you?

Variation Across British and American English

BNC-S
the spoken demographic subpart of the British National Corpus
LSAC
the Longman Spoken American Corpus

Auxiliary Choice

  • modal auxiliaries dominate (especially will, would)
  • regional differences in choice of be vs do

Pronouns and Phrases

Sociolinguistic Factors

Gender Comparison (Women vs Men)

Age-Group Comparison

Interim Summary

  • reversed polarity tags most frequent
  • significant BE/AE contrasts
  • gender & age influence usage
  • form–function mapping varies with context

Practice: Corpus Study with Sketch Engine

Corpus: BNC 2014 Spoken

Objective: replicate and extend Tottie’s analysis on present-day data

Study Questions

About the structure of tag questions:

  1. How many false positives did your query return?
  2. Which question tags are most frequent (e.g. is n’t it, are n’t you, etc.)?
  3. Which pronouns are most frequent?
  4. Which verbs are most frequent?
  5. Are negated or positive tags more frequent?

About their use in different contexts:

  1. How do tag questions differ in different age groups?
  2. How do tag questions differ in different genders?
  3. How do tag questions differ in different social classes?
  4. How do tag questions differ in different educational levels?

Further study:

  1. Which polarity patterns are most frequent? (e.g. + - vs - +)

Overview of steps

1: Retrieve Attestations

CQL query:

1:[lemma="be|do|have" | tag="MD"]  2:[lemma="not"]?   3:[tag="PP.?"] [word="\?"]

Hints

  • You can exclude utterance-initial hits with <u> []{1,} … within <u/>.
  • You can exclude preceding wh-words with [lemma!="where|who|which|when"].
  • You can exclude cases where a verb follows the pronoun immediately by adding [tag!="V.*"] after the pronoun in your pattern.
  • You can exclude cases where an adjective follows the pronoun by adding [tag!="J.*"] after the pronoun.

Step 2: Evaluate False Positives

  • download a random sample of 50 hits
  • annotate in Excel (Label column: 0 = false positive, 1 = true positive)
  • refine the CQL query or use Excel filters to reduce noise.
  • Model sheet (template): https://1drv.ms/x/s!AvkgNVl9yS6aokmqDbTz5BmfbU6C

Step 3: Frequency of Tag Forms

  • most frequent complete tag phrases (is n’t it, are n’t you, etc.)
  • most frequent pronouns
  • most frequent verbs
  • export frequency tables to Excel for charts

Step 4: Distribution across Metadata

cross-tabulate frequencies by

- `Age range`
- `Gender`
- `Class: Social grade`
- `Highest qualification`

Summary & Discussion

  • compare BNC 2014 findings with Tottie and Hoffmann (2006)
  • reflect on methodological challenges (false positives, metadata)
  • discuss potential research extensions

References

Tottie, Gunnel, and Sebastian Hoffmann. 2006. “Tag Questions in British and American English.” Journal of English Linguistics 34 (4): 283–311. https://doi.org/10.1177/0075424206294369.