Course Wrap-up & Synthesis

Seminar ‘Corpus Linguistics’

Quirin Würschinger, LMU Munich

July 24, 2025

Session Overview

  • Last week’s exercises
  • Course recap of previous sessions
  • Key concepts synthesis
  • Methods & tools consolidation
  • Research pathways & future directions
  • Course feedback & reflection

A Recap of what we’ve done

Sessions 01-06: Foundations & Core Methods

Building the foundation

  1. Organisation & introduction to corpus linguistics
  2. Sketch Engine skills (concordancing, frequency)
  3. Lexical innovation & diffusion patterns
  4. Morphology & word-formation processes
  5. Meaning analysis, collocations, word sketches (clipping study, Hilpert, Correia Saavedra, and Rains (2023))
  6. Creating corpora (principles & practice)

Core skills acquired: corpus search, frequency analysis, morphological investigation, corpus design

Basic concordancing skills

Frequency analysis foundations

Sessions 07-10: Advanced Applications

From methods to research

  1. Syntax analysis (constructions, CQL, entrenchment research)
  2. Research project planning & term paper methodology
  3. Linguistic variation (tag questions, social factors)
  4. Language change (modal verbs, diachronic analysis)

Research skills developed: syntactic investigation, sociolinguistic analysis, historical corpus research, project design

Construction Grammar analysis

Tag questions variation study

Language Change: Modal Verbs Over Time

Modal verb frequency changes over time

Session 10: tracking systematic language change patterns

Key Concepts Synthesis

Corpus Linguistics as Method & Theory

What we’ve learnt about corpus linguistics

  • Empirical foundation: language use patterns in large datasets
  • Frequency effects: usage frequency shapes linguistic structure
  • Contextual meaning: distributional semantics and collocational profiles
  • Variation & change: systematic patterns across time, space, social groups

Theoretical frameworks encountered:

  • Construction Grammar (syntax study, Session 07)
  • Principle of No Synonymy (clipping study, Session 05)
  • Entrenchment & Conventionalisation (syntactic patterns, Session 07)
  • Sociolinguistic variation theory (tag questions, Session 09)

Language Levels & Phenomena

Corpus methods across linguistic levels

Level Phenomena Methods Focus
Lexis Innovation, diffusion Frequency tracking, diachronic analysis Patterns of lexical change
Morphology Word-formation, clipping Collocational analysis, meaning comparison Form-meaning relationships
Syntax Constructions, CQL patterns Frequency analysis, entrenchment measures Usage-based grammar
Sociolinguistics Variation, change Social factor analysis, diachronic comparison Language and society

Corpus linguistics = versatile methodology across all language levels

Morphology vs Word-formation

Theoretical framework from Session 04

Methods & Tools

Methodological Skills

Your corpus linguistics toolkit

Search & Analysis:

  • Sketch Engine: concordancing, frequency lists, word sketches
  • CQL (Corpus Query Language): advanced pattern searches
  • Collocational analysis: meaning through co-occurrence
  • Diachronic comparison: tracking change over time

Research Design:

  • Corpus selection: appropriate data for research questions
  • Hypothesis formation: testable predictions about language use
  • Critical evaluation: corpus limitations and biases

CQL Builder interface

Advanced search capabilities

Corpus Resources & Data Access

English corpora overview

Sessions 06 & 10: from corpus principles to historical English corpora

From Data to Insights

The corpus research process

  1. Research question formulation → clear, testable hypotheses
  2. Corpus selection → appropriate data for your question
  3. Query design → CQL and search strategies
  4. Data extraction → concordances, frequency lists, collocations
  5. Pattern analysis → quantitative and qualitative interpretation
  6. Contextual interpretation → linguistic and social meaning
  7. Critical reflection → methodological limitations and insights

You can now conduct independent corpus-based research

Research Pathways & Future Directions

Contemporary Corpus Linguistics

Where corpus linguistics is heading

Technological advances:

  • Large language models and corpus data
  • Multimodal corpora (text, speech, video)
  • Real-time corpus compilation (social media, web scraping)
  • Automated annotation (parsing, semantic tagging)

Emerging research areas:

  • Digital humanities applications
  • Forensic linguistics and authorship analysis
  • Second language acquisition and learner corpora
  • Computational sociolinguistics and social media analysis

Course Feedback & Reflection

What worked well?

Your thoughts on the course

Please reflect on:

  • Which sessions were most valuable for your learning?
  • What aspects of corpus linguistics excited you most?
  • Which practical skills do you feel most confident about?
  • How well did the balance of theory and practice work?
  • What connections did you make to your other studies?

Discussion time: Share your comments and insights

Areas for improvement

What could be improved?

Consider:

  • Which topics needed more time or clearer explanation?
  • Were there technical difficulties that slowed learning?
  • What additional topics would you have liked to cover?
  • How could the session structure be improved?
  • What resources would be helpful for continued learning?

Constructive feedback welcome: Your input shapes future courses

Thank you for an engaging semester!

Questions, discussion, final thoughts?

References

Hilpert, Martin, David Correia Saavedra, and Jennifer Rains. 2023. “Meaning Differences Between English Clippings and Their Source Words: A Corpus-Based Study.” ICAME Journal 47 (1): 19–37. https://doi.org/10.2478/icame-2023-0002.