Lexis, innovation, diffusion, and frequency
Lexis and lexical innovation
Lexical innovation and diffusion
Society continually changes as new practices and products emerge (e.g. smartphones)
These changes typically first manifest themselves in language on the level of lexis in the form of neologisms (e.g. the words smartphone or iphone).
Which recent neologisms can you think of?
- to google
- staycation, baecation
- medfluencer
- byelingual
- glamping
- hangry
- influencer
- gender pay gap
- kindergarchy
- cringe
- social distancing, frontline worker
- millenium bug
- to furlough, dotard
patterns
- motivations
- communicative need
- describe a feeling/phenomenon
- one-off
- appeal
- forms
- blending
- motivations
Knowledge of words is conventional: speakers learn form-meaning pairings.
Model of the Linguistic Sign (de Saussure 1916)

Theoretical framework
S-curve model
The S-curve model is relevant to linguistic innovation, diffusion, and language change.

Integration of Milroy’s and Rogers’ model of diffusion stages into an S-curve. (Kerremans 2015: p. 65)
The Entrenchment and Conventionalization Model (Schmid 2015, Schmid 2020)
- The more frequently a word is used, the more likely it is
- that speakers have stored it in their mental lexicon (entrenchment) and
- that it is part of the conventional language system (conventionalization).
- Usage, entrenchment, and conventionalization are interconnected.

(p. 4)
Operationalization
Frequency as an indicator for entrenchment and conventionality (Stefanowitsch and Flach 2017).
- corpus-as-input: language used in corpora represents potential exposure to speakers
- corpus-as-output: language used by speakers in corpora represents potential degrees of entrenchment
Pathways of diffusion
- types of linguistic variation and diffusion

- dimensions of diffusion
- across speakers and communities
- across text types
- examples for different degrees of diffusion

Empirical analyses of diffusion based on frequency
Würschinger, Quirin. 2021. ‘Social Networks of Lexical Innovation. Investigating the Social Dynamics of Diffusion of Neologisms on Twitter’. Frontiers in Artificial Intelligence 4:106. https://doi.org/10.3389/frai.2021.648583.
Total frequency
Most frequent

Median

Least frequent

Cumulative frequency

Temporal dynamics of use

Volatility

Investigating frequency using Sketch Engine
Frequency over time
Using the English Trends (2014–today) corpus; example: blockchain
Timeline view

Frequency view


Frequency across text types
Using the enTenTen21 corpus; example: alt-right

Topic text typePractice: Lexical innovation
Use the above case study words:
- upskill
- hyperlocal
- solopreneur
- alt-right
- alt-left
- poppygate
Tasks
- In the English Trends (2014–today) corpus:
- determine total frequency for each word
- example for alt-right:
- query:
[lemma="alt-right"] - absolute total frequency:
95,094 - relative total frequency:
0.96per million words/tokens (pmw)
- query:
- example for alt-right:
- identify the year of highest usage
- sort by year
- use
Frequency(raw) andRelative in text type(per million words)
- determine total frequency for each word
- In the enTenTen21 corpus:
- determine in which
Genreeach word was used most frequently- example for alt-right:
- example for alt-right:
- determine in which
- Compare results:
- Which words show highest/lowest degrees of conventionality?
- For which words is there a discrepancy to the results on Twitter?


Social dynamics of diffusion