Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide






Regional Distribution of the /el/-/æl/ Merger in Australian English





Steven Coats1, Chloé Diskin-Holdaway2, and Debbie Loakes2
1University of Oulu, Finland; 2University of Melbourne, Australia
steven.coats@oulu.fi
12th VarDial Workshop, أبو ظبي
January 19th, 2025

1 / 9
W3SchoolsSecond Logo

Introduction and Background

Traditional view: Regional variation is limited. "Australia is, generally speaking, linguistically unified" (Mitchell & Delbridge 1965: 13)

AusE has “begun to exhibit more widespread social and regional variation than has previously been acknowledged” (Cox and Fletcher, 2017, p. 20)

This study: Investigation of regional variation of prelateral merger of /e/ and [æ] in a large dataset

  • Prelateral merger of /e/ and [æ] (e.g., celery = salary, Ellen = Alan)

  • Investigated in small-scale studies in Southern Victoria/Melbourne (Diskin et al. 2017; Diskin-Holdaway et al., 2024; Loakes et al., 2017, 2024)

  • What about in large datasets and in other locations?

2 / 9
W3SchoolsSecond Logo

Methods: Vowel extraction

  • Data from CoANZSE Audio (https://coanzse.org) (Coats 2024a,b)
  • ASR transcripts and audio from 38,786 videos uploaded to YouTube channels of Australian councils in 404 locations
  • Alignment with the Montreal Forced Aligner (McAuliffe et al. 2017)
  • F1 and F2 formant values for /e/ and /æ/ extracted at vowel midpoints using Parselmouth-Praat (Jadoul et al., 2018), a Python port of Praat (Boersma & Weenink 2024)
  • Vowels were extracted in two contexts: prelateral (/æl/ and /el/, e.g. value, well) and non-prelateral (/æC/ and /eC/, e.g. fact, next)
  • Values from CMU pronunciation dictionary (Weide et al. 1998)
  • After filtering (only stressed syllables, only locations with at least 20 tokens in all contexts): 4,297,259 vowel tokens from 252 locations
  • F1 and F2 formants, values normalized with Nearey's method
3 / 9
W3SchoolsSecond Logo

Methods: Bhattacharyya Difference, Spatial Autocorrelation

  • Many recent studies use Pillai's trace to quantify vowel overlap.
  • Pillai's trace is is from MANOVA, used for quantifying additional covariates and generating p-values
  • We use the Bhattacharyya Distance (Bhattacharyya 1943) DB, defined as:

DB=logP(x)Q(x)dx

for two multivariate probability distributions P and Q (i.e., f1,f2 for /e/ and /æ/)

  • For each location, we calculate the DB between /e/ and /æ/, based on all measurements at that location, for
    • 1) Non-prelateral contexts (vC), and
    • 2) Prelateral contexts (vL)

We then calculate the difference between the two contexts at each location:

DiffB=DvCDvL

4 / 9
W3SchoolsSecond Logo

Methods: Spatial autocorrelation

  • Moran's I and Getis-Ord G*i from DiffB
    • Spatial weights matrix W based on distance band wij=1/dij
    • Minimum distance: all locations must have at least one neighbor

Moran's I (Moran 1950): Takes into account attribute values at all locations in a dataset and summarizes the overall extent of spatial correlation

  • 1: perfect clustering, 0: random distribution, -1: pefect dispersion

Getis-Ord G*i (Getis & Ord, 1992; Ord & Getis, 1995): Identifies spatial clusters by evaluating the values at each location in the dataset in comparison with neighboring locations, in relation to the global dataset

  • Positive: location is ~> global mean; 0: location ~= global mean; negative: location ~< global mean
5 / 9
W3SchoolsSecond Logo

Results: Regional distribution

6 / 9
W3SchoolsSecond Logo

Results: Regional distribution, G*i values

7 / 9
W3SchoolsSecond Logo

Caveats, summary, outlook

  • ASR and formant extraction methods can introduce errors
  • No demographic data, but some can be semi-automatically annotated (cf. Bredin 2023; Plaquet & Bredin 2023; Ferreira 2024)
  • Pipeline approaches can be used to automatically process large data volumes and extract formants for millions of vowels
  • Merger confirmed as Southern Victoria/Melbourne phenomenon
  • Need to look more closely at Western Australia
8 / 9
W3SchoolsSecond Logo

References

Bhattacharyya, A. (1943). On a measure of divergence between two statistical populations defined by their probability distribution. Bulletin of the Calcutta Mathematical Society 35, 99–110.

Boersma, P. and Weenink, D. (2024). Praat: Doing phonetics by computer.

Bredin, H. (2023). pyannote.audio 2.1 speaker diarization pipeline: Principle, benchmark, and recipe.In INTERSPEECH 2023, 1983-1987.

Coats, S. (2024a). Building a searchable online corpus of Australian and New Zealand aligned speech. Australian Journal of Linguistics.

Coats, S. (2024b). CoANZSE Audio: Creation of an online corpus for linguistic and phonetic snalysis of Australian and New Zealand Englishes. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 3407-3412.

Cox, F. and Fletcher, J. (2017). Australian English pronunciation and transcription, 2 edition. Cambridge University Press.

Diskin, C., Loakes, D., Billington, R., Stoakes, H., Gonzalez, S., and Kirkham, S. (2019). The /el-/æl/ merger in Australian English: Acoustic and articulatory insights. In Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019, pp. 1764–1768.

Diskin-Holdaway, C., Loakes, D., and Clothier, J. (2024). Variability in cross-language and cross-dialect perception. How Irish and Chinese migrants process Australian English vowels. Phonetica, 81(1):1–41.

Getis, A. and Ord, J. K. (1992). The Analysis of Spatial Association by Use of Distance Statistics. Geographical Analysis 24, 189-206.

Jadoul, Y., Thompson, B., and de Boer, B. (2018). Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics 71, 1-15.

Loakes, D., Clothier, J., Hajek, J., and Fletcher, J. (2024). Sociophonetic variation in vowel categorization of Australian English. Language and Speech 67(3), 870–906.

Loakes, D., Hajek, J., and Fletcher, J. (2017). Can you t[æ]ll I’m from M[æ]lbourne? English World-Wide, 38(1), 29–49.

McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., and Sonderegger, M. (2017). Montreal Forced Aligner: Trainable text-speech alignment using Kaldi. In Proceedings of the 18th Conference of the International Speech Communication Association, 498-502.

Mitchell, A. G. and Delbridge, A. (1965). The speech of Australian adolescents: A survey. Angus and Robertson.

Moran, P. A. P. (1950). Notes on Continuous Stochastic Phenomena. Biometrika 37, 17-23.

Ord, J. K. and Getis, A. (1995). Local Spatial Autocorrelation Statistics: Distributional Issues and an Application. Geographical Analysis 27, 286-306.

Plaquet, A. and Bredin, H. (2023). Powerset multi-class cross entropy loss for neural speaker diarization. In INTERSPEECH 2023, 3222-3226.

Weide, R. and others. (1998). The Carnegie Mellon Pronouncing Dictionary.

9 / 9
W3SchoolsSecond Logo

Introduction and Background

Traditional view: Regional variation is limited. "Australia is, generally speaking, linguistically unified" (Mitchell & Delbridge 1965: 13)

AusE has “begun to exhibit more widespread social and regional variation than has previously been acknowledged” (Cox and Fletcher, 2017, p. 20)

This study: Investigation of regional variation of prelateral merger of /e/ and [æ] in a large dataset

  • Prelateral merger of /e/ and [æ] (e.g., celery = salary, Ellen = Alan)

  • Investigated in small-scale studies in Southern Victoria/Melbourne (Diskin et al. 2017; Diskin-Holdaway et al., 2024; Loakes et al., 2017, 2024)

  • What about in large datasets and in other locations?

2 / 9
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow