class: inverse, center, middle background-image: url(data:image/png;base64,#https://cc.oulu.fi/~scoats/oululogoRedTransparent.png); background-repeat: no-repeat; background-size: 80px 57px; background-position:right top; exclude: true --- class: title-slide <br><br><br><br><br> .pull-right[ <span style="font-family:Rubik;font-size:24pt;font-weight: 700;font-style: normal;float:right;text-align: right;color:white;-webkit-text-fill-color: black;-webkit-text-stroke: 0.8px;">Regional Distribution of the /el/-/æl/ Merger in Australian English</span> ] <br><br><br><br> <p style="float:right;text-align: right;color:white;font-weight: 700;font-style: normal;-webkit-text-fill-color: black;-webkit-text-stroke: 0.5px;"> Steven Coats<sup>1</sup>, Chloé Diskin-Holdaway<sup>2</sup>, and Debbie Loakes<sup>2</sup><br> <sup>1</sup>University of Oulu, Finland; <sup>2</sup>University of Melbourne, Australia<br> <a href="mailto:steven.coats@oulu.fi">steven.coats@oulu.fi</a><br> 12<sup>th</sup> VarDial Workshop, أبو ظبي<br> January 19th, 2025<br> </p> --- layout: true <div class="my-header"><img border="0" alt="W3Schools" src="https://cc.oulu.fi/~scoats/oululogonewEng.png" width="80" height="80"><img border="0" alt="Second Logo" src="./uni_melb_logo.png" width="80" height="80"></div> <div class="my-footer"><span>Coats, Diskin-Holdaway, & Loakes                     /el/-/æl/ Merger | فارديال أبو ظبي </span></div> --- exclude: true <div class="my-header"><img border="0" alt="W3Schools" src="https://cc.oulu.fi/~scoats/oululogonewEng.png" width="80" height="80"><img border="0" alt="Second Logo" src="./uni_melb_logo.png" width="80" height="80"></div> <div class="my-footer"><span>Coats, Diskin-Holdaway, & Loakes                     /el/-/æl/ Merger | فارديال أبو ظبي </span></div> --- exclude: true ## Outline 1. Introduction and Background: Prelateral vowel merger 2. Data: CoANZSE 3. Methods: Vowel extraction, Bhattacharyya difference, spatial autocorrelation 4. Results 5. Caveats, summary and outlook .footnote[Slides for the presentation are on my homepage at https://cc.oulu.fi/~scoats] <div class="my-header"><img border="0" alt="W3Schools" src="https://cc.oulu.fi/~scoats/oululogonewEng.png" width="80" height="80"><img border="0" alt="Second Logo" src="./uni_melb_logo.png" width="80" height="80"></div> <div class="my-footer"><span>Coats, Diskin-Holdaway, & Loakes                     /el/-/æl/ Merger | فارديال أبو ظبي </span></div> --- ### Introduction and Background Traditional view: Regional variation is limited. "Australia is, generally speaking, linguistically unified" <span class="small">(Mitchell & Delbridge 1965: 13)</span> AusE has “begun to exhibit more widespread social and regional variation than has previously been acknowledged” <span class="small">(Cox and Fletcher, 2017, p. 20)</span> **This study: Investigation of regional variation of prelateral merger of /e/ and [æ] in a large dataset** - Prelateral merger of /e/ and [æ] (e.g., ***celery = salary***, ***Ellen = Alan***) - Investigated in small-scale studies in Southern Victoria/Melbourne <span class="small">(Diskin et al. 2017; Diskin-Holdaway et al., 2024; Loakes et al., 2017, 2024)</span> - **What about in large datasets and in other locations?** --- ### Methods: Vowel extraction - Data from CoANZSE Audio (https://coanzse.org) <span class="small">(Coats 2024a,b)</span> - ASR transcripts and audio from 38,786 videos uploaded to YouTube channels of Australian councils in 404 locations - Alignment with the Montreal Forced Aligner <span class="small">(McAuliffe et al. 2017)</span> - F1 and F2 formant values for /e/ and /æ/ extracted at vowel midpoints using Parselmouth-Praat <span class="small">(Jadoul et al., 2018)</span>, a Python port of Praat <span class="small">(Boersma & Weenink 2024)</span> - Vowels were extracted in two contexts: prelateral (/æl/ and /el/, e.g. *value*, *well*) and non-prelateral (/æC/ and /eC/, e.g. *fact*, *next*) - Values from CMU pronunciation dictionary <span class="small">(Weide et al. 1998)</span> - After filtering (only stressed syllables, only locations with at least 20 tokens in all contexts): 4,297,259 vowel tokens from 252 locations - F1 and F2 formants, values normalized with Nearey's method --- ### Methods: Bhattacharyya Difference, Spatial Autocorrelation - Many recent studies use **Pillai's trace** to quantify vowel overlap. - Pillai's trace is is from MANOVA, used for quantifying additional covariates and generating p-values - We use the ***Bhattacharyya Distance*** <span class="small">(Bhattacharyya 1943)</span> `\(D_B\)`, defined as: `$$D_B = -\log{\int \sqrt{P(x) \cdot Q(x)} \, dx}$$` for two probability distributions `\(P\)` and `\(Q\)` (i.e., `\(f_{1}\)` and `\(f_{2}\)`) - We calculate the difference between the two contexts at each location: `$$\text{Diff}_B = D_{\text{/el/}} - D_{\text{/æl/}}$$` --- ### Methods: Spatial autocorrelation - Moran's *I* and Getis-Ord *G*<span class='supsub'><sup>*</sup><sub>i</sub></span> from `\(\text{Diff}_B\)` - Spatial weights matrix `\(W\)` based on distance band `\(w_{ij}=1/d_{ij}\)` - Minimum distance: all locations must have at least one neighbor Moran's *I* <span class="small">(Moran 1950)</span>: Takes into account attribute values at all locations in a dataset and summarizes the overall extent of spatial correlation - 1: perfect clustering, 0: random distribution, -1: pefect dispersion Getis-Ord *G*<span class='supsub'><sup>*</sup><sub>i</sub></span> <span class="small">(Getis & Ord, 1992; Ord & Getis, 1995)</span>: Identifies spatial clusters by evaluating the values at each location in the dataset in comparison with neighboring locations, in relation to the global dataset - Positive: location is ~> global mean; 0: location ~= global mean; negative: location ~< global mean --- ### Results: Regional distribution <iframe width="860" height="515" src="https:// stcoats.github.io/AU_Bhatt_map.html" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- ### Results: Regional distribution, *G*<span class='supsub'><sup>*</sup><sub>i</sub></span> values <iframe width="860" height="515" src="https:// stcoats.github.io/AU_Bhatt_Gi_map_v2.html" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- ### Caveats, summary, outlook - ASR and formant extraction methods can introduce errors - No demographic data, but some can be semi-automatically annotated <span class="small">(cf. Bredin 2023; Plaquet & Bredin 2023; Ferreira 2024)</span> - Pipeline approaches can be used to automatically process large data volumes and extract formants for millions of vowels - Merger confirmed as Southern Victoria/Melbourne phenomenon - Need to look more closely at Western Australia --- #### References .small[ .hangingindent[ .pull-left[ Bhattacharyya, A. (1943). On a measure of divergence between two statistical populations defined by their probability distribution. *Bulletin of the Calcutta Mathematical Society* 35, 99–110. Boersma, P. and Weenink, D. (2024). [Praat: Doing phonetics by computer](https://www.praat.org/). Bredin, H. (2023). [pyannote.audio 2.1 speaker diarization pipeline: Principle, benchmark, and recipe](https://www.isca-archive.org/interspeech_2023/bredin23_interspeech.html).In *INTERSPEECH 2023*, 1983-1987. Coats, S. (2024a). Building a searchable online corpus of Australian and New Zealand aligned speech. *Australian Journal of Linguistics*. Coats, S. (2024b). [CoANZSE Audio: Creation of an online corpus for linguistic and phonetic snalysis of Australian and New Zealand Englishes](https://aclanthology.org/2024.lrec-main.302). In *Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)*, 3407-3412. Cox, F. and Fletcher, J. (2017). *Australian English pronunciation and transcription*, 2 edition. Cambridge University Press. Diskin, C., Loakes, D., Billington, R., Stoakes, H., Gonzalez, S., and Kirkham, S. (2019). The /el-/æl/ merger in Australian English: Acoustic and articulatory insights. In *Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019*, pp. 1764–1768. Diskin-Holdaway, C., Loakes, D., and Clothier, J. (2024). Variability in cross-language and cross-dialect perception. How Irish and Chinese migrants process Australian English vowels. *Phonetica*, 81(1):1–41. Getis, A. and Ord, J. K. (1992). [The Analysis of Spatial Association by Use of Distance Statistics](https://doi.org/10.1111/j.1538-4632.1992.tb00261.x). *Geographical Analysis* 24, 189-206. ] .pull-right[ Jadoul, Y., Thompson, B., and de Boer, B. (2018). [Introducing Parselmouth: A Python interface to Praat](https://doi.org/10.1016/j.wocn.2018.07.001). *Journal of Phonetics* 71, 1-15. Loakes, D., Clothier, J., Hajek, J., and Fletcher, J. (2024). Sociophonetic variation in vowel categorization of Australian English. *Language and Speech* 67(3), 870–906. Loakes, D., Hajek, J., and Fletcher, J. (2017). Can you t[æ]ll I’m from M[æ]lbourne? *English World-Wide*, 38(1), 29–49. McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., and Sonderegger, M. (2017). [Montreal Forced Aligner: Trainable text-speech alignment using Kaldi](https://doi.org/10.21437/Interspeech.2017-1386). In *Proceedings of the 18th Conference of the International Speech Communication Association*, 498-502. Mitchell, A. G. and Delbridge, A. (1965). *The speech of Australian adolescents: A survey*. Angus and Robertson. Moran, P. A. P. (1950). [Notes on Continuous Stochastic Phenomena](http://www.jstor.org/stable/2332142). *Biometrika* 37, 17-23. Ord, J. K. and Getis, A. (1995). [Local Spatial Autocorrelation Statistics: Distributional Issues and an Application](https://doi.org/https://doi.org/10.1111/j.1538-4632.1995.tb00912.x). *Geographical Analysis* 27, 286-306. Plaquet, A. and Bredin, H. (2023). [Powerset multi-class cross entropy loss for neural speaker diarization](https://doi.org/10.21437/Interspeech.2023-205). In *INTERSPEECH 2023*, 3222-3226. Weide, R. and others. (1998). [The Carnegie Mellon Pronouncing Dictionary](http://www.speech.cs.cmu.edu/cgi-bin/cmudict). ] ]]