YouTube Phonetics Pipeline Workshop

class: inverse, center, middle
background-image: url(data:image/png;base64,#https://cc.oulu.fi/~scoats/oululogoRedTransparent.png);
background-repeat: no-repeat;
background-size: 80px 57px;
background-position:right top;
exclude: true

---

.pull-right[
<span style="font-family:Rubik;font-size:24pt;font-weight: 700;font-style: normal;float:right;text-align: right;color:yellow;-webkit-text-fill-color: black;-webkit-text-stroke: 0.8px;">YouTube Phonetics Pipeline Workshop</span>
]

Steven Coats<br>
English, University of Oulu, Finland<br>
<a href="mailto:steven.coats@oulu.fi">steven.coats@oulu.fi</a><br>
ALOES Pre-conference Workshop, Paris<br> 
March 28th, 2024<br>
</p>

---

<div class="my-footer"><span>Steven Coats&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;YouTube Phonetics Pipeline | ALOES Pre-conference workshop</span></div>

---

## Outline

1. Background

2. yt-dlp

3. Montreal Forced Aligner

4. Praat-Parselmouth

3. Examples: Double modals, acoustic analysis pipeline

4. Caveats, summary

<div class="my-footer"><span>Steven Coats&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;Pipeline for Acoustic Analysis | CMC-Corpora 10, Mannheim</span></div>  
---

### Background

- Vast amounts of streamed audio, video, and transcript data are available online
- Standard technical protocols for online streaming: DASH and HLS
- Creation of specialized corpora for specific locations/topics/speech genres

1. Transcript corpora from YouTube (or other platforms): CoANZSE, CoNASE, CoBISE, CoGS
  - Analysis of grammar/syntax, lexis, pragmatics, discourse 
  
2. Audio extraction and forced alignment
  - Visualization and analysis of phonetic and prosodic variation
  
3. Video extraction
  - Analysis of multimodal communication

---

### Existing tools

- FAVE, PolyglotDB/ISCAN, LaBB-CAT, DARLA, WebMAUS, EMU-SDMS

All are versatile and well-designed tools

- FAVE and PolyglotDB users may run into issues with dependency inconsistencies
- LaBB-CAT requires setting up an Apache Tomcat server and many other installation steps
- EMU-SDMS is for analysis, not alignment
- DARLA is not customizable
- WebMAUS is only customizable to a limited extent
- None of these tools are suitable for download of DASH-streamed content

---

### Pipeline for acoustic analysis

![:scale 50%](data:image/png;base64,#./Github_phonetics_pipeline_screenshot.png)

- A Jupyter notebook for Python that collects transcripts and audio from YouTube, aligns the transcripts, and extracts vowel formants
- Click your way through the process in a Google Colab environment
- Can be used for any language that has ASR transcripts
- With a few script modifications, also works for manual transcripts

https://github.com/stcoats/phonetics_pipeline

---

### Potential advantages of a notebook pipeline

- Dependency conflict issues are minimal
- Can use immediately without extensive setup of servers, databases
- All modules are customizable
- Can be adapted to non-YouTube content
- Can be adapted to handle large amounts of data

---

### Potential disadvantages

- If used in Colab, potential processor/memory/storage issues
- Formant extraction technique could be optimized
- Customization requires Python scripting

---

### Component: yt-dlp

![](data:image/png;base64,#yt-dlp_screenshot.png)

]

- Open-source fork of YouTube-DL

- When a viewer accesses a video on the YouTube web page, a cryptographic key is generated. Video, audio, and transcript content is then streamed to the browser using this key

- yt-dlp gets this key to access the content streams (video, audio, captions, etc.)

- Can be used to access any content streamed with DASH or HLS protocols

]

---

### Component: Montreal Forced Aligner <span class="small">(McAuliffe et al. 2017)</span>

.small[
- Forced alignment is aligning a transcript to an audio track so that the exact start and end times of segments (words, phones) can be determined

- Necessary for automated analysis of vowel quality or other phonetic analysis

- MFA may perform better than some other aligners (P2FA, MAUS)

- MFA is fragile

]
]

![](data:image/png;base64,#mfa_screenshot.png)

]

---

### Component: Praat-Parselmouth <span class="small">(Jadoul et al. 2018)</span>

- Python interface to Praat, widely used software for acoustic analysis <span class="small">(Boersma & Weenink 2023)</span>

- Intergration into Python simplifies workflows and analysis

![:scale 75%](data:image/png;base64,#praat_screenshot.png)

---

### Google Colab

- Google Colaboratory is an online server for running
code in Python or R in a notebook environment
- You need a Google account to use Colab
- Advantages include access to GPU/TPU, collaborative
editing, cloud-based execution, and integration with code on GitHub/Gitlab

![:scale 70%](data:image/png;base64,#Colab.png)

---

### Colab link

## https://t.ly/3HhGJ

Go there now and we will align a video from YouTube!

---

### Example: CoANZSE Audio

- Cut YouTube transcripts into 20-word chunks
- Using transcript timing information and the DASH manifest, extract audio segment for each chunk with yt-dlp
- Feed audio and transcript excerpts to Montreal Forced Aligner <span class="small">(McAuliffe et al. 2017)</span>
    - <span class="small">Grapheme to phoneme dictionary, pronunciation dictionary: US ARPAbet</span>
    - <span class="small">Acoustic model: from Librispeech Corpus (Panayotov et al. 2015)</span>
    - <span class="small">Language model: MFA English 2.0.0</span>
- Output is Praat textgrids
- Get features of interest from textgrids + audio chunks with Praat-Parselmouth
- Analyze phenomena of interest (formants, voice onset time, pitch, etc.)

---

### Example: Excerpt from a video of the City of Adelaide <span class="small">(former mayor Sandy Verschoor, https://www.youtube.com/watch?v=f-GX8-qszPE)</span>

---

### Example video

---

### WebVTT file

![](data:image/png;base64,#./Maranoa_webvtt_example.png)

---

### YouTube captions files

- Videos can have multiple captions files: user-uploaded captions, auto-generated captions created using automatic speech recognition (ASR), or both, or neither

- User-uploaded captions can be manually created or generated automatically by 3rd-party ASR software

- Auto-generated captions are generated by YT's speech-to-text service

- CoANZSE, CoNASE, CoBISE: target YT ASR captions

---

### CoANZSE and other YouTube ASR Corpora

Corpus of Australian and New Zealand Spoken English
- [CoANZSE](https://cc.oulu.fi/~scoats/CoANZSE.html): 196m tokens, 472 locations, 56k transcripts corresponding to  24,007 hours of video from 2007-2022 <span class="small">(Coats 2023a)</span>

Corpus of North American Spoken English
- [CoNASE](https://cc.oulu.fi/~scoats/CoNASE.html): 1.25b token corpus of 301,846 word-timed, part-of-speech-tagged Automatic Speech Recognition (ASR) transcripts <span class="small">(Coats 2023c, also available with a searchable online interface: https://lncl6.lawcorpus.byu.edu)</span>

Corpus of Britain and Ireland Spoken English
- [CoBISE](https://cc.oulu.fi/~scoats/CoBISE.html): 112m tokens, 452 locations, 38,680 ASR transcripts <span class="small">(Coats 2022b)</span>

Corpus of German Speech
- [CoGS](https://cc.oulu.fi/~scoats/CoGS.html): 50.5m tokens, 1,308 locations, 39.5k transcripts <span class="small">(Coats in review)</span>
  
All are freely available for research use; download from the Harvard Dataverse ([CoNASE](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/X8QJJV), [CoBISE](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/UGIIWD), [CoGS](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/3Y1YVB),
[CoANZSE](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GW35AK))

---

### Data collection and processing

- Identification of relevant channels (lists of councils with web pages -> scrape pages for links to YouTube)
- Inspection of returned channels to remove false positives
- Retrieval of ASR transcripts using [YT-DLP](https://github.com/yt-dlp/yt-dlp)
- Geocoding: String containing council name + address + country location to Google's geocoding service
- PoS tagging with SpaCy <span class="small">(Honnibal et al. 2019)</span>

---

### CoANZSE corpus size by country/state/territory

.small[
Location	                  |nr_channels|nr_videos  |nr_words|video_length (h)
----------------------------|---|-------|-----------|----				
Australian Capital Territory|	8	|650	  |915,542	  |111.79
New South Wales             |114|9,741  |27,580,773	|3,428.87
Northern Territory	        |11 |	289	  |315,300	  |48.72
New Zealand	                |74	|18,029	|84,058,661	|10,175.80
Queensland	                |58	|7,356	|19,988,051	|2,642.75
South Australia	            |50	|3,537	|13,856,275	|1,716.72
Tasmania	                  |21	|1,260	|5,086,867	|636.99
Victoria	                  |78	|12,138	|35,304,943	|4,205.40
Western Australia	          |68	|3,815	|8,422,484	|1,063.78
| | | |
Total                       |482|56,815 |195,528,896|24,030.82
]

---

### CoANZSE channel locations

---

### ASR transcript and audio quality metric

- The quality of ASR transcripts can be evaluated by using a language model trained on a very large set of ASR transcripts generated for the same audio files at different rates of compression <span class="small">(Yuksel et al. 2023)</span>

rank  |compression|quality|hypothetical ASR excerpt	
--------|-----------|-------|-------------------------
  	1   | none      | best  |it's really fantastic that we
  	2   | little    | good  | it's really fantastic we
  	3   | medium    | middle| it's really fantasy	with 
	  4   | high      | poor  | it rifle fantasy that wonder
  	5   | most      | worst | Ik reed met fantasie

]]

<br><br>
- Applied with an adapted PyTorch model <span class="small">(https://huggingface.co/aixplain/NoRefER)</span>
- Assigns a numerical rating 0 (very bad ASR/audio) to 1 (excellent ASR/audio)

---

### Corpus use cases: Syntax/grammar/pragmatics

- Regional variation in syntax, mood and modality
- Lexical items
- Contractions
- Hortatives/commands/interjections
- Pragmatics: Turn-taking, politeness markers
- Multidimensional analysis à la Biber
- Typological comparison at country/state/regional level

---

### Example analysis: Double modals

- Non-standard rare syntactic feature<span class="small"> (Montgomery & Nagle 1994; Coats 2022a)</span>
  - *I might could help you with this*
- Occurs only in the American Southeast and in Scotland/Northern England/Northern Ireland?
- Most studies based on non-naturalistic data with limited geographical scope <span class="small">(data from linguistic atlas interviews, surveys administered mostly in American Southeast and North of Britain)</span>
- More widely used in North America and the British Isles than previously thought <span class="small">(Coats 2022a, Coats 2023b)</span>
- Little studied in Australian and New Zealand speech

.verysmall[
<div class="datatables html-widget html-fill-item" id="htmlwidget-eaefc9a82c9e51c5c0bb" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-eaefc9a82c9e51c5c0bb">{"x":{"filter":"none","vertical":false,"data":[["1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57"],["NSW","NSW","NSW","NSW","NSW","NSW","NSW","NSW","NSW","VIC","VIC","VIC","SA","SA","SA","SA","SA","SA","SA","QLD","QLD","QLD","TAS","TAS","TAS","TAS","TAS","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ"],["Central Darling Shire Council","Dubbo Regional Council","Inner West Council","Ku-ring-gai Council","Ku-ring-gai Council","mosmancouncil","Wingecarribee Shire Council","Wingecarribee Shire Council","Hunter Joint Organisation","Cardinia TV","Latrobe City Council","WyndhamCity","City of Adelaide","City of Burnside","Town of Gawler","Town of Gawler","City of Onkaparinga","CityOfPlayford","City of Victor Harbor","RRCouncil","Logan City Council","NOOSA COUNCIL TV","Clarence City Council","George Town Council Tasmania","Glamorgan Spring Bay Council","Huon Valley Council","King Island TV","Bay of Plenty Regional Council","Environment Canterbury","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Dunedin City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hastings District Council","Napier City Council","Nelson City Council","Otago Regional Council Official","Taupo District Council","Tauranga City Council","Tauranga City Council","Waikato District Council","Waikato District Council","Waikato District Council","Waipa District Council","Westland District Council","Whanganui District Council"],["24 February 2021 Part 2","Dubbo City Council State of the City Report 2014","Speaker Series - Shiver with Allie Reynolds","3D Bushfire Simulation and CWC Workshop","Ordinary Meeting of Council 20_08_2019","Mosman Art Prize - In Conversation Salote Tawale","Extraordinary Council Meeting 16 Feb 2022","Ordinary Meeting of Council 13 May 2020 - part one","Hunter Global Summit Day 1 Session 1","Cardinia Shire Council Meeting, May 18 2020","Latrobe City Council Webinars - Branding and Design","Powers of Attorney & Planning for the future - 2019 Disability Expo","Council Assessment Panel Meeting - 25 May 2020","Burnside Council Meeting 12 March 2019 7 pm","Council Member Workshop - Climate Emergency Action Plan","Special Council Meeting - 29_9_2020","Council meeting 18 May 2021","Ordinary Council - 15 December 2020","City of Victor Harbor Ordinary Council Meeting _ July 2021 (Continuation)","The Short Fall talking about RADF","16_02_21 - The City Planning, Economic Development & Environment Committee","Noosa Council Services & Organisation Committee Meeting - 10 November 2020","Clarence City Council - Council Meeting 28th February 2022","George Town Council Ordinary Meeting held 22nd February 2022 Part 3","Ordinary Meeting of Council - December 11, 2018","Huon Valley Council - Ordinary Council Meeting 27 May 2020","Interview Peter Youd","Public Transport Committee Zoom VIDEO Recording - 30 November 2021","Council Meeting 9 December 2021","09.12.21 - Item 31 - Memorandum of Understanding","30.04.19 - Item 3 - Colin Meurk","09.08.18 - Christchurch City Council meeting","09.08.2018 - Item 17 - Water Supply Programme Update","28.06.18 - Item 10 - Voluntary Smokefree Outdoor Dining in Council-licenced footpath areas","05.04.18 - Item 8 - Coastal - Burwood Community Board Report to Council","19.05.15 - Item 1 - Hearings of Submissions - The Cranmer Bridge Club - John Nimmo","17.04.14 - District Plan Review - Part 4","Committee Meetings - 31 August 2020","HCC HCC Meeting 14 March 2019 Part 4","HCC Annual Plan Meeting 26 February 2019 Part 2","HCC Finance Meeting 28 August Part 1","HCC Meeting 24th May 2018 Part 1","HCC G&I meeting 20 Feb 201 7 part 2","HCC Community and services meeting 27th June Part 2","Council Meeting – 14_07_2020","Sustainable Napier Committee - 13th February 2020 - Part 3","Council meeting Thursday, 24 June 2021","Strategy and Planning Committee - 11 August 2021","2014-08-26 Taupo Council Meeting - Part 2","Policy Committee - 19 February 2020","Urban Form and Transport Development Committee meeting - 17 March 2020 - Part 1 of 2","Local Alcohol Policy Workshop - 11 April 2022","Raglan Community Board - 27 October 2021","LTP Workshop - 30 June 2020","Finance & Corporate Committee - Zoom Meeting","Capital Projects & Tenders Committee Meeting","Building Owners Meeting - 27 May 2019"],["would might","'ll can","would might","might would","would might","might could","would might","would might","will can","would might","would might","would might","might could","'ll can","would might","would might","might could","would might","would might","would might","would might","would might","would might","would might","'ll can","would might","might would","would might","would might","might could","'ll can","might would","might would","might would","would might","would might","might would","would might","would might","would might","would might","'ll can","would might","would might","would might","would might","would might","would might","might can","would might","would might","would might","would might","would might","might would","might could","'ll can"],["<a href=https://youtu.be/4JhDv6H_rMQ?t=63>https://youtu.be/4JhDv6H_rMQ?t=63<\/a>","<a href=https://youtu.be/zOyDAMACmFk?t=190>https://youtu.be/zOyDAMACmFk?t=190<\/a>","<a href=https://youtu.be/WrmDQhsqv5s?t=568>https://youtu.be/WrmDQhsqv5s?t=568<\/a>","<a href=https://youtu.be/KhxiXPQBFXs?t=1232>https://youtu.be/KhxiXPQBFXs?t=1232<\/a>","<a href=https://youtu.be/n80tXfiqQzA?t=6192>https://youtu.be/n80tXfiqQzA?t=6192<\/a>","<a href=https://youtu.be/jQbDqA1yvhM?t=117>https://youtu.be/jQbDqA1yvhM?t=117<\/a>","<a href=https://youtu.be/kwGrKSIIDcQ?t=2997>https://youtu.be/kwGrKSIIDcQ?t=2997<\/a>","<a href=https://youtu.be/whP9EfvuouQ?t=3822>https://youtu.be/whP9EfvuouQ?t=3822<\/a>","<a href=https://youtu.be/6kHJiJMugPs?t=2351>https://youtu.be/6kHJiJMugPs?t=2351<\/a>","<a href=https://youtu.be/LX88aDEQCHY?t=1206>https://youtu.be/LX88aDEQCHY?t=1206<\/a>","<a href=https://youtu.be/7ukJvOujPfQ?t=1044>https://youtu.be/7ukJvOujPfQ?t=1044<\/a>","<a href=https://youtu.be/jFwUaeH452Q?t=804>https://youtu.be/jFwUaeH452Q?t=804<\/a>","<a href=https://youtu.be/6Tk9LilbFQU?t=2586>https://youtu.be/6Tk9LilbFQU?t=2586<\/a>","<a href=https://youtu.be/NwPfjcB8cq8?t=9061>https://youtu.be/NwPfjcB8cq8?t=9061<\/a>","<a href=https://youtu.be/nayq_0Stx2E?t=1519>https://youtu.be/nayq_0Stx2E?t=1519<\/a>","<a href=https://youtu.be/qgN_NF2Plqc?t=6825>https://youtu.be/qgN_NF2Plqc?t=6825<\/a>","<a href=https://youtu.be/e5kOcWgU4o8?t=13474>https://youtu.be/e5kOcWgU4o8?t=13474<\/a>","<a href=https://youtu.be/H35lwri328Q?t=4148>https://youtu.be/H35lwri328Q?t=4148<\/a>","<a href=https://youtu.be/TAIn0QH8VKM?t=10799>https://youtu.be/TAIn0QH8VKM?t=10799<\/a>","<a href=https://youtu.be/3a_MEXeW7H8?t=55>https://youtu.be/3a_MEXeW7H8?t=55<\/a>","<a href=https://youtu.be/6ro3nmNtutc?t=3751>https://youtu.be/6ro3nmNtutc?t=3751<\/a>","<a href=https://youtu.be/efGprIT2zho?t=5256>https://youtu.be/efGprIT2zho?t=5256<\/a>","<a href=https://youtu.be/cW_jBLyo0vo?t=6760>https://youtu.be/cW_jBLyo0vo?t=6760<\/a>","<a href=https://youtu.be/1lUsn3fwm_Y?t=450>https://youtu.be/1lUsn3fwm_Y?t=450<\/a>","<a href=https://youtu.be/4mum1Yur000?t=703>https://youtu.be/4mum1Yur000?t=703<\/a>","<a href=https://youtu.be/uBMu9GMDFaU?t=4480>https://youtu.be/uBMu9GMDFaU?t=4480<\/a>","<a href=https://youtu.be/pFb49I4p0xQ?t=308>https://youtu.be/pFb49I4p0xQ?t=308<\/a>","<a href=https://youtu.be/mHtIRAlc2w4?t=7061>https://youtu.be/mHtIRAlc2w4?t=7061<\/a>","<a href=https://youtu.be/h-Ue9-iD3mc?t=6800>https://youtu.be/h-Ue9-iD3mc?t=6800<\/a>","<a href=https://youtu.be/JO7vMyroJQo?t=1425>https://youtu.be/JO7vMyroJQo?t=1425<\/a>","<a href=https://youtu.be/MRZSHSAhqZ4?t=281>https://youtu.be/MRZSHSAhqZ4?t=281<\/a>","<a href=https://youtu.be/jzZzR2yHjf4?t=7062>https://youtu.be/jzZzR2yHjf4?t=7062<\/a>","<a href=https://youtu.be/khHQeskq9VY?t=3782>https://youtu.be/khHQeskq9VY?t=3782<\/a>","<a href=https://youtu.be/T5PwFRVU2vo?t=863>https://youtu.be/T5PwFRVU2vo?t=863<\/a>","<a href=https://youtu.be/BM25w7hI628?t=1034>https://youtu.be/BM25w7hI628?t=1034<\/a>","<a href=https://youtu.be/nmgg2LeCRh8?t=380>https://youtu.be/nmgg2LeCRh8?t=380<\/a>","<a href=https://youtu.be/l7fuhKQ-Nrs?t=197>https://youtu.be/l7fuhKQ-Nrs?t=197<\/a>","<a href=https://youtu.be/ifMwL7L4ZRc?t=3353>https://youtu.be/ifMwL7L4ZRc?t=3353<\/a>","<a href=https://youtu.be/CbR4GSo5Tr0?t=1291>https://youtu.be/CbR4GSo5Tr0?t=1291<\/a>","<a href=https://youtu.be/KR7DEpF6cPo?t=2352>https://youtu.be/KR7DEpF6cPo?t=2352<\/a>","<a href=https://youtu.be/x2MIZAQbtlg?t=4392>https://youtu.be/x2MIZAQbtlg?t=4392<\/a>","<a href=https://youtu.be/PTCRbmvQ1_w?t=9366>https://youtu.be/PTCRbmvQ1_w?t=9366<\/a>","<a href=https://youtu.be/UGHGqS_OO6o?t=696>https://youtu.be/UGHGqS_OO6o?t=696<\/a>","<a href=https://youtu.be/cWElounayJo?t=5189>https://youtu.be/cWElounayJo?t=5189<\/a>","<a href=https://youtu.be/_u_QyZmmhq4?t=2725>https://youtu.be/_u_QyZmmhq4?t=2725<\/a>","<a href=https://youtu.be/gdzgqjJ4nhY?t=1453>https://youtu.be/gdzgqjJ4nhY?t=1453<\/a>","<a href=https://youtu.be/z3aqzSzw8ek?t=3266>https://youtu.be/z3aqzSzw8ek?t=3266<\/a>","<a href=https://youtu.be/nQ_zHzfPBXk?t=1346>https://youtu.be/nQ_zHzfPBXk?t=1346<\/a>","<a href=https://youtu.be/6tHoNtddg_4?t=78>https://youtu.be/6tHoNtddg_4?t=78<\/a>","<a href=https://youtu.be/InYpTU9ZuTI?t=2251>https://youtu.be/InYpTU9ZuTI?t=2251<\/a>","<a href=https://youtu.be/FpEpRZGeQDw?t=11023>https://youtu.be/FpEpRZGeQDw?t=11023<\/a>","<a href=https://youtu.be/KkSNB-dJZs8?t=2945>https://youtu.be/KkSNB-dJZs8?t=2945<\/a>","<a href=https://youtu.be/RWFqTJCqkYE?t=6684>https://youtu.be/RWFqTJCqkYE?t=6684<\/a>","<a href=https://youtu.be/AbrvLqxsSTg?t=1292>https://youtu.be/AbrvLqxsSTg?t=1292<\/a>","<a href=https://youtu.be/53yPfrqbpkE?t=2611>https://youtu.be/53yPfrqbpkE?t=2611<\/a>","<a href=https://youtu.be/yIR6wFdNUKI?t=1888>https://youtu.be/yIR6wFdNUKI?t=1888<\/a>","<a href=https://youtu.be/WTP15-spw3A?t=4809>https://youtu.be/WTP15-spw3A?t=4809<\/a>"],["t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","sr d t","t","t","t","t","t","fp a2 t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","fp a2 t","t"],["\"however, the senior planning officer would might may want to make comment\"","\"we'll, we'll can forget about that plan for a while\"","also in embedded manual transcript","\"for anything that might would... go wrong\""," "," ","\"if you would might just convey\"","\"if they could move them down the hill further, I think they would might find that\""," "," ","\"people would might have chosen a Commodore over a Merc\"","\"again, there would might be a conflict of interest\"","\"it's a reasonable height, it might could have been taller\""," "," ","\"that would might be a bit of a challenge\"","\"I accept that they now might could have been better worded\""," ","\"it would might be, and this is the reason why\"","younger people","\"are there any other further councilors that would might make comment?\""," ","\"then we would might be able to get\"","\"this morning's workshop, would it might have been a good place...\" dm in question","\"so I'll can raise that with the relevant people\""," ","\"My question might would have been\"","slight pause after dm"," ","\"if they'd done something different way back, things might could have been better\"","\"they'll can accumulate over the next week\"","\"something that might would be in public ownership\"","same as 2003","\"once they've finished with this cigarette, if they must might, we'd like them to go outside\""," "," ","\"and that might would assist them\""," "," "," ","\"how did it slip through would mighta been another way of putting it\""," ","\"we would might have liked them to do that\"","\"where would might it be covered\" question form","Scottish accent?","\"I think what would might be useful is\" Wh- word"," "," ","\"on what might can constitute\""," ","\"we would might want to do, as councilor Morris said\" American/Canadian accent"," "," "," ","\"she might would be aware there are 23 projects\"","\"that we would might get a\"","\"and he'll can walk into a building\" narrative"]],"container":"<table class=\"cell-border stripe\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Location<\/th>\n      <th>Channel<\/th>\n      <th>Video<\/th>\n      <th>DM<\/th>\n      <th>Link<\/th>\n      <th>Type<\/th>\n      <th>Notes<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":100,"dom":"tip","scrollY":"200px","rownames":false,"columnDefs":[{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"Location","targets":1},{"name":"Channel","targets":2},{"name":"Video","targets":3},{"name":"DM","targets":4},{"name":"Link","targets":5},{"name":"Type","targets":6},{"name":"Notes","targets":7}],"order":[],"autoWidth":false,"orderClasses":false,"rowCallback":"function(row, data, displayNum, displayIndex, dataIndex) {\n}"}},"evals":["options.rowCallback"],"jsHooks":[]}</script>
]

---

### Script: Generating a table for manual inspection of double modals

- Base modals *will, would, can, could, might, may, must, should, shall, used to, 'll, ought to, oughta*
- Script to generate regexes of two-tier combinations

```python
import re
hits = []
for x in modals:
  for i,y in coanzse_df.iterrows():
      pat1 = re.compile("("+x[0]+"_\\w+_\\S+\\s+"+x[1]+"_\\w+_\\S+\\s)",re.IGNORECASE)
      finds = pat1.findall(y["text_pos"])
      if finds:
  	    for z in finds:
    	    seq = z.split()[0].split("_")[0].strip()+" "+z.split()[1].split("_")[0].strip()
    	    time = z.split()[0].split("_")[-1] 
    	    hits.append((x["country"],x["channel_title"],seq,"https://youtu.be/"+x["video_id"]+"?t="+str(round(float(time)-3))))
pd.DataFrame(hits)
```

- The script creates a URL for each search hit at a time 3 seconds before the targeted utterance 
- In the resulting data frame, each utterance can be annotated after examining the targeted video sequence
- Filter out non-double-modals (clause overlap, speaker self-repairs, ASR errors)

---

### Excerpt from generated table

<div class="datatables html-widget html-fill-item" id="htmlwidget-1b07183eb695ae011788" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-1b07183eb695ae011788">{"x":{"filter":"none","vertical":false,"data":[["1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57"],["NSW","NSW","NSW","NSW","NSW","NSW","NSW","NSW","NSW","VIC","VIC","VIC","SA","SA","SA","SA","SA","SA","SA","QLD","QLD","QLD","TAS","TAS","TAS","TAS","TAS","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ","NZ"],["Central Darling Shire Council","Dubbo Regional Council","Inner West Council","Ku-ring-gai Council","Ku-ring-gai Council","mosmancouncil","Wingecarribee Shire Council","Wingecarribee Shire Council","Hunter Joint Organisation","Cardinia TV","Latrobe City Council","WyndhamCity","City of Adelaide","City of Burnside","Town of Gawler","Town of Gawler","City of Onkaparinga","CityOfPlayford","City of Victor Harbor","RRCouncil","Logan City Council","NOOSA COUNCIL TV","Clarence City Council","George Town Council Tasmania","Glamorgan Spring Bay Council","Huon Valley Council","King Island TV","Bay of Plenty Regional Council","Environment Canterbury","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Christchurch City Council","Dunedin City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hamilton City Council","Hastings District Council","Napier City Council","Nelson City Council","Otago Regional Council Official","Taupo District Council","Tauranga City Council","Tauranga City Council","Waikato District Council","Waikato District Council","Waikato District Council","Waipa District Council","Westland District Council","Whanganui District Council"],["24 February 2021 Part 2","Dubbo City Council State of the City Report 2014","Speaker Series - Shiver with Allie Reynolds","3D Bushfire Simulation and CWC Workshop","Ordinary Meeting of Council 20_08_2019","Mosman Art Prize - In Conversation Salote Tawale","Extraordinary Council Meeting 16 Feb 2022","Ordinary Meeting of Council 13 May 2020 - part one","Hunter Global Summit Day 1 Session 1","Cardinia Shire Council Meeting, May 18 2020","Latrobe City Council Webinars - Branding and Design","Powers of Attorney & Planning for the future - 2019 Disability Expo","Council Assessment Panel Meeting - 25 May 2020","Burnside Council Meeting 12 March 2019 7 pm","Council Member Workshop - Climate Emergency Action Plan","Special Council Meeting - 29_9_2020","Council meeting 18 May 2021","Ordinary Council - 15 December 2020","City of Victor Harbor Ordinary Council Meeting _ July 2021 (Continuation)","The Short Fall talking about RADF","16_02_21 - The City Planning, Economic Development & Environment Committee","Noosa Council Services & Organisation Committee Meeting - 10 November 2020","Clarence City Council - Council Meeting 28th February 2022","George Town Council Ordinary Meeting held 22nd February 2022 Part 3","Ordinary Meeting of Council - December 11, 2018","Huon Valley Council - Ordinary Council Meeting 27 May 2020","Interview Peter Youd","Public Transport Committee Zoom VIDEO Recording - 30 November 2021","Council Meeting 9 December 2021","09.12.21 - Item 31 - Memorandum of Understanding","30.04.19 - Item 3 - Colin Meurk","09.08.18 - Christchurch City Council meeting","09.08.2018 - Item 17 - Water Supply Programme Update","28.06.18 - Item 10 - Voluntary Smokefree Outdoor Dining in Council-licenced footpath areas","05.04.18 - Item 8 - Coastal - Burwood Community Board Report to Council","19.05.15 - Item 1 - Hearings of Submissions - The Cranmer Bridge Club - John Nimmo","17.04.14 - District Plan Review - Part 4","Committee Meetings - 31 August 2020","HCC HCC Meeting 14 March 2019 Part 4","HCC Annual Plan Meeting 26 February 2019 Part 2","HCC Finance Meeting 28 August Part 1","HCC Meeting 24th May 2018 Part 1","HCC G&I meeting 20 Feb 201 7 part 2","HCC Community and services meeting 27th June Part 2","Council Meeting – 14_07_2020","Sustainable Napier Committee - 13th February 2020 - Part 3","Council meeting Thursday, 24 June 2021","Strategy and Planning Committee - 11 August 2021","2014-08-26 Taupo Council Meeting - Part 2","Policy Committee - 19 February 2020","Urban Form and Transport Development Committee meeting - 17 March 2020 - Part 1 of 2","Local Alcohol Policy Workshop - 11 April 2022","Raglan Community Board - 27 October 2021","LTP Workshop - 30 June 2020","Finance & Corporate Committee - Zoom Meeting","Capital Projects & Tenders Committee Meeting","Building Owners Meeting - 27 May 2019"],["would might","'ll can","would might","might would","would might","might could","would might","would might","will can","would might","would might","would might","might could","'ll can","would might","would might","might could","would might","would might","would might","would might","would might","would might","would might","'ll can","would might","might would","would might","would might","might could","'ll can","might would","might would","might would","would might","would might","might would","would might","would might","would might","would might","'ll can","would might","would might","would might","would might","would might","would might","might can","would might","would might","would might","would might","would might","might would","might could","'ll can"],["<a href=https://youtu.be/4JhDv6H_rMQ?t=63>https://youtu.be/4JhDv6H_rMQ?t=63<\/a>","<a href=https://youtu.be/zOyDAMACmFk?t=190>https://youtu.be/zOyDAMACmFk?t=190<\/a>","<a href=https://youtu.be/WrmDQhsqv5s?t=568>https://youtu.be/WrmDQhsqv5s?t=568<\/a>","<a href=https://youtu.be/KhxiXPQBFXs?t=1232>https://youtu.be/KhxiXPQBFXs?t=1232<\/a>","<a href=https://youtu.be/n80tXfiqQzA?t=6192>https://youtu.be/n80tXfiqQzA?t=6192<\/a>","<a href=https://youtu.be/jQbDqA1yvhM?t=117>https://youtu.be/jQbDqA1yvhM?t=117<\/a>","<a href=https://youtu.be/kwGrKSIIDcQ?t=2997>https://youtu.be/kwGrKSIIDcQ?t=2997<\/a>","<a href=https://youtu.be/whP9EfvuouQ?t=3822>https://youtu.be/whP9EfvuouQ?t=3822<\/a>","<a href=https://youtu.be/6kHJiJMugPs?t=2351>https://youtu.be/6kHJiJMugPs?t=2351<\/a>","<a href=https://youtu.be/LX88aDEQCHY?t=1206>https://youtu.be/LX88aDEQCHY?t=1206<\/a>","<a href=https://youtu.be/7ukJvOujPfQ?t=1044>https://youtu.be/7ukJvOujPfQ?t=1044<\/a>","<a href=https://youtu.be/jFwUaeH452Q?t=804>https://youtu.be/jFwUaeH452Q?t=804<\/a>","<a href=https://youtu.be/6Tk9LilbFQU?t=2586>https://youtu.be/6Tk9LilbFQU?t=2586<\/a>","<a href=https://youtu.be/NwPfjcB8cq8?t=9061>https://youtu.be/NwPfjcB8cq8?t=9061<\/a>","<a href=https://youtu.be/nayq_0Stx2E?t=1519>https://youtu.be/nayq_0Stx2E?t=1519<\/a>","<a href=https://youtu.be/qgN_NF2Plqc?t=6825>https://youtu.be/qgN_NF2Plqc?t=6825<\/a>","<a href=https://youtu.be/e5kOcWgU4o8?t=13474>https://youtu.be/e5kOcWgU4o8?t=13474<\/a>","<a href=https://youtu.be/H35lwri328Q?t=4148>https://youtu.be/H35lwri328Q?t=4148<\/a>","<a href=https://youtu.be/TAIn0QH8VKM?t=10799>https://youtu.be/TAIn0QH8VKM?t=10799<\/a>","<a href=https://youtu.be/3a_MEXeW7H8?t=55>https://youtu.be/3a_MEXeW7H8?t=55<\/a>","<a href=https://youtu.be/6ro3nmNtutc?t=3751>https://youtu.be/6ro3nmNtutc?t=3751<\/a>","<a href=https://youtu.be/efGprIT2zho?t=5256>https://youtu.be/efGprIT2zho?t=5256<\/a>","<a href=https://youtu.be/cW_jBLyo0vo?t=6760>https://youtu.be/cW_jBLyo0vo?t=6760<\/a>","<a href=https://youtu.be/1lUsn3fwm_Y?t=450>https://youtu.be/1lUsn3fwm_Y?t=450<\/a>","<a href=https://youtu.be/4mum1Yur000?t=703>https://youtu.be/4mum1Yur000?t=703<\/a>","<a href=https://youtu.be/uBMu9GMDFaU?t=4480>https://youtu.be/uBMu9GMDFaU?t=4480<\/a>","<a href=https://youtu.be/pFb49I4p0xQ?t=308>https://youtu.be/pFb49I4p0xQ?t=308<\/a>","<a href=https://youtu.be/mHtIRAlc2w4?t=7061>https://youtu.be/mHtIRAlc2w4?t=7061<\/a>","<a href=https://youtu.be/h-Ue9-iD3mc?t=6800>https://youtu.be/h-Ue9-iD3mc?t=6800<\/a>","<a href=https://youtu.be/JO7vMyroJQo?t=1425>https://youtu.be/JO7vMyroJQo?t=1425<\/a>","<a href=https://youtu.be/MRZSHSAhqZ4?t=281>https://youtu.be/MRZSHSAhqZ4?t=281<\/a>","<a href=https://youtu.be/jzZzR2yHjf4?t=7062>https://youtu.be/jzZzR2yHjf4?t=7062<\/a>","<a href=https://youtu.be/khHQeskq9VY?t=3782>https://youtu.be/khHQeskq9VY?t=3782<\/a>","<a href=https://youtu.be/T5PwFRVU2vo?t=863>https://youtu.be/T5PwFRVU2vo?t=863<\/a>","<a href=https://youtu.be/BM25w7hI628?t=1034>https://youtu.be/BM25w7hI628?t=1034<\/a>","<a href=https://youtu.be/nmgg2LeCRh8?t=380>https://youtu.be/nmgg2LeCRh8?t=380<\/a>","<a href=https://youtu.be/l7fuhKQ-Nrs?t=197>https://youtu.be/l7fuhKQ-Nrs?t=197<\/a>","<a href=https://youtu.be/ifMwL7L4ZRc?t=3353>https://youtu.be/ifMwL7L4ZRc?t=3353<\/a>","<a href=https://youtu.be/CbR4GSo5Tr0?t=1291>https://youtu.be/CbR4GSo5Tr0?t=1291<\/a>","<a href=https://youtu.be/KR7DEpF6cPo?t=2352>https://youtu.be/KR7DEpF6cPo?t=2352<\/a>","<a href=https://youtu.be/x2MIZAQbtlg?t=4392>https://youtu.be/x2MIZAQbtlg?t=4392<\/a>","<a href=https://youtu.be/PTCRbmvQ1_w?t=9366>https://youtu.be/PTCRbmvQ1_w?t=9366<\/a>","<a href=https://youtu.be/UGHGqS_OO6o?t=696>https://youtu.be/UGHGqS_OO6o?t=696<\/a>","<a href=https://youtu.be/cWElounayJo?t=5189>https://youtu.be/cWElounayJo?t=5189<\/a>","<a href=https://youtu.be/_u_QyZmmhq4?t=2725>https://youtu.be/_u_QyZmmhq4?t=2725<\/a>","<a href=https://youtu.be/gdzgqjJ4nhY?t=1453>https://youtu.be/gdzgqjJ4nhY?t=1453<\/a>","<a href=https://youtu.be/z3aqzSzw8ek?t=3266>https://youtu.be/z3aqzSzw8ek?t=3266<\/a>","<a href=https://youtu.be/nQ_zHzfPBXk?t=1346>https://youtu.be/nQ_zHzfPBXk?t=1346<\/a>","<a href=https://youtu.be/6tHoNtddg_4?t=78>https://youtu.be/6tHoNtddg_4?t=78<\/a>","<a href=https://youtu.be/InYpTU9ZuTI?t=2251>https://youtu.be/InYpTU9ZuTI?t=2251<\/a>","<a href=https://youtu.be/FpEpRZGeQDw?t=11023>https://youtu.be/FpEpRZGeQDw?t=11023<\/a>","<a href=https://youtu.be/KkSNB-dJZs8?t=2945>https://youtu.be/KkSNB-dJZs8?t=2945<\/a>","<a href=https://youtu.be/RWFqTJCqkYE?t=6684>https://youtu.be/RWFqTJCqkYE?t=6684<\/a>","<a href=https://youtu.be/AbrvLqxsSTg?t=1292>https://youtu.be/AbrvLqxsSTg?t=1292<\/a>","<a href=https://youtu.be/53yPfrqbpkE?t=2611>https://youtu.be/53yPfrqbpkE?t=2611<\/a>","<a href=https://youtu.be/yIR6wFdNUKI?t=1888>https://youtu.be/yIR6wFdNUKI?t=1888<\/a>","<a href=https://youtu.be/WTP15-spw3A?t=4809>https://youtu.be/WTP15-spw3A?t=4809<\/a>"],["t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","sr d t","t","t","t","t","t","fp a2 t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","t","fp a2 t","t"],["\"however, the senior planning officer would might may want to make comment\"","\"we'll, we'll can forget about that plan for a while\"","also in embedded manual transcript","\"for anything that might would... go wrong\""," "," ","\"if you would might just convey\"","\"if they could move them down the hill further, I think they would might find that\""," "," ","\"people would might have chosen a Commodore over a Merc\"","\"again, there would might be a conflict of interest\"","\"it's a reasonable height, it might could have been taller\""," "," ","\"that would might be a bit of a challenge\"","\"I accept that they now might could have been better worded\""," ","\"it would might be, and this is the reason why\"","younger people","\"are there any other further councilors that would might make comment?\""," ","\"then we would might be able to get\"","\"this morning's workshop, would it might have been a good place...\" dm in question","\"so I'll can raise that with the relevant people\""," ","\"My question might would have been\"","slight pause after dm"," ","\"if they'd done something different way back, things might could have been better\"","\"they'll can accumulate over the next week\"","\"something that might would be in public ownership\"","same as 2003","\"once they've finished with this cigarette, if they must might, we'd like them to go outside\""," "," ","\"and that might would assist them\""," "," "," ","\"how did it slip through would mighta been another way of putting it\""," ","\"we would might have liked them to do that\"","\"where would might it be covered\" question form","Scottish accent?","\"I think what would might be useful is\" Wh- word"," "," ","\"on what might can constitute\""," ","\"we would might want to do, as councilor Morris said\" American/Canadian accent"," "," "," ","\"she might would be aware there are 23 projects\"","\"that we would might get a\"","\"and he'll can walk into a building\" narrative"]],"container":"<table class=\"cell-border stripe\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Location<\/th>\n      <th>Channel<\/th>\n      <th>Video<\/th>\n      <th>DM<\/th>\n      <th>Link<\/th>\n      <th>Type<\/th>\n      <th>Notes<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":100,"dom":"tip","scrollY":"400px","rownames":false,"columnDefs":[{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"Location","targets":1},{"name":"Channel","targets":2},{"name":"Video","targets":3},{"name":"DM","targets":4},{"name":"Link","targets":5},{"name":"Type","targets":6},{"name":"Notes","targets":7}],"order":[],"autoWidth":false,"orderClasses":false,"rowCallback":"function(row, data, displayNum, displayIndex, dataIndex) {\n}"}},"evals":["options.rowCallback"],"jsHooks":[]}</script>

---

### Finding features

- Regular-expression-search and manual annotation approach
- Double modals can be found in the US North and West and in Canada; in Scotland, N. Ireland, and N. England, but also in the English Midlands and South and in Wales <span class="small">(Coats in Review)
- Also in Australia and (especially) New Zealand!

---

### Vowel formants from underlying audio

For each transcript/audio pair in the collection:

- Send transcript + audio to Montreal Forced Aligner <span class="small">(McAuliffe et al. 2017)</span>; output is Praat TextGrids <span class="small">(Boersma & Weenink 2023)</span>
- Select features of interest using TextGrid timings and Parselmouth <span class="small">(Python port of Praat functions; Jadoul et al. 2018)</span>

<pre style="font-size:11px">were raised by councillors which discussed              [oʊ]<br/>a broad range of topics and issues of<br />particular note was the further promotion</pre>                      
<audio controls id="player" autostart="0" autostart="false" preload ="none" name="media">
  <source src="data:image/png;base64,#https://a3s.fi/swift/v1/AUTH_319b50570e56446f94b58088b66fcdb2/test_sounds1/OdhGckWy5Dw_0001358500014315_17.wav" type="audio/wav">
</audio>

---

### Formants: F1/F2 values for a single utterance

]

- Script makes 9 f1/f2 measurements per token at deciles of the token duration

- Circles are individual measurement points

- The line represents the formant trajectory for a single token

- Retain segments for which at least 5 measurements were possible

]

---

### Formants: F1/F2 values for a single location (filtered)

]

- Sample of [oʊ] realizations from the City of Adelaide channel

- Retain tokens for which at least 5 measurements were possible

- This visualization filters out segments 100 milliseconds in duration

]
     
---

### Formants: Mean values

]

- Mean values for a single video, a single channel, a single location, etc.
- Circle locations represent the average value for that duration quantile (subscript)
- Circle size is proportional to the number of measurements for that quantile (more likely to get formant values in the middle of the vowel than at the beginning/end)
]

---

### GOAT vowel

- First target of /oʊ/ is more back and closed in South Australia compared to other Australian locations <span class="small">(Butcher 2007, Cox & Palethorpe 2019)</span>

---

#### Average  F1 and F2 values for the first targets of the diphthongs /eɪ/, /aɪ/, /oʊ/, and /aʊ/, spatial autocorrelation <span class="small">(2,339,812 vowel tokens)

---

### Comparison <small>(Grieve, Speelman & Geeraerts 2013, p. 37)</small>

![](data:image/png;base64,#.\Grieve_et_al_2013_eY.png)
]

.pull-right[
- Grieve et al. (2013) used a similar technique used to analyze formant measurements from the *Atlas of North American English* (Labov et al. 2006)
- ANAE contains approximately 134,000 vowel measurements in total
]

---

### Multimodality

- Use regular expressions to search corpus
- Extract video as well as audio
- Manually or automatically analyze:
  - Gesture
  - Posture/body/head inclination
  - Facial expression
  - Handling of objects
  - Touching
  - (etc.)

---

### 'Heaps of' in Australian English

---

### Extracted *today* tokens

---

### In development: CoANZSE Audio

]

---

### A few caveats

- ASR errors (mean WER after filtering ~14%), quality of transcript related to quality of audio as well as dialect features <span class="small">(Tatman 2017; Meyer et al. 2020; Markl & Lai 2021)</span>
  - Low-frequency phenomena: manually inspect corpus hits
  - High-frequency phenomena: signal of correct transcriptions will be stronger <span class="small">(Agarwal et al. 2009)</span> → classifiers
- Machine learning model to identify higher quality transcripts/audio <span class="small">(Yuksel et al. 2023)</span>
- MFA pronunciation dictionary and acoustic model: US English models might fail for some features (rhotacism)? <span class="small">BUT see Gonzalez et al. (2020), Mackenzie and Turton (2020)</span> 
- Need to analyze error rates of forced alignment
- Diarization, speaker demographic information

---

### Summary and outlook

- Access to online audio data via DASH and HLS protocols
- Pipeline can get audio data from YouTube or other sites
- Automatic acoustic analysis of vowel formants or other speech properties
- CoANZSE Audio, built with the pipeline, for Australian and New Zealand English

---

#Thank you!

---

### References

Agarwal, S., Godbole, S., Punjani, D. & Roy, S. (2007). [How much noise is too much: A study in automatic text classification](https://doi.org/10.1109/ICDM.2007.21). In: *Seventh IEEE International Conference on Data Mining (ICDM 2007)*, 3–12.

Boersma, P. & Weenink, D. (2023). Praat: doing phonetics by computer. Version 6.3.09. http://www.praat.org

Coats, Steven. (2023a). CoANZSE: [The Corpus of Australian and New Zealand Spoken English: A new resource of naturalistic speech transcripts](https://doi.org/10.2478/plc-2022-13 ). In P. Parameswaran, J. Biggs & D. Powers (Eds.), *Proceedings of the the 20th Annual Workshop of the Australasian Language Technology Association*, 1–5. Australasian Language Technology Association.

Coats, S. (2023b). [Double modals in contemporary British and Irish Speech](https://doi.org/10.1017/S1360674323000126). *English Language and Linguistics*.

Coats, S. (2023c). [Dialect corpora from YouTube](https://doi.org/10.1515/9783111017433-005 ). In B. Busse, N. Dumrukcic & I. Kleiber (Eds.), *Language and linguistics in a complex world*, 79–102. Walter de Gruyter.

Coats, S. (2022a). [Naturalistic double modals in North America](https://doi.org/10.1215/00031283-9766889). *American Speech*.

Coats, S. (2022b). [The Corpus of British Isles Spoken English (CoBISE): A new resource of contemporary British and Irish speech](http://ceur-ws.org/Vol-3232/paper15.pdf). In K. Berglund, M. La Mela & I. Zwart (Eds.), *Proceedings of the 6th Digital Humanities in the Nordic and Baltic Countries Conference, Uppsala, Sweden, March 15–18, 2022*, 187–194. CEUR.

Gonzalez, S., Grama, J. & Travis, C. (2020). [Comparing the performance of forced aligners used in sociophonetic research](https://doi.org/10.1515/lingvan-2019-0058). Linguistics Vanguard 5.

Honnibal, M. et al. (2019). [Explosion/spaCy v2.1.7: Improved evaluation, better language factories and bug
fixes](https://doi.org/10.5281/zenodo.3358113).
]]

---

### References II

Jadoul, Y., Thompson, B. & de Boer, B. (2018). Introducing Parselmouth: A Python interface to Praat. *Journal of Phonetics*, 71, 1–15. https://doi.org/10.1016/j.wocn.2018.07.001

MacKenzie, L. & Turton, D. (2020). [Assessing the accuracy of existing forced alignment software on varieties of British English](https://doi.org/10.1515/lingvan-2018-0061). Linguistics Vanguard 6.

Markl, N. & Lai, C. (2021). [Context-sensitive evaluation of automatic speech recognition: considering
user experience & language variation](https://aclanthology.org/2021.hcinlp-1.6). In: *Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing, Association for Computational Linguistics*, 34–40. Association for Computational Linguistics.

McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M. & Sonderegger, M. (2017). Montreal Forced Aligner: Trainable text-speech alignment using Kaldi. In *Proceedings of the 18th Conference of the International Speech Communication Association*.

Meyer, J., Rauchenstein, L., Eisenberg, J. D. & Howell, N. (2020). [Artie bias corpus: An open dataset
for detecting demographic bias in speech applications](https://aclanthology.org/2020.lrec-1.796). In: *Proceedings of the 12th Language Resources and Evaluation Conference, European Language Resources Association, Marseille, France, 2020*, 6462–6468.

Panayotov, V., Chen, G., Povey, D. & Khudanpur, S. (2015) [Librispeech: An ASR corpus based on public domain audio books](https://doi.org/10.1109/ICASSP.2015.7178964). In *Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, 5206–5210.

Tatman, R. (2017). [Gender and dialect bias in YouTube’s automatic captions](https://aclanthology.
org/W17-1606). In: *Proceedings of the First ACL Workshop on Ethics in Natural Language Processing*, 53–59. Association for Computational Linguistics.

Yuksel, K. A., Ferreira, T., Javadi, G., El-Badrashiny, M. & Gunduz, A. (2023). [NoRefER: A referenceless quality metric for Automatic Speech Recognition via semi-supervised language model fine-tuning with contrastive learning](https://arxiv.org/abs/2306.12577). arXiv:2306.12577 [cs.CL].

]]