class: inverse, center, middle background-image: url(data:image/png;base64,#https://cc.oulu.fi/~scoats/oululogoRedTransparent.png); background-repeat: no-repeat; background-size: 80px 57px; background-position:right top; exclude: true --- class: title-slide <br><br><br><br><br> .pull-left[] .pull-right[ <span style="font-family:Rubik;font-size:24pt;font-weight: 700;font-style: normal;float:right;text-align: right;color:white;-webkit-text-fill-color: black;-webkit-text-stroke: 0.8px;">Combined Audio and Chat Transcripts for Recorded Video Streams</span> ] <p style="float:right;text-align: right;color:white;font-weight: 700;font-style: normal;-webkit-text-fill-color: black;-webkit-text-stroke: 0.5px;"> Steven Coats<br> English, University of Oulu, Finland<br> <a href="mailto:steven.coats@oulu.fi">steven.coats@oulu.fi</a><br> Love Data Week, Université Toulouse – Jean Jaurès<br> February 12th, 2026<br> </p> --- layout: true <div class="my-header"><img border="0" alt="W3Schools" src="https://cc.oulu.fi/~scoats/oululogonewEng.png" width="80" height="80"></div> <div class="my-footer"><span>Steven Coats                            Framework for Stream Analysis | Formation données langagières</span></div> --- exclude: true <div class="my-header"><img border="0" alt="W3Schools" src="https://cc.oulu.fi/~scoats/oululogonewEng.png" width="80" height="80"></div> <div class="my-footer"><span>Steven Coats                            Framework for Stream Analysis | Formation données langagières</span></div> --- ### Outline 1. Background - Video streaming as an increasingly popular CMC modality - Study of multimodal content: ASR transcript + chat stream (+ video) 2. VoD Toolkit: Pipeline components 3. Use cases - Chat density - Sentiment 4. Workshop in Colab/Jupyter .footnote[Slides for the presentation are on my homepage at https://cc.oulu.fi/~scoats] <div class="my-header"><img border="0" alt="W3Schools" src="https://cc.oulu.fi/~scoats/oululogonewEng.png" width="80" height="80"></div> <div class="my-footer"><span>Steven Coats                            Framework for Stream Analysis | Formation données langagières</span></div> --- ### Background - Increasing popularity of streaming - Twitch (mostly gaming), YouTube Live, Instagram Live, Facebook Live, X Livestream, Kick, and others - Increasing importance as an economic activity <span class="small">(Zhou et al. 2019; Johnson & Woodcock 2019; Yu et al. 2018)</span> - Recorded streams contain multiple levels of communication <span class="small">(Sjöblom et al. 2019; Recktenwald 2017)</span> - Speech of the streamer (and potentially of others) - Text and graphical image content (emoji, emotes) of chat participants - Text and graphical content of system messages (e.g. bots showing tips to streamer) - Secondary visual content (and text and speech) of video output (embedded windows showing gameplay) - Most corpus-based analyses have focused on live chat content <span class="small">(Olejniczak 2015; Kim et al. 2022)</span> - Few studies consider multiple levels --- ### VoD Toolkit (https://shorturl.at/TF3Kn) A script pipeline to generate a structured, time-aligned transcript that combines the stream speech transcript with chat contributions - Jupyter Notebook/ Google Colab - Generates output that can be analyzed with corpus methods - Can also be used to capture video for multimodal analysis --- ### Pipeline components  - [yt-dlp](https://github.com/yt-dlp/yt-dlp) - [TwitchDownloaderCLI](https://github.com/lay295/TwitchDownloader) - [faster-whisper](https://github.com/SYSTRAN/faster-whisper) The toolkit's output is an HTML file --- ### Component: [yt-dlp](https://github.com/yt-dlp/yt-dlp) .pull-left[  ] .pull-right[ - Fork of YouTube-DL - Can be used to access any content streamed with DASH or HLS protocols - Can be used to get video ] --- ### Component: [TwitchDownloaderCLI](https://github.com/lay295/TwitchDownloader) .pull-left[  ] .pull-right[ - Command-line interface for retrieving Twitch videos and chats ] --- ### Component: [faster-whisper](https://github.com/SYSTRAN/faster-whisper) .pull-left[  ] .pull-right[ Library based on OpenAI's Whisper providing Automatic speech recognition - Word-level timestamps - Faster than Whisper, especially with GPU ] --- ### Workflow  --- ### Example: YouTube stream <iframe width="800" height="450" controls src="https://a3s.fi/swift/v1/Toulouse_Workshop/PewDiePie_videoClip.mp4" frameborder="0" allow="accelerometer; encrypted-media; gyroscope; picture-in-picture" scrolling="no" sandbox allowfullscreen></iframe> --- ### Example: YouTube output <div class="container"> <iframe src="https://a3s.fi/swift/v1/Toulouse_Workshop/PewDiePie26_mini.html" style="width: 100%; height: 450px;" style="max-width = 100%" sandbox="allow-same-origin allow-scripts" scrolling="yes" seamless="seamless" frameborder="0" align="middle"></iframe> </div> --- ### Example: Twitch stream <iframe width="800" height="450" controls src="https://a3s.fi/swift/v1/Toulouse_Workshop/Anyme.mp4" frameborder="0" allow="accelerometer; encrypted-media; gyroscope; picture-in-picture" scrolling="no" sandbox allowfullscreen></iframe> --- ### Example: Twitch output <div class="container"> <iframe src="https://a3s.fi/swift/v1/Toulouse_Workshop/Anyme.html" style="width: 100%; height: 450px;" style="max-width = 100%" sandbox="allow-same-origin allow-scripts" scrolling="yes" seamless="seamless" frameborder="0" align="middle"></iframe> </div> --- ### Use cases: Chat density  - Chat density can be compared with and correlated with streamer utterances --- ### Use cases: Sentiment  - How does sentiment evolve over the course of a stream? --- ### Potential use case: Automated analysis of video streams - Retrieve video with yt-dlp - Add cells to VoD Toolkit to import (e.g.) [X-CLIP](https://huggingface.co/microsoft/xclip-base-patch32) <span class="small">(Ni et al. 2022)</span>, [LLaVA-NeXT-Video-7B-h9](https://huggingface.co/llava-hf/LLaVA-NeXT-Video-7B-h) <span class="small">(Zhang et al. 2022)</span>, or other libraries - Automatically generate text describing what is going on in different parts of the video - Who is chatting about what parts of the video? - Is chat about (for example) video content, other chat, or speech content? --- exclude: true ### Potential use cases: Acoustic analysis  - Acoustic features of particular streamers or streams with different topics/from different locations etc. can be analyzed <span class="small">(cf. Coats 2025, 2023; Méli et al. 2023)</span> https://colab.research.google.com/github/stcoats/phonetics_pipeline/blob/main/phonetics_pipeline_v3.ipynb --- exclude: True ### Google Colab - Google Colaboratory is an online server for running code in Python or R in a notebook environment - You need a Google account to use Colab - Advantages include access to GPU/TPU, collaborative editing, cloud-based execution, and integration with code on GitHub/Gitlab  --- ### CMC-Corpora Conference https://cmc2026.org Submissions portal open! Deadline 15 April ### Workshop 1. Package installs 2. Collect some YouTube content 3. Collect some Twitch content 4. Preliminary analyses of YouTube content #### Allons-y! VoD Toolkit (https://shorturl.at/TF3Kn) --- ### References .verysmall[ .hangingindent[ Coats, S. (2025). [An automatic pipeline for processing streamed content: New horizons for corpus linguistics and phonetics](https://doi.org/10.1515/9783111434018-011). In L. Cotgrove, L. Herzberg, & H. Lüngen (eds.), *Exploring digitally-mediated communication with corpora: Methods, analyses, and corpus construction*, 257–274. Berlin: De Gruyter Brill. Coats, S. (2023). [A pipeline for the large-scale acoustic analysis of streamed content](https://doi.org/10.14618/1z5k-pb25). In L. Cotgrove, L. Herzberg, H. Lüngen, & I. Pisetta (eds.), *Proceedings of the 10th International Conference on CMC and Social Media Corpora for the Humanities (CMC-Corpora 2023)*, 51–54. Mannheim: Leibniz-Institut für Deutsche Sprache. Herring, S. (1999). [Interactional coherence in CMC](https://doi.org/10.1111/j.1083-6101.1999.tb00106.x ). *Journal of Computer-Mediated-Communication*, 4(4). Johnson, M. R., & Woodcock, J. (2019). [The impacts of live streaming and Twitch.tv on the video game industry](https://doi.org/10.1177/0163443718818363 ). *Media, Culture & Society*, 41(5), 670–688. Kim, J., Wohn, D. Y., & Cha, M. (2022). [Understanding and identifying the use of emotes in toxic chat on Twitch](https://doi.org/10.1016/j.osnem.2021.100180). *Online Social Networks and Media* 27. Ni, B., Peng, H., Chen, M., Zhang, S., Meng, G., Fu, J., Xiang, S., & Ling, H. (2022). [Expanding language-image pretrained models for general video recognition](https://doi.org/10.48550/arXiv.2208.02816 ). *arXiv*, cs.CV, 2208.02816. Olejniczak, J. (2015). A linguistic study of language variety used on twitch.tv: Descriptive and corpus-based approaches. In *Proceedings of RCIC’15: Redefining Community in Intercultural Context, Brasov, 21–23 May 2015* (pp. 329–334). Recktenwald, D. (2017). Toward a transcription and analysis of live streaming on Twitch. *Journal of Pragmatics* 115, 68–81. Robert, A. J. (2025). Modelling the interaction space of Twitch: A multimodal framework for corpus structuring and analysis. In A. Fabián & I. Trost (eds.), [*Impulses and Approaches to Computer-Mediated Communication: Proceedings of the 12th International Conference on Computer Mediated Communication and Social Media Corpora for the Humanities*](https://doi.org/10.15495/EPub_UBT_00008705), 94–98. Bayreuth: University of Bayreuth. Sjöblom, M., Törhönen, M., Hamari, J., & Macey, J. (2019). The ingredients of Twitch streaming: Affordances of game streams. *Computers in Human Behavior*, 92, 20–28. Yu, E., Jung, C., Kim, H., & Jung, J. (2018). Impact of viewer engagement on gift-giving in live video streaming. *Telematics and Informatics*, 35(5), 1450–1460. Zhang, Y. Li, B., Liu, H., Lee, Y. G., Gui, L., Fu, D., Feng, J., Liu, Z., & Li, C. (2024). [LLaVA-NeXT: A Strong Zero-shot Video Understanding Model](https://llava-vl.github.io/blog/2024-04-30-llava-next-video/). Zhou, J., Zhou, J., Ding, Y., & Wang, H. (2019). The magic of danmaku: A social interaction perspective of gift sending on live streaming platforms. *Electronic Commerce Research and Applications* 34, 100815. ]]