The accuracy of auto-generated subtitles of four Indonesian English-speaking youtubers

Devika, Saviera Christina (2022) The accuracy of auto-generated subtitles of four Indonesian English-speaking youtubers. Undergraduate thesis, Widya Mandala Surabaya Catholic University.

[thumbnail of ABSTRAK] Text (ABSTRAK)
Abstract.pdf

Download (1MB)
[thumbnail of BAB 1] Text (BAB 1)
BAB I.pdf

Download (417kB)
[thumbnail of BAB 2] Text (BAB 2)
BAB II.pdf
Restricted to Registered users only

Download (645kB) | Request a copy
[thumbnail of BAB 3] Text (BAB 3)
BAB III.pdf
Restricted to Registered users only

Download (860kB) | Request a copy
[thumbnail of BAB 4] Text (BAB 4)
BAB IV.pdf
Restricted to Registered users only

Download (343kB) | Request a copy
[thumbnail of BAB 5] Text (BAB 5)
BAB V.pdf

Download (314kB)
[thumbnail of LAMPIRAN] Text (LAMPIRAN)
Lampiran.pdf
Restricted to Registered users only

Download (2MB) | Request a copy

Abstract

YouTube is well-known with almost 95% of internet users who use it today. Using its features, users are able to upload, watch and share videos freely. As it is easy to access, the video content must be well delivered. YouTube content creators (YouTubers) sometimes enjoy sharing their thoughts in a language other than their mother tongue, for instance, Indonesian using English. As YouTube provides Auto-Generated subtitles that have various subtitles quality, Indonesian YouTubers, as non-native speakers might make mistakes. This research aims to identify the inaccuracies in the subtitles of Indonesian YouTubers speaking English by generating two research questions; (i) Do the auto-generated subtitles represent the speakers’ pronunciation accurately? (ii) What are the reasons underlying the inaccuracies if there are? Are there any sounds which are systematically misinterpreted by the auto-generated subtitle engine? This research is descriptive qualitative. To collect the data, the researcher took the transcript of chosen two videos from each of four YouTube channels. The analysis was done by making 3 steps of analysis (Accuracy identification, Explanation of the inaccuracies, and Inaccuracies classification). The result of the analysis was written in narrative. The subjects were chosen based on these criteria; (1) Indonesian YouTuber and (2) Speak English in their videos. The result of this research shows that there were some inaccuracies from the system, YouTubers, and unknown reasons. The researcher then classified the reasons of the inaccuracies into 16 sub reasons: (1) weak or inappropriate pronunciation, (2) weak voiced-voiceless sounds, (3) weak plosive sounds, (4) weak vowel sounds, (5) unclear alveolar sounds, (6) unfamiliar people's names/name of places/event/brand, (7) homophones, (8) using foreign language, (9) mixing Indonesian and English, (10) mixing Korean and English, (11) bad video-audio editing, (12) background noises, (13) harsh words, (14) speak too fast/unclear/too soft, (15) simultaneous talk, and (16) the ASR failed to interpret. The researcher then sharpened the sub reasons classification into five main reasons: (1) phonological reasons, (2) morphological reasons, (3) code switching, (4) technical reasons, and (5) miscellaneous reasons. Based on the findings, more systematic inaccuracies related to phonology can be conducted further.

Item Type: Thesis (Undergraduate)
Department: S1 - Pendidikan Bahasa Inggris
Contributors:
Contribution
Contributors
NIDN / NIDK
Email
Thesis advisor
Yumarnamto, Mateus
NIDN0723017201
mateus@ukwms.ac.id
Uncontrolled Keywords: YouTube, YouTuber, Indonesian YouTuber, pronunciation, auto-generated subtitle, inaccuracies in generating subtitles.
Subjects: English Education
Divisions: Faculty of Teacher Training and Education > English Education Study Program
Depositing User: Users 10879 not found.
Date Deposited: 17 May 2022 06:45
Last Modified: 17 May 2022 06:45
URI: http://repository.ukwms.ac.id/id/eprint/29934

Actions (login required)

View Item View Item