Using term clouds to represent segment-level semantic
content of podcasts
Fuller, Marguerite, Tsagkias, Manos, Newman, EamonnORCID: 0000-0002-0310-0539, Besser, Jana, Larson, Martha, Jones, Gareth J.F.ORCID: 0000-0003-2923-8365 and de Rijke, Maarten
(2008)
Using term clouds to represent segment-level semantic
content of podcasts.
In: the Workshop on Searching Spontaneous Conversational Speech at Thirty-First Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), 24 July 2008, Singapore.
Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts
generated by automatic speech recognition (ASR). This paper
examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript
generated by automatic speech recognition (ASR). Quality of
segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries.