About: Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
Description
  • In this paper we study the effect of incorporation of automatic transcriptions in the speaker diarization process. We aim to improve both the diarization accuracy as evaluated by standard objective measures and quality of the diarization output from user’s perspective. Although the presented approach relies on output of an automatic speech recognizer, it makes no use of lexical information. Instead, we use information about word boundaries and classification of non-speech events occurring in the processed stream. The former information is used as constraining condition for speaker change-point candidates and the latter facilitate to neglect various vocal noise sounds that carry no speaker-specific information (considering representation of the signal by cepstral features) and thus harm the speaker’s representation. The experimental evaluation of the presented approach was carried out using the COST278 multilingual broadcast news database. We demonstrate that the approach yields improvement in terms of both speaker diarization and segmentation performance measures. Furthermore, we show that the number of change-points detected within words (and not at their boundaries) is significantly reduced.
  • In this paper we study the effect of incorporation of automatic transcriptions in the speaker diarization process. We aim to improve both the diarization accuracy as evaluated by standard objective measures and quality of the diarization output from user’s perspective. Although the presented approach relies on output of an automatic speech recognizer, it makes no use of lexical information. Instead, we use information about word boundaries and classification of non-speech events occurring in the processed stream. The former information is used as constraining condition for speaker change-point candidates and the latter facilitate to neglect various vocal noise sounds that carry no speaker-specific information (considering representation of the signal by cepstral features) and thus harm the speaker’s representation. The experimental evaluation of the presented approach was carried out using the COST278 multilingual broadcast news database. We demonstrate that the approach yields improvement in terms of both speaker diarization and segmentation performance measures. Furthermore, we show that the number of change-points detected within words (and not at their boundaries) is significantly reduced. (en)
Title
  • Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams
  • Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams (en)
skos:prefLabel
  • Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams
  • Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams (en)
skos:notation
  • RIV/46747885:24220/12:#0002003!RIV13-TA0-24220___
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • P(TA01011204)
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 141097
http://linked.open...ai/riv/idVysledku
  • RIV/46747885:24220/12:#0002003
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • speaker diarization; broadcast transcription (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...ontrolniKodProRIV
  • [50D591EC1322]
http://linked.open...v/mistoKonaniAkce
  • Banff, Kanada
http://linked.open...i/riv/mistoVydani
  • Kanada
http://linked.open...i/riv/nazevZdroje
  • Proc. of IEEE conf. on Multimedia Signal Processing (MMSP)
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...vavai/riv/projekt
http://linked.open...UplatneniVysledku
http://linked.open...iv/tvurceVysledku
  • Nouza, Jan
  • Pražák, Jan
  • Silovský, Jan
  • Červa, Petr
  • Žďánský, Jindřich
http://linked.open...vavai/riv/typAkce
http://linked.open...ain/vavai/riv/wos
  • 000312670200021
http://linked.open.../riv/zahajeniAkce
number of pages
http://purl.org/ne...btex#hasPublisher
  • Neuveden
https://schema.org/isbn
  • 978-1-4673-4572-9
http://localhost/t...ganizacniJednotka
  • 24220
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 112 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software