About: Multimodal Phoneme Recognition of Meeting Data     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
Description
  • Rozpoznávání fonémů z meetingových dat pomocí audio-vizuálních parametrů<br> (cs)
  • This paper describes experiments in automatic recognition of context-independent phoneme strings from meeting data using audio-visual features. Visual features are known to improve accuracy and noise robustness of automatic speech recognizers. However, many problems appear when not &quot;visually clean'' data is provided, such as data without limited variation in the speaker's frontal pose, lighting conditions, background, etc. The goal of this work was to test whether visual information can be helpful for recognition of phonemes using neural nets. While the audio part is fixed and uses standard Mel filter-bank energies, different features describing the video were tested: average brightness, DCT coefficients extracted from region-of-interest (ROI), o ptical flow analysis and lip-position features. The recognition was evaluated on a sub-set of IDIAP meeting room data. We have seen small improvement when compared to purely audio-recognition, but further work needs to be done especially concerning the d
  • This paper describes experiments in automatic recognition of context-independent phoneme strings from meeting data using audio-visual features. Visual features are known to improve accuracy and noise robustness of automatic speech recognizers. However, many problems appear when not &quot;visually clean'' data is provided, such as data without limited variation in the speaker's frontal pose, lighting conditions, background, etc. The goal of this work was to test whether visual information can be helpful for recognition of phonemes using neural nets. While the audio part is fixed and uses standard Mel filter-bank energies, different features describing the video were tested: average brightness, DCT coefficients extracted from region-of-interest (ROI), o ptical flow analysis and lip-position features. The recognition was evaluated on a sub-set of IDIAP meeting room data. We have seen small improvement when compared to purely audio-recognition, but further work needs to be done especially concerning the d (en)
Title
  • Multimodal Phoneme Recognition of Meeting Data
  • Multimodal Phoneme Recognition of Meeting Data (en)
  • Multimodální rozpoznávání fonémů na meeting datech (cs)
skos:prefLabel
  • Multimodal Phoneme Recognition of Meeting Data
  • Multimodal Phoneme Recognition of Meeting Data (en)
  • Multimodální rozpoznávání fonémů na meeting datech (cs)
skos:notation
  • RIV/00216305:26230/04:PU49308!RIV06-GA0-26230___
http://linked.open.../vavai/riv/strany
  • 379-384
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • P(GA102/02/0124), P(GP102/02/D108), Z(MSM 262200012)
http://linked.open...iv/cisloPeriodika
  • 3206
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 575027
http://linked.open...ai/riv/idVysledku
  • RIV/00216305:26230/04:PU49308
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • speech processing, audio-video processing, feature extraction, pattern recognition (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...odStatuVydavatele
  • DE - Spolková republika Německo
http://linked.open...ontrolniKodProRIV
  • [22284505B5A2]
http://linked.open...i/riv/nazevZdroje
  • Lecture Notes in Computer Science (IF 0,513)
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...vavai/riv/projekt
http://linked.open...UplatneniVysledku
http://linked.open...v/svazekPeriodika
  • 2004
http://linked.open...iv/tvurceVysledku
  • Černocký, Jan
  • Motlíček, Petr
http://linked.open...n/vavai/riv/zamer
issn
  • 0302-9743
number of pages
http://localhost/t...ganizacniJednotka
  • 26230
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 48 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software