About: Recovery of Rare Words in Lecture Speech

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Recovery of Rare Words in Lecture Speech Goto Sponge NotDistinct Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

Attributes	Values
rdf:type	skos:Concept http://linked.opendata.cz/ontology/domain/vavai/Vysledek
Description	The vocabulary used in speech usually consists of two types of words: a limited set of common words, shared across multiple documents, and a virtually unlimited set of rare words, each of which might appear a few times only in particular documents. In most documents, however, these rare words are not seen at all. The first type of words is typically included in the language model of an automatic speech recognizer (ASR) and is thus widely referred to as invocabulary (IV). Words of the second type are missing in the language model and thus are called out-of-vocabulary (OOV). However, these words usually carry important information. We use a hybrid word/sub-word recognizer to detect OOV words occurring in English talks and describe them as sequences of sub-words.We detected about one third of all OOV words, and were able to recover the correct spelling for 26.2% of all detections by using a phoneme-to-grapheme (P2G) conversion trained on the recognition dictionary. By omitting detections corresponding to The vocabulary used in speech usually consists of two types of words: a limited set of common words, shared across multiple documents, and a virtually unlimited set of rare words, each of which might appear a few times only in particular documents. In most documents, however, these rare words are not seen at all. The first type of words is typically included in the language model of an automatic speech recognizer (ASR) and is thus widely referred to as invocabulary (IV). Words of the second type are missing in the language model and thus are called out-of-vocabulary (OOV). However, these words usually carry important information. We use a hybrid word/sub-word recognizer to detect OOV words occurring in English talks and describe them as sequences of sub-words.We detected about one third of all OOV words, and were able to recover the correct spelling for 26.2% of all detections by using a phoneme-to-grapheme (P2G) conversion trained on the recognition dictionary. By omitting detections corresponding to (en)
Title	Recovery of Rare Words in Lecture Speech Recovery of Rare Words in Lecture Speech (en)
skos:prefLabel	Recovery of Rare Words in Lecture Speech Recovery of Rare Words in Lecture Speech (en)
skos:notation	RIV/00216305:26230/10:PU89608!RIV11-GA0-26230___
http://linked.open...avai/riv/aktivita	P S Z
http://linked.open...avai/riv/aktivity	P(GA102/08/0707), S, Z(MSM0021630528)
http://linked.open...vai/riv/dodaniDat	2011
http://linked.open...aciTvurceVysledku	Burget, Lukáš Heřmanský, Hynek Kombrink, Stefan Hannemann, Mirko
http://linked.open.../riv/druhVysledku	D - Článek ve sborníku
http://linked.open...iv/duvernostUdaju	S - Úplné a pravdivé údaje nepodléhající ochraně podle zvláštních právních předpisů
http://linked.open...titaPredkladatele	Vysoké učení technické v Brně / Fakulta informačních technologií
http://linked.open...dnocenehoVysledku	284237
http://linked.open...ai/riv/idVysledku	RIV/00216305:26230/10:PU89608
http://linked.open...riv/jazykVysledku	eng - angličtina
http://linked.open.../riv/klicovaSlova	speech, rare words, recognizer, detect OOV words, sub-words, lectures (en)
http://linked.open.../riv/klicoveSlovo	recognizer lectures speech detect OOV words rare words sub-words
http://linked.open...ontrolniKodProRIV	[D085C7340192]
http://linked.open...v/mistoKonaniAkce	Brno
http://linked.open...i/riv/mistoVydani	Brno
http://linked.open...i/riv/nazevZdroje	Proc. Text, Speech and Dialogue 2010
http://linked.open...in/vavai/riv/obor	JC
http://linked.open...ichTvurcuVysledku	4 (xsd:int)
http://linked.open...cetTvurcuVysledku	4 (xsd:int)
http://linked.open...vavai/riv/projekt	Speech Recognition under Real-World Conditions
http://linked.open...UplatneniVysledku	2010
http://linked.open...iv/tvurceVysledku	Burget, Lukáš Kombrink, Stefan Hannemann, Mirko Heřmanský, Hynek
http://linked.open...vavai/riv/typAkce	WRD - Světová
http://linked.open.../riv/zahajeniAkce	2010-09-06 (xsd:date)
http://linked.open...n/vavai/riv/zamer	Výzkum informačních technologií z hlediska bezpečnosti
number of pages	8 (xsd:int)
http://purl.org/ne...btex#hasPublisher	Springer-Verlag
https://schema.org/isbn	978-3-642-15759-2
http://localhost/t...ganizacniJednotka	26230
is http://linked.open...avai/riv/vysledek of	Recovery of Rare Words in Lecture Speech

Faceted Search & Find service v1.16.118 as of Jun 21 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 123 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software