HTML Microdata document

This HTML5 document contains 52 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

Prefix	IRI
dcterms	http://purl.org/dc/terms/
n15	http://localhost/temp/predkladatel/
n8	http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n5	http://linked.opendata.cz/resource/domain/vavai/projekt/
n13	http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F49777513%3A23520%2F05%3A00000284%21RIV07-MSM-23520___/
n17	http://linked.opendata.cz/ontology/domain/vavai/
n18	http://linked.opendata.cz/resource/domain/vavai/zamer/
s	http://schema.org/
skos	http://www.w3.org/2004/02/skos/core#
n3	http://linked.opendata.cz/ontology/domain/vavai/riv/
n2	http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
n6	http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n19	http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdh	http://www.w3.org/2001/XMLSchema#
n11	http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n7	http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n14	http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n4	http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n16	http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item: n2:RIV%2F49777513%3A23520%2F05%3A00000284%21RIV07-MSM-23520___
rdf:type: skos:Concept n17:Vysledek
dcterms:description: This paper describes a Czech spontaneous speech corpus consisting of radio talk show recordings. As the first complete non-English MDE corpus, it has been annotated with structural metadata information beyond the words that is critical to both increasing transcript readability and allowing application of downstream NLP methods. Metadata annotation involves partitioning verbatim transcripts into syntactic/semantic units (SUs) that function to express a complete idea; and identifying fillers and edit disfluencies. Annotation guidelines for English metadata developed by Linguistic Data Consortium were taken as the starting point, with changes applied to accommodate specific phenomena of Czech. In addition to the necessary language-dependent modifications, we further propose some language-independent modifications including limited prosodic labeling at SU boundaries. This paper describes a Czech spontaneous speech corpus consisting of radio talk show recordings. As the first complete non-English MDE corpus, it has been annotated with structural metadata information beyond the words that is critical to both increasing transcript readability and allowing application of downstream NLP methods. Metadata annotation involves partitioning verbatim transcripts into syntactic/semantic units (SUs) that function to express a complete idea; and identifying fillers and edit disfluencies. Annotation guidelines for English metadata developed by Linguistic Data Consortium were taken as the starting point, with changes applied to accommodate specific phenomena of Czech. In addition to the necessary language-dependent modifications, we further propose some language-independent modifications including limited prosodic labeling at SU boundaries. Tento článek popisuje český korpus spontánní řeči skládajícíse z nahrávek rozhlasových diskusních pořadů. Jako první kompletní neanglický MDE korpus byl anotován strukturálními metadaty, která zvyšují čitelnost přepisů člověkem a umožňují i další automatické zpracování. Anotace zahrnuje rozdělení přepisů do syntakticko-sémantických jednotek a identifikace výplní a neplynulostí. Mimo modifikací nutných pouze pro češtinu také navrhujeme některé modifikace nezávislé na jazyku, jako je například limitované prozodické značkování na hranicích syntakticko-sémantických jednotek.
dcterms:title: Czech spontaneous speech corpus with structural metadata Czech spontaneous speech corpus with structural metadata Český korpus spontánní řeči s anotací strukturálních metadat
skos:prefLabel: Český korpus spontánní řeči s anotací strukturálních metadat Czech spontaneous speech corpus with structural metadata Czech spontaneous speech corpus with structural metadata
skos:notation: RIV/49777513:23520/05:00000284!RIV07-MSM-23520___
n3:strany: 1165
n3:aktivita: n7:P n7:Z
n3:aktivity: P(LC536), Z(MSM 235200004)
n3:cisloPeriodika: 0
n3:dodaniDat: n16:2007
n3:domaciTvurceVysledku: n8:8780943 n8:2328372 n8:6579760
n3:druhVysledku: n4:J
n3:duvernostUdaju: n19:S
n3:entitaPredkladatele: n13:predkladatel
n3:idSjednocenehoVysledku: 516830
n3:idVysledku: RIV/49777513:23520/05:00000284
n3:jazykVysledku: n11:eng
n3:klicovaSlova: SUs; structural metadata; spontaneous speech; disfluencies; fillers
n3:klicoveSlovo: n6:structural%20metadata n6:spontaneous%20speech n6:fillers n6:disfluencies n6:SUs
n3:kodStatuVydavatele: PT - Portugalská republika
n3:kontrolniKodProRIV: [98A7D988FAA6]
n3:nazevZdroje: Eurospeech
n3:obor: n14:JD
n3:pocetDomacichTvurcuVysledku: 3
n3:pocetTvurcuVysledku: 6
n3:projekt: n5:LC536
n3:rokUplatneniVysledku: n16:2005
n3:svazekPeriodika: 2005
n3:tvurceVysledku: Walker, Christopher Švec, Jan Kolář, Jáchym Kozlíková, Dagmar Psutka, Josef Strassel, Stephanie
n3:zamer: n18:MSM%20235200004
s:issn: 1018-4074
s:numberOfPages: 4
n15:organizacniJednotka: 23520