About: Database Framework for a Distributed Spoken Data Collection Project

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Database Framework for a Distributed Spoken Data Collection Project Goto Sponge NotDistinct Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

Attributes	Values
rdf:type	skos:Concept http://linked.opendata.cz/ontology/domain/vavai/Vysledek
Description	The chapter describes the main features of database system Mluvka (chatterbox) that is used in the Czech National Corpus for collecting recordings and transcriptions of authentic spoken Czech used in informal situations. To record maximum possible variety of speakers, the material is collected in the whole country through a network of local collaborators. The system is a central data storage that reflects distributed character of the project and facilitates its organisation in various ways. In particular, it ensures formal conformance of all the submissions, it supports several levels of read/write access rights based on the collection areas and it enables continuous balancing of the collected material. Mluvka is a well-attested system lying behind both recently published corpora of authentic spoken Czech, ORAL2006 and ORAL2008. Their total size is 2 650 000 tokens including punctuation, ORAL2008 is balanced in selected sociolinguistic categories of speakers. The chapter describes the main features of database system Mluvka (chatterbox) that is used in the Czech National Corpus for collecting recordings and transcriptions of authentic spoken Czech used in informal situations. To record maximum possible variety of speakers, the material is collected in the whole country through a network of local collaborators. The system is a central data storage that reflects distributed character of the project and facilitates its organisation in various ways. In particular, it ensures formal conformance of all the submissions, it supports several levels of read/write access rights based on the collection areas and it enables continuous balancing of the collected material. Mluvka is a well-attested system lying behind both recently published corpora of authentic spoken Czech, ORAL2006 and ORAL2008. Their total size is 2 650 000 tokens including punctuation, ORAL2008 is balanced in selected sociolinguistic categories of speakers. (en)
Title	Database Framework for a Distributed Spoken Data Collection Project Database Framework for a Distributed Spoken Data Collection Project (en)
skos:prefLabel	Database Framework for a Distributed Spoken Data Collection Project Database Framework for a Distributed Spoken Data Collection Project (en)
skos:notation	RIV/00216208:11210/11:10103866!RIV12-MSM-11210___
http://linked.open...avai/predkladatel	Filozofická fakulta
http://linked.open...avai/riv/aktivita	Z
http://linked.open...avai/riv/aktivity	Z(MSM0021620823)
http://linked.open...vai/riv/dodaniDat	2012
http://linked.open...aciTvurceVysledku	Křen, Michal Waclawičová, Martina
http://linked.open.../riv/druhVysledku	O - Ostatní výsledky nezařaditelné do žádného z výše uvedených druhů výsledku
http://linked.open...iv/duvernostUdaju	S - Úplné a pravdivé údaje nepodléhající ochraně podle zvláštních právních předpisů
http://linked.open...titaPredkladatele	Univerzita Karlova v Praze / Filozofická fakulta
http://linked.open...dnocenehoVysledku	192935
http://linked.open...ai/riv/idVysledku	RIV/00216208:11210/11:10103866
http://linked.open...riv/jazykVysledku	eng - angličtina
http://linked.open.../riv/klicovaSlova	transcription; informal spoken Czech; representativeness; sociolinguistics; regional coverage; data collection; corpus design; spoken corpora; language resources; Czech language (en)
http://linked.open.../riv/klicoveSlovo	corpus design informal spoken Czech language resources regional coverage representativeness sociolinguistics spoken corpora transcription Czech language - declension - foreign oikonyms - SYN2005 and SYN2010 corpora data collection
http://linked.open...ontrolniKodProRIV	[3F9CC99079D0]
http://linked.open...in/vavai/riv/obor	AI
http://linked.open...ichTvurcuVysledku	2 (xsd:int)
http://linked.open...cetTvurcuVysledku	2 (xsd:int)
http://linked.open...UplatneniVysledku	2011
http://linked.open...iv/tvurceVysledku	Křen, Michal Waclawičová, Martina
http://linked.open...n/vavai/riv/zamer	Czech National Corpus and Corpora of Other Languages
http://localhost/t...ganizacniJednotka	11210

Faceted Search & Find service v1.16.118 as of Jun 21 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 58 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software