About: Database Framework for a Distributed Spoken Data Collection Project     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
Description
  • The chapter describes the main features of database system Mluvka (chatterbox) that is used in the Czech National Corpus for collecting recordings and transcriptions of authentic spoken Czech used in informal situations. To record maximum possible variety of speakers, the material is collected in the whole country through a network of local collaborators. The system is a central data storage that reflects distributed character of the project and facilitates its organisation in various ways. In particular, it ensures formal conformance of all the submissions, it supports several levels of read/write access rights based on the collection areas and it enables continuous balancing of the collected material. Mluvka is a well-attested system lying behind both recently published corpora of authentic spoken Czech, ORAL2006 and ORAL2008. Their total size is 2 650 000 tokens including punctuation, ORAL2008 is balanced in selected sociolinguistic categories of speakers.
  • The chapter describes the main features of database system Mluvka (chatterbox) that is used in the Czech National Corpus for collecting recordings and transcriptions of authentic spoken Czech used in informal situations. To record maximum possible variety of speakers, the material is collected in the whole country through a network of local collaborators. The system is a central data storage that reflects distributed character of the project and facilitates its organisation in various ways. In particular, it ensures formal conformance of all the submissions, it supports several levels of read/write access rights based on the collection areas and it enables continuous balancing of the collected material. Mluvka is a well-attested system lying behind both recently published corpora of authentic spoken Czech, ORAL2006 and ORAL2008. Their total size is 2 650 000 tokens including punctuation, ORAL2008 is balanced in selected sociolinguistic categories of speakers. (en)
Title
  • Database Framework for a Distributed Spoken Data Collection Project
  • Database Framework for a Distributed Spoken Data Collection Project (en)
skos:prefLabel
  • Database Framework for a Distributed Spoken Data Collection Project
  • Database Framework for a Distributed Spoken Data Collection Project (en)
skos:notation
  • RIV/00216208:11210/11:10103866!RIV12-MSM-11210___
http://linked.open...avai/predkladatel
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • Z(MSM0021620823)
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 192935
http://linked.open...ai/riv/idVysledku
  • RIV/00216208:11210/11:10103866
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • transcription; informal spoken Czech; representativeness; sociolinguistics; regional coverage; data collection; corpus design; spoken corpora; language resources; Czech language (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...ontrolniKodProRIV
  • [3F9CC99079D0]
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...UplatneniVysledku
http://linked.open...iv/tvurceVysledku
  • Křen, Michal
  • Waclawičová, Martina
http://linked.open...n/vavai/riv/zamer
http://localhost/t...ganizacniJednotka
  • 11210
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 58 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software