About: The SYN-series corpora of written Czech     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
rdfs:seeAlso
Description
  • The paper overviews the SYN series of synchronic corpora of written Czech compiled within the framework of the Czech National Corpus project. It describes their design and processing with a focus on the annotation, i.e. lemmatization and morphological tagging. The paper also introduces SYN2013PUB, a new 935-million newspaper corpus of Czech published in 2013 as the most recent addition to the SYN series before planned revision of its architecture. SYN2013PUB can be seen as a completion of the series in terms of titles and publication dates of major Czech newspapers that are now covered by complete volumes in comparable proportions. All SYN-series corpora can be characterized as traditional, with emphasis on cleared copyright issues, well-defined composition, reliable metadata and high-quality data processing; their overall size currently exceeds 2.2 billion running words.
  • The paper overviews the SYN series of synchronic corpora of written Czech compiled within the framework of the Czech National Corpus project. It describes their design and processing with a focus on the annotation, i.e. lemmatization and morphological tagging. The paper also introduces SYN2013PUB, a new 935-million newspaper corpus of Czech published in 2013 as the most recent addition to the SYN series before planned revision of its architecture. SYN2013PUB can be seen as a completion of the series in terms of titles and publication dates of major Czech newspapers that are now covered by complete volumes in comparable proportions. All SYN-series corpora can be characterized as traditional, with emphasis on cleared copyright issues, well-defined composition, reliable metadata and high-quality data processing; their overall size currently exceeds 2.2 billion running words. (en)
Title
  • The SYN-series corpora of written Czech
  • The SYN-series corpora of written Czech (en)
skos:prefLabel
  • The SYN-series corpora of written Czech
  • The SYN-series corpora of written Czech (en)
skos:notation
  • RIV/00216208:11210/14:10285597!RIV15-MSM-11210___
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • P(LM2011023)
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 48935
http://linked.open...ai/riv/idVysledku
  • RIV/00216208:11210/14:10285597
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • Czech; language corpus; written language (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...ontrolniKodProRIV
  • [28746911E6BE]
http://linked.open...v/mistoKonaniAkce
  • Reykjavík, Island
http://linked.open...i/riv/mistoVydani
  • Reykjavík, Island
http://linked.open...i/riv/nazevZdroje
  • Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...vavai/riv/projekt
http://linked.open...UplatneniVysledku
http://linked.open...iv/tvurceVysledku
  • Křen, Michal
  • Procházka, Pavel
  • Hnátková, Milena
  • Skoumalová, Hana
http://linked.open...vavai/riv/typAkce
http://linked.open.../riv/zahajeniAkce
number of pages
http://purl.org/ne...btex#hasPublisher
  • European Language Resources Association (ELRA)
https://schema.org/isbn
  • 978-2-9517408-8-4
http://localhost/t...ganizacniJednotka
  • 11210
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 58 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software