This HTML5 document contains 48 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
n12http://linked.opendata.cz/ontology/domain/vavai/riv/typAkce/
dctermshttp://purl.org/dc/terms/
n16http://localhost/temp/predkladatel/
n10http://purl.org/net/nknouf/ns/bibtex#
n22http://linked.opendata.cz/resource/domain/vavai/projekt/
n8http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n19http://linked.opendata.cz/resource/domain/vavai/subjekt/
n18http://linked.opendata.cz/ontology/domain/vavai/
n5https://schema.org/
shttp://schema.org/
skoshttp://www.w3.org/2004/02/skos/core#
n3http://linked.opendata.cz/ontology/domain/vavai/riv/
n7http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216224%3A14330%2F11%3A00054044%21RIV12-MSM-14330___/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n4http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n20http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n17http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n11http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n21http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n13http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n6http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216224%3A14330%2F11%3A00054044%21RIV12-MSM-14330___
rdf:type
skos:Concept n18:Vysledek
dcterms:description
Paper presents by far the largest available computer corpus of Tajik Language of the size of more than 50 million words. To obtain the texts for the corpus two different approaches were used. The paper brings a description of both of them, discusses their advantages and disadvantages and shows some statistics of the two respective partial corpora. Then the paper characterizes the resulting joined corpus and finally discusses some possible future improvements. Paper presents by far the largest available computer corpus of Tajik Language of the size of more than 50 million words. To obtain the texts for the corpus two different approaches were used. The paper brings a description of both of them, discusses their advantages and disadvantages and shows some statistics of the two respective partial corpora. Then the paper characterizes the resulting joined corpus and finally discusses some possible future improvements.
dcterms:title
Building a 50M Corpus of Tajik Language Building a 50M Corpus of Tajik Language
skos:prefLabel
Building a 50M Corpus of Tajik Language Building a 50M Corpus of Tajik Language
skos:notation
RIV/00216224:14330/11:00054044!RIV12-MSM-14330___
n18:predkladatel
n19:orjk%3A14330
n3:aktivita
n17:S n17:P
n3:aktivity
P(LC536), S
n3:dodaniDat
n6:2012
n3:domaciTvurceVysledku
n8:1322451 n8:8884439 Dovudov, Gulshan n8:4980190
n3:druhVysledku
n13:D
n3:duvernostUdaju
n20:S
n3:entitaPredkladatele
n7:predkladatel
n3:idSjednocenehoVysledku
188873
n3:idVysledku
RIV/00216224:14330/11:00054044
n3:jazykVysledku
n11:eng
n3:klicovaSlova
language corpora; corpus; corpus building; tajik
n3:klicoveSlovo
n4:corpus n4:language%20corpora n4:tajik n4:corpus%20building
n3:kontrolniKodProRIV
[7648BA1D309B]
n3:mistoKonaniAkce
Karlova Studánka
n3:mistoVydani
Brno
n3:nazevZdroje
Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2011
n3:obor
n21:AI
n3:pocetDomacichTvurcuVysledku
4
n3:pocetTvurcuVysledku
4
n3:projekt
n22:LC536
n3:rokUplatneniVysledku
n6:2011
n3:tvurceVysledku
Pomikálek, Jan Suchomel, Vít Šmerk, Pavel Dovudov, Gulshan
n3:typAkce
n12:CST
n3:zahajeniAkce
2011-01-01+01:00
s:numberOfPages
7
n10:hasPublisher
Tribun EU
n5:isbn
978-80-263-0077-9
n16:organizacniJednotka
14330