This HTML5 document contains 42 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
n7http://linked.opendata.cz/ontology/domain/vavai/riv/typAkce/
dctermshttp://purl.org/dc/terms/
n10http://localhost/temp/predkladatel/
n8http://purl.org/net/nknouf/ns/bibtex#
n21http://linked.opendata.cz/resource/domain/vavai/projekt/
n9http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n19http://linked.opendata.cz/resource/domain/vavai/subjekt/
n14http://linked.opendata.cz/ontology/domain/vavai/
n22https://schema.org/
shttp://schema.org/
skoshttp://www.w3.org/2004/02/skos/core#
n3http://linked.opendata.cz/ontology/domain/vavai/riv/
n6http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216224%3A14330%2F13%3A00070316%21RIV14-MSM-14330___/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n17http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n4http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n18http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n13http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n16http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n15http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n12http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216224%3A14330%2F13%3A00070316%21RIV14-MSM-14330___
rdf:type
skos:Concept n14:Vysledek
dcterms:description
Since there are only very few techniques for quantitative and systematic comparison of text corpora we proposed and implemented several novel methods. The procedures were applied to comparing two very large web based Czech text corpora: czTenTen12 and Hector with more than 4.47 and 2.65 billion words, respectively. All methods are fully automatic and some of them are even language independent. We released some of them so they can be used instantly for comparison of other corpora. Since there are only very few techniques for quantitative and systematic comparison of text corpora we proposed and implemented several novel methods. The procedures were applied to comparing two very large web based Czech text corpora: czTenTen12 and Hector with more than 4.47 and 2.65 billion words, respectively. All methods are fully automatic and some of them are even language independent. We released some of them so they can be used instantly for comparison of other corpora.
dcterms:title
Intrinsic Methods for Comparison of Corpora Intrinsic Methods for Comparison of Corpora
skos:prefLabel
Intrinsic Methods for Comparison of Corpora Intrinsic Methods for Comparison of Corpora
skos:notation
RIV/00216224:14330/13:00070316!RIV14-MSM-14330___
n14:predkladatel
n19:orjk%3A14330
n3:aktivita
n13:S n13:P
n3:aktivity
P(LM2010013), S
n3:dodaniDat
n12:2014
n3:domaciTvurceVysledku
n9:8884439 n9:9652353
n3:druhVysledku
n16:D
n3:duvernostUdaju
n4:S
n3:entitaPredkladatele
n6:predkladatel
n3:idSjednocenehoVysledku
80969
n3:idVysledku
RIV/00216224:14330/13:00070316
n3:jazykVysledku
n18:eng
n3:klicovaSlova
text corpus; corpora comparison
n3:klicoveSlovo
n17:corpora%20comparison n17:text%20corpus
n3:kontrolniKodProRIV
[AC486683F094]
n3:mistoKonaniAkce
Karlova studánka, Česká republika
n3:mistoVydani
Brno
n3:nazevZdroje
RASLAN 2013 Recent Advances in Slavonic Natural Language Processing
n3:obor
n15:IN
n3:pocetDomacichTvurcuVysledku
2
n3:pocetTvurcuVysledku
2
n3:projekt
n21:LM2010013
n3:rokUplatneniVysledku
n12:2013
n3:tvurceVysledku
Suchomel, Vít Baisa, Vít
n3:typAkce
n7:EUR
n3:zahajeniAkce
2013-12-06+01:00
s:numberOfPages
8
n8:hasPublisher
Tribun EU
n22:isbn
9788026305200
n10:organizacniJednotka
14330