This HTML5 document contains 49 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
n11http://linked.opendata.cz/ontology/domain/vavai/riv/typAkce/
dctermshttp://purl.org/dc/terms/
n10http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216224%3A14330%2F14%3A00075387%21RIV15-MSM-14330___/
n18http://purl.org/net/nknouf/ns/bibtex#
n14http://localhost/temp/predkladatel/
n8http://linked.opendata.cz/resource/domain/vavai/projekt/
n7http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n16http://linked.opendata.cz/ontology/domain/vavai/
n22https://schema.org/
shttp://schema.org/
skoshttp://www.w3.org/2004/02/skos/core#
rdfshttp://www.w3.org/2000/01/rdf-schema#
n3http://linked.opendata.cz/ontology/domain/vavai/riv/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n6http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n9http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n19http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n13http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n21http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n20http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n15http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216224%3A14330%2F14%3A00075387%21RIV15-MSM-14330___
rdf:type
n16:Vysledek skos:Concept
rdfs:seeAlso
http://aclweb.org/anthology/E/E14/E14-2014.pdf
dcterms:description
Term candidates for a domain, in a language, can be found by taking a corpus for the domain, and a refer- ence corpus for the language identifying the grammatical shape of a term in the language tokenising, lemmatising and POS-tagging both corpora identifying (and counting) the items in each corpus which match the grammatical shape for each item in the domain corpus, compar- ing its frequency with its frequency in the refence corpus. Then, the items with the highest frequency in the domain corpus in comparison to the reference cor- pus will be the top term candidates. None of the steps above are unusual or innova- tive for NLP (see, e. g., (Aker et al., 2013), (Go- jun et al., 2012)). However it is far from trivial to implement them all, for numerous languages, in an environment that makes it easy for non- programmers to find the terms in a domain. This is what we have done in the Sketch Engine (Kilgarriff et al., 2004), and will demonstrate. Term candidates for a domain, in a language, can be found by taking a corpus for the domain, and a refer- ence corpus for the language identifying the grammatical shape of a term in the language tokenising, lemmatising and POS-tagging both corpora identifying (and counting) the items in each corpus which match the grammatical shape for each item in the domain corpus, compar- ing its frequency with its frequency in the refence corpus. Then, the items with the highest frequency in the domain corpus in comparison to the reference cor- pus will be the top term candidates. None of the steps above are unusual or innova- tive for NLP (see, e. g., (Aker et al., 2013), (Go- jun et al., 2012)). However it is far from trivial to implement them all, for numerous languages, in an environment that makes it easy for non- programmers to find the terms in a domain. This is what we have done in the Sketch Engine (Kilgarriff et al., 2004), and will demonstrate.
dcterms:title
Finding Terms in Corpora for Many Languages with the Sketch Engine Finding Terms in Corpora for Many Languages with the Sketch Engine
skos:prefLabel
Finding Terms in Corpora for Many Languages with the Sketch Engine Finding Terms in Corpora for Many Languages with the Sketch Engine
skos:notation
RIV/00216224:14330/14:00075387!RIV15-MSM-14330___
n3:aktivita
n13:S n13:P
n3:aktivity
P(LM2010013), S
n3:dodaniDat
n15:2015
n3:domaciTvurceVysledku
n7:5837189 n7:6217850 n7:8884439 n7:6616844
n3:druhVysledku
n21:D
n3:duvernostUdaju
n9:S
n3:entitaPredkladatele
n10:predkladatel
n3:idSjednocenehoVysledku
16860
n3:idVysledku
RIV/00216224:14330/14:00075387
n3:jazykVysledku
n19:eng
n3:klicovaSlova
terminology; terms; corpora; sketch engine
n3:klicoveSlovo
n6:corpora n6:terminology n6:sketch%20engine n6:terms
n3:kontrolniKodProRIV
[7749E15EF4C1]
n3:mistoKonaniAkce
Gothenburg, Sweden
n3:mistoVydani
Gothenburg, Sweden
n3:nazevZdroje
Proceedings of the Demonstrations at the 14th Conferencethe European Chapter of the Association for Computational Linguistics
n3:obor
n20:IN
n3:pocetDomacichTvurcuVysledku
4
n3:pocetTvurcuVysledku
5
n3:projekt
n8:LM2010013
n3:rokUplatneniVysledku
n15:2014
n3:tvurceVysledku
Kilgarriff, Adam Rychlý, Pavel Jakubíček, Miloš Suchomel, Vít Kovář, Vojtěch
n3:typAkce
n11:WRD
n3:zahajeniAkce
2014-01-01+01:00
s:numberOfPages
4
n18:hasPublisher
The Association for Computational Linguistics
n22:isbn
9781937284756
n14:organizacniJednotka
14330