This HTML5 document contains 44 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
n17http://linked.opendata.cz/ontology/domain/vavai/riv/typAkce/
dctermshttp://purl.org/dc/terms/
n16http://purl.org/net/nknouf/ns/bibtex#
n11http://localhost/temp/predkladatel/
n15http://linked.opendata.cz/resource/domain/vavai/projekt/
n4http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n18http://linked.opendata.cz/ontology/domain/vavai/
n19https://schema.org/
shttp://schema.org/
skoshttp://www.w3.org/2004/02/skos/core#
n3http://linked.opendata.cz/ontology/domain/vavai/riv/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n6http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216224%3A14330%2F08%3A00024204%21RIV11-MSM-14330___/
n10http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n20http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n21http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n13http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n14http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n8http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n7http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216224%3A14330%2F08%3A00024204%21RIV11-MSM-14330___
rdf:type
skos:Concept n18:Vysledek
dcterms:description
In this paper we deal with a recently developed large Czech MWE database containing at the moment 160 000 MWEs (treated as lexical units). We describe the structure of the database and give basic types of MWEs according to domains they belong to. We compare the built MWEs database with the corpus data from Czech National Corpus (approx. 100 mil. tokens) and present results of this comparison in the paper. To obtain a more complete list of MWEs we propose and use a technique exploiting the Word Sketch Engine, which allows us to work with statistical parameters such as frequency of MWEs and their components as well as with the salience for the whole MWEs. We also discuss exploitation of the database for working out a more adequate tagging and lemmatization. The final goal is to be able to recognize MWEs in corpus text and lemmatize them as complete lexical units, i. e. to make tagging and lemmatization more adequate. In this paper we deal with a recently developed large Czech MWE database containing at the moment 160 000 MWEs (treated as lexical units). We describe the structure of the database and give basic types of MWEs according to domains they belong to. We compare the built MWEs database with the corpus data from Czech National Corpus (approx. 100 mil. tokens) and present results of this comparison in the paper. To obtain a more complete list of MWEs we propose and use a technique exploiting the Word Sketch Engine, which allows us to work with statistical parameters such as frequency of MWEs and their components as well as with the salience for the whole MWEs. We also discuss exploitation of the database for working out a more adequate tagging and lemmatization. The final goal is to be able to recognize MWEs in corpus text and lemmatize them as complete lexical units, i. e. to make tagging and lemmatization more adequate.
dcterms:title
Czech MWE Database Czech MWE Database
skos:prefLabel
Czech MWE Database Czech MWE Database
skos:notation
RIV/00216224:14330/08:00024204!RIV11-MSM-14330___
n3:aktivita
n21:P
n3:aktivity
P(1ET200610406), P(2C06009), P(LC536)
n3:dodaniDat
n7:2011
n3:domaciTvurceVysledku
n4:7167660 n4:1322451 n4:6076939
n3:druhVysledku
n14:D
n3:duvernostUdaju
n20:S
n3:entitaPredkladatele
n6:predkladatel
n3:idSjednocenehoVysledku
361940
n3:idVysledku
RIV/00216224:14330/08:00024204
n3:jazykVysledku
n13:eng
n3:klicovaSlova
multiword expressions; word sketch engine
n3:klicoveSlovo
n10:word%20sketch%20engine n10:multiword%20expressions
n3:kontrolniKodProRIV
[675D2A335645]
n3:mistoKonaniAkce
Marrakech, Morocco
n3:mistoVydani
Marrakech, Morocco
n3:nazevZdroje
Proceedings of the Sixth International Language Resources and Evaluation Conference (LREC '08)
n3:obor
n8:IN
n3:pocetDomacichTvurcuVysledku
3
n3:pocetTvurcuVysledku
3
n3:projekt
n15:LC536 n15:2C06009 n15:1ET200610406
n3:rokUplatneniVysledku
n7:2008
n3:tvurceVysledku
Svoboda, Lukáš Pala, Karel Šmerk, Pavel
n3:typAkce
n17:WRD
n3:zahajeniAkce
2008-05-28+02:00
s:numberOfPages
5
n16:hasPublisher
European Language Resources Association (ELRA)
n19:isbn
2-9517408-4-0
n11:organizacniJednotka
14330