About: Extrinsic Corpus Evaluation with a Collocation Dictionary Task

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Extrinsic Corpus Evaluation with a Collocation Dictionary Task Goto Sponge NotDistinct Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

Attributes	Values
rdf:type	skos:Concept http://linked.opendata.cz/ontology/domain/vavai/Vysledek
rdfs:seeAlso	http://www.lrec-conf.org/proceedings/lrec2014/summaries/52.html
Description	The NLP researcher or application-builder often wonders ``what corpus should I use, or should I build one of my own? If I build one of my own, how will I know if I have done a good job?'' Currently there is very little help available for them. They are in need of a framework for evaluating corpora. We develop such a framework, in relation to corpora which aim for good coverage of `general language'. The task we set is automatic creation of a publication-quality collocations dictionary. For a sample of 100 headwords of Czech and 100 of English, we identify a gold standard dataset of (ideally) all the collocations that should appear for these headwords in such a dictionary. The datasets are being made available alongside this paper. We then use them to determine precision and recall for a range of corpora, with a range of parameters. The NLP researcher or application-builder often wonders ``what corpus should I use, or should I build one of my own? If I build one of my own, how will I know if I have done a good job?'' Currently there is very little help available for them. They are in need of a framework for evaluating corpora. We develop such a framework, in relation to corpora which aim for good coverage of `general language'. The task we set is automatic creation of a publication-quality collocations dictionary. For a sample of 100 headwords of Czech and 100 of English, we identify a gold standard dataset of (ideally) all the collocations that should appear for these headwords in such a dictionary. The datasets are being made available alongside this paper. We then use them to determine precision and recall for a range of corpora, with a range of parameters. (en)
Title	Extrinsic Corpus Evaluation with a Collocation Dictionary Task Extrinsic Corpus Evaluation with a Collocation Dictionary Task (en)
skos:prefLabel	Extrinsic Corpus Evaluation with a Collocation Dictionary Task Extrinsic Corpus Evaluation with a Collocation Dictionary Task (en)
skos:notation	RIV/00216224:14330/14:00073227!RIV15-MV0-14330___
http://linked.open...avai/riv/aktivita	P S
http://linked.open...avai/riv/aktivity	P(LM2010013), P(VF20102014003), S
http://linked.open...vai/riv/dodaniDat	2015
http://linked.open...aciTvurceVysledku	Jakubíček, Miloš Kovář, Vojtěch Baisa, Vít Rychlý, Pavel Kocincová, Lucia
http://linked.open.../riv/druhVysledku	D - Článek ve sborníku
http://linked.open...iv/duvernostUdaju	S - Úplné a pravdivé údaje nepodléhající ochraně podle zvláštních právních předpisů
http://linked.open...titaPredkladatele	Masarykova univerzita / Fakulta informatiky
http://linked.open...dnocenehoVysledku	16265
http://linked.open...ai/riv/idVysledku	RIV/00216224:14330/14:00073227
http://linked.open...riv/jazykVysledku	eng - angličtina
http://linked.open.../riv/klicovaSlova	corpus; evaluation; collocation (en)
http://linked.open.../riv/klicoveSlovo	collocation corpus evaluation
http://linked.open...ontrolniKodProRIV	[83EE300B0AE8]
http://linked.open...v/mistoKonaniAkce	Reykjavik, Iceland
http://linked.open...i/riv/mistoVydani	Reykjavik, Iceland
http://linked.open...i/riv/nazevZdroje	Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
http://linked.open...in/vavai/riv/obor	IN
http://linked.open...ichTvurcuVysledku	5 (xsd:int)
http://linked.open...cetTvurcuVysledku	6 (xsd:int)
http://linked.open...vavai/riv/projekt	LINDAT-CLARIN: Institute for analysis, processing and distribution of linguistic data Natural Language Analysis in the Internet Environment
http://linked.open...UplatneniVysledku	2014
http://linked.open...iv/tvurceVysledku	Baisa, Vít Jakubíček, Miloš Kovář, Vojtěch Kilgarriff, Adam Rychlý, Pavel Kocincová, Lucia
http://linked.open...vavai/riv/typAkce	WRD - Světová
http://linked.open.../riv/zahajeniAkce	2014-01-01 (xsd:date)
number of pages	8 (xsd:int)
http://purl.org/ne...btex#hasPublisher	European Language Resources Association (ELRA)
https://schema.org/isbn	9782951740884
http://localhost/t...ganizacniJednotka	14330

Faceted Search & Find service v1.16.118 as of Jun 21 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 48 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software