. "Jakub\u00ED\u010Dek, Milo\u0161" . . "[83EE300B0AE8]" . . . . . . . . "Kov\u00E1\u0159, Vojt\u011Bch" . . . "RIV/00216224:14330/14:00073227!RIV15-MV0-14330___" . . . "http://www.lrec-conf.org/proceedings/lrec2014/summaries/52.html" . "Reykjavik, Iceland" . "Reykjavik, Iceland" . "8"^^ . "Extrinsic Corpus Evaluation with a Collocation Dictionary Task" . "corpus; evaluation; collocation"@en . "Extrinsic Corpus Evaluation with a Collocation Dictionary Task"@en . "European Language Resources Association (ELRA)" . "The NLP researcher or application-builder often wonders ``what corpus should I use, or should I build one of my own? If I build one of my own, how will I know if I have done a good job?'' Currently there is very little help available for them. They are in need of a framework for evaluating corpora. We develop such a framework, in relation to corpora which aim for good coverage of `general language'. The task we set is automatic creation of a publication-quality collocations dictionary. For a sample of 100 headwords of Czech and 100 of English, we identify a gold standard dataset of (ideally) all the collocations that should appear for these headwords in such a dictionary. The datasets are being made available alongside this paper. We then use them to determine precision and recall for a range of corpora, with a range of parameters." . "2014-01-01+01:00"^^ . "6"^^ . "Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)" . "5"^^ . . "Extrinsic Corpus Evaluation with a Collocation Dictionary Task"@en . "16265" . "Kilgarriff, Adam" . . "Baisa, V\u00EDt" . . . "The NLP researcher or application-builder often wonders ``what corpus should I use, or should I build one of my own? If I build one of my own, how will I know if I have done a good job?'' Currently there is very little help available for them. They are in need of a framework for evaluating corpora. We develop such a framework, in relation to corpora which aim for good coverage of `general language'. The task we set is automatic creation of a publication-quality collocations dictionary. For a sample of 100 headwords of Czech and 100 of English, we identify a gold standard dataset of (ideally) all the collocations that should appear for these headwords in such a dictionary. The datasets are being made available alongside this paper. We then use them to determine precision and recall for a range of corpora, with a range of parameters."@en . "Rychl\u00FD, Pavel" . . . "P(LM2010013), P(VF20102014003), S" . "Kocincov\u00E1, Lucia" . . "14330" . . "9782951740884" . "RIV/00216224:14330/14:00073227" . "Extrinsic Corpus Evaluation with a Collocation Dictionary Task" . .