This HTML5 document contains 55 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
n21http://linked.opendata.cz/ontology/domain/vavai/riv/typAkce/
dctermshttp://purl.org/dc/terms/
n8http://localhost/temp/predkladatel/
n4http://purl.org/net/nknouf/ns/bibtex#
n6http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n5http://linked.opendata.cz/resource/domain/vavai/projekt/
n10http://linked.opendata.cz/resource/domain/vavai/subjekt/
n9http://linked.opendata.cz/ontology/domain/vavai/
n17https://schema.org/
shttp://schema.org/
skoshttp://www.w3.org/2004/02/skos/core#
rdfshttp://www.w3.org/2000/01/rdf-schema#
n3http://linked.opendata.cz/ontology/domain/vavai/riv/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n13http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n19http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n20http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n23http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n15http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216208%3A11320%2F12%3A10130100%21RIV13-GA0-11320___/
n11http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n7http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n12http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216208%3A11320%2F12%3A10130100%21RIV13-GA0-11320___
rdf:type
n9:Vysledek skos:Concept
rdfs:seeAlso
http://www.aclweb.org/anthology/W12-3148
dcterms:description
We provide a few insights on data selection for machine translation. We evaluate the quality of the new CzEng 1.0, a parallel data source used in WMT12. We describe a simple technique for reducing out-of-vocabulary rate after phrase extraction. We discuss the benefits of tuning towards multiple reference translations for English-Czech language pair. We introduce a novel approach to data selection by full-text indexing and search: we select sentences similar to the test set from a large monolingual corpus and explore several options of incorporating them in a machine translation system. We show that this method can improve translation quality. Finally, we describe our submitted system CU-TAMCH-BOJ. We provide a few insights on data selection for machine translation. We evaluate the quality of the new CzEng 1.0, a parallel data source used in WMT12. We describe a simple technique for reducing out-of-vocabulary rate after phrase extraction. We discuss the benefits of tuning towards multiple reference translations for English-Czech language pair. We introduce a novel approach to data selection by full-text indexing and search: we select sentences similar to the test set from a large monolingual corpus and explore several options of incorporating them in a machine translation system. We show that this method can improve translation quality. Finally, we describe our submitted system CU-TAMCH-BOJ.
dcterms:title
Selecting Data for English-to-Czech Machine Translation Selecting Data for English-to-Czech Machine Translation
skos:prefLabel
Selecting Data for English-to-Czech Machine Translation Selecting Data for English-to-Czech Machine Translation
skos:notation
RIV/00216208:11320/12:10130100!RIV13-GA0-11320___
n9:predkladatel
n10:orjk%3A11320
n3:aktivita
n20:P
n3:aktivity
P(7E09003), P(7E11051), P(GAP406/11/1499), P(GPP406/10/P259)
n3:dodaniDat
n12:2013
n3:domaciTvurceVysledku
n6:2630176 Stanojević, Miloš Kamran, Amir n6:3528839 n6:5657512
n3:druhVysledku
n7:D
n3:duvernostUdaju
n19:S
n3:entitaPredkladatele
n15:predkladatel
n3:idSjednocenehoVysledku
167306
n3:idVysledku
RIV/00216208:11320/12:10130100
n3:jazykVysledku
n23:eng
n3:klicovaSlova
translation; machine; czech; english; data; selecting
n3:klicoveSlovo
n13:czech n13:english n13:translation n13:machine n13:data n13:selecting
n3:kontrolniKodProRIV
[FC0C9002DFCF]
n3:mistoKonaniAkce
Montréal, Canada
n3:mistoVydani
Montréal, Canada
n3:nazevZdroje
Proceedings of the Seventh Workshop on Statistical Machine Translation
n3:obor
n11:IN
n3:pocetDomacichTvurcuVysledku
5
n3:pocetTvurcuVysledku
5
n3:projekt
n5:7E11051 n5:GPP406%2F10%2FP259 n5:GAP406%2F11%2F1499 n5:7E09003
n3:rokUplatneniVysledku
n12:2012
n3:tvurceVysledku
Bojar, Ondřej Galuščáková, Petra Kamran, Amir Tamchyna, Aleš Stanojević, Miloš
n3:typAkce
n21:CST
n3:zahajeniAkce
2012-06-07+02:00
s:numberOfPages
8
n4:hasPublisher
Association for Computational Linguistics
n17:isbn
978-1-937284-20-6
n8:organizacniJednotka
11320