This HTML5 document contains 47 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
n17http://linked.opendata.cz/ontology/domain/vavai/riv/typAkce/
dctermshttp://purl.org/dc/terms/
n18http://localhost/temp/predkladatel/
n12http://purl.org/net/nknouf/ns/bibtex#
n8http://linked.opendata.cz/resource/domain/vavai/projekt/
n7http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n3http://linked.opendata.cz/ontology/domain/vavai/
n14https://schema.org/
shttp://schema.org/
rdfshttp://www.w3.org/2000/01/rdf-schema#
skoshttp://www.w3.org/2004/02/skos/core#
n5http://linked.opendata.cz/ontology/domain/vavai/riv/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n6http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n9http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n16http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n15http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n21http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n22http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216224%3A14330%2F14%3A00076251%21RIV15-MSM-14330___/
n19http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n20http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216224%3A14330%2F14%3A00076251%21RIV15-MSM-14330___
rdf:type
n3:Vysledek skos:Concept
rdfs:seeAlso
http://www.lrec-conf.org/proceedings/lrec2014/summaries/835.html
dcterms:description
We present HindEnCorp, a parallel corpus of Hindi and English, and HindMonoCorp, a monolingual corpus of Hindi in their release version 0.5. Both corpora were collected from web sources and preprocessed primarily for the training of statistical machine translation systems. HindEnCorp consists of 274k parallel sentences (3.9 million Hindi and 3.8 million English tokens). HindMonoCorp amounts to 787 million tokens in 44 million sentences. Both the corpora are freely available for non-commercial research and their preliminary release has been used by numerous participants of the WMT 2014 shared translation task. We present HindEnCorp, a parallel corpus of Hindi and English, and HindMonoCorp, a monolingual corpus of Hindi in their release version 0.5. Both corpora were collected from web sources and preprocessed primarily for the training of statistical machine translation systems. HindEnCorp consists of 274k parallel sentences (3.9 million Hindi and 3.8 million English tokens). HindMonoCorp amounts to 787 million tokens in 44 million sentences. Both the corpora are freely available for non-commercial research and their preliminary release has been used by numerous participants of the WMT 2014 shared translation task.
dcterms:title
HindEnCorp – Hindi-English and Hindi-only Corpus for Machine Translation HindEnCorp – Hindi-English and Hindi-only Corpus for Machine Translation
skos:prefLabel
HindEnCorp – Hindi-English and Hindi-only Corpus for Machine Translation HindEnCorp – Hindi-English and Hindi-only Corpus for Machine Translation
skos:notation
RIV/00216224:14330/14:00076251!RIV15-MSM-14330___
n5:aktivita
n15:P
n5:aktivity
P(LM2010013)
n5:dodaniDat
n20:2015
n5:domaciTvurceVysledku
n7:8884439 n7:6616844
n5:druhVysledku
n19:D
n5:duvernostUdaju
n9:S
n5:entitaPredkladatele
n22:predkladatel
n5:idSjednocenehoVysledku
19186
n5:idVysledku
RIV/00216224:14330/14:00076251
n5:jazykVysledku
n16:eng
n5:klicovaSlova
Machine Translation; SpeechToSpeech Translation; Metadata
n5:klicoveSlovo
n6:Machine%20Translation n6:SpeechToSpeech%20Translation n6:Metadata
n5:kontrolniKodProRIV
[1475EBE9759F]
n5:mistoKonaniAkce
Reykjavik, Iceland
n5:mistoVydani
Reykjavik, Iceland
n5:nazevZdroje
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
n5:obor
n21:IN
n5:pocetDomacichTvurcuVysledku
2
n5:pocetTvurcuVysledku
7
n5:projekt
n8:LM2010013
n5:rokUplatneniVysledku
n20:2014
n5:tvurceVysledku
Tamchyna, Aleš Diatka, Vojtěch Zeman, Daniel Bojar, Ondřej Suchomel, Vít Straňák, Pavel Rychlý, Pavel
n5:typAkce
n17:WRD
n5:zahajeniAkce
2014-05-26+02:00
s:numberOfPages
6
n12:hasPublisher
European Language Resources Association (ELRA)
n14:isbn
9782951740884
n18:organizacniJednotka
14330