About: A Dataset Comparison for an Indonesian-English Statistical Machine Translation System     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
rdfs:seeAlso
Description
  • In this paper, we study the effect of incorporating morphological information on an Indonesian (id) to English (en) Statistical Machine Translation (SMT) system as part of a preprocessing module. The linguistic phenomenon that is being addressed here is Indonesian cliticized words. The approach is to transform the text by separating the correct clitics from a cliticized word to simplify the word alignment. We also study the effect of applying the preprocessing on different SMT systems trained on different kinds of text, such as spoken language text. The system is built using the state-of-the-art SMT tool, MOSES. The Indonesian morphological information is provided by MorphInd. Overall the preprocessing improves the translation quality, especially for the Indonesian spoken language text, where it gains 1.78 BLEU score points of increase.
  • In this paper, we study the effect of incorporating morphological information on an Indonesian (id) to English (en) Statistical Machine Translation (SMT) system as part of a preprocessing module. The linguistic phenomenon that is being addressed here is Indonesian cliticized words. The approach is to transform the text by separating the correct clitics from a cliticized word to simplify the word alignment. We also study the effect of applying the preprocessing on different SMT systems trained on different kinds of text, such as spoken language text. The system is built using the state-of-the-art SMT tool, MOSES. The Indonesian morphological information is provided by MorphInd. Overall the preprocessing improves the translation quality, especially for the Indonesian spoken language text, where it gains 1.78 BLEU score points of increase. (en)
Title
  • A Dataset Comparison for an Indonesian-English Statistical Machine Translation System
  • A Dataset Comparison for an Indonesian-English Statistical Machine Translation System (en)
skos:prefLabel
  • A Dataset Comparison for an Indonesian-English Statistical Machine Translation System
  • A Dataset Comparison for an Indonesian-English Statistical Machine Translation System (en)
skos:notation
  • RIV/00216208:11320/12:10130074!RIV13-MSM-11320___
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • P(LC536), P(LM2010013)
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
  • Larasati, Septina Dian
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 120152
http://linked.open...ai/riv/idVysledku
  • RIV/00216208:11320/12:10130074
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • system; translation; machine; statistical; english; indonesian; comparison; dataset (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...ontrolniKodProRIV
  • [5BDBC1C5BDA8]
http://linked.open...v/mistoKonaniAkce
  • Bali, Indonesia
http://linked.open...i/riv/mistoVydani
  • Bali, Indonesia
http://linked.open...i/riv/nazevZdroje
  • Proceedings of the 26th Pacific Asia Conference on Language, Information and Computation
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...vavai/riv/projekt
http://linked.open...UplatneniVysledku
http://linked.open...iv/tvurceVysledku
  • Larasati, Septina Dian
http://linked.open...vavai/riv/typAkce
http://linked.open.../riv/zahajeniAkce
number of pages
http://purl.org/ne...btex#hasPublisher
  • Faculty of Computer Science, Universitas Indonesia
https://schema.org/isbn
  • 978-979-1421-17-1
http://localhost/t...ganizacniJednotka
  • 11320
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 77 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software