About: Large Corpora for Turkic Languages and Unsupervised Morphological Analysis     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
Description
  • In this article we describe six new web corpora for Turkish, Azerbaijani, Kazakh, Turkmen, Kyrgyz and Uzbek languages. The data for these corpora was automatically crawled from the web by SpiderLing. Only minimal knowledge of these languages was required to obtain the data in raw form. Corpora are tokenized only since morphological analyzers and disambiguators for these languages are not available (except for Turkish). Subsequent experiment with unsupervised morphological segmentation was carried out on the Turkish corpus. In this experiment we achieved encouraging results. We used data provided for MorphoChallenge competition for the purpose of evaluation.
  • In this article we describe six new web corpora for Turkish, Azerbaijani, Kazakh, Turkmen, Kyrgyz and Uzbek languages. The data for these corpora was automatically crawled from the web by SpiderLing. Only minimal knowledge of these languages was required to obtain the data in raw form. Corpora are tokenized only since morphological analyzers and disambiguators for these languages are not available (except for Turkish). Subsequent experiment with unsupervised morphological segmentation was carried out on the Turkish corpus. In this experiment we achieved encouraging results. We used data provided for MorphoChallenge competition for the purpose of evaluation. (en)
Title
  • Large Corpora for Turkic Languages and Unsupervised Morphological Analysis
  • Large Corpora for Turkic Languages and Unsupervised Morphological Analysis (en)
skos:prefLabel
  • Large Corpora for Turkic Languages and Unsupervised Morphological Analysis
  • Large Corpora for Turkic Languages and Unsupervised Morphological Analysis (en)
skos:notation
  • RIV/00216224:14330/12:00059944!RIV13-MSM-14330___
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • P(LM2010013), S
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 146446
http://linked.open...ai/riv/idVysledku
  • RIV/00216224:14330/12:00059944
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • corpus; turkic languages; unsupervised morphological analysis (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...ontrolniKodProRIV
  • [DEB8C8F697F5]
http://linked.open...v/mistoKonaniAkce
  • Istanbul
http://linked.open...i/riv/mistoVydani
  • Istanbul, Turkey
http://linked.open...i/riv/nazevZdroje
  • Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...vavai/riv/projekt
http://linked.open...UplatneniVysledku
http://linked.open...iv/tvurceVysledku
  • Baisa, Vít
  • Suchomel, Vít
http://linked.open...vavai/riv/typAkce
http://linked.open.../riv/zahajeniAkce
number of pages
http://purl.org/ne...btex#hasPublisher
  • European Language Resources Association (ELRA)
https://schema.org/isbn
  • 9782951740877
http://localhost/t...ganizacniJednotka
  • 14330
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 48 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software