About: Investigation of Latent Semantic Analysis for Clustering of Czech News Articles     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
Description
  • This paper studies the use of Latent Semantic Analysis (LSA) for automatic clustering of Czech news articles. We show that LSA is capable of yielding good results in this task as it allows us to reduce the problem of synonymy. This is a very important factor particularly for Czech, which belongs to a group of highly inflective and morphologicallyrich languages. The experimental evaluation of our clustering scheme and investigation of LSA is performed on query-and category-based test sets. The obtained results demonstrate that the automatic system yields values of the Rand index that are absolutely lower -- by 20% -- than the accuracy of human cluster annotations. We also show which similarity metric should be used for cluster merging and the effect of dimension reduction on clustering accuracy.
  • This paper studies the use of Latent Semantic Analysis (LSA) for automatic clustering of Czech news articles. We show that LSA is capable of yielding good results in this task as it allows us to reduce the problem of synonymy. This is a very important factor particularly for Czech, which belongs to a group of highly inflective and morphologicallyrich languages. The experimental evaluation of our clustering scheme and investigation of LSA is performed on query-and category-based test sets. The obtained results demonstrate that the automatic system yields values of the Rand index that are absolutely lower -- by 20% -- than the accuracy of human cluster annotations. We also show which similarity metric should be used for cluster merging and the effect of dimension reduction on clustering accuracy. (en)
Title
  • Investigation of Latent Semantic Analysis for Clustering of Czech News Articles
  • Investigation of Latent Semantic Analysis for Clustering of Czech News Articles (en)
skos:prefLabel
  • Investigation of Latent Semantic Analysis for Clustering of Czech News Articles
  • Investigation of Latent Semantic Analysis for Clustering of Czech News Articles (en)
skos:notation
  • RIV/46747885:24220/14:#0002973!RIV15-TA0-24220___
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • P(TA01011204)
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 22737
http://linked.open...ai/riv/idVysledku
  • RIV/46747885:24220/14:#0002973
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • latent semantic analysis; speech processing (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...ontrolniKodProRIV
  • [B6C0765EE797]
http://linked.open...v/mistoKonaniAkce
  • Mnichov, Německo
http://linked.open...i/riv/mistoVydani
  • Německo
http://linked.open...i/riv/nazevZdroje
  • Proc. of International Workshop on Database and Expert Systems Applications (DEXA), 2014 25th
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...vavai/riv/projekt
http://linked.open...UplatneniVysledku
http://linked.open...iv/tvurceVysledku
  • Červa, Petr
  • Rott, Michal
http://linked.open...vavai/riv/typAkce
http://linked.open.../riv/zahajeniAkce
number of pages
http://bibframe.org/vocab/doi
  • 10.1109/DEXA.2014.54
http://purl.org/ne...btex#hasPublisher
  • IEEE
https://schema.org/isbn
  • 978-1-4799-5721-7
http://localhost/t...ganizacniJednotka
  • 24220
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 58 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software