About: Parallel Processing of Very Many Textual Customers' Reviews Freely Written Down in Natural Languages

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Parallel Processing of Very Many Textual Customers' Reviews Freely Written Down in Natural Languages Goto Sponge NotDistinct Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

Attributes	Values
rdf:type	skos:Concept http://linked.opendata.cz/ontology/domain/vavai/Vysledek
Description	Text mining of hundreds of thousand or millions of documents written in a natural language is limited by the computational complexity (time and memory) and computer performance. Many applications can use only standard personal computers. In this case, the whole data set has to be divided into smaller subsets that can be processed in parallel. This article deals with the problem how to divide the original data set, which represents a typical collection containing two millions of customers' reviews written in English. The main goal is to mine information the quality of which is comparable with information obtained from the whole set despite the fact that the mining is carried out using subsets of the original large data set. The article suggests a method of dividing the set into subsets including a possibility of evaluating the mining results by comparing the unified outputs of individual subsets with the original set. The suggested method is illustrated with a task that searches for significant words expressing the customers' opinions on hotel services. It is shown that there is always a certain boundary under which the subset sizes cannot fall as well as how to experimentally find this border. Text mining of hundreds of thousand or millions of documents written in a natural language is limited by the computational complexity (time and memory) and computer performance. Many applications can use only standard personal computers. In this case, the whole data set has to be divided into smaller subsets that can be processed in parallel. This article deals with the problem how to divide the original data set, which represents a typical collection containing two millions of customers' reviews written in English. The main goal is to mine information the quality of which is comparable with information obtained from the whole set despite the fact that the mining is carried out using subsets of the original large data set. The article suggests a method of dividing the set into subsets including a possibility of evaluating the mining results by comparing the unified outputs of individual subsets with the original set. The suggested method is illustrated with a task that searches for significant words expressing the customers' opinions on hotel services. It is shown that there is always a certain boundary under which the subset sizes cannot fall as well as how to experimentally find this border. (en)
Title	Parallel Processing of Very Many Textual Customers' Reviews Freely Written Down in Natural Languages Parallel Processing of Very Many Textual Customers' Reviews Freely Written Down in Natural Languages (en)
skos:prefLabel	Parallel Processing of Very Many Textual Customers' Reviews Freely Written Down in Natural Languages Parallel Processing of Very Many Textual Customers' Reviews Freely Written Down in Natural Languages (en)
skos:notation	RIV/62156489:43110/12:00191656!RIV13-MSM-43110___
http://linked.open...avai/riv/aktivita	Z
http://linked.open...avai/riv/aktivity	Z(MSM6215648904)
http://linked.open...vai/riv/dodaniDat	2013
http://linked.open...aciTvurceVysledku	Dařena, František Žižka, Jan
http://linked.open.../riv/druhVysledku	D - Článek ve sborníku
http://linked.open...iv/duvernostUdaju	S - Úplné a pravdivé údaje nepodléhající ochraně podle zvláštních právních předpisů
http://linked.open...titaPredkladatele	Mendelova univerzita v Brně / Provozně ekonomická fakulta
http://linked.open...dnocenehoVysledku	157762
http://linked.open...ai/riv/idVysledku	RIV/62156489:43110/12:00191656
http://linked.open...riv/jazykVysledku	eng - angličtina
http://linked.open.../riv/klicovaSlova	natural language; decision tree; computational complexity; parallel processing; text mining; data subset size (en)
http://linked.open.../riv/klicoveSlovo	natural language computational complexity text mining parallel processing decision tree data subset size
http://linked.open...ontrolniKodProRIV	[7393F7721EE3]
http://linked.open...v/mistoKonaniAkce	Venice, Italy
http://linked.open...i/riv/nazevZdroje	IMMM 2012: The Second International Conference on Advances in Information Mining and Management
http://linked.open...in/vavai/riv/obor	IN
http://linked.open...ichTvurcuVysledku	2 (xsd:int)
http://linked.open...cetTvurcuVysledku	2 (xsd:int)
http://linked.open...UplatneniVysledku	2012
http://linked.open...iv/tvurceVysledku	Žižka, Jan Dařena, František
http://linked.open...vavai/riv/typAkce	WRD - Světová
http://linked.open.../riv/zahajeniAkce	2012-01-01 (xsd:date)
http://linked.open...n/vavai/riv/zamer	The Czech Economy in the Process of Integration and Globalisation, and the Development of Agricultural Sector and the Sector of Services under the New Conditions of the Integrated European Market
number of pages	7 (xsd:int)
http://purl.org/ne...btex#hasPublisher	Neuveden
https://schema.org/isbn	978-1-61208-227-1
http://localhost/t...ganizacniJednotka	43110

Faceted Search & Find service v1.16.118 as of Jun 21 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 100 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software