About: Filtering Very Similar Text Documents: A Case Study

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Filtering Very Similar Text Documents: A Case Study Goto Sponge NotDistinct Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

Attributes	Values
rdf:type	skos:Concept http://linked.opendata.cz/ontology/domain/vavai/Vysledek
Description	Článek popisuje problémy s klasifikací a filtrací podobných relevantních a nerelevantních reálných textových dokumentů z jedné velmi specifické domény, získané z internetových zdrojů. Kromě podobnosti jsou dokumenty často nevyváženy -- nedostatek nerelevantních dokumentů pro trénování. Je navržena definice podobnosti. Klasifikace byla testována pomocí šesti algoritmů z hlediska podobnosti textů. Nejlepší výsledky poskytly neuronové sítě založené na backpropagation a support vector machines s radiálními bázovými funkcemi. (cs) This paper describes problems with classification and filtration of similar relevant and irrelevant real medical documents from one very specific domain, obtained from the Internet resources. Besides the similarity, the documents are often unbalanced-a lack of irrelevant documents for the training. A definition of similarity is suggested. For the classification, six algorithms are tested from the document similarity point of view. The best results are provided by the back propagation-based neural network and by the radial basis function-based support vector machine. This paper describes problems with classification and filtration of similar relevant and irrelevant real medical documents from one very specific domain, obtained from the Internet resources. Besides the similarity, the documents are often unbalanced-a lack of irrelevant documents for the training. A definition of similarity is suggested. For the classification, six algorithms are tested from the document similarity point of view. The best results are provided by the back propagation-based neural network and by the radial basis function-based support vector machine. (en)
Title	Filtering Very Similar Text Documents: A Case Study Filtering Very Similar Text Documents: A Case Study (en) Filtrace velmi podobných textových dokumentů: Studie případu. (cs)
skos:prefLabel	Filtering Very Similar Text Documents: A Case Study Filtering Very Similar Text Documents: A Case Study (en) Filtrace velmi podobných textových dokumentů: Studie případu. (cs)
skos:notation	RIV/00216224:14330/04:00009948!RIV08-MSM-14330___
http://linked.open.../vavai/riv/strany	511-520
http://linked.open...avai/riv/aktivita	Z
http://linked.open...avai/riv/aktivity	Z(MSM 143300003)
http://linked.open...vai/riv/dodaniDat	2008
http://linked.open...aciTvurceVysledku	Hroza, Jiří Bourek, Aleš Žižka, Jan
http://linked.open.../riv/druhVysledku	D - Článek ve sborníku
http://linked.open...iv/duvernostUdaju	S - Úplné a pravdivé údaje nepodléhající ochraně podle zvláštních právních předpisů
http://linked.open...titaPredkladatele	Masarykova univerzita / Fakulta informatiky
http://linked.open...dnocenehoVysledku	564379
http://linked.open...ai/riv/idVysledku	RIV/00216224:14330/04:00009948
http://linked.open...riv/jazykVysledku	eng - angličtina
http://linked.open.../riv/klicovaSlova	machine learning; text categorization; text filtration; text similarity (en)
http://linked.open.../riv/klicoveSlovo	text categorization text similarity text filtration machine learning
http://linked.open...ontrolniKodProRIV	[45505C6B08EC]
http://linked.open...v/mistoKonaniAkce	Seoul, Korea
http://linked.open...i/riv/mistoVydani	Germany
http://linked.open...i/riv/nazevZdroje	Computational linguistics and Intelligent Text Processing
http://linked.open...in/vavai/riv/obor	IN
http://linked.open...ichTvurcuVysledku	3 (xsd:int)
http://linked.open...cetTvurcuVysledku	3 (xsd:int)
http://linked.open...UplatneniVysledku	2004
http://linked.open...iv/tvurceVysledku	Žižka, Jan Bourek, Aleš Hroza, Jiří
http://linked.open...vavai/riv/typAkce	WRD - Světová
http://linked.open.../riv/zahajeniAkce	2004-02-15 (xsd:date)
http://linked.open...n/vavai/riv/zamer	http://linked.opendata.cz/resource/domain/vavai/zamer/MSM%20143300003
number of pages	10 (xsd:int)
http://purl.org/ne...btex#hasPublisher	Springer-Verlag. (Berlin; Heidelberg)
https://schema.org/isbn	3-540-21006-7
http://localhost/t...ganizacniJednotka	14330

Faceted Search & Find service v1.16.118 as of Jun 21 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 100 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software