This HTML5 document contains 46 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
n22http://linked.opendata.cz/ontology/domain/vavai/riv/typAkce/
dctermshttp://purl.org/dc/terms/
n16http://purl.org/net/nknouf/ns/bibtex#
n12http://localhost/temp/predkladatel/
n15http://linked.opendata.cz/resource/domain/vavai/projekt/
n11http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n20http://linked.opendata.cz/resource/domain/vavai/subjekt/
n10http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216208%3A11320%2F13%3A10192339%21RIV14-TA0-11320___/
n8http://linked.opendata.cz/ontology/domain/vavai/
n19https://schema.org/
shttp://schema.org/
skoshttp://www.w3.org/2004/02/skos/core#
n4http://linked.opendata.cz/ontology/domain/vavai/riv/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n18http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n21http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n9http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n5http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n13http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n6http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n17http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216208%3A11320%2F13%3A10192339%21RIV14-TA0-11320___
rdf:type
skos:Concept n8:Vysledek
dcterms:description
In this paper we introduce Strigil, a framework for automated data extraction. It represents an easily con gurable tool that enables one to retrieve a data from textual or weak structured documents. The paper contains description of the framework architecture and its important components. Additionally, we propose a scraping language inspired by the XSL transformations designed to extract data from di erent kinds of documents. Although there are many di erent approaches focused on various aspects of data scraping, they are usually very specialized to a concrete domain or a data source. We compare these solutions and discuss their advantages and disadvantages. Our scraping language is designed to work with an ontology to map scraped data directly to classes and attributes. In this paper we introduce Strigil, a framework for automated data extraction. It represents an easily con gurable tool that enables one to retrieve a data from textual or weak structured documents. The paper contains description of the framework architecture and its important components. Additionally, we propose a scraping language inspired by the XSL transformations designed to extract data from di erent kinds of documents. Although there are many di erent approaches focused on various aspects of data scraping, they are usually very specialized to a concrete domain or a data source. We compare these solutions and discuss their advantages and disadvantages. Our scraping language is designed to work with an ontology to map scraped data directly to classes and attributes.
dcterms:title
Strigil: A Framework for Data Extraction in Semi-Structured Web Documents Strigil: A Framework for Data Extraction in Semi-Structured Web Documents
skos:prefLabel
Strigil: A Framework for Data Extraction in Semi-Structured Web Documents Strigil: A Framework for Data Extraction in Semi-Structured Web Documents
skos:notation
RIV/00216208:11320/13:10192339!RIV14-TA0-11320___
n8:predkladatel
n20:orjk%3A11320
n4:aktivita
n5:P
n4:aktivity
P(TA02010182)
n4:dodaniDat
n17:2014
n4:domaciTvurceVysledku
n11:1718452 n11:5412595 n11:7205090
n4:druhVysledku
n13:D
n4:duvernostUdaju
n21:S
n4:entitaPredkladatele
n10:predkladatel
n4:idSjednocenehoVysledku
108265
n4:idVysledku
RIV/00216208:11320/13:10192339
n4:jazykVysledku
n9:eng
n4:klicovaSlova
Web; Semi-Structured Data; Data Extraction; Framework; Strigil
n4:klicoveSlovo
n18:Strigil n18:Web n18:Semi-Structured%20Data n18:Data%20Extraction n18:Framework
n4:kontrolniKodProRIV
[4484F86780FE]
n4:mistoKonaniAkce
Vienna, Austria
n4:mistoVydani
ACM Press
n4:nazevZdroje
Proceedings of the 15th International Conference on Information Integration and Web-based Applications & Services
n4:obor
n6:IN
n4:pocetDomacichTvurcuVysledku
3
n4:pocetTvurcuVysledku
3
n4:projekt
n15:TA02010182
n4:rokUplatneniVysledku
n17:2013
n4:tvurceVysledku
Stárka, Jakub Holubová, Irena Nečaský, Martin
n4:typAkce
n22:WRD
n4:zahajeniAkce
2013-12-02+01:00
s:numberOfPages
10
n16:hasPublisher
ACM Press
n19:isbn
978-1-4503-2113-6
n12:organizacniJednotka
11320