"In this paper we introduce Strigil, a framework for automated data extraction. It represents an easily con gurable tool that enables one to retrieve a data from textual or weak structured documents. The paper contains description of the framework architecture and its important components. Additionally, we propose a scraping language inspired by the XSL transformations designed to extract data from di erent kinds of documents. Although there are many di erent approaches focused on various aspects of data scraping, they are usually very specialized to a concrete domain or a data source. We compare these solutions and discuss their advantages and disadvantages. Our scraping language is designed to work with an ontology to map scraped data directly to classes and attributes."@en . . "In this paper we introduce Strigil, a framework for automated data extraction. It represents an easily con gurable tool that enables one to retrieve a data from textual or weak structured documents. The paper contains description of the framework architecture and its important components. Additionally, we propose a scraping language inspired by the XSL transformations designed to extract data from di erent kinds of documents. Although there are many di erent approaches focused on various aspects of data scraping, they are usually very specialized to a concrete domain or a data source. We compare these solutions and discuss their advantages and disadvantages. Our scraping language is designed to work with an ontology to map scraped data directly to classes and attributes." . . . "St\u00E1rka, Jakub" . . "Holubov\u00E1, Irena" . . . "Proceedings of the 15th International Conference on Information Integration and Web-based Applications & Services" . . "Strigil: A Framework for Data Extraction in Semi-Structured Web Documents" . "11320" . "3"^^ . . "P(TA02010182)" . . "Strigil: A Framework for Data Extraction in Semi-Structured Web Documents" . "10"^^ . . . "Web; Semi-Structured Data; Data Extraction; Framework; Strigil"@en . "Strigil: A Framework for Data Extraction in Semi-Structured Web Documents"@en . "Ne\u010Dask\u00FD, Martin" . "3"^^ . "ACM Press" . . . . "978-1-4503-2113-6" . "108265" . . . . . "Strigil: A Framework for Data Extraction in Semi-Structured Web Documents"@en . . "RIV/00216208:11320/13:10192339" . "[4484F86780FE]" . "RIV/00216208:11320/13:10192339!RIV14-TA0-11320___" . . . "2013-12-02+01:00"^^ . "ACM Press" . "Vienna, Austria" .