. "V\u00FDsledek byl implementov\u00E1n v programovac\u00EDm jazyce Python nad datab\u00E1zov\u00FDm syst\u00E9mem MySQL. Pro z\u00EDsk\u00E1n\u00ED licence kontaktujte: Jan \u0160vec, Katedra kybernetiky, Z\u010CU v Plzni, tel. 2557, v\u00EDce na http://www.kky.zcu.cz/cs/sw/sk-import" . "Softwarov\u00FD modul pro import dat ze slovensk\u00FDch webov\u00FDch port\u00E1l\u016F"@cs . . "language modelling; text cleaning"@en . "http://www.kky.zcu.cz/cs/sw/sk-import" . "RIV/49777513:23520/12:43918029!RIV13-MSM-23520___" . "2"^^ . "RIV/49777513:23520/12:43918029" . . "V\u00FDsledek vznikl na z\u00E1klad\u011B Smlouvy o d\u00EDlo uzav\u0159en\u00E9 mezi SpeechTech, s.r.o. a Z\u010CU v Plzni dne 15.10.2012, reg. \u010D. SML/5200/0055/12. Cena d\u00EDle je 100.000 K\u010D. V\u00FDsledek umo\u017E\u0148uje automatick\u00E9 zpracov\u00E1n\u00ED vstupn\u00EDch jazykov\u00FDch dat za \u00FA\u010Delem tvorby jazykov\u00E9ho modelu pro sloven\u0161tinu. V\u00EDce na http://www.kky.zcu.cz/cs/sw/sk-import" . . . . . "2"^^ . . . . "[9BCAA2A332FE]" . . "The software module implements data importing algorithms tailored for Slovak web portals. It also performs the conversion and text cleaning of the source HTML page into a clean text in a given encoding. The algorithms for text cleaning are adapted using the training data. The trained classifier classifies each fragment of an HTML page into two classes - clean text of the page or %22other%22. The following post-processing algorithm keeps just the clean text. The integral part of the module is a tool for automatized downloading of RSS channels. This tool simplifies an automatized processing of new data."@en . "Softwarov\u00FD modul pro import dat ze slovensk\u00FDch webov\u00FDch port\u00E1l\u016F" . "N" . "Software module for importing data from Slovak web portals"@en . . "23520" . . "Softwarov\u00FD modul realizuje import dat ze slovensk\u00FDch webov\u00FDch port\u00E1l\u016F, jejich p\u0159evod a filtraci z form\u00E1tu HTML do \u010Dist\u00E9ho textu v zadan\u00E9m k\u00F3dov\u00E1n\u00ED. Algoritmy pro filtraci do \u010Dist\u00E9ho textu jsou adaptov\u00E1ny na z\u00E1klad\u011B tr\u00E9novac\u00EDch dat. Natr\u00E9novan\u00FD klasifik\u00E1tor ka\u017Ed\u00FD fragment HTML str\u00E1nky za\u0159ad\u00ED do jedn\u00E9 ze dvou t\u0159\u00EDd - \u010Dist\u00FD text \u010Dl\u00E1nku nebo ostatn\u00ED. N\u00E1sledn\u011B je ponech\u00E1n pouze \u010Dist\u00FD text. Sou\u010D\u00E1st\u00ED softwarov\u00E9ho modulu jsou i n\u00E1stroj pro automatick\u00E9 sledov\u00E1n\u00ED RSS kan\u00E1l\u016F. Tento n\u00E1stroj usnad\u0148uje automatizovan\u00E9 zpracov\u00E1n\u00ED nov\u00FDch dat." . "\u0160vec, Jan" . "Software module for importing data from Slovak web portals"@en . . . "169054" . . . "Softwarov\u00FD modul realizuje import dat ze slovensk\u00FDch webov\u00FDch port\u00E1l\u016F, jejich p\u0159evod a filtraci z form\u00E1tu HTML do \u010Dist\u00E9ho textu v zadan\u00E9m k\u00F3dov\u00E1n\u00ED. Algoritmy pro filtraci do \u010Dist\u00E9ho textu jsou adaptov\u00E1ny na z\u00E1klad\u011B tr\u00E9novac\u00EDch dat. Natr\u00E9novan\u00FD klasifik\u00E1tor ka\u017Ed\u00FD fragment HTML str\u00E1nky za\u0159ad\u00ED do jedn\u00E9 ze dvou t\u0159\u00EDd - \u010Dist\u00FD text \u010Dl\u00E1nku nebo ostatn\u00ED. N\u00E1sledn\u011B je ponech\u00E1n pouze \u010Dist\u00FD text. Sou\u010D\u00E1st\u00ED softwarov\u00E9ho modulu jsou i n\u00E1stroj pro automatick\u00E9 sledov\u00E1n\u00ED RSS kan\u00E1l\u016F. Tento n\u00E1stroj usnad\u0148uje automatizovan\u00E9 zpracov\u00E1n\u00ED nov\u00FDch dat."@cs . "Softwarov\u00FD modul pro import dat ze slovensk\u00FDch webov\u00FDch port\u00E1l\u016F"@cs . "SK-Import-2012" . "Softwarov\u00FD modul pro import dat ze slovensk\u00FDch webov\u00FDch port\u00E1l\u016F" . "Vavru\u0161ka, Jan" . .