This HTML5 document contains 38 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
dctermshttp://purl.org/dc/terms/
n9http://localhost/temp/predkladatel/
n19http://linked.opendata.cz/resource/domain/vavai/riv/tvurce/
n16http://linked.opendata.cz/resource/domain/vavai/subjekt/
n8http://linked.opendata.cz/ontology/domain/vavai/
skoshttp://www.w3.org/2004/02/skos/core#
rdfshttp://www.w3.org/2000/01/rdf-schema#
n3http://linked.opendata.cz/ontology/domain/vavai/riv/
n12http://linked.opendata.cz/ontology/domain/vavai/riv/licencniPoplatek/
n10http://linked.opendata.cz/resource/domain/vavai/vysledek/RIV%2F00216275%3A25410%2F13%3A39896075%21RIV14-MSM-25410___/
n2http://linked.opendata.cz/resource/domain/vavai/vysledek/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n15http://linked.opendata.cz/ontology/domain/vavai/riv/vyuzitiJinymSubjektem/
n6http://linked.opendata.cz/ontology/domain/vavai/riv/klicoveSlovo/
n18http://linked.opendata.cz/ontology/domain/vavai/riv/duvernostUdaju/
xsdhhttp://www.w3.org/2001/XMLSchema#
n20http://linked.opendata.cz/ontology/domain/vavai/riv/jazykVysledku/
n11http://linked.opendata.cz/ontology/domain/vavai/riv/aktivita/
n17http://linked.opendata.cz/ontology/domain/vavai/riv/druhVysledku/
n7http://linked.opendata.cz/ontology/domain/vavai/riv/obor/
n13http://reference.data.gov.uk/id/gregorian-year/

Statements

Subject Item
n2:RIV%2F00216275%3A25410%2F13%3A39896075%21RIV14-MSM-25410___
rdf:type
skos:Concept n8:Vysledek
rdfs:seeAlso
http://projekty-usii.upce.cz/soubory/vytvoreny_software/hovad/2013/realtimeWebmining.zip
dcterms:description
The source code represents the backend of web application. It is written in the Python 2.7, gui is not neccessary because the script is run in the specified time interval automatically by Phone/PC. Front-end can be made individually (JS, PHP, mySQL). For example: http://space-walk.info/phd/pages/cz/realmining.php. The main goal is to analyze BIG DATA volumes from the websites in the real-time and visualise them through the selected API. The Plot.ly and Google services are used in this case along with mySQL, PHP, Javascript to handle processing and visualisation. Code includes basic classes to handle HTML structure: MLStripper(HTMLParser): - clears the HTML structure (tagy, JS, atp.) ParseIt: - analyses the target websites - utilizes Counter collection and BeautifulSoup library for easier HTML transformation to classes, which allows elegant atribute handling - saves data into associative arrays, dictionaries, sometimes in the multidimensional structure - words are filtered by the bad word dictionaries Badwords: - manipulation with the bad word dictionaries, usage is optional, the stopwords.txt is usualy good enough PublishResults: - utilizes Plot.ly service as an API to visualize graphs - necessary to set up the app_cfg.py to access mySQL and API account SpecificAnalyzes: - searches top word contexts, based on the parametrical values - distance Crimes - searches through the set of words that are familiar to specific crime - in case of positive occurrance, the JSON dictionary of towns is scanned and the adequate town is returned (only for towns with more than 5 000 inhabitants). - JSON is utilized because of complicated structure of the Czech language The source code represents the backend of web application. It is written in the Python 2.7, gui is not neccessary because the script is run in the specified time interval automatically by Phone/PC. Front-end can be made individually (JS, PHP, mySQL). For example: http://space-walk.info/phd/pages/cz/realmining.php. The main goal is to analyze BIG DATA volumes from the websites in the real-time and visualise them through the selected API. The Plot.ly and Google services are used in this case along with mySQL, PHP, Javascript to handle processing and visualisation. Code includes basic classes to handle HTML structure: MLStripper(HTMLParser): - clears the HTML structure (tagy, JS, atp.) ParseIt: - analyses the target websites - utilizes Counter collection and BeautifulSoup library for easier HTML transformation to classes, which allows elegant atribute handling - saves data into associative arrays, dictionaries, sometimes in the multidimensional structure - words are filtered by the bad word dictionaries Badwords: - manipulation with the bad word dictionaries, usage is optional, the stopwords.txt is usualy good enough PublishResults: - utilizes Plot.ly service as an API to visualize graphs - necessary to set up the app_cfg.py to access mySQL and API account SpecificAnalyzes: - searches top word contexts, based on the parametrical values - distance Crimes - searches through the set of words that are familiar to specific crime - in case of positive occurrance, the JSON dictionary of towns is scanned and the adequate town is returned (only for towns with more than 5 000 inhabitants). - JSON is utilized because of complicated structure of the Czech language
dcterms:title
Real-time big data webmining and data processing Real-time big data webmining and data processing
skos:prefLabel
Real-time big data webmining and data processing Real-time big data webmining and data processing
skos:notation
RIV/00216275:25410/13:39896075!RIV14-MSM-25410___
n8:predkladatel
n16:orjk%3A25410
n3:aktivita
n11:S
n3:aktivity
S
n3:dodaniDat
n13:2014
n3:domaciTvurceVysledku
n19:4013778
n3:druhVysledku
n17:R
n3:duvernostUdaju
n18:S
n3:ekonomickeParametry
Rychlé získání potřebných informací z velkého objemu dat, multioborové
n3:entitaPredkladatele
n10:predkladatel
n3:idSjednocenehoVysledku
101630
n3:idVysledku
RIV/00216275:25410/13:39896075
n3:interniIdentifikace
0.8
n3:jazykVysledku
n20:eng
n3:klicovaSlova
python, webmining, big data
n3:klicoveSlovo
n6:big%20data n6:webmining n6:python
n3:kontrolniKodProRIV
[AE92380473DE]
n3:licencniPoplatek
n12:Z
n3:obor
n7:IN
n3:pocetDomacichTvurcuVysledku
1
n3:pocetTvurcuVysledku
1
n3:rokUplatneniVysledku
n13:2013
n3:technickeParametry
python 2.7, beautiful soup
n3:tvurceVysledku
Hovad, Jan
n3:vlastnik
n10:vlastnikVysledku
n3:vyuzitiJinymSubjektem
n15:P
n9:organizacniJednotka
25410