About: Diverse Queries and Feature Type Selection for Plagiarism Discovery     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

AttributesValues
rdf:type
Description
  • This paper describes our approaches for the Plagiarism Detection task of PAN 2013. We present modified three-way search methodology for source retrieval subtask. We introduce new query type – the paragraph based queries. Their purpose is to check some parts of suspicious text in more depth. The other two types of queries are: the keywords based for retrieval of documents concerning the same theme; and the intrinsic plagiarism based for retrieval sources which contain text detected as different, in a manner of writing style, from other parts of the suspicious document. The query execution was controlled by its type and by preliminary similarities discovered during the searches. We discuss 2-tuples snippet similarity measurement for decision making over search result download, which indicates how many neighbouring word pairs coexist in the snippet and in the suspicious document. Our tests indicate advantages setting of snippet similarity threshold.
  • This paper describes our approaches for the Plagiarism Detection task of PAN 2013. We present modified three-way search methodology for source retrieval subtask. We introduce new query type – the paragraph based queries. Their purpose is to check some parts of suspicious text in more depth. The other two types of queries are: the keywords based for retrieval of documents concerning the same theme; and the intrinsic plagiarism based for retrieval sources which contain text detected as different, in a manner of writing style, from other parts of the suspicious document. The query execution was controlled by its type and by preliminary similarities discovered during the searches. We discuss 2-tuples snippet similarity measurement for decision making over search result download, which indicates how many neighbouring word pairs coexist in the snippet and in the suspicious document. Our tests indicate advantages setting of snippet similarity threshold. (en)
Title
  • Diverse Queries and Feature Type Selection for Plagiarism Discovery
  • Diverse Queries and Feature Type Selection for Plagiarism Discovery (en)
skos:prefLabel
  • Diverse Queries and Feature Type Selection for Plagiarism Discovery
  • Diverse Queries and Feature Type Selection for Plagiarism Discovery (en)
skos:notation
  • RIV/00216224:14330/13:00070216!RIV14-MSM-14330___
http://linked.open...avai/predkladatel
http://linked.open...avai/riv/aktivita
http://linked.open...avai/riv/aktivity
  • P(LG13010)
http://linked.open...iv/cisloPeriodika
  • September
http://linked.open...vai/riv/dodaniDat
http://linked.open...aciTvurceVysledku
http://linked.open.../riv/druhVysledku
http://linked.open...iv/duvernostUdaju
http://linked.open...titaPredkladatele
http://linked.open...dnocenehoVysledku
  • 70038
http://linked.open...ai/riv/idVysledku
  • RIV/00216224:14330/13:00070216
http://linked.open...riv/jazykVysledku
http://linked.open.../riv/klicovaSlova
  • suspicious document; plagiarism detection; search engine; source retrieval; stop word; text alignment; contextual n gram; word n gram; representative sentence; overlapping detection; snippet similarity; global postprocessing (en)
http://linked.open.../riv/klicoveSlovo
http://linked.open...odStatuVydavatele
  • ES - Španělské království
http://linked.open...ontrolniKodProRIV
  • [A21688D84A0F]
http://linked.open...i/riv/nazevZdroje
  • CLEF 2013 Evaluation Labs and Workshop
http://linked.open...in/vavai/riv/obor
http://linked.open...ichTvurcuVysledku
http://linked.open...cetTvurcuVysledku
http://linked.open...vavai/riv/projekt
http://linked.open...UplatneniVysledku
http://linked.open...v/svazekPeriodika
  • 2013
http://linked.open...iv/tvurceVysledku
  • Brandejs, Michal
  • Kasprzak, Jan
  • Suchomel, Šimon
issn
  • 2038-4963
number of pages
http://localhost/t...ganizacniJednotka
  • 14330
Faceted Search & Find service v1.16.118 as of Jun 21 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 47 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software