About: From the corpus as an open source for investigation to commercial products

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: From the corpus as an open source for investigation to commercial products Goto Sponge NotDistinct Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

Attributes	Values
rdf:type	skos:Concept http://linked.opendata.cz/ontology/domain/vavai/Vysledek
Description	Článek nastiňuje vývoj korpusů od velkých souborů neznačkovaných textů přes značkované korpusy k nástrojům, které operují nad značkovanými korpusy a produkují data prezentovaná jako data o jazyku, jako jsou např. Word Sketches (TM). Článek připomíná, že každý korpus je jen reprezentací textů a že se musíme ptát po kvalitě reprezentace. Nezbytná otázka při výzkumu je, jak je korpus vybudován a jak, na základě jakých principů, pracuje obslužný program. Tam, kde zkoumáme korpus s deformacemi, kde se texty objevují v podobě, jak je nikdo nenapsal (číslice a jejich okolí jsou často jevy takového druhu), stejně jako tam, kde nemáme dovoleno dívat se %22pod kapotu%22 nebo měnit pracovní parametry, sotva můžeme mluvit o tom, že bychom dělali vědecký výzkum. (cs) The development of corpora is sketched, from large collections of texts without tagging through tagged corpora to machines that operate above tagged corpora and produce data presented as data about language, such as Word Sketches (TM). The article remarks that every corpus is merely a representation of texts and that the quality of representation is to be examined. The unavoidable question in research is how is the corpus built and how, under what principles, the service software operates. Both in case we explore a corpus with distortions, where texts appear in a way nobody has written them so (digits and their environment uses to be phenomena of that sort), and in case we are not allowed to have an insight %22below the bonnet%22 or to change working parameters, we hardly may speak about doing scholarly research. The development of corpora is sketched, from large collections of texts without tagging through tagged corpora to machines that operate above tagged corpora and produce data presented as data about language, such as Word Sketches (TM). The article remarks that every corpus is merely a representation of texts and that the quality of representation is to be examined. The unavoidable question in research is how is the corpus built and how, under what principles, the service software operates. Both in case we explore a corpus with distortions, where texts appear in a way nobody has written them so (digits and their environment uses to be phenomena of that sort), and in case we are not allowed to have an insight %22below the bonnet%22 or to change working parameters, we hardly may speak about doing scholarly research. (en)
Title	From the corpus as an open source for investigation to commercial products From the corpus as an open source for investigation to commercial products (en) Od korpusu jako otevřeného zdroje pro bádání ke komerčním produktům (cs)
skos:prefLabel	From the corpus as an open source for investigation to commercial products From the corpus as an open source for investigation to commercial products (en) Od korpusu jako otevřeného zdroje pro bádání ke komerčním produktům (cs)
skos:notation	RIV/68378092:_____/07:00097207!RIV08-AV0-68378092
http://linked.open.../vavai/riv/strany	243;249
http://linked.open...avai/riv/aktivita	P Z
http://linked.open...avai/riv/aktivity	P(GA405/03/0377), Z(AV0Z90610518)
http://linked.open...vai/riv/dodaniDat	2008
http://linked.open...aciTvurceVysledku	Šimandl, Josef
http://linked.open.../riv/druhVysledku	D - Článek ve sborníku
http://linked.open...iv/duvernostUdaju	S - Úplné a pravdivé údaje nepodléhající ochraně podle zvláštních právních předpisů
http://linked.open...titaPredkladatele	Ústav pro jazyk český AV ČR, v. v. i.
http://linked.open...dnocenehoVysledku	422676
http://linked.open...ai/riv/idVysledku	RIV/68378092:_____/07:00097207
http://linked.open...riv/jazykVysledku	eng - angličtina
http://linked.open.../riv/klicovaSlova	corpus linguistics; linguistic corpus; tagging (en)
http://linked.open.../riv/klicoveSlovo	corpus linguistics tagging linguistic corpus
http://linked.open...ontrolniKodProRIV	[54C7771944D2]
http://linked.open...v/mistoKonaniAkce	Praha
http://linked.open...i/riv/mistoVydani	Praha
http://linked.open...i/riv/nazevZdroje	Gramatika a korpus 2005
http://linked.open...in/vavai/riv/obor	AI
http://linked.open...ichTvurcuVysledku	1 (xsd:int)
http://linked.open...cetTvurcuVysledku	1 (xsd:int)
http://linked.open...vavai/riv/projekt	Exploring the Core and Limits of Czech Grammar as seen through the Czech National Corpus
http://linked.open...UplatneniVysledku	2007
http://linked.open...iv/tvurceVysledku	Šimandl, Josef
http://linked.open...vavai/riv/typAkce	WRD - Světová
http://linked.open.../riv/zahajeniAkce	2005-11-23 (xsd:date)
http://linked.open...n/vavai/riv/zamer	Integrovaný výzkum českého jazyka a jeho variet
number of pages	7 (xsd:int)
http://purl.org/ne...btex#hasPublisher	Ústav pro jazyk český AV ČR
https://schema.org/isbn	80-86496-32-5

Faceted Search & Find service v1.16.118 as of Jun 21 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 48 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software