About: From the Corpus as Open Source for Investigation to Commercial Products

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: From the Corpus as Open Source for Investigation to Commercial Products Goto Sponge NotDistinct Permalink

An Entity of Type : http://linked.opendata.cz/ontology/domain/vavai/Vysledek, within Data Space : linked.opendata.cz associated with source document(s)

Attributes	Values
rdf:type	skos:Concept http://linked.opendata.cz/ontology/domain/vavai/Vysledek
Description	Corpora have developed from pure texts into sophisticated tagged tools. They provide quick and seemingly indisputable answers to complicated questions about language. Reasons as to why those answers do not necessarily describe the language are proposed, using examples from the Czech representative corpus SYN2000: (1) Texts are often represented in a way that nobody has (or could have) written/published them. (2) The tagging is far from plausible for labelling language phenomena, both as conception and as implementation. (3) Even if statistical data were proof enough, the explanation of language cannot be derived by the data themselves, but must be obtained through data and interpreting data. (4) Linguistic inquiry is impeded rather than supported by the development of corpus-based tools, as a researcher cannot modify/test how appropriate their setting is for each single case (so by the use of WordSketches). Thus, the more sophisticated the corpus tools, the less the guarantee of scientifically plausib Corpora have developed from pure texts into sophisticated tagged tools. They provide quick and seemingly indisputable answers to complicated questions about language. Reasons as to why those answers do not necessarily describe the language are proposed, using examples from the Czech representative corpus SYN2000: (1) Texts are often represented in a way that nobody has (or could have) written/published them. (2) The tagging is far from plausible for labelling language phenomena, both as conception and as implementation. (3) Even if statistical data were proof enough, the explanation of language cannot be derived by the data themselves, but must be obtained through data and interpreting data. (4) Linguistic inquiry is impeded rather than supported by the development of corpus-based tools, as a researcher cannot modify/test how appropriate their setting is for each single case (so by the use of WordSketches). Thus, the more sophisticated the corpus tools, the less the guarantee of scientifically plausib (en) Korpusy se vyvinuly zčistých textů vsofistikované, značkované nástroje. Poskytují rychlé a zdánlivě nenapadnutelné odpovědi na komplikované dotazy o jazyku. Příspěvek předkládá důvody, pro které tyto odpovědi ne vždy popisují jazyk; příklady jsou zčeského reprezentativního korpusu SYN2000: (1) Texty jsou často reprezentovány vtakové podobě, jak (by) je nikdo nepublikoval. (2) Značkování jazykových jevů má daleko k plauzibilitě, a to jak co do koncepce, tak co do implementace značek. (3) Ani tehdy, když statistická data jsou spolehlivá, explanaci jazykových jevů nelze derivovat zdat; získává se nad daty, interpretací dat. (4) Vědeckému výzkumu jazyka je spíš na překážku než k pomoci rozvoj takových korpusových nástrojů, u nichž badatel nemůže nastavit pracovní parametry (případ WordSketches). A tak čím sofistikovanější jsou korpusové nástroje, tím nižší je záruka vědecky plauzibilních výsledků, resp. tím víc starostí je stím, jak výsledky učinit plauzibilními. To, co nabízí tzv. korpusová lingvistika, (cs)
Title	From the Corpus as Open Source for Investigation to Commercial Products From the Corpus as Open Source for Investigation to Commercial Products (en) Od korpusu jako otevřeného zdroje pro bádání ke komerčním produktům (cs)
skos:prefLabel	From the Corpus as Open Source for Investigation to Commercial Products From the Corpus as Open Source for Investigation to Commercial Products (en) Od korpusu jako otevřeného zdroje pro bádání ke komerčním produktům (cs)
skos:notation	RIV/00216208:11260/07:00003319!RIV08-GA0-11260___
http://linked.open.../vavai/riv/strany	243;249
http://linked.open...avai/riv/aktivita	P
http://linked.open...avai/riv/aktivity	P(GA405/06/1057)
http://linked.open...vai/riv/dodaniDat	2008
http://linked.open...aciTvurceVysledku	Šimandl, Josef
http://linked.open.../riv/druhVysledku	D - Článek ve sborníku
http://linked.open...iv/duvernostUdaju	S - Úplné a pravdivé údaje nepodléhající ochraně podle zvláštních právních předpisů
http://linked.open...titaPredkladatele	Univerzita Karlova v Praze / Katolická teologická fakulta
http://linked.open...dnocenehoVysledku	422677
http://linked.open...ai/riv/idVysledku	RIV/00216208:11260/07:00003319
http://linked.open...riv/jazykVysledku	eng - angličtina
http://linked.open.../riv/klicovaSlova	Corpus; Source; Investigation; Commercial; Products (en)
http://linked.open.../riv/klicoveSlovo	Products Source Commercial Corpus Investigation
http://linked.open...ontrolniKodProRIV	[136C4F4628B2]
http://linked.open...i/riv/mistoVydani	Praha
http://linked.open...i/riv/nazevZdroje	Gramatika a korpus 2005 = Grammar & Corpora 2005
http://linked.open...in/vavai/riv/obor	AI
http://linked.open...ichTvurcuVysledku	1 (xsd:int)
http://linked.open...cetTvurcuVysledku	1 (xsd:int)
http://linked.open...vavai/riv/projekt	Studies in Czech Grammar
http://linked.open...UplatneniVysledku	2007
http://linked.open...iv/tvurceVysledku	Šimandl, Josef
number of pages	7 (xsd:int)
http://purl.org/ne...btex#hasPublisher	Ústav pro jazyk český AV ČR
https://schema.org/isbn	978-80-86496-32-0
http://localhost/t...ganizacniJednotka	11260

Faceted Search & Find service v1.16.118 as of Jun 21 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Jun 21 2024, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (126 GB total memory, 38 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software