Digital corpora and grammatical words

Authors

  • Mar Garachana Universitat de Barcelona
  • Esther Artigas Universitat de Barcelona

Abstract

The aim of this paper is to address the advantages and disadvantages of working with electronic corpora to study the grammaticalization of verbal constructions in the history of Spanish. The data on which we base our study have been extracted from the corpora compiled by the Real Academia Española, i.e., corde and crea. These corpora lack lemmatization and tagging. Hence, when these corpora are consulted for a given expression, concordance searches often yield an unwieldy amount of irrelevant occurrences. Another difficulty we have encountered when working with these corpora, especially corde, has to do with the fact that, all too often, the editions that are used therein are not contemporaneous with the original manuscripts. This seriously compromises the conclusions reached by studies that use these corpora. We illustrate the problem with an analysis of the first documentations of the verbal periphrasis tener + (de/a) + infinitive.

Keywords

Corpus Linguistics, Grammaticalization, Verbal Periphrases

Published

15-10-2012

Downloads

Download data is not yet available.