Theory-driven corpus research : using corpora to inform aspect theory.

This article maintains that linguistics is the study of language as reflected by our knowledge, as well as use, of language, contrary to Chomskyan linguists' assertion that performance data cannot be the subject of linguistics. A comparison of the three types of data used in linguistics, namely introspective data, elicited data and corpus data, shows that corpus data is more reliable than the first two types as a corpus can provide data that is attested, contextualized, and quantitative. The corpus-based approach can achieve improved reliability also because it does not go to the extreme of rejecting intuition while attaching importance to empirical data. Whilst corpora can be used to verify and revise existing linguistic theories, or to provide what intuition alone cannot discern, on the basis of which entirely new linguistic theories can be developed, the sharp distinction found in the literature between the corpus-based vs. corpus-driven approaches is overstated. This article also presents a case study of aspect which demonstrates that a marriage between theory-driven and corpus-based approaches to linguistics can lead to more accurate linguistic descriptions and hence theories.

