Recent literature in corpus linguistics (e.g., McEnery & Ostler 2000) and language documentation (e.g., Johnson 2004) suggests both disciplines may share natural points of interaction, having in common an interest in the construction and use of permanent collections of diverse linguistic data. Although considerable benefit might be anticipated from close collaboration between these two areas, divergences in their respective purposes, practices, and products may render such an interaction more difficult to foster than might initially be expected. This paper considers points of commonality and difference between corpus linguistics and language documentation in four specific areas of practice, drawing upon examples from ongoing corpus construction and language documentation efforts centered on Mennonite Plautdietsch in Canada. Given the results of this comparison, this study proposes viewing corpora as descriptive applications of language documentation, to be built directly upon the permanent documentary record. By founding corpora upon documentary materials, such an approach opens language documentation more readily to the analytical and methodological contributions of corpus linguistics, while providing a solid empirical basis for future corpus construction.

Series Language and Computers
Cox, C. (2011). Corpus linguistics and language documentation: Challenges for collaboration. In Language and Computers.