This is a list of relevant publications. Please contribute if you know of any publications that touch upon this issue! (Don't worry about formatting to fit a certain citation style, the content is more important.)
- Abney, Steven, and Steven Bird. 2010. The Human Language Project: building a universal corpus of the world's languages. In Proceedings of the 48th annual meeting of the association for computational linguistics. Uppsala, Sweden.
- Abney, Steven, and Steven Bird. 2011. Towards a data model for the Universal Corpus. In Proceedings of the 4th workshop on building and using comparable corpora: Comparable corpora and the web. Portland, Oregon
- Bender, Emily M., and Jeff Good. 2010. A grand challenge for linguistics: Scaling up and integrating models. White paper contributed to NSF’s SBE 2020 (2010): 1-1.
- Bird, Steven, and David Chiang. 2012. Machine Translation for Language Preservation. In Proceedings of 24th International Conference on Computational Linguistics (COLING): Poster. Mumbai, India.
- Bird, Steven, Lauren Gawne, Katie Gelbart and Isaac McAlister. Collecting Bilingual Audio in Remote Indigenous Communities. In Proceedings of 25th International Conference on Computational Linguistics (COLING). Dublin, Ireland.
- Good, Jeff. 2010. Data and language documentation. Peter Austin and Julia Sallabank (eds.), Handbook of Endangered Languages. Cambridge: Cambridge University Press. 212–234.
- Hanke, Florian R., and Steven Bird. 2010. Large-scale text collection for unwritten languages. In Proceedings of the 6th International Joint Conference on Natural Language Processing. Nagoya, Japan.
- Holton, Gary. 2010. The role of information technology in supporting small and endangered languages. The Cambridge Handbook of Endangered Languages, ed. by Peter K. Austin & Julia Sallabank. 371-99. Cambridge University Press.