Starting out with DDL: How do students use their own do-it-yourself corpus?

Maggie Charles, Oxford University

Research on the uses of data-driven learning (DDL) has recently centred on the ways in which students consult corpora (e.g. Pérez-Paredes et al., 2013; Yoon, 2016). This research typically provides students with access to one or more large general corpora, and student searches are prompted by the demands of writing an assignment (Yoon, 2016) or carrying out a given grammar task (Pérez-Paredes et al., 2013). In this paper I take a slightly different approach by focusing on the first steps that students take as independent users of corpora for their own DDL purposes. The students in question were participants in a course entitled ‘Editing your Thesis with Corpora’, during which they built two do-it-yourself corpora consisting of 1) research articles in their own field and 2) draft chapters of their own thesis. I start by giving an overview of this course and the practice material on concordancing with AntConc (Anthony 2014) which was given to the students in the first session. In the second session, students constructed a quick and dirty corpus of research articles using the AntFileConverter (Anthony 2015) to create plain text files from pdfs in their own bibliography. They were then asked to deal with five of their own editing queries using the corpora they had just built. In class, they recorded their searches and outcomes on worksheets, which provide the data for this study. Data are available for 63 students (all with L2 English) and indicate that even at this early stage and with relatively small corpora, students can answer most of their queries (84%) to their own satisfaction. This paper reports in more detail on these queries and their outcomes and discusses their implications for facilitating the early stages of corpus consultation. It argues that the examination of such query-outcome sequences has a role to play in devising effective tasks for novice corpus users in academic writing.


Anthony, L., (2014). AntConc (3.4.4). [computer program] Tokyo, Japan: Waseda University. Available at: <>

Anthony, L. (2015). AntFileConverter (Version 1.2.0) [Computer Software]. Tokyo, Japan:

Waseda University. Available from

Pérez-Paredes, P., Sánchez-Tornel, M., & Alcaraz Calero, J. (2013). Learners’ search patterns during corpus-based focus-on-form activities. International Journal of Corpus Linguistics, 17(4), 482–515.

Yoon, C. (2016a). Concordancers and dictionaries as problem-solving tools for ESL academic writing. Language Learning and Technology, 20(1), 209–229.

Slides: BAAL Corpus Sig Charles