Concordancing for the English Language Teacher and Learner
Robert L Fielding
The case for using concordancing software in the classroom.
With the advances in computer software, and in particular, with concordance software, and with the huge corpora of language that such software is able to manipulate and process, aspects of language that were once invisible, are now visible. Dave Willis, in his book, The Lexical Syllabus (1993), and the Collins Cobuild team working from the University of Birmingham, have outlined these aspects in detail. Willis states that, ’the fact that a lexical description depends on more powerful generalisations means that learners will have more evidence on which to base useful generalisations about the language.’ Willis (1993) Willis goes on to report the writers Selinker and Corder, both experts in Interlanguage Theory, who describe language learning as a process of continually forming, editing and revising hypotheses about the grammar of the language, going on to state that learners need a lot of evidence in the form of exposure to the language before they are able to reach stable conclusions about the grammar.
The main drift of Willis’ argument is that rather than having a syllabus based upon transformational grammar, a more productive syllabus should be formed with reference to lexis rather than structure. He points to the traditional way such structural items as the passive, conditionals and reported speech are traditionally taught, and contrasts them to the ‘reality’ of English in vitro, actual language as used by native users, rather than the English that is very often contrived to illustrate a grammatical structure. And while this is not the place to go deeply into such a debate, it would seem, from the findings of those with access to corpora of language in use, that “a new, thorough-going description of the English language, and one which is not based on the introspection of its authors, but which recorded their observations of linguistic behaviour as revealed in naturally occurring text,” (Renouf 1987) would help students make the generalisations they need to make far more successfully than recourse to contrivances in grammar based language teaching course-books such as the ones a present in use in our profession.
As teachers of English, we have been, and still are, deluged with ‘theories’: of learning, teaching, and of what our subject matter consists and how it works. Of these, the last named alone stands as evidential in an absolute sense; it presents us with data-driven information relating to authentic language in vitro, and bears out the position of Willis outlined above.
Like all revelations in any field, this one challenges our accepted thinking on what language consists of and how it is used by people using it for authentic reasons.
Changes in our ways of thinking about our subject matter should and will change how we teach it, and we can either accept it and go along with it, use it and get used to it, or we can bury our heads in the sand or behave like some latter day educational ostriches.
Introduction to concordancing: What is a concordancer?
“The most valuable contribution a computer can make to language learning is in supplying on demand and in an organised fashion, masses and masses of authentic language. The most powerful of these is a concordancer.” Higgins (1991) If we are to teach English to people who are going to communicate via that language, and through whatever medium, teaching has to utilise what we refer to as authentic language. To do otherwise would be to teach something that is ultimately useless, or in fact, worse than useless, since any misunderstandings that arise from using language that is not authentic is tantamount to using language that is not appropriate. Dell Hymes and others have coined the term ‘communicative competence’ to refer to ways of using language that are appropriate for the time, the place, and the people being addressed.
Given then that authentic language for authentic situations is what we are in the business of teaching, it follows that we should take advantage of the most up to date, comprehensive data on that language, and then use it to inform what we teach and how we teach it.
A concordancer is an electronic means of processing large amounts of language. A concordance itself is nothing new, of course, but an electronic one is a relatively new device.
Coupled with advances in this technology is the bringing together, in machine readable form, huge collections (corpora) of language from a variety of sources. The corpora accumulated and used by the Collins Cobuild Project at Birmingham University, and available via a concordancer on-line, consists of millions upon millions of words taken from authentic sources such as newspapers, magazines, journals, and the spoken voice. It is a corpus of real language in vitro in our present time. Of course, on-line there are many thousands of such corpora, and these come from all kinds of discourse communities, scientific and otherwise. “An extensive linguistic corpus is a gold mine of authentic language use that through KWIC (Key Word In Context) concordances can provide students with multiple contexts from which to learn new vocabulary.” Now, while it is true that a dedicated teacher can pull out words from texts and write them on the board for his students, an electronic concordancer processing a corpus of language can not only pull out the words we find interesting and worthy of teaching, but it can also pull out the immediate contexts in which such vocabulary operates.
A concordancer can isolate and render visible patterns of language that occur significantly often enough to make them worthy of our attention as teachers of English. Perhaps more importantly, a concordancer can do the same for students of English. While not pretending that such software is simple to use, it is sufficiently easy after a short time of perseverance and diligence; it is ‘user-friendly’.
Below, I have quoted such benefits to learners using this software.
1. With a concordancer, the teacher chooses the right corpus for particular learners. He provides his students with tasks according to his teaching objectives. The students work on their own, or in pairs, on authentic material or any kind other kind of teaching material.
2. The students can draw their own conclusions about the use of the given words by focusing on certain points in the contexts these words appear.
3. This provides learners with an opportunity to develop strategies which they can build on once the language class is finished.
4. It opens language classes to the use and interpretation of up-to-date and often authentic language even at lower levels.
5. They bring cognitive and analytic skills in students to bear on the manipulation of comprehensive databases for the purpose of solving real-language problems.
6. Why use concordancing in language learning?
i) It interjects authenticity (of purpose and activity) into the learning process.
ii) Learners assume control of that process.
iii) The predominant metaphor for learning becomes the research metaphor as embodied in the concept of data-driven learning (DDL).
7. It gives access to many important language patterns in texts.
8. They are the only way of providing students with a lot of authentic textual data.
9. Concordances are the only way to expose students to large numbers of collocations in authentic texts.
10. Help the inductive approach to language learning. Or the deductive approach – 1 of 3
11. “An extensive linguistic corpus is a gold mine of authentic language use that through KWIC concordances can provide students with multiple contexts from which to learn new vocabulary.”
12. With a concordancer the teacher chooses the right corpus for particular learners. He provides his students with tasks according to his teaching objectives. The students work on their own, or in pairs, on authentic material or any kind other kind of teaching material.
13. The students can draw their own conclusions about the use of the given words by focusing on certain points in the contexts these words appear.
14. This provides learners with an opportunity to develop strategies which they can build on once the language class is finished.
15. It opens language classes to the use and interpretation of up-to-date and often authentic language even at lower levels.
16. They bring cognitive and analytic skills in students to bear on the manipulation of comprehensive databases for the purpose of solving real-language problems.
17. Why use concordancing in language learning?
iv) It interjects authenticity (of purpose and activity) into the learning process.
v) Learners assume control of that process.
vi) The predominant metaphor for learning becomes the research metaphor as embodied in the concept of data-driven learning (DDL).
18. “If the top 200 or so most frequent words in English are systematically taught in all of their forms and in well-structured materials, they will carry with them most of the grammatical and discourse detail that the second and foreign language learners are ever likely to need.”
19. Louw has used concordances to study progressive delexicalization, the phenomenon by which words tend to lose their ‘dictionary meaning’; e.g. ‘take the money’, ‘take a bus’, ‘take a look’.
20. The most valuable contribution a computer can make to language learning is in supplying, on demand and in an organised fashion, masses and masses of authentic language. The most powerful of these tools is the concordancer. What the concordancer does is make the invisible visible.
21. Requires the notion that language learners can benefit from teaching materials promoting inductivity, authenticity and learner responsibility for learning. Particularly where technology is involved, there is much ignorance, misunderstanding and ‘indifference” tp putting into practice new approaches to language teaching while operating new skills in operating complex hardware and software. Although text manipulation is conveniently implemented and consistent with current language learning pedagogy, its benefits are difficult to intuit; hence the genre is easily misunderstood.
All 21 quotes above were downloaded from on-line sources that are listed in References below
The difference between and different uses of on-line concordancers (free sites)
and downloadable software such as ’Concordance’ (often pay sites).
On-line concordancers are plentiful and usually free to use. Of these, the one serviced and offered by Collins Cobuild is worth mentioning. Concordances of any word, or string of words, is available almost instantaneously. Some examples of these can be found below in the Appendices. The concordancer uses several corpora of both written and/or spoken language of a ‘general’ nature, which is to say that it is not subject specific. For example, the information from a concordance of the word ‘oxidised’ which although the term for a scientific process, and as such, a ‘technical’ word occasionally used in ‘normal’ discourse, in the press, for example, would not perhaps be adequate for a term paper outlining the process of oxidisation at a university. Nevertheless, much of what we teach is English of the common or garden variety, and so information on how more ordinary words behave and affect other in the context in which they are used would be very useful to both teachers and their students.
For more genre specific language use, one has to go to genre specific corpora, of which there are many on-line. The problem now though is how to process them, and that is where downloaded concordancing software becomes necessary and invaluable.
With ones own software, one is able to process whatever one wishes, be it from on-line sources, scanned documents or from students’ own efforts at writing English. Everyone, however, would have to be presented to the concordancer in a form it could manage. All this requires is the simple expedient of filing in TEXT ONLY, and then submitting as you would any file to a drive on a computer.
Now, the advantages and uses of such sources of language readily available for processing at one’s fingertips would be enormous.
Material from on-line sources could be chosen to represent the variety of language to be taught. Similarly, with scanned documents, one could scan current journals into readable forms, and with documents from students’ own efforts, one could use such data to inform a new syllabus, correct an old one, or offer remedial courses for students with more fundamental recurring difficulties expressing themselves.
Appearing in the Appendices below, and linked into the PowerPoint presentation, of which this essay is also a part, are details of one such downloadable, affordable, concordancing software.
Here are some of the things that can be done with it.
• Make wordlists, word frequency lists, and indexes
• Make full concordances to texts of any size, limited only by available disk space and memory
• Make fast concordances, picking your selection of words from text
• Use multiple input files
• Make concordances straight from text in other Windows programs
• Make Web Concordances: turn your concordance into linked HTML files, ready for publishing on
the Web, with a single click
• View a full wordlist, a concordance, and your original text simultaneously
• Browse through the original text and click on any word to see the concordance for that word
• Edit and re-arrange a wordlist by drag and drop
• See the collocation counts for every word, up to four words left and right
• Lemmatise a wordlist - group together any words you choose.
• Support for many different languages and character sets
• User-definable alphabet
• User-definable reference system
• User-definable contexts
• Very flexible search, selection, and sorting criteria
• Statistics on your text
• Stop Lists let you specify words to be omitted from your concordance
• Word length chart
• Full print preview and printing, with control over page size, margins, headers, footers, fonts etc.
• Can save concordances as plain text, as a single HTML file, or as a Web Concordance
• Built-in file viewer can display files of unlimited size
• Built-in editor allows fast editing of files up to 16MB
• Tools supplied for converting from OEM to ANSI character sets and from Unix to PC files
• Easy user interface with modern Windows features
• Context-sensitive help system with over 200 topics
Runs fast - can pick 15000 occurrences of a word from a 1.5MB text in under 4 seconds on a 600MHz Pentium III
• Entirely native 32-bit code for speed and stability
CONCORDANCE COPYRIGHT © 1999, 2000, 2002 R.J.C. WATT ALL RIGHTS RESERVED
With what has been said here about the importance of vocabulary in allowing students to formulate, edit and revise generalisations about the language they are learning Selinker and Corder (ibid), and with our constant wish to change learning strategies from merely rote learning to more research based, experiential learning based upon evidence which students themselves are responsible for finding, with guidance from teachers, we as teachers would become more in tune with what should be our proper role in the language classroom; language consultants, facilitators and helpers rather than the sort of sources of all knowledge that sometimes make students too heavily reliant on teachers. Having concordancing software together with corpora of authentic language associated with the subject areas of students’ intended majors, and personnel trained in its use and able to pass this skill on to students would greatly enhance learning and teaching the English Language..
Robert L Fielding
Corder S.P. (1967) The Significance of Learners’ Errors IRAL
Gowin-Jones B (2001) Emerging Technologies Virginia Commonwealth University
Hill J. & Lewis M. (1999) LTP Dictionary of Selected Collocations LTP Hove England
Renouf A (1987) Corpus Development Collins
Vance’s ESL Home Text Analysis Inc.
Selinker l. (1972) Interlanguage IRAL
Willis D (1993) The Lexical Syllabus Harper Collins London