A corpus is a large collection of written or spoken texts, held as a database that can be searched to show all the instances of a particular word and the contexts in which it is used.
90% of the BNC is written language
The written part is made up of:
10% of the BNC is spoken language
The spoken part is made up of :
As lexicographers we would hate to be without a large, well-balanced corpus. It gives us an invaluable picture of the way words are really used today. We use the BNC to confirm our intuitions and also to tell us things we didn't already know, or may not have thought about. We can find out exactly what a word means, rather than what we think it means. We can see how it behaves grammatically and which words it collocates with. We use all this information when writing our learners’ dictionaries.
For example, look at this extract from the BNC in which ‘bent’ and ‘on’ have been searched for together. Clicking on it will open the full extract in a new window.
The concordances tell us that ‘bent on’ can be followed by a noun or noun phrase, or by verb+-ing. The lines also show clearly that many of the things somebody is bent on or bent on doing have something in common. Can you see what it is?
Answer: They are often negative (destroying, destruction, creating hell on earth)
Here is the entry in the Oxford Advanced Learner’s Dictionary that uses this information: