Back to Search View Original Cite This Article

Abstract

<jats:title>Abstract</jats:title> <jats:p>Zipf's law is the observation that a few very frequent words make up a very large portion of any text, while most words occur relatively rarely. For anyone learning a language, that means there is a core vocabulary that can give a learner access to the network of a language. Language instructors can utilize this feature to craft materials that focus on vocabulary items that will give the greatest return on investment. This holds for all language learners, including those preparing for academia or some specialization. Mandelbrot added parameters to Zipf's formula to adjust the top and bottom of the distribution that deviate from the roughly straight line the word frequencies make on a log–log graph. Rather than being theoretically inconvenient, these deviations hint at a small world network structure that maximizes entropy. The lexicon of a language seems to be poised at a critical point between the need to maximize information transfer and minimize the cognitive cost of retrieving or recognizing words. The network of language is self‐organizing, dynamically preserving this pattern despite changes in morphology, semantics, and syntax through history. As corpus linguists compile larger corpora of different languages, and different types of speech and writing, this pattern or power law distribution seems universal. It also appears to hold for the distribution of galaxies in the universe and for the energy states of subatomic particles. This suggests that there are greater principles that relate this pattern to other biological and physical phenomena.</jats:p>

Show More

Keywords

language words network distribution pattern

Related Articles