本文へスキップ

KOJIMA Masumi's Website

P_Lex (Meara & Bell, 2001)

Introduction

P_Lex is a lexical richness measure that uses word frequency lists developed by Meara and Bell (2001). Meara and Bell argue that P_Lex is mathematically more sophisticated than the LFP, and that the data it produces is easier to work with. They also claim that P_Lex works much better with shorter texts than LFP (Laufer and Nation, 1995) does. P_Lex is based on the assumption that the occurrence of low-frequency words is rare in a text in a way that approximates a Poisson distribution. The measure of P_Lex is called λ (lambda); it defines the overall shape of the Poisson curve to which the real data is approximated.

How P_Lex works

P_Lex works as follows. First, it divides the text into a set of 10-word segments. Second, each word is categorised as ‘easy’ (the most frequent 1000 words, proper nouns, and numbers) or ‘difficult’ (all other words). Then, P_Lex calculates the number of segments containing zero difficult words, the number of segments containing one difficult word, and so on. This creates a curve such as the one illustrated in Figure 1. In the illustrated case, the ratio of the number of segments containing zero difficult words to the total number of segments is 0.4, that for the number of segments containing one difficult word is 0.4, and that for two difficult words is 0.2. The curve acquired is then fit to already established theoretical curves, each of which has a lambda (λ) value. In this case, the data matches a theoretical curve with lambda = 0.92, and therefore, the P_Lex score for this text is 0.92. The authors state that higher scores correspond to a higher proportion of infrequent words in a text and thus a lexically richer text.

References

  • Laufer, B., & Nation, P. (1995). Vocabulary Size and Use: Lexical Richness in L2 Written Production. Applied Linguistics, 16 (3), 307-322.
  • Meara, P., & Bell, H. (2001). P_Lex: A simple and effective way of describing the lexical characteristics of short L2 texts. Prospect, 16 (3), 5-19.