How can we show that a formal language is not context free. One important tool for this is the pumping lemma for context-free languages.

The pumping lemma states that if a language \(L\) is context-free. then there is a string-length \(l\) such that for any strings \(s\) in the language with length greater than or equal to \(l\) (i.e., \(|s| \geq l\)), \(s\) can be written as \(uvwxy\) (i.e., \(s=uvwxy\)) such that \(|vx| \geq 1\) and \(|vwx| \leq l\) and that all strings \(uv^n wx^n y\) for all \(n\) are also in \(L\).

The intuition behind the pumping lemma is that if a string is long enough, it must have been generated by recursion and this recursion can lead to a center-embedding. But if this recursion happened once, it can happen an unbounded number of times, so whatever material it adds into the string, it can add into the string some unbounded number of times.

Note that the pumping lemma is not an abstract characterization since it only provides a necessary, but not a sufficient condition for membership in the class of context-free languages.


48 Parameter Estimation for CFGs 50 Natural Languages and Context-Free Languages