<<Up     Contents

Chomsky hierarchy

The Chomsky hierarchy is a containment hierarchy of classes of formal grammars that generate formal languages. This hierarchy was described by Noam Chomsky in 1956.

Formal grammars

A formal grammar consists of a finite set of terminal symbols (the letters of the words in the formal language), a finite set of nonterminal symbols, a set of production rules with a left- and a right-hand side consisting of a word of these symbols, and a start symbol. A rule may be applied to a word by replacing the left-hand side by the right-hand side. A derivation is a sequence of rule applications. Such a grammar defines the formal language of all words consisting solely of terminal symbols that can be reached by a derivation from the start symbol.

Nonterminals are usually represented by uppercase letters, terminals by lowercase letters, and the start symbol by an "S". For example, the grammar with terminals {a, b}, nonterminals {S, A, B}, production rules

S -> ABS
S -> ε (with ε the empty string)
BA -> AB
BS -> b
Bb -> bb
Ab -> ab
Aa -> aa
and start symbol S, defines the language of all words of the form anbn. (ie: n copies of a followed by n copies of b)(See formal grammar for a more elaborate explanation.)

The hierarchy

The Chomsky hierarchy consists of the following levels:

Every regular language is context-free, every context-free language is context-sensitive and every context-sensitive language is recursively enumerable. These are all proper inclusions, meaning that there exist recursively enumerable languages which are not context-sensitive, context-sensitive languages which are not context-free and context-free languages which are not regular.

The following table summarizes each of Chomsky's four types of grammars, the class of languages it generates, the type of automaton that recognizes it, and the form its rules must have.

Grammar Languages Automaton Production rules
Type-0 Recursively enumerable Turing machine No restrictions
Type-1 Context-sensitive Linear-bounded non-deterministic Turing machine αAβ -> αγβ
Type-2 Context-free Non-deterministic pushdown automaton A -> γ
Type-3 Regular Finite state automaton A -> aB
A -> a

References

wikipedia.org dumped 2003-03-17 with terodump