|
| 1 | +<!DOCTYPE html> |
| 2 | +<!-- Academia (pandoc HTML5 template) |
| 3 | + designer: soimort |
| 4 | + last updated: 2016-05-07 --> |
| 5 | +<html> |
| 6 | + <head> |
| 7 | + <meta charset="utf-8"> |
| 8 | + <meta name="generator" content="pandoc"> |
| 9 | + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"> |
| 10 | + <meta name="author" content="Mort Yao"> |
| 11 | + <meta name="dcterms.date" content="2017-04-11"> |
| 12 | + <title>Context-Free Languages</title> |
| 13 | + <link rel="canonical" href="https://wiki.soimort.org/comp/language/context-free"> |
| 14 | + <style type="text/css">code { white-space: pre; }</style> |
| 15 | + <link rel="stylesheet" href="//cdn.soimort.org/normalize/5.0.0/normalize.min.css"> |
| 16 | + <link rel="stylesheet" href="//cdn.soimort.org/mathsvg/latest/mathsvg.min.css"> |
| 17 | + <link rel="stylesheet" href="//cdn.soimort.org/fonts/latest/Latin-Modern-Roman.css"> |
| 18 | + <link rel="stylesheet" href="//cdn.soimort.org/fonts/latest/Latin-Modern-Mono.css"> |
| 19 | + <link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> |
| 20 | + <link rel="stylesheet" href="/__/css/style.css"> |
| 21 | + <link rel="stylesheet" href="/__/css/pygments.css"> |
| 22 | + <script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_CHTML-full" type="text/javascript"></script> |
| 23 | + <!--[if lt IE 9]> |
| 24 | + <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script> |
| 25 | + <![endif]--> |
| 26 | + <script src="//cdn.soimort.org/jk/20160504/jk.min.js"></script> |
| 27 | + <script src="//cdn.soimort.org/mathsvg/latest/mathsvg.min.js"></script> |
| 28 | + <script src="/__/js/jk-minibar.js"></script> |
| 29 | + <link rel="icon" href="/favicon.png"> |
| 30 | + <link rel="apple-touch-icon" href="/favicon.png"> |
| 31 | + </head> |
| 32 | + <body> |
| 33 | + <main><article> |
| 34 | + <header> |
| 35 | + <h1 class="title">Context-Free Languages</h1> |
| 36 | + <address class="author">Mort Yao</address> |
| 37 | + <!-- h3 class="date">2017-04-11</h3 --> |
| 38 | + </header> |
| 39 | + <div id="content"> |
| 40 | +<p><strong>Context-free grammar (CFG).</strong> A <em>context-free grammar</em> <span class="math inline">\(G\)</span> is a 4-tuple <span class="math inline">\((V, \Sigma, R, S)\)</span>, where</p> |
| 41 | +<ol type="1"> |
| 42 | +<li><span class="math inline">\(V\)</span> is a finite set called the <em>variables</em>,</li> |
| 43 | +<li><span class="math inline">\(\Sigma\)</span> is a finite set called the <em>terminals</em>, (<span class="math inline">\(\Sigma \cap V = \emptyset\)</span>)</li> |
| 44 | +<li><span class="math inline">\(R \subseteq V \times (V \cup \Sigma)^*\)</span> is a finite set of <em>substitution rules</em>,</li> |
| 45 | +<li><span class="math inline">\(S \in V\)</span> is the <em>start variable</em>.</li> |
| 46 | +</ol> |
| 47 | +<p>Given <span class="math inline">\(u, v, w \in (V \cup \Sigma)^*\)</span>, and <span class="math inline">\(A \mapsto w\)</span> is a substitution rule, we say that <span class="math inline">\(uAv\)</span> <em>yields</em> <span class="math inline">\(uwv\)</span>, denoted as <span class="math inline">\(uAv \Rightarrow uwv\)</span>.</p> |
| 48 | +<p>Moreover, given <span class="math inline">\(w_0, w \in (V \cup \Sigma)^*\)</span>, if <span class="math inline">\(w_0 = w\)</span> or if there exists a sequence <span class="math inline">\(w_1, \dots, w_k\)</span> (where <span class="math inline">\(k \geq 0\)</span>, and <span class="math inline">\(\forall 1 \leq i \leq k : w_i \in (V \cup \Sigma)^*\)</span>) such that <span class="math inline">\(w_0 \Rightarrow w_1 \Rightarrow \dots \Rightarrow w_k \Rightarrow w\)</span>, we say that <span class="math inline">\(w_0\)</span> <em>derives</em> <span class="math inline">\(w\)</span>, denoted as <span class="math inline">\(w_0 \Rightarrow^* w\)</span>; the sequence is called a <em>derivation</em>. A derivation is a <em>leftmost derivation</em> if at every step the leftmost remaining variable is replaced. The leftmost derivation of a string corresponds to the pre-order traversal of its <em>parse tree</em>.</p> |
| 49 | +<p><span class="math inline">\(L\)</span> is the <em>language of grammar</em> <span class="math inline">\(G\)</span>, denoted as <span class="math inline">\(\mathcal{L}(G) = L\)</span>, if and only if <span class="math inline">\(L = \{ w \in \Sigma^*\ |\ S \Rightarrow^* w \}\)</span>. We say that the grammar <span class="math inline">\(G\)</span> <em>generates</em> the language <span class="math inline">\(L\)</span>.</p> |
| 50 | +<p><strong>Context-free language (CFL).</strong> A language <span class="math inline">\(L\)</span> is called a <em>context-free language</em> if there exists a CFG <span class="math inline">\(G\)</span> that generates <span class="math inline">\(L\)</span>.</p> |
| 51 | +<p>A grammar <span class="math inline">\(G\)</span> is said to be <em>ambiguous</em>, if there exists a string <span class="math inline">\(w \in \mathcal{L}(G)\)</span> with two or more leftmost derivations.</p> |
| 52 | +<p><strong>Chomsky normal form (CNF).</strong> A CFG is in <em>Chomsky normal form</em> if every substitution rule is of the form <span class="math display">\[\begin{aligned} |
| 53 | +A &\mapsto BC \\ |
| 54 | +A &\mapsto a \\ |
| 55 | +S &\mapsto \varepsilon |
| 56 | +\end{aligned}\]</span> where <span class="math inline">\(A, B, C \in V\)</span>, <span class="math inline">\(B \neq S\)</span>, <span class="math inline">\(C \neq S\)</span>, and <span class="math inline">\(a \in \Sigma\)</span>.</p> |
| 57 | +<p><strong>Theorem 1.</strong> Every context-free language is generated by a CFG in Chomsky normal form. (Alternatively: any CFG can be converted into Chomsky normal form that generates the same language.)</p> |
| 58 | +<p><strong>Theorem 2. (Closure properties)</strong> The class of context-free languages is closed under regular operations (i.e., union, concatenation and Kleene star).</p> |
| 59 | +<p><strong>Nondeterministic pushdown automaton (PDA).</strong> A <em>nondeterministic pushdown automaton</em> <span class="math inline">\(M\)</span> is a 6-tuple <span class="math inline">\((Q, \Sigma, \Gamma, \delta, q_0, F)\)</span>, where</p> |
| 60 | +<ol type="1"> |
| 61 | +<li><span class="math inline">\(Q\)</span> is a finite set called the <em>states</em>,</li> |
| 62 | +<li><span class="math inline">\(\Sigma\)</span> is a finite set called the <em>input alphabet</em>,</li> |
| 63 | +<li><span class="math inline">\(\Gamma\)</span> is a finite set called the <em>stack alphabet</em>,</li> |
| 64 | +<li><span class="math inline">\(\delta : Q \times \Sigma_\varepsilon \times \Gamma_\varepsilon \to \mathcal{P}(Q \times \Gamma_\varepsilon)\)</span> is the <em>transition function</em>, (where <span class="math inline">\(\Sigma_\varepsilon = \Sigma \cup \{\varepsilon\}\)</span> and <span class="math inline">\(\Gamma_\varepsilon = \Gamma \cup \{\varepsilon\}\)</span>)</li> |
| 65 | +<li><span class="math inline">\(q_0 \in Q\)</span> is the <em>start state</em>,</li> |
| 66 | +<li><span class="math inline">\(F \subseteq Q\)</span> is the set of <em>accept states</em>.</li> |
| 67 | +</ol> |
| 68 | +<p>We say that the PDA <span class="math inline">\(M = (Q, \Sigma, \Gamma, \delta, q_0, F)\)</span> <em>accepts</em> a string <span class="math inline">\(w\)</span> if <span class="math inline">\(w\)</span> may be written as <span class="math inline">\(w = a_1 \cdots a_n\)</span> (where each <span class="math inline">\(a_i \in \Sigma_\varepsilon\)</span>), and there exists a sequence of states <span class="math inline">\(r_0, \dots, r_n\)</span> (where each <span class="math inline">\(r_i \in Q\)</span>) and a sequence of strings <span class="math inline">\(s_0, \dots, s_n\)</span> (where each <span class="math inline">\(s_i \in \Gamma^*\)</span>), such that</p> |
| 69 | +<ol type="1"> |
| 70 | +<li><span class="math inline">\(r_0 = q_0\)</span> and <span class="math inline">\(s_0 = \varepsilon\)</span>,</li> |
| 71 | +<li>For every <span class="math inline">\(0 \leq i < n\)</span>, <span class="math inline">\((r_{i+1}, \beta) \in \delta(r_i, a_{i+1}, \alpha)\)</span>, where <span class="math inline">\(s_i = \alpha t\)</span> and <span class="math inline">\(s_{i+1} = \beta t\)</span> for some <span class="math inline">\(\alpha, \beta \in \Gamma_\varepsilon\)</span> and <span class="math inline">\(t \in \Gamma^*\)</span>.</li> |
| 72 | +<li><span class="math inline">\(r_n \in F\)</span>.</li> |
| 73 | +</ol> |
| 74 | +<p>Otherwise, we say that the PDA <span class="math inline">\(M\)</span> <em>rejects</em> the string <span class="math inline">\(w\)</span>.</p> |
| 75 | +<p><span class="math inline">\(L\)</span> is the language of PDA <span class="math inline">\(M\)</span>, denoted as <span class="math inline">\(\mathcal{L}(M) = L\)</span>, if and only if <span class="math inline">\(L = \{w\ |\ w \text{ is a string accepted by } M\}\)</span>. We say that the PDA <span class="math inline">\(M\)</span> <em>recognizes</em> the language <span class="math inline">\(L\)</span>.</p> |
| 76 | +<p><strong>Theorem 3.</strong> A language <span class="math inline">\(L\)</span> is context free if and only if there exists a PDA <span class="math inline">\(M\)</span> that recognizes <span class="math inline">\(L\)</span>.</p> |
| 77 | +<p><strong>Theorem 4.</strong> Every regular language is context free.</p> |
| 78 | +<p><strong>Theorem 5. (Pumping lemma)</strong> If <span class="math inline">\(L\)</span> is a context-free language, then there is a number <span class="math inline">\(p\)</span> (called the <em>pumping length</em>) such that if <span class="math inline">\(w \in L\)</span> and <span class="math inline">\(|w| \geq p\)</span>, then <span class="math inline">\(w\)</span> may be written as <span class="math inline">\(w = uvxyz\)</span>, under the following conditions:</p> |
| 79 | +<ol type="1"> |
| 80 | +<li>For every <span class="math inline">\(i \geq 0\)</span>, <span class="math inline">\(uv^ixy^iz \in L\)</span>,</li> |
| 81 | +<li><span class="math inline">\(|vy| > 0\)</span>,</li> |
| 82 | +<li><span class="math inline">\(|vxy| \leq p\)</span>.</li> |
| 83 | +</ol> |
| 84 | + </div> |
| 85 | + <footer> |
| 86 | + <!-- TO BE MODIFIED BY NEED --> |
| 87 | + <a title="Keyboard shortcut: q" |
| 88 | + href=".."> |
| 89 | + <i class="fa fa-angle-double-left" aria-hidden="true"></i> |
| 90 | + <code>Parent</code> |
| 91 | + </a> | |
| 92 | + <a class="raw" accesskey="r" |
| 93 | + title="Keyboard shortcut: R" |
| 94 | + href="https://wiki.soimort.org/comp/language/context-free/src.md"> |
| 95 | + <i class="fa fa-code" aria-hidden="true"></i> |
| 96 | + <code>Raw</code> |
| 97 | + </a> | |
| 98 | + <a class="history" accesskey="h" |
| 99 | + title="Keyboard shortcut: H" |
| 100 | + href="https://github.com/soimort/wiki/commits/gh-pages/comp/language/context-free/src.md"> |
| 101 | + <i class="fa fa-history" aria-hidden="true"></i> |
| 102 | + <code>History</code> |
| 103 | + </a> | |
| 104 | + <a class="edit" accesskey="e" |
| 105 | + title="Keyboard shortcut: E" |
| 106 | + href="https://github.com/soimort/wiki/edit/gh-pages/comp/language/context-free/src.md"> |
| 107 | + <i class="fa fa-code-fork" aria-hidden="true"></i> |
| 108 | + <code>Edit</code> |
| 109 | + </a> | |
| 110 | + <a title="Keyboard shortcut: p" |
| 111 | + href="javascript:window.print();"> |
| 112 | + <i class="fa fa-print" aria-hidden="true"></i> |
| 113 | + <code>Print</code> |
| 114 | + </a> | |
| 115 | + <a title="Keyboard shortcut: ." |
| 116 | + href="https://wiki.soimort.org/comp/language/context-free"> |
| 117 | + <i class="fa fa-anchor" aria-hidden="true"></i> |
| 118 | + <code>Permalink</code> |
| 119 | + </a> | |
| 120 | + Last updated: <span id="update-time">2017-04-11</span> |
| 121 | + </footer> |
| 122 | + </article></main> |
| 123 | + </body> |
| 124 | +</html> |
0 commit comments