Homework #7 1. Show that the language L = {a^i b^j c^k: i < j < k} is not a context-free language. Solution: If L were context free, then the pumping lemma should hold. Let z = a^n b^{n+1} c^{n+2}. Given this string and knowing that |z| >= n, we want to define z as uvwxy such that |vwx| <= n, |vx| >= 1. Because |vwx| <= n, there are five possible descriptions of uvwxy: 1. vwx is a^p for some p<=n, p>=1 2. vwx is a^p b^q for some p+q<=n, p+q>=1 3. vwx is b^p for some p<=n, p>=1 4. vwx is b^p c^q for some p+q<=n, p+q>=1 5. vwx is c^q for some i<=n, i>=1 Note that because |vwx| <= n, vwx cannot contain both "a"s and "c". For all of these cases, u v^i w x^i y, i>=0, should be in the language. In case 1, if i=2 we will be adding an a to the string, making the number of "a"s n+1 and thus the string is not in the language. The same argument holds for case 3 in which the number of "b"s will be equal to the number of "c"s. A similar argument holds in case 5. In case 5 if i=0 then the number of "c"s will be less than or equal to the number of "b"s. In case 2, when i=2 either the number of "a"s will be greater than the number of "b"s or the number of "b"s will be greater than the number of "c"s (depending on the distribution of v and x). In case 4, when i=0 either the number of "b"s will be less than or equal to number of "a"s or the number of "c"s will be less than or equal to the number of "b"s (depending on the distribution of v and x). 2. Show that the language L = {a^i: is is a prime} is not a context-free language. Solution: If L were context free, then the pumping lemma should hold. We pick a string z equal to 0^m, where m is prime and >= n + 2. 0^m = uvwxy = 0^|u| 0^|v| 0^|w| 0^|x| 0^|y|, |vwx| <= n, |vx| >= 1. If we pick i = |u| + |w| + |y|, then uv^iwx^iy = 0^|u| 0^{|v|(|u|+|w|+|y|)} 0^|w| 0^{|x|(|u|+|w|+|y|)} 0^|y| = 0^{(|v|+|x|+1)(|u|+|w|+|y|)}. Because |v| + |x| + 1 >= 2 and |u| + |y| >= 2 (this has to be true because |vwx| <= n), the string is not in L. Thus the language is not context free. 3. Is the language L = {w w^R w: w is in {a,b}*} a context-free language? Prove or disprove your answer. Solution: L is not a context-free language. Remember that CFL are closed under union. We know that because L' = a*b*a*b* is a regular language it is also a context-free language. However, L union L' is L'' = a^i b^j a^i b^{j/2}, which is not a context-free language. If L'' were context-free than the pumping lemma should hold. Let z = a^n b^n a^n b^{n/2}. Because z = uvwxy and |vwx| <= n, |vw| >= 1, vwx cannot span the first and second set of "a"s (the string of "b"s between the two substrings is of length n). Thus if vwx contains any "a"s then when i=0 the resulting string will not be in L''. If vwx does not contain any "a"s it must contain all "b"s. However, vwx cannot span the first and second set of "b"s (the string of "a"s between the two substrings if of length n). Thus if vwx contains "b"s from the first group, when i=0 the number of "b"s in the first group will be less than twice the number of "b"s in the second group. Similarly, if vwx contains "b"s from the second group, when i=0 the number of "b"s in the second group will be less than half the number of "b"s in the first group. Because the pumped strings are not in L'', L is not context free. 4. Show that CFLs are closed under the REVERSE operation, where REVERSE(L) = {x: x^R is in L}. Solution: Let G = (V, T, P, S) be a CFG in Chomsky Normal Form such that L = L(G). We construct a CFG G' such that L(G') = REVERSE(L(G)). Note that every rule in G is of the form A -> BC or A -> a for nonterminals B,C and terminal a. Let G' = (V, T, P', S) where P' is defined as follows: - Every rule in P of the form A -> a is also contained in P' - For every rule in P of the form A -> BC, the rule A -> CB is contained in P'. For example, if P contains the following rules S -> aSb | ab (a^i b^i: i >= 1), in Chomsky Normal Form these rules become S -> AC | AB C -> SB A -> a B -> b G' contains the rules S -> CA | BA C -> BS A -> a B -> b which generates strings {ba, bbaa, bbbaaa, etc., or b^i a^i: i >= 1). 5. How would you modify the CYK algorithm (or the dynamic programming algorithm presented in the book, choose either one) to produce an actual derivation of x in G? Solution: Using the CYK algorithm we can store along with each symbol in the box, the two boxes higher in the table that were used to produce this symbol. For example, if S were stored in location (i=1, j=6) because A is stored in location (i=1, j=2), B is stored in location (i=3, j=4), and S -> AB is in the set of rules, then we store the two contributing locations along with the rule and symbol S in location (i=1, j=6). To produce a derivation of x, use the CYK algorithm to decide if x is in G. If x is in G, use the stored information to generate the derivation in the following way. Starting with the last box, output the stored rule(s) for S. For each symbol on the right hand side of the rule, visit the box that corresponds to the symbol and expand the derived string according to the rule stored there. Repeat this operation until no nonterminals are left in the string. Consider the application of CYK to the string baaaa in G, where the rules are S -> AB | AC, A -> BC | a, B -> CB | b, C -> AA | b 1 2 3 4 5 +-------+-------+-------+-------+-------+ | B,C | A | A | A | A | 1 | B->b | A->a | A->a | A->a | A->a | | C->b | | | | | +-------+-------+-------+-------+-------+ | {} | C | C | C | 2 | | C->AA | C->AA | C->AA | | | (2,1) | (3,1) | (4,1) | | | (3,1) | (4,1) | (5,1) | +-------+-------+-------+-------+ | A | S | S | 3 | A->BC | S->AC | S->AC | | (1,1) | (2,1) | (3,1) | | (2,2) | (3,2) | (4,2) | +-------+-------+-------+ | C | {} | 4 | C->AA | | | (1,3) | | | (4,1) | | +-------+-------+ | S | 5 | S->AC | | (1,3) | | (4,2) | +-------+ S -> A(1,3) C(4,2) -> B(1,1) C(2,2) C(4,2) -> b C(2,2) C(4,2) -> b A(2,1) A(3,1) C(4,2) -> ba A(3,1) C(4,2) -> baa C(4,2) -> baa A(4,1) A(5,1) -> baaa A(5,1) -> baaaa The rules and box locations can be stored in the dynamic programmin algorithm described in the textbook to generate the derivation in a similar fashion. 6. Use the CYK algorithm for the grammar G below to determine whether a) aaaaa b) aabbab are in the language generated by grammar G. G = ({S, A, B, C, a, b}, {a, b}, R, S) R = {S -> AB | BC A -> BA | a B -> CC | b C -> AB | a} Solution: a) position 1 2 3 4 5 +---+---+---+---+---+ l 1 |A,C|A,C|A,C|A,C|A,C| e +---+---+---+---+---+ n 2 | B | B | B | B | g +---+---+---+---+ t 3 |S,A|S,A|S,A| h +---+---+---+ 4 |{} |{} | +---+---+ 5 |S,A| S is in the box (1,5), so aaaaa is in the language. +---+ b) position 1 2 3 4 5 6 +-----+---+---+---+---+---+ l 1 |A,C |A,C| B | B |A,C| B | e +-----+---+---+---+---+---+ n 2 | B |S,C|{} |S,A|S,C| g +-----+---+---+---+---+ t 3 | B | B | A |S,C| h +-----+---+---+---+ 4 |S,C | A |S,C| +-----+---+---+ 5 |A,B |A,B| +-----+---+ 6 |S,B,C| S is in the box (1,6), so aabbab is in the language. +-----+