```Homework #7

1.  Show that the language L = {a^i b^j c^k: i < j < k} is not a context-free
language.

Solution:  If L were context free, then the pumping lemma should hold.
Let z = a^n b^{n+1} c^{n+2}.  Given this string and knowing that |z| >= n,
we want to define z as uvwxy such that |vwx| <= n, |vx| >= 1.
Because |vwx| <= n, there are five possible descriptions of uvwxy:

1.  vwx is a^p for some p<=n, p>=1
2.  vwx is a^p b^q for some p+q<=n, p+q>=1
3.  vwx is b^p for some p<=n, p>=1
4.  vwx is b^p c^q for some p+q<=n, p+q>=1
5.  vwx is c^q for some i<=n, i>=1

Note that because |vwx| <= n, vwx cannot contain both "a"s and "c".
For all of these cases, u v^i w x^i y, i>=0, should be in the language.

In case 1, if i=2 we will be adding an a to the string, making the
number of "a"s n+1 and thus the string is not in the language.  The
same argument holds for case 3 in which the number of "b"s will be
equal to the number of "c"s.  A similar argument holds in case 5.
In case 5 if i=0 then the number of "c"s will be less than or equal to
the number of "b"s.

In case 2, when i=2 either the number of "a"s will be greater than the
number of "b"s or the number of "b"s will be greater than the number of
"c"s (depending on the distribution of v and x).

In case 4, when i=0 either the number of "b"s will be less than or equal
to number of "a"s or the number of "c"s will be less than or equal to
the number of "b"s (depending on the distribution of v and x).

2.  Show that the language L = {a^i: is is a prime} is not a context-free
language.

Solution:  If L were context free, then the pumping lemma should hold.
We pick a string z equal to 0^m, where m is prime and >= n + 2.
0^m = uvwxy = 0^|u| 0^|v| 0^|w| 0^|x| 0^|y|, |vwx| <= n, |vx| >= 1.
If we pick i = |u| + |w| + |y|, then uv^iwx^iy =
0^|u| 0^{|v|(|u|+|w|+|y|)} 0^|w| 0^{|x|(|u|+|w|+|y|)} 0^|y|
= 0^{(|v|+|x|+1)(|u|+|w|+|y|)}.  Because |v| + |x| + 1 >= 2 and
|u| + |y| >= 2 (this has to be true because |vwx| <= n), the string
is not in L.  Thus the language is not context free.

3.  Is the language L = {w w^R w: w is in {a,b}*} a context-free language?

Solution:  L is not a context-free language.  Remember that CFL are closed
under union.  We know that because L' = a*b*a*b* is a regular language it is
also a context-free language.  However, L union L' is
L'' = a^i b^j a^i b^{j/2}, which is not a context-free language.

If L'' were context-free than the pumping lemma should hold.  Let
z = a^n b^n a^n b^{n/2}.  Because z = uvwxy and |vwx| <= n, |vw| >= 1,
vwx cannot span the first and second set of "a"s (the string of "b"s
between the two substrings is of length n).  Thus if vwx contains any "a"s
then when i=0 the resulting string will not be in L''.

If vwx does not contain any "a"s it must contain all "b"s.  However, vwx
cannot span the first and second set of "b"s (the string of "a"s between
the two substrings if of length n).  Thus if vwx contains "b"s from the
first group, when i=0 the number of "b"s in the first group will be less
than twice the number of "b"s in the second group.  Similarly, if vwx
contains "b"s from the second group, when i=0 the number of "b"s in the
second group will be less than half the number of "b"s in the first group.
Because the pumped strings are not in L'', L is not context free.

4.  Show that CFLs are closed under the REVERSE operation, where
REVERSE(L) = {x: x^R is in L}.

Solution:  Let G = (V, T, P, S) be a CFG in Chomsky Normal Form
such that L = L(G).  We construct a CFG G' such that L(G') = REVERSE(L(G)).
Note that every rule in G is of the form
A -> BC or
A -> a     for nonterminals B,C and terminal a.

Let G' = (V, T, P', S) where P' is defined as follows:

- Every rule in P of the form A -> a is also contained in P'
- For every rule in P of the form A -> BC, the rule
A -> CB is contained in P'.

For example, if P contains the following rules

S -> aSb | ab    (a^i b^i:  i >= 1),

in Chomsky Normal Form these rules become

S -> AC | AB
C -> SB
A -> a
B -> b

G' contains the rules

S -> CA | BA
C -> BS
A -> a
B -> b

which generates strings {ba, bbaa, bbbaaa, etc., or b^i a^i: i >= 1).

5.  How would you modify the CYK algorithm (or the dynamic programming
algorithm presented in the book, choose either one) to produce an actual
derivation of x in G?

Solution:  Using the CYK algorithm we can store along with each symbol in
the box, the two boxes higher in the table that were used to produce this
symbol.  For example, if S were stored in location (i=1, j=6) because A is
stored in location (i=1, j=2), B is stored in location (i=3, j=4), and
S -> AB is in the set of rules, then we store the two contributing locations
along with the rule and symbol S in location (i=1, j=6).  To produce a
derivation of x, use the CYK algorithm to decide if x is in G.  If x is in
G, use the stored information to generate the derivation in the following
way.  Starting with the last box, output the stored rule(s) for S.  For each
symbol on the right hand side of the rule, visit the box that corresponds to
the symbol and expand the derived string according to the rule stored there.
Repeat this operation until no nonterminals are left in the string.

Consider the application of CYK to the string baaaa in G, where the
rules are   S -> AB | AC, A -> BC | a, B -> CB | b, C -> AA | b

1       2       3       4       5
+-------+-------+-------+-------+-------+
|  B,C  |   A   |   A   |   A   |   A   |
1 | B->b  | A->a  | A->a  | A->a  | A->a  |
| C->b  |       |       |       |       |
+-------+-------+-------+-------+-------+
|  {}   |   C   |   C   |   C   |
2 |       | C->AA | C->AA | C->AA |
|       | (2,1) | (3,1) | (4,1) |
|       | (3,1) | (4,1) | (5,1) |
+-------+-------+-------+-------+
|   A   |   S   |   S   |
3 | A->BC | S->AC | S->AC |
| (1,1) | (2,1) | (3,1) |
| (2,2) | (3,2) | (4,2) |
+-------+-------+-------+
|   C   |  {}   |
4 | C->AA |       |
| (1,3) |       |
| (4,1) |       |
+-------+-------+
|   S   |
5 | S->AC |
| (1,3) |
| (4,2) |
+-------+

S -> A(1,3) C(4,2) -> B(1,1) C(2,2) C(4,2) -> b C(2,2) C(4,2) ->
b A(2,1) A(3,1) C(4,2) -> ba A(3,1) C(4,2) -> baa C(4,2) ->
baa A(4,1) A(5,1) -> baaa A(5,1) -> baaaa

The rules and box locations can be stored in the dynamic programmin
algorithm described in the textbook to generate the derivation in a
similar fashion.

6.  Use the CYK algorithm for the grammar G below to determine whether
a) aaaaa
b) aabbab

are in the language generated by grammar G.

G = ({S, A, B, C, a, b}, {a, b}, R, S)
R = {S -> AB | BC
A -> BA | a
B -> CC | b
C -> AB | a}

Solution:

a)
position
1   2   3   4   5
+---+---+---+---+---+
l   1 |A,C|A,C|A,C|A,C|A,C|
e     +---+---+---+---+---+
n   2 | B | B | B | B |
g     +---+---+---+---+
t   3 |S,A|S,A|S,A|
h     +---+---+---+
4 |{} |{} |
+---+---+
5 |S,A|       S is in the box (1,5), so aaaaa is in the language.
+---+

b)
position
1     2   3   4   5   6
+-----+---+---+---+---+---+
l   1 |A,C  |A,C| B | B |A,C| B |
e     +-----+---+---+---+---+---+
n   2 | B   |S,C|{} |S,A|S,C|
g     +-----+---+---+---+---+
t   3 | B   | B | A |S,C|
h     +-----+---+---+---+
4 |S,C  | A |S,C|
+-----+---+---+
5 |A,B  |A,B|
+-----+---+
6 |S,B,C|       S is in the box (1,6), so aabbab is in the language.
+-----+
```