Homework #6


1.  Construct a pushdown automata for strings over {a,b} with
exactly twice as many "a"s as "b"s.

   Solution:

   This machine is similar to the one given in the textbook to accept strings
   over {a,b} with exactly as many "a"s as "b".  In this case, for each a we
   push an A or pop a B, depending on what is on the top of the stack.
   For each b we push BB or pop AA, depending on what is on the top of the
   stack.  Because we cannot pop AA in one move, we pop A and change state.
   In this new state, we an A is on the top of the stack we can pop the A
   without processing any input and move back to the old state.

   M = ({s, p, q, f}, {a, b}, {A,B,C}, delta, s, {f})

   delta(s, e, e) = (s, C)
   delta(a, b, C) = (a, BBC)	delta(a, b, B) = (a, BBB)
   delta(a, a, C) = (a, AC)	delta(a, a, B) = (a, e)
   delta(a, a, A) = (a, AA)	delta(a, b, A) = (b, e)
   delta(a, e, C) = (f, e)
   delta(b, b, C) = (b, BBC)	delta(b, b, B) = (b, BBB)
   delta(b, a, C) = (b, AC)	delta(b, a, B) = (b, e)
   delta(b, e, A) = (a, e)


2.  Construct a PDA for the language {a^i b^j:  i neq j}

    Solution:

    M = ({q0, q1, q2, q3}, {a,b}, {A,B}, delta, q0, {q2, q3})

    delta(q0, a, e) = (q0, A)	delta(q0, a, A) = (q0, AA)
    delta(q0, b, e) = (q1, B)	delta(q0, b, A) = (q1, e)
    delta(q1, b, A) = (q1, e)	delta(q1, b, e) = (q2, e)
    delta(q1, e, A) = (q3, e)	delta(q2, b, e) = (q2, e)


3.  Use the procedure outlined in class to generate a PDA from a CFG to
generate a PDA for the CFG G below.  Trace the operation of the PDA on the
input string (()()).

    G = ({S, (, )}, {(, )}, R, S)
    R = {S -> e,
	 S -> SS,
	 S -> (S)}

    Solution:

       M = ({p,q}, {(,)}, {S, (, )}, Delta, p, {q})
       Delta(p,e,e) = (q,S)
       Delta(q,e,S) = (q,e)
       Delta(q,e,S) = (q,SS)
       Delta(q,e,S) = (q,(S))
       Delta(q,(,() = (q,e)
       Delta(q,),)) = (q,e)

       (p, (()()), e) |- (q, (()()), S) |- (q, (()()), (S)) |-
	  (q, ()()), S)) |- (q, ()()), SS)) |- (q, ()()), (S)S)) |-
	  (q, )()), S)S)) |- (q, )()), )S)) |- (q, ()), S)) |- (q, ()), (S))) |-
	  (q, )), S))) |- (q, )), ))) |- (q, ), )) |- (q, e, e)



4.  Convert grammar G into an equivalent Chomsky Normal Form grammar.

    G = (V, Sigma, R, E)
    V =     {+, *, (, ), id, T, F, E}
    Sigma = {+, *, (, ), id}
    R =     {E -> E + T | T
	     T -> T * F | F
	     F -> ( E ) | id

    Solution:

       * There are no e-productions.

       * There are 2 unit productions:
	 E -> T
	 T -> F

	 After replacement grammar is

	 E -> E + T | T * F | ( E ) | id
	 T -> T * F | ( E ) | id
	 F -> ( E ) | id

      * There are no useless symbols.

      * Replace terminals on right sides of rules (with 2 or more symbols) with
	nonterminals

	After replacement grammar is

	E -> E X1 T | T X2 F | X3 E X4 | id
	T -> T X2 F | X3 E X4 | id
	F -> X3 E X4 | id
	X1 -> +
	X2 -> *
	X3 -> (
	X4 -> )

      * Make each rule have two nonterminals or one terminal on the right side.

	After replacement is

	E -> E D1
	D1 -> X1 T
	E -> T D2
	D2 -> X2 F
	E -> X3 D3
	D3 -> E X4
	E -> id
	T -> T D2
	T -> X3 D3
	T -> id
	F -> X3 D3
	F -> id
	X1 -> +
	X2 -> *
	X3 -> (
	X4 -> )


5.  Let G be a grammar in Chomsky normal form.  If w has length n and is an
element of L(G), what is the maximum possible depth of any derivation
tree for w?  What is the minimum possible depth of any derivation tree for w?
Provide precise bounds, and prove that your answers are the best possible.
Note that the depth of a tree is the maximum number of edges on any simple path
from the root to a leaf.

   Solution:

      The depth of the derivation tree depends on the number of variables in the
      grammar.  We show in class that if the number of variables is k, the
      maximum-length path for w (and thus the depth of the derivation tree)
      is k.  The proof of this bound is provided as part of the proof of the
      pumping lemma for context free languages.

      For the minimum depth, note that CNF rules are of the form
	 A -> BC
	   or
	 A -> a
      (B,C nonterminals, a terminal).

      Let n = |w|, w = a1 a2 ... an.  Each character a1 ... an will be derived
      by a separate rule of the form Xi -> ai.  However, no rule will contain
      more than two of the Xi nonterminals on the right hand side.  Thus the
      shortest possible derivation will only be able to contain one Xi on the
      right hand side of each rule used, except for the bottom rule in the tree
      which will contain two Xi nonterminals.  The form of this derivation will
      be

	 S      -> X1 D1
	 D1     -> X2 D2
	 D2     -> X3 D3
	 ...
	 D{n-1} -> X{n-1} Xn

      The shortest derivation will apply all of these n rules in term plus
      the rule Xn -> an (other terminals can be generated earlier in the tree).
      The total derivation length of this shortest derivation, and thus the
      minimum length of the derivation tree, is n + 1.


5.  From Homework #5:  Lex and Yacc.

File "l"

letter   [a-zA-Z]
delim    [ \n\t]
ws       {delim}+
digit    [0-9]

%%

{ws}           {}
"program"      { fprintf(stderr,"Parsed PROGRAM token\n"); return(TPROGRAM); }
"begin"        { fprintf(stderr,"Parsed BEGIN token\n"); return(TBEGIN); }
"end"          { fprintf(stderr,"Parsed END token\n"); return(TEND); }
"("            { fprintf(stderr,"Parsed ( token\n"); return(TLPAREN); }
")"            { fprintf(stderr,"Parsed ) token\n"); return(TRPAREN); }
";"            { fprintf(stderr,"Parsed ; token\n"); return(TSEMI); }
"."            { fprintf(stderr,"Parsed . token\n"); return(TDOT); }
","            { fprintf(stderr,"Parsed , token\n"); return(TCOMMA); }
\"{letter}({letter}|{digit})*\"          { pident(yytext); return(TIDENT); }

%%

pident(lexeme)
   char *lexeme;
{
   fprintf(stderr,"Parsed ident %s\n",lexeme);
}


File "y"

%token TPROGRAM TLPAREN TRPAREN TSEMI TDOT TCOMMA TBEGIN TEND TIDENT

%%

pascal_program   : TPROGRAM ident
                   TLPAREN program_heading TRPAREN TSEMI block TDOT
                      {fprintf(stderr,"Done\n");}

program_heading  : ident
                 | program_heading TCOMMA ident
                      {fprintf(stderr,"Parsed program heading\n");}

block            : TBEGIN TEND

ident            : TIDENT

%%

#include "lex.yy.c"
#include 

main()
{
   return(yyparse());
}

yyerror(s)
   char *s;
{
   fprintf(stderr,"%s\n",s);
}