Homework #5

1.  Construct a context-free grammar for strings over {a,b} with
exactly twice as many "a"s as "b"s.


2.  Consider the grammar G = ({S,A,a,b}, {a,b}, R, S)
       R = S-> AA,
	   A -> AA | AAA | a | bA | Ab

    a) Which strings of L(G) can be produced by derivations of four or fewer
       steps?

    b) Give at least four distinct derivations for the string babbab


3.  Minimize the DFA defined by the machine M.
    M = ({a,b,c,d,e,f,g,h}, {0,1}, delta, a, {d})
    delta(a,0) = b	delta(a,1) = g
    delta(b,0) = g	delta(b,1) = c
    delta(c,0) = d	delta(c,1) = b
    delta(d,0) = d	delta(d,1) = d
    delta(e,0) = d	delta(e,1) = f
    delta(f,0) = a	delta(f,1) = e
    delta(g,0) = f	delta(g,1) = a
    delta(h,0) = g	delta(h,1) = d


4.  Given the grammar G = ({A,B,a,b}, {a,b}, R, S)
    R = {S -> aB | bA,
	 A -> a
	 B -> aBB | bS | b}

    a) Show a leftmost derivation of the string aaabbabbba.

    b) Show a rightmost derivation of the string aaabbabbba.

    c) Show the parse tree corresponding to a derivation of aaabbabbba.

    d) Is this grammar unambiguous?  Why or why not?


5.  Lex and Yacc.

    For this problem you must write a short parser using Lex and Yacc.
For more information on how to use Lex and Yacc, look at the "lex" and "yacc"
home pages.

   The grammar you are to parse is defined as:

--------------------------------------------------------------------------------

   pascal_program -> PROGRAM ident ( program_heading ) ; block .

   program_heading -> ident | program_heading, ident

   block -> BEGIN END

--------------------------------------------------------------------------------

   You can use Lex to parse the characters comprising "program", "(", ")", ".",
"begin", "end", and the identifier which is defined by the DAFs below.


                            letter, digit
                                 +-+
                                 | |
                                 v |
           +-+   .    +-+ letter +-+    .   +===+
   ident:  | |------->| |------->| |------->|| ||
           +-+        +-+        +-+        +===+


           +-+ a-z,A-Z+===+
   letter: | |------->|| ||
           +-+        +===+


           +-+  0-9   +===+
   digit:  | |------->|| ||
           +-+        +===+



   Parts of the code are provided below.  The idea is to give you an opportunity
to use a real lexical analyzer (Lex) and a context-free grammar parser (Yacc).


----------------------- File "l" containing lex code ---------------------------

/* Definitions first */
/* Allow for white space of any kind (blanks, tabs, newlines) between tokens */

letter   [a-zA-Z]
delim    [ \n\t]
ws       {delim}+

%%

/* Now rules */

{ws}           {}
"program"      { fprintf(stderr,"Parsed PROGRAM token\n"); return(TPROGRAM); }

%%

/* Finally user routines */

pident(lexeme)
   char *lexeme;
{
   fprintf(stderr,"Parsed ident %s\n",lexeme);
}


--------------------- File "y" containing yacc code ----------------------------

%token TPROGRAM TLPAREN TRPAREN TSEMI TDOT TCOMMA TBEGIN TEND TIDENT

%%

block            : TBEGIN TEND

%%

#include "lex.yy.c"
#include 

main()
{
   return(yyparse());
}

yyerror(s)
   char *s;
{
   fprintf(stderr,"%s\n",s);
}

--------------------------------------------------------------------------------

To put these programs together, execute the following statements.

* Run the lex program on file "l" by typing "lex l".
  This creates the file lex.yy.c.

* Run the yacc program on file "y" by typing "yacc y".
  This creates the file y.tab.c.

* Compile the program by typing "cc y.tab.c -ly -ll".
  This creates the file a.out.

* Create a file called "in1" with the following input:
  program "Test" ("id1", "id2", "id3");
  begin
  end.

* Create a file called "in2" with the following input:
  program "simple" ("name"); begin end.

* Run the program on input "in1" and output to the file "out1" by typing
  "a.out & out1".

* Run the program on input "in2" and output to the file "out2".

* Turn in your lex file, your yacc file, and the two output files.
  The output for the data "in1" should look like:

Parsed PROGRAM token
Parsed ident "Test"
Parsed ( token
Parsed ident "id1"
Parsed , token
Parsed ident "id2"
Parsed , token
Parsed ident "id3"
Parsed ) token
Parsed ; token
Parsed BEGIN token
Parsed END token
Parsed . token
Done

   The output for the data "in2" should look like:

Parsed PROGRAM token
Parsed ident "simple"
Parsed ( token
Parsed ident "name"
Parsed ) token
Parsed ; token
Parsed BEGIN token
Parsed END token
Parsed . token
Done