PROGRAM #5
Fall 2010
Due: December 6, 2010, 5pm on Angel
General Guidelines:
There are two problems in this assignment. It is expected that this assignment will use a combination of hash tables and union find data structures. You are also free to use other data structures covered so far in this class. Obviously, the idea is to come up with the implementation that can guarantee the best performance for large input sizes, and the grading will factor that in.
Group Cities Connected by Air Routes:
You are given a database of flight connections, where an entry <A,B> means there are direct flights from both A to B and B to A.. Two cities are said to be “connected” if there is either a direct connection or a series of connections between them. A typical passenger is interested in answering the following two questions:
Q1) List out all city groups, such that each
group contains a unique set of cities that are all connected (by above
definition) by flight from one to another, and to no city outside the group.
Q2) Given any two arbitrary cities,
X and Y, is there a flight connection between them?
Since the above question is of the form of computing equivalence classes, the best way to implement the above functionalities is to use the union-find data structure. However, the union-find data structure has many kinds (as discussed in the lecture) and your objective for this assignment is to implement FOUR separate programs using the following four different kinds:
i) simple union + simple find
ii) union-by-rank + simple find
iii) simple union + find using path compression
iv) union-by-rank + find using path compression
Note: the code for all these four kinds of union-find data structure is in the book (also discussed in class) and you can directly use that source code. Your job is to write the application code that lets any user answer the above two questions efficiently, and as per the specifications below. You may have to bring in other data structures (in addition to union-find). Also use class inheritance or template functionalities effectively to minimize any code duplication.
Specifications:
You can assume that the input flight database is of the form of a comma-delimited text file, in which every line lists two cities that have a direct flight between them.
Here is an example input file.
City group #1 List
all cities connected in this class. City group #2 List
all cities connected in this class. …
The output for Q1 should
look something like:
The output to Q2 is a simple YES or NO.
Class CityConnector { public: CityConnector(string InputFileName); void ReportConnectedCities(); boolean
CheckConnection(string city_1, string city_2); // any other member functions and private data
stores that you may need …
The program’s API should look like the following:
// This function should allow the user to load an input text file
// This function should compute the eq. classes and output the groups
// In addition, this function should output the sum total of the number of
// steps all find()
operations take to climb to find its current root.
// As an example, if at any
given point if your union-find forest looks like
// what is shown in slide #27
of the lecture notes on union-find, then find(7)
// will take 2 steps away
from the root, and so it will contribute 2 to the
// final count.
// Also output the number of
find operations (i.e., # calls to find() from
union)
// You will need this number
to compare the four implementations.
// this function checks whether two cities are connected
}
Testing: Test your code with the input file provided above. For this you will have to write a main() function. In this function, first call ReportConnectedCities() once and print its output. After that, make a few calls to CheckConnection() doing arbitrary checks to test out if your code is working. You have to do this, once for each of your four implementations.
Store the output of testing into four corresponding text files (or as one file) for submission.
Written Report:
In a separate 1-page document (Word or PDF), compile the following two sections:
· Briefly explain your design. A design flowchart will also be sufficient.
· Populate the following table:
Implementation |
Total number of find() operation calls made from the union() function |
Total number of steps climbed over all find() operations |
simple union + simple find |
|
|
union-by-rank + simple find |
|
|
simple union + find using path compression |
|
|
union-by-rank + find using path compression |
|
|
Provide a very brief justification for these numbers – i.e., do these meet your expectations and why?
FINAL CHECKLIST FOR SUBMISSION:
___ Cover sheet
___ Source code
___ Output files (4 files, 1 for each implementation)
___ Report.doc or Report.pdf
___ All the above zipped into another folder archive called Program5<YourLastName>.zip for submission on Angel.