CptS/EE 562 Literature Survey
Assignment
Spring, 2002
10% of final grade
(Updated February 15, 2002)
Bottom Line
Assigned: Thursday
February 7, 2002
Due: Thursday
March 7, 2002 in class (firm deadline). Hand in hardcopy to class, and also email the TA (David McKinnon, mckinnon@eecs.wsu.edu)
a copy (.pdf or .doc or .dvi
preferred) too. These will be put on the
class web site, too, for others to read, so include your personal URL if you
would like it linked in here. Be sure to
include your paper title and personal URL in your email to David so that he can
easily copy-and-paste it into the web page.
Caveat: Start Early!
You are hereby warned…
Overview
In this assignment you will apply what you have learned in
class to analyze 4-5 research papers (“primary papers”) in a related area (only 3 in some cases where it is a key area that only 3 are
available for; you will have to explain that there are only 3). You will also skim 3-5 papers (“secondary
papers”) cited in these primary papers to get a better feel for the topic. The basic idea is to analyze and summarize
the state of the art in this field.
In all places in your report, you are expected to use
technical terms carefully, as discussed in class and in Dependable
Computing and Fault Tolerance: Concepts and Terminology. You should also be careful to have your
citations in standard citation formats, and include URLS when they are
available.
Organization of Your Report
Your report should be organized as follows:
- Abstract:
a 150-200 word summary summarizing the technology/issues/problems (e.g.,
Byzantine quorum systems, high-performance multicast, etc) and summarizing
the state of the art in that area.
The first words of your abstract should be: “This report summarizes
the state of the art in XXX as represented by papers [X,Y,Z]”,
where XXX is the technology/issue/problem and [X], [Y], and [Z] are
references below.
- Overview
of Technical Issues: a half or full page explaining the technology and the
issues involved. A figure is
appropriate but optional (probably half of the students’ reports can use
one, half not).
- Paper
summaries: discuss the results, limitations, and strengths of each
paper. 1 to 3 pages per paper.
- Analysis:
1 to 3 pages comparing and contrasting the papers reviewed. You will be expected to apply good judgement and your knowledge of fault tolerance, not
just rehashing what the papers say.
For example, your judgement about if the
papers are making good points or are arguing about a “distinction without
a difference”.
- Conclusions:
100-300 words summarizing the report.
- References:
all papers mentioned in standard citation form. I prefer the format [JKZ93] for
something with authors’ last names starting with ‘J’ and ‘K’ and ‘Z’ and
year of publication 1993, not the [1] format (I have a horrible memory so
I need the help remembering which paper you are citing when reading your
reports). If there are more than 3
authors, do something like [JKZ+93]. The citation of course has the full
author list, but the “label” for it (e.g., “[JZK+93]”) only has the first
3 authors’ initials, so the References section looks aesthetically
pleasing!
You can cite the textbook too if it
provides useful background on a specific issue, but include the section number
probably to be helpful.
If you cite any web pages as the
source, unless an author is listed, use the name of the organization (BBN Corp,
UCSB for UC Sanata Barbara, etc), then the title of
the web page, then it URL. Assume the
date of publication is 2002, unless it says last updated in 2001 or something
like that.
A sample format will be emailed to the class in .doc format
at least 2 weeks before the due date.
Paper Sources
The sources below have two kinds of links: one for the
conference or workshop itself (which usually contains the program, including
the paper titles), and also the source for the conference papers (ACM or
IEEE). Some conferences don’t have online
sources (e.g., expensive publishers like Springer and Wiley), but you can
almost always find the paper online with google or
looking at the author’s web pages.
The WSU library has subscriptions for the online
publications for the IEEE and ACM. They
will not likely include most conferences and workshops in the last 6-12 months,
so you will have to search for the paper like a Wiley or Springer paper.
(In general, if one cannot find a paper in their library or
online, it is standard practice to email the author, and they are obliged as a
matter of professional courtesy to send you an electronic or paper copy. Few authors mind this, and zero of the grad
student authors! If you cannot, then the
WSU library’s inter-library loan system can probably get you a copy, but I do
not know how long it will take.)
We are focusing on papers published in the last 2-3 years,
but if there is a need to include in your paper set a paper from as early as
1995, that is OK. Below are sources for
the last 2-3 years only. Most
conferences and workshops are every year at the same time each year (within a
week), but some are every 18 or 24 months.
Most conferences are IEEE, and their overall paper web page
is here. You can look at tables of contents, abstracts,
etc but have to have the IEEE library configuration/password to access the full
papers. Also, note that the conferences
often have associated one-day or half-day workshops or “Fast Abstracts” or
“Works in Progress” sessions that have shorter papers online that may be of
interest (but do not count as a full paper in terms of your review number of
papers).
Before I go into the list, note that a HUGE source of
finding related work is CiteSeer. It probably won’t be needed for you on this
project, but it can tell you who has cited your paper, which might give you an
idea of related work.
Top-Tier Conference and Workshops Completely or Mostly on Fault Tolerance
The International Conference on Dependable Systems and
Networks (DSN), IEEE/IFIP (Note: FTCS and DCCA merged to become DSN a few years
back.)
- DSN-2002 (FYI, program will probably not be
out before the due date)
- DSN-2001
(conference site, paper site at IEEE) (Textbook
co-author Paulo was program co-chair.)
- DSN-2000
(conference site, paper site
at IEEE)
- FTCS
proceedings linked at here; (caveat,
“ftcs.org” used to be I think a site for this, but I just tried it and
it’s a porno site!!) I have a DVD
with the papers from all past FTCSs, so if you
cannot find your paper then I can get David to email you the .pdf from it. FTCS-29
(1999) papers are here. Bill Sanders (who I teamed up with on AQuA and who visited here in 2000) was program
co-chair!
Some Middle-Tier or
Other Conference and Workshops Completely or Mostly on Fault Tolerance
Symposium on Reliable Distributed Systems (SRDS), IEEE
Workshop on Object-Oriented Real-Time Dependable Systems (WORDS),
IEEE
Top-Tier Conferences and Workshops Often with Some Fault Tolerance
International Conference on
Distributed Computing Systems (ICDCS), IEEE
International Conference on Distributed Systems Platforms
(Middleware), IFIP/ACM
- Middleware
2001 (conference web
site, papers not online but I have the hardcopy if you cannot find on
the author’s web page. Hey, my
advisor’s (Schlichting) paper got best paper
award!)
- Middleware
2000 (conference web
site, papers not online but I have the hardcopy if you cannot find on
the author’s web page)
- Middleware
1998 (conference
web site, papers not online but I think I have the hardcopy if you
cannot find on the author’s web page)
Some Middle-Tier or Other Conference and Workshops with Some Fault
Tolerance
International Symposium on Distributed Objects &
Applications
Journals
There are really no journals dedicated to fault tolerance,
other than the IEEE Transactions on Reliability which has almost nothing to do
with fault tolerant distributed systems.
Further, unlike many other fields, experimental “systems software”
researchers do not usually bother with journal publications, because of their
long lead times (1-3 years), so conferences are by far the preferred way to
have “impact” with your research. To say
that for experimental systems programmers journals are “almost a joke” is a bit
too strong, and would offend some from other fields (or those that run the
journals), so I won’t! They do have the
one strong virtue of no (hard) page limitations.
Still, some top-tier journals that occasionally have fault
tolerance papers in them are:
Some Example Paper Sets
Below are some paper sets that would be a good choice. These are only the primary papers that have
to be read and summarized, the few that will also have
to be skimmed are not included here.
But you can come up with your own choice. Just find a topic in the last year in the
above workshops and conferences that interests you. Find one or two papers from it, then choose 2-3 more papers from the papers they cite. Voila!
You have your own paper set, on a topic that interests you!
But here are some of the possibilities:
- Byzantine
fault-tolerant multicast: Reiter Rampart, MIT “Practical” Byzantine
fault tolerance, SecureRing, plus a DSN-02 paper
accepted “Quantifying the Cost of Providing Intrusion Tolerance in Group
Communication Systems” which gives full citations of previous 2. I have this paper and can give to the
person who takes this set. Also
DSN-2001 Session 10A paper #3.
SRDS 2001 Session 4 paper
#3; DOA-2001 paper “Transparent Dynamic Reconfiguration for CORBA” [Joey does,
maybe room for one more]
[Note: this topic has too many
papers, so you could subset or two people could do it.]
- Replication
and (re)configuration: DSN-2001 Session 3B papers 2,3; SRDS-2001
Session 1 paper #3; WORDS-2002 paper “Asynchronous Leasing”; Words-2001
paper “Reconfiguration of Resources in Middleware; ICDCS-2000 Session 8C paper #2; DOA-2001
paper “Coordinating the Simultaneous Upgrade of Multiple CORBA Application
Objects”
[Note: this topic has too many
papers, so you could subset or two people could do it.]
- Replication
and security: DSN-2001 Session 4B papers 1,3,4; Find Mike Reiter’s home page (oops, its here; he left ATT research for
CMU. This has lots of great links
to research involving the intersection of fault tolerance and security.)
- Wireless/mobile
and fault tolerance: DSN-2001 Session 5A paper 1; SRDS-2001 Session 6
papers 1,6; WORDS-2002 paper
“Scalable Group Membership Service for Mobile Internet”;
- Real-time
fault tolerance: DSN-2001 Session 7A papers 2,4; get at least one
recent paper from Prof. Hermann Kopetz (google will find him, or search in recent DSNs) [Wes does]
- Group
communication (non-Byzantine): DSN-2001 Session 9A paper 2; [Kevin]
- Byzantine
diffusion: SRDS-2001 Session 3 paper 2 (trace its related work for
rest of paper set); [Ioanna does]
- Gossip-style
multicast: start with the SpinGlass project
at Cornell (Birman et al), search for “gossip”
in the online DSN and SRDS proceedings, look at others if not enough papers. [Ty does]
- Byzantine-tolerant
replication: SRDS-2001 Session 5 paper #2, DSN-2002 to-appear paper on
ITDOS (Prof. Bakken has); follow some references (if any) [Likely Sudipto]
- FT
CORBAs: (Eternal, AQuA,
DOORS, IRL – Prof. Bakken has); WORDS-2001 paper “Using Semantic Knowledge
of Distributed Objets to Increase Reliability
and Availability”; DOA-2001 paper “Lightweight Fault-Tolerance in CORBA”
- Inexact
voting: (2 cited in book chap 7); ICDCS-2001 Session 4a paper #1; [Andy does]
- Quorum
systems: ICDCS-2001 Session 4A paper #3; ICDCS-2000 Session 8C paper
#1;
- Optimistic
fault tolerance: SRDS-2000 Session 2 paper #2; ICDCS-2001 Session 5A
paper #1 (“Optimistic Active Replication”); follow a few of their
citations for a few more.
- You can choose your own topic, remember: look
at the papers on the conference links to see what researchers are
grappling with in the last few years!
Come up with a paper set somehow related, then
go for it!