18-845 Individual Project (IP)
Assigned: Wed, Jan 16, 2019
Due: 11:59pm, Wed, Feb 13, 2019
Note: in the following,
$coursedir refers to /afs/ece/class/ece845.
Intro
For this project, you will design and implement your own protocol for
serving dynamic Web content. The purpose is to give you some
practical context when we study the research issues.
Description
The project has three parts:
-
Part I: Implement a baseline concurrent Web server in C.
- Part II:
Design an efficient protocol for serving dynamic
content, and then implement an optimized version of the
baseline concurrent server that uses your protocol.
-
Part III:
Evaluate the performance of your baseline and optimized servers,
characterizing the performance improvement of your new server.
Part I: Baseline concurrent server
Here are the requirements for the baseline server:
- Implements HTTP/1.0 GET requests for static and dynamic content.
- Assumes one connection per request (no persistent connections).
- Uses the CGI protocol (as implemented by the Tiny server) to serve dynamic content.
- Serves HTML (.html), image (.gif and
.jpg), and text (.txt) files.
- Accepts a single command-line argument: the port to listen on.
- Implements concurrency using either threads or I/O multiplexing.
Part II: Optimized concurrent server
The idea here is to improve the performance of your baseline server by
replacing the standard CGI protocol with a protocol of your own
design. This is entirely open-ended. Anything goes.
There are a number of existing standards for this kind of thing, such
as ISAPI (Microsoft), NSAPI (Netscape), and
fast-cgi. However, I would encourage you to forget about
these and start from first principles. Design something that is
simple and fast. A good design is likely to include some combination
of dynamic linking, pre-threading, and code caching.
Part III: Evaluation of baseline and optimized servers
In this part, you will evaluate how well your baseline and optimized
servers can serve dynamic content. Some options are to compare
request throughputs the performance as measured on the server, and/or
latencies as measured from the client. The goal is to convince your TA
that your approach is significantly faster than CGI. Be prepared to defend
the metrics you use to evaluate performance.
Some tools for performance benchmarking servers, in rough order of
quality, based on reports from 18-845 students:
- Gatling
and Seige. Both get
favorable comments from 18-845 students.
- ApacheBench (ab). Note: the ab
program is bundled with the Apache Web server. You can download this
from the Apache site. A standalone version is
available here. Documentation
is available
here.
- httperf. Note:
This tool hasn't been updated for awhile and tends to fail on Ubuntu
and Debian systems. However, students report that it works reliably on
Red Hat systems, such as the Andrew Linux machines
(linux.andrew.cmu.edu and ghcX.ghc.andrew.cmu.edu
machines). Feel free to use these. Another solution is to apply
for a free Red Hat AWS micro instance and run httperf there.
- autobench. Note:
this is a wrapper around httperf.
Handin instructions
Tar up the directory containing your solution in a file called
"ANDREWID.tar",
where ANDREWID is your Andrew login name, and copy it
to $coursedir/ip/handin.
You have list and insert privileges only in this directory. If you
need to hand in twice, put a number after later handins, e.g.,
ANDREWID-1.tar, ANDREWID-2.tar, and so on.
IP evaluation
Evaluation of the IP will be done by a live demo with the TA.
Please arrange your demo time with the TA. The projects are
open-ended and so is the evaluation. Here are the rough guidelines:
- Baseline server (35%). The goal here is just to get it working
serving multiple clients concurrently using standard CGI (as
implemented by the Tiny web server).
- Optimized server (35%). The idea here is to come up with a design
that attacks the biggest overheads associated with running CGI
programs: fork and exec.
- Evaluation (30%). The idea here is to develop a testing
infrastructure and workloads that will allow you to compare the
performance of the baseline server against the optimized server. You
will be demonstrating this testing infrastructure to the TA.
It's your responsibility to come up with a convincing evaluation
methodology and testing infrastructure.
Sources of information
- Students in a graduate class are expected to debug their own
programs. Your instructors are delighted to discuss design issues with
you, but please don't ask them to debug your programs.
- The 15-213 textbook, known as the CS:APP book, contains all of the
programming information that you need to complete the project, covering dynamic
linking, process control, Unix signals, Unix I/O, network programming, CGI protocol,
and application-level concurrency and synchronization. The E&S library
in Wean Hall has multiple copies on reserve.
-
Numerous code examples, including the csapp.c and
csapp.h files, and the Tiny Web server, are available from
the
CS:APP Student Web Site.
- Refer to the HTTP 1.0
specification for questions about HTTP.
- Volume 1 of Stevens is also an excellent reference for
advanced topics in sockets programming.
- Consider using tools such as curl, wget, nc
(netcat), or telnet as the client to do basic debugging of
your server.
|