Do I need to support any HTTP versions other than 1.1?
No, you can assume that during marking all messages will be HTTP/1.1. However, a more robust proxy would send a 505 HTTP Version Not Supported
response if it encounters a request with an unsupported version.
Do I need to support any methods other than GET
, HEAD
, POST
, and CONNECT
?
No, you can assume that during marking these are the only methods that will be tested. However, supporting other methods should be trivial, as would sending a 501 Not Implemented
response, either of which would make the proxy more functional.
How will headers be encoded?
You can assume all headers will use the ASCII character set.
Will all headers be line terminated with CRLF?
Yes, you can assume this to be the case during marking. However, a more robust message parser may recognise a single LF as a line terminator and ignore any preceding CR.
What should I do if there's no Host
header field?
Nothing special, just forward the request on as usual.
How many header fields could there be? How big can a header field be? How big can a header section be?
HTTP does not place a predefined limit on the length of a header field or on the total size of the header section. It is up to implementations to define their own constraints. For our purposes, you can assume no single header (including the start line) will be greater than 1024 bytes, and the header section in total will be no greater than 8192 bytes (including the CRLF indicating the end of the header section). Within those contraints, there could be any number of header fields.
How big can a message be?
There is no fixed limit. You can assume that the proxy has sufficient main memory to buffer each message entirely, and that any valid Content-Length
value can fit within a 4-byte signed integer. However, you should not assume that each message can fit within the proxy's send/receive buffers.
What content formats/encodings should the proxy be able to handle?
Your proxy does not need to process content or modify content encoding—just forward data as-is. So it should be able to handle any kind of data, including binary data. As such, data may include null bytes, so you should be cautious about using any C-style functions like strlen()
that would misinterpret such characters as end-of-string markers.
Will responses always include a reason phrase?
No, not necessarily. Reason phrases are optional, so your proxy should be able to handle such a case.
Do I need to support pipelining?
No. While pipelining was introduced in HTTP/1.1, many servers do not support it correctly, so most clients wouldn't typically use it. During marking, all requests over a single connection will be sequential (non-pipelined).
What should I do if I get a port/address already in use error?
This usually means another process is already bound to the port you're trying to use and will typically come about because either:
SIGINT
(keyboard interrupt). But otherwise, you could either choose a different port, wait a minute or two for the operating system to clean up after you, or use setsockopt()
to set SO_REUSEADDR
and allow reuse.Will my proxy be abruptly terminated during marking?
No.
If my proxy crashes during marking, will it be restarted?
Yes, but you are urged to test your proxy thoroughly and ensure it's robust enough to survive marking. Obviously if your proxy crashes, it will have failed some test and inherently lost marks.
How can I configure Firefox to use my proxy?
Go to Settings → Network Settings → Manual proxy configuration, enter localhost
or 127.0.0.1
and your proxy’s port under HTTP Proxy. Once your proxy supports the CONNECT
method, check the option to Also use this proxy for HTTPS.
Why does my browser keep using HTTPS even though I type in HTTP?
Some browsers enforce HTTPS via HSTS (HTTP Strict Transport Security), upgrade connections automatically, or default to HTTPS-first mode. Make sure you are explicitly typing http://
in the address bar, and if the https://
site has previously loaded, try clicking on the lock icon next to the URL in the address bar and select "Clear cookies and site data...".
Why is my browser making connections/requests that I don't recognise?
Browsers make background requests for various reasons, such as:
Hence, it's a good idea to focus on simpler user agents early in your development process.
How is my browser loading pages without sending a request?
It could be using a cached response instead of making a network request. Try disabling caching (e.g., Ctrl/Cmd + Shift + R
for a hard refresh in most browsers, or opening Developer Tools and selecting "Disable cache" on the "Network" tab).
Will any GET
request include a body?
No, a GET
request should not and will not have a body.
Will any HEAD
request include a body?
No. The HEAD
method is identical to GET
except that the server must not send content in the response.
Will any POST
request include a body?
Typically. The POST
method is usually used this way.
Will any CONNECT
request include a body?
No, CONNECT
request should not and will not have a body.
How many clients/connections should I be able to support?
By default, Firefox is configured to establish no more than 32 persistent proxy connections. If your proxy supports concurrency and you have imposed some fixed limit on the number of client connections, then it is advisable to not be less than 32.
Does the order of header lines matter?
In principle, yes, but you can assume that during marking this will not be the case.
Can a request have multiple header lines with the same field name?
In principle, yes, but you can assume that during marking this will not be the case.
Can a single header field value span multiple lines?
In principle, yes, but you can assume that during marking this will not be the case.
Can a header line have extraneous whitespace?
Yes, each header line consists of a case-insensitive field name followed by a colon (:
), then either a field line value or a comma-separated list of field line values, where any value may have leading and/or trailing whitespace.
Will my proxy be chained with other proxies?
Not intentionally, but any chaining would be transparent to your proxy, meaning your proxy does not need to alter its behaviour. The one exception to this is that it may need to update, rather than insert, the Via
header.
Which interface should my proxy bind to?
By default, binding to 0.0.0.0
(all interfaces) allows connections from any device on the network, while 127.0.0.1
restricts it to local-only access. Choose based on your preference, but during marking binding only to 127.0.0.1
will be sufficient.
Do I need to check that the port number in a request target is valid, i.e. an integer from 0 to 65535?
No. While it would be a good idea, this won't be tested.
HTTP uses "percent-encoding" for certain URL characters. Do I need to worry about this?
No, your proxy does not need to decode or normalise percent-encoded characters. You won't be penalised for any potential cache misses that may result, and the URL components can be forwarded in whatever form they were received.
Can I change the log format?
No, you should adhere to the specification.
What should I log if a client disconnects before making a request?
Nothing.
Do I need to handle incomplete requests/responses?
Your proxy should naturally handle most scenarios by ensuring:
Most other scenarios should naturally be handled by the client or the server. For example, the server may detect an incomplete request and send an error response, or the client may detect an incomplete response and retry the request.
Can my proxy produce other output?
No, not during normal operation. But you might find it helpful to add an optional command-line argument (e.g. -d
), that, when given, enables some debugging output.
Do I need to validate the command-line arguments?
While it's good programming practice to validate any user input, we won't be testing such error conditions during marking.
I haven't implemented the timeout or caching, should I still accept the command-line arguments?
Yes. It's important that we can execute your program as expected. If you haven't implemented these features, you can simply ignore the values that are passed in.
Can I write my proxy in a language other than C, Python or Java?
No, you must use one of these languages, and it must be able to compile and run within the CSE server environment.
Where can I get help on writing a Makefile
?
There's a guide linked from the Sample Client-Server Programs and Networking Programming Resources.
What are some useful curl
options?
See man 1 curl
for full details, but here is a summary of perhaps the most relevant:
-d, --data <data>
Sends the specified data in a POST request to the HTTP server, in the
same way that a browser does when a user has filled in an HTML form and
presses the submit button.
-I, --head
Fetch the headers only! HTTP-servers feature the command HEAD which this
uses to get nothing but the header of a document.
-H, --header <header/@file>
Extra header to include in information sent. When used within an HTTP
request, it is added to the regular request headers.
--http1.1
Tells curl to use HTTP version 1.1 (usually the default).
-i, --include
Include the HTTP response headers in the output.
--json <data>
Sends the specified JSON data in a POST request to the HTTP server.
--json works as a shortcut for passing on these three options:
--data [arg]
--header "Content-Type: application/json"
--header "Accept: application/json"
-m, --max-time <fractional seconds>
Maximum time in seconds that you allow each transfer to take.
-o, --output <file>
Write output to <file> instead of stdout.
-Z, --parallel
Makes curl perform its transfers in parallel as compared to the regular
serial manner.
-x, --proxy [protocol://]host[:port]
Use the specified proxy.
--rate <max request rate>
Specify the maximum transfer frequency you allow curl to use - in number
of transfer starts per time unit (sometimes called request rate).
-X, --request <method>
Specifies a custom request method to use when communicating with the
HTTP server. The specified request method will be used instead of the
method otherwise used (which defaults to GET).
-v, --verbose
Makes curl verbose during the operation. Useful for debugging and seeing
what's going on "under the hood". A line starting with '>' means "header
data" sent by curl, '<' means "header data" received by curl that is
hidden in normal cases, and a line starting with '*' means additional
info provided by curl.