Frequently Asked Questions

Do I need to support any HTTP versions other than 1.1?

No, you can assume that during marking all messages will be HTTP/1.1. However, a more robust proxy would send a 505 HTTP Version Not Supported response if it encounters a request with an unsupported version.

Do I need to support any methods other than GET, HEAD, POST, and CONNECT?

No, you can assume that during marking these are the only methods that will be tested. However, supporting other methods should be trivial, as would sending a 501 Not Implemented response, either of which would make the proxy more functional.

How will headers be encoded?

You can assume all headers will use the ASCII character set.

Will all headers be line terminated with CRLF?

Yes, you can assume this to be the case during marking. However, a more robust message parser may recognise a single LF as a line terminator and ignore any preceding CR.

What should I do if there's no Host header field?

Nothing special, just forward the request on as usual.

How many header fields could there be? How big can a header field be? How big can a header section be?

HTTP does not place a predefined limit on the length of a header field or on the total size of the header section. It is up to implementations to define their own constraints. For our purposes, you can assume no single header (including the start line) will be greater than 1024 bytes, and the header section in total will be no greater than 8192 bytes (including the CRLF indicating the end of the header section). Within those contraints, there could be any number of header fields.

How big can a message be?

There is no fixed limit. You can assume that the proxy has sufficient main memory to buffer each message entirely, and that any valid Content-Length value can fit within a 4-byte signed integer. However, you should not assume that each message can fit within the proxy's send/receive buffers.

What content formats/encodings should the proxy be able to handle?

Your proxy does not need to process content or modify content encoding—just forward data as-is. So it should be able to handle any kind of data, including binary data. As such, data may include null bytes, so you should be cautious about using any C-style functions like strlen() that would misinterpret such characters as end-of-string markers.

Will responses always include a reason phrase?

No, not necessarily. Reason phrases are optional, so your proxy should be able to handle such a case.

Do I need to support pipelining?

No. While pipelining was introduced in HTTP/1.1, many servers do not support it correctly, so most clients wouldn't typically use it. During marking, all requests over a single connection will be sequential (non-pipelined).

What should I do if I get a port/address already in use error?

This usually means another process is already bound to the port you're trying to use and will typically come about because either:

Another student is using that port. You should try a different port from the dynamic port range, 49152–65535.
You were previously using that port but the socket wasn't properly closed. Ideally your proxy would ensure this isn't the case, by never crashing and by closing the socket properly upon a SIGINT (keyboard interrupt). But otherwise, you could either choose a different port, wait a minute or two for the operating system to clean up after you, or use setsockopt() to set SO_REUSEADDR and allow reuse.

Will my proxy be abruptly terminated during marking?

No.

If my proxy crashes during marking, will it be restarted?

Yes, but you are urged to test your proxy thoroughly and ensure it's robust enough to survive marking. Obviously if your proxy crashes, it will have failed some test and inherently lost marks.

How can I configure Firefox to use my proxy?

Go to Settings → Network Settings → Manual proxy configuration, enter localhost or 127.0.0.1 and your proxy’s port under HTTP Proxy. Once your proxy supports the CONNECT method, check the option to Also use this proxy for HTTPS.

Configure Proxy Access to the Internet

Why does my browser keep using HTTPS even though I type in HTTP?

Some browsers enforce HTTPS via HSTS (HTTP Strict Transport Security), upgrade connections automatically, or default to HTTPS-first mode. Make sure you are explicitly typing http:// in the address bar, and if the https:// site has previously loaded, try clicking on the lock icon next to the URL in the address bar and select "Clear cookies and site data...".

Clear cookies and site data

Why is my browser making connections/requests that I don't recognise?

Browsers make background requests for various reasons, such as:

Preloading websites (speculative connections).
Fetching favicons, updates, or security lists.
Checking for captive portals when connecting to a new network.
Loading scripts and assets from previously visited pages.

Hence, it's a good idea to focus on simpler user agents early in your development process.

How is my browser loading pages without sending a request?

It could be using a cached response instead of making a network request. Try disabling caching (e.g., Ctrl/Cmd + Shift + R for a hard refresh in most browsers, or opening Developer Tools and selecting "Disable cache" on the "Network" tab).

Will any GET request include a body?

No, a GET request should not and will not have a body.

Will any HEAD request include a body?

No. The HEAD method is identical to GET except that the server must not send content in the response.

Will any POST request include a body?

Typically. The POST method is usually used this way.

Will any CONNECT request include a body?

No, CONNECT request should not and will not have a body.

How many clients/connections should I be able to support?

By default, Firefox is configured to establish no more than 32 persistent proxy connections. If your proxy supports concurrency and you have imposed some fixed limit on the number of client connections, then it is advisable to not be less than 32.

Does the order of header lines matter?

In principle, yes, but you can assume that during marking this will not be the case.

Can a request have multiple header lines with the same field name?

In principle, yes, but you can assume that during marking this will not be the case.

Can a single header field value span multiple lines?

In principle, yes, but you can assume that during marking this will not be the case.

Can a header line have extraneous whitespace?

Yes, each header line consists of a case-insensitive field name followed by a colon (:), then either a field line value or a comma-separated list of field line values, where any value may have leading and/or trailing whitespace.

Will my proxy be chained with other proxies?

Not intentionally, but any chaining would be transparent to your proxy, meaning your proxy does not need to alter its behaviour. The one exception to this is that it may need to update, rather than insert, the Via header.

Which interface should my proxy bind to?

By default, binding to 0.0.0.0 (all interfaces) allows connections from any device on the network, while 127.0.0.1 restricts it to local-only access. Choose based on your preference, but during marking binding only to 127.0.0.1 will be sufficient.

Do I need to check that the port number in a request target is valid, i.e. an integer from 0 to 65535?

No. While it would be a good idea, this won't be tested.

HTTP uses "percent-encoding" for certain URL characters. Do I need to worry about this?

No, your proxy does not need to decode or normalise percent-encoded characters. You won't be penalised for any potential cache misses that may result, and the URL components can be forwarded in whatever form they were received.

Can I change the log format?

No, you should adhere to the specification.

What should I log if a client disconnects before making a request?

Nothing.

Do I need to handle incomplete requests/responses?

Your proxy should naturally handle most scenarios by ensuring:

It can handle unexpected connection closure, and
It times out idle connections.

Most other scenarios should naturally be handled by the client or the server. For example, the server may detect an incomplete request and send an error response, or the client may detect an incomplete response and retry the request.

Can my proxy produce other output?

No, not during normal operation. But you might find it helpful to add an optional command-line argument (e.g. -d), that, when given, enables some debugging output.

Do I need to validate the command-line arguments?

While it's good programming practice to validate any user input, we won't be testing such error conditions during marking.

I haven't implemented the timeout or caching, should I still accept the command-line arguments?

Yes. It's important that we can execute your program as expected. If you haven't implemented these features, you can simply ignore the values that are passed in.

Can I write my proxy in a language other than C, Python or Java?

No, you must use one of these languages, and it must be able to compile and run within the CSE server environment.

Where can I get help on writing a Makefile?

There's a guide linked from the Sample Client-Server Programs and Networking Programming Resources.

What are some useful curl options?

See man 1 curl for full details, but here is a summary of perhaps the most relevant:

-d, --data <data>
        Sends the specified data in a POST request to the HTTP server, in the 
        same way that a browser does when a user has filled in an HTML form and 
        presses the submit button.

-I, --head
        Fetch the headers only! HTTP-servers feature the command HEAD which this 
        uses to get nothing but the header of a document.

-H, --header <header/@file>
        Extra header to include in information sent. When used within an HTTP 
        request, it is added to the regular request headers.

--http1.1
        Tells curl to use HTTP version 1.1 (usually the default).

-i, --include
        Include the HTTP response headers in the output.

--json <data>
        Sends the specified JSON data in a POST request to the HTTP server. 
        --json works as a shortcut for passing on these three options:

        --data [arg]
        --header "Content-Type: application/json"
        --header "Accept: application/json"

-m, --max-time <fractional seconds>
        Maximum time in seconds that you allow each transfer to take.

-o, --output <file>
        Write output to <file> instead of stdout. 

-Z, --parallel
        Makes curl perform its transfers in parallel as compared to the regular 
        serial manner.

-x, --proxy [protocol://]host[:port]
        Use the specified proxy.

--rate <max request rate>
        Specify the maximum transfer frequency you allow curl to use - in number 
        of transfer starts per time unit (sometimes called request rate).

-X, --request <method>
        Specifies a custom request method to use when communicating with the 
        HTTP server. The specified request method will be used instead of the 
        method otherwise used (which defaults to GET).

-v, --verbose
        Makes curl verbose during the operation. Useful for debugging and seeing 
        what's going on "under the hood". A line starting with '>' means "header 
        data" sent by curl,  '<'  means "header data" received by curl that is 
        hidden in normal cases, and a line starting with '*' means additional 
        info provided by curl.