Relay HTTP

1 Abstract

This document describes a protocol for tunnelling HTTP traffic over HTTP, with the goal of providing portable, general, securable access to the World Wide Web for programs running in restricted environments, including Javascript programs running in browsers.

The defined protocol is similar to the widely used HTTP proxying protocol, but differs in that the proxied traffic is carried over an ordinary HTTP connection; the special syntax used by an HTTP proxy is avoided here.


2 Introduction

Javascript programs running in a browser have severely restricted access to the network, to avoid cross-site scripting attacks. This makes developing browser-based applications that interact with multiple services on the web difficult, leading to problematic workarounds such as dynamic script tags for loading JSON data from third-party services. Many applications have therefore introduced methods of relaying HTTP requests to other HTTP servers on an ad-hoc, case-by-case basis.

This specification defines instead a general mechanism for tunnelling outbound HTTP requests over an HTTP transport, which, if implemented by the site hosting a browser-based application, gives the Javascript component of the application full (but controllable) access to the entirety of the web in a standard way.

2.1 Requirements

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

3 Definitions

3.1 Relay Endpoints

The HTTP server that hosts a web application should expose one or more Relay Endpoints. These are URLs, under the control of and served by the HTTP server, that implement the protocol described below for relaying HTTP requests to other HTTP servers.

The Javascript part of the hosted web application should be configured with the URLs of one or more Relay Endpoints that it can use.

3.2 Pipelines, Envelopes, and Payloads

RFC 2616 defines the application/http MIME type for representing pipelines of HTTP requests or responses, and the message/http MIME type for single HTTP requests or responses. In order to carry HTTP over HTTP, we define a simple enveloping protocol for these pipelines and messages.

In cases where there is no resulting ambiguity, the term Pipeline is used to mean either a Pipeline or a Message in the text below.

4 Running Example

In the following sections, we will be tracking the progress of a web application that

5 Making a request

Instead of the application making a direct HTTP request to the service desired, it constructs an HTTP request Pipeline, embeds it in an Envelope Request, and sends the Envelope Request to the Relay Endpoint.

The application MUST use the HTTP POST method when sending the Envelope Request to the Relay Endpoint, even when the method(s) in the Embedded Pipeline are not POST.

The application MUST append the host and port number that the Relay Endpoint should contact to the Relay Endpoint URL it has been given. The string appended to the Relay Endpoint URL MUST be of the form "hostname:portnumber". The ":portnumber" part MAY be omitted, in which case the actual port number used by the Relay Endpoint when making the outbound HTTP connection SHOULD be the standard IANA-registered http port number.

For our running example, this makes the URL to which the Envelope Request is POSTed http://10.11.12.13/myrelay/www.example.com. If the port had been number 8080, then the URL would have been http://10.11.12.13/myrelay/www.example.com:8080.

The body of the POSTed Envelope Request request sent to the Relay Endpoint MUST be a properly-formatted HTTP request Pipeline or Message, as defined by RFC 2616 or RFC 1945.

If the Embedded Pipeline only contains a single message (making it, in fact, an Embedded Message), then the Content-Type header of the Envelope Request SHOULD be message/http. If it contains more than one message (making it a true Embedded Pipeline), then the Content-Type header of the Envelope Request MUST be application/http.

Rationale: Requiring clear indication of the number of pipelined requests contained in an Envelope Request makes it possible to implement a RelayHttp service that does not parse the Embedded Pipelines it sends back and forth at all, simply relaying them verbatim to and from the remote server.

6 Processing a request

When the Relay Endpoint receives an Envelope Request as described above, it MUST extract the hostname and port numbers encoded in the Envelope Request's URL as described above. If the hostname and port number are not valid, the Embedded Pipeline MUST NOT be relayed, a status code of 400 MUST be sent in the response to the Envelope Request, and processing of the Envelope Request MUST be terminated.

A Relay Endpoint MUST support Embedded Messages, that is, Envelope Requests with Content-Type of message/http. It MAY also support Embedded Pipelines proper, with Content-Type of application/http. If it does not support Pipelines, application/http, then it MUST reply to any application/http Envelope Request with a status code of 415, and MUST NOT relay the Embedded Pipeline. If an Envelope Request has an unrecognised Content-Type (such as application/x-www-form-urlencoded), the Relay Endpoint SHOULD treat it as if it were message/http.

Rationale: Naive RelayHttp services that do not parse Embedded Pipelines cannot in general tell when to stop collecting a response pipeline for a given request pipeline. Therefore, support for multiple-requests-per-Embedded-Pipeline is made optional.

The Relay Endpoint MAY examine the Embedded Pipeline, in combination with the relay parameter, in order to decide whether to perform the actual relaying or not. If it decides not to proceed with the relay after such an examination, because of security policy configuration or otherwise, it MAY reply to the Envelope Request (NB: not the Embedded Request(s)) with status codes 401 or 403 as appropriate.

Otherwise, the Relay Endpoint opens an outbound HTTP connection to the hostname and port number specified as part of the Envelope Request's URL and sends the Pipeline that it received, unmodified. The response(s) from the remote server are recorded verbatim by the Relay Endpoint.

7 Responding to a request

If the Relay Endpoint fails before it reaches the point where it has collected a response from the remote server, it SHOULD respond to the original Envelope Request with a 5xx-series status code.

Otherwise, it should reply to the Envelope Request with the collected response Pipeline, unmodified, embedded in an Envelope Response with a status code of 200.

If the Envelope Request was of type application/http, a true Pipeline, then the Envelope Response MUST be given a Content-Type of application/http, and SHOULD carry the same number of HTTP responses as there were HTTP requests in the Envelope Request's Pipeline, but MAY carry fewer. In the case where fewer responses than expected are sent back by the Relay Endpoint, the sender of the Envelope Request should treat the missing responses as it would for a dropped TCP connection if it were accessing the remote server directly.

If the Envelope Request was of type message/http (or was treated as if this were the case), then the Envelope Response MUST be given a Content-Type of message/http, and MUST contain exactly one HTTP response message.

8 Security Considerations

Normal HTTP access-control mechanisms can be used to restrict access to Relay Endpoints, including HTTP authentication and HTTPS.

A Relay Endpoint is not required to relay every request; it may choose to reject requests to certain blacklisted hosts, for instance, or may only allow requests to a set of whitelisted destinations. Mechanisms for controlling such policies are outside the scope of this document.

9 Normative References

10 Author's Address

Tony Garnock-Jones
tonygarnockjones@gmail.com
http://homepages.kcbbs.gen.nz/tonyg/

11 Copyright Statement

Copyright © 2009, 2010 Tony Garnock-Jones tonygarnockjones@gmail.com
Copyright © 2009 LShift Ltd. query@lshift.net

Permission is hereby granted, free of charge, to any person obtaining a copy of this documentation (the "Documentation"), to deal in the Documentation without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Documentation, and to permit persons to whom the Documentation is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Documentation.

THE DOCUMENTATION IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE DOCUMENTATION OR THE USE OR OTHER DEALINGS IN THE DOCUMENTATION.

12 Appendix: Example messages

The following sections provide example message/http Envelope Requests and Responses that correspond to the running example given in the main body of the text.

12.1 Envelope Request (message/http)

The client connects to IP address 10.11.12.13, on port 80, and sends the following HTTP message:

GET /myrelay/www.example.com HTTP/1.0
Host: 10.11.12.13
Content-Type: message/http
Content-Length: 48

GET /service HTTP/1.0
Host: www.example.com

12.2 Envelope Response (message/http)

We imagine that the service requested from www.example.com is not actually available. The Relay Endpoint then responds with the following HTTP message:

HTTP/1.0 200 Ok
Server: RelayHttp/1.0
Content-Type: message/http
Content-Length: 146

HTTP/1.0 404 Not found
Server: SomeExampleServer/2.3.4
Content-Type: text/plain
Content-Length: 41

The requested service is not available.