Today’s Web interactions are frequently short, which leads to an increasing number of responses carrying only control information and no data. Currently browser uses HTTP, which uses client-initiated TCP for all Web interactions. TCP is not always well suited for short interactions.
Client initiated TCP handicaps the deployment of interception caches in the network because of the possibility of disrupted connections when some client packets bypass the cache on their way to the server. This report explains a new transfer protocol for Web traffic, called Dual transport HTTP (DHTTP), which splits the web traffic between UDP and TCP channels. This protocol is proposed by Michael Rabinovich and Hua Wang. When choosing the TCP channel, it is the server who opens the connection back to the client. Among important aspects of DHTTP are
- Adapting to bottleneck shifts between a server and the network and coping with the unreliable nature of UDP.
- By using server-initiated TCP, DHTTP also eliminates the possibility of disrupted TCP connections in the presence of interception caches thereby allowing unrestricted caching within backbones.
2.1 ABOUT HTTP
Currently all the browsers use HTTP, which was conceived as essentially a protocol for transferring files. It is an application layer protocol. It was designed on top of a connection-oriented transport protocol such as TCP.
The HTTP protocol consists of two fairly distinct items
1) The set of requests by browser to server
2) The set of responses form server to browser
Following is the list of few built-in HTTP request methods
a) GET – Request to read a web page
b) PUT – Requests to store web page
c) HEAD – Request to read a Web page’s header
d) POST – Append to a named resource
e) DELETE – Remove the Web page
f) LINK – Connects two existing resources
g) UNLINK – Breaks an existing connection between two resources.
2.2 PROBLEMS IN HTTP:
The current Web workload exhibits a large number of short page transfers, which results in more interactions for control purposes rather than data transfers. It was found that median response size that carries only data was found to be about 3,450 bytes. Establishing a TCP connection for such short response size lead to an unnecessary overhead. It also increases number of TCP connection setups and number of open connections at the server.
To address these overheads late versions of HTTP uses the concept of persistent connections and pipelining. Persistent connections allows client to fetch multiple pages from the same server using the same TCP connection. This reduces the TCP set-up overheads. Pipelining allows client to send multiple requests over the same connection without waiting for response for earlier request. The server will send a stream of responses back.
Though these features reduce client latency and network traffic but it does not eliminate all overheads of TCP. In fact they introduce new performance penalties, especially when the bottleneck is at the server.
• Persistent connections increase the number of open connections at the server, which may decrease the server throughput.
• Pipelining has a limitation that servers must send responses in their entirety and in the same order as the order of the requests in the pipeline. This constraint causes head of line delays when a slow response holds up all other responses in the pipeline.
To avoid head of line delays, browsers often open multiple simultaneous connections to the same server, which further increasing the number of open connections and degrading the throughput of a busy server. To limit the number of open connections, servers close connections that remain idle for a persistent connection timeout period. Busy sites often use short connection timeouts, thereby limiting the benefits of persistent connections. Moreover, persistent connections that servers do maintain are often underutilized, which wastes server resources and affects the connection’s ability to transmit at proper rate since well-behaving TCP implementations shut down the transmission windows of idle connections.
3. FEATURES OF DHTTP
• The Dual-transport HTTP protocol (DHTTP) splits the web traffic between Transport Layer Protocol (TCP) and User Datagram Protocol (UDP)
• Client normally sends all requests on UDP. Then server sends its response over UDP or TCP, depending on the size of the response and the network conditions.
• By using UDP for short responses, DHTTP reduces both the number of TCP connection set-ups and the number of open connections at the server.
• The utilization of the remaining TCP connections increases because they are reserved for larger objects.
• DHTTP removes ordering constraints of pipelining.
• When choosing TCP, a DHTTP server establishes the connection back to the client, reversing in a sense the client/server roles in the interaction. While having some implications with firewalls, this role reversal brings following benefits.
1. It avoids an increase (compared to the current HTTP) in the number of message round-trips before the client starts receiving the data over the TCP channel.
2. It removes a bottleneck at the server that accepts all TCP connections.
3. It allows unconstrained deployment of interception caches in the network.
4. DHTTP PROTOCOL
In DHTTP, both Web clients and servers listen on two ports, a UDP and a TCP. Thus,two communication channels exist between a client and a server
- UDP channel
- TCP channel.
The client usually sends its requests over UDP. Only when uploading a large amount of data (e.g., using a PUT request) client may use TCP. By default, a request below 1460 bytes, that is Ethernet maximum transfer unit (MTU), is sent over UDP. This value is called as Size Threshold value. Virtually all HTTP requests fall into this category. For conceptual cleanness, the client itself initiates the TCP connections to send requests instead of reusing connections initiated by the server for data transfer.
Fig 4.1 Message Exchange for Web interaction
When the server receives the request, it chooses between the UDP and TCP channels for its response. It sends control messages that consist of only responses and no data, as well as short (below 1460 bytes, one Ethernet MTU, by default) data messages, over UDP even if there is an open TCP connection to the client. This avoids the overhead of enforcing unnecessary response ordering at the TCP layer. A UDP response is sent in a single UDP packet since our default size threshold practically ensures that the packet will not be fragmented.
For long data messages (over threshold value), the server opens a TCP connection to the client, or re-uses an open one if available. Figure 4.1 shows the message exchange of a current HTTP interaction and a DHTTP interaction with the response sent over UDP and TCP. It is important that even when choosing TCP, DHTTP does not introduce any extra round-trip delays compared to the current Web interactions. While it may appear counter-intuitive because in DHTTP, TCP establishment is preceded by an “extra” UDP request, the comparison of Figure 1a and 1c shows that data start arriving at the client after two roundtrip times (RTTs) in both cases. In fact, a possible significant advantage of DHTTP over current HTTP in this case is that the server may overlap page generation with TCP connection establishment.
4.2 MESSAGE FORMAT:
Fig 4.2 a shows format of the DHTTP message formats for sending the request and response given by the server.
Responses to the requests may arrive on different channels and out of order with respect to requests, so the client must be able to match requests with responses. Consequently, a client assigns a randomly chosen request ID to each request. The request ID is reflected by the server in the response and allows the client to assign the response to the proper request. The request ID is unique for a given client and only across the outstanding requests that await their responses.
Following table shows the field in the message format and corresponding size in bytes for the DHTTP request.
Request ID 8
Port Number 2
Resent flag 1
HTTP Request –
A) Request ID: Total eight bytes are allocated for the request ID, which is sufficient to safely assume no possibility of collision
B) Port Number: Two bytes are allocated for a client to specify port number. While sending response server must knows both the client port numbers that is TCP port and UDP port. To save the overhead we use the fact that source port number of the channel used by the request is included in the IP header already. So DHTTP request must contain the port number of the other channel only. Thus DHTTP request message has port number field, which contains client’s TCP port number if request is sent over UDP and UDP port number if the request is sent over TCP.
C) Resent flag: It is a byte flag, which is currently used to indicate duplicate request. Thus whenever client retransmits the request it send it with resent flag set. When server receives such request, it process such request accordingly.
D) HTTP Request: This field is similar to current HTTP request. While sending the response server appends the 8-byte request ID from the request to the normal HTTP response as shown in the Figure 4.2 b.
DHTTP request format adds eleven bytes to every request compared to standard HTTP request format. Similarly DHTTP response, which includes request ID adds only 8 bytes.
UDP is not as reliable a protocol as TCP. Considering the Internet we must provide a reliability mechanism for transmission over UDP. An easiest way to provide reliability would be to make clients acknowledge every UDP packet received and servers resend unacknowledged UDP packets. This, however, would increase bandwidth consumption for acknowledgements and server overhead for storing unacknowledged UDP packets and for managing per-packet timeouts and retransmissions. These overheads would be paid whether or not a packet loss occurs.
This approach for providing reliability is never optimal. When packet loss is low, it imposes the unnecessary overheads. When it is high, TCP is preferable than using such approach over UDP. So, instead of trying to build reliability into the UDP channel, the DHTTP protocol simply stipulates that a client may resend a UDP request if the response does not arrive for a timeout period, with the resent flag set. Using a large request timeout (about 5 and 10 seconds) with a limited number of resends DHTTP can ensures that clients do not overwhelm a server with repeated resends. Clients could use more sophisticated strategies such as smaller initial timeouts followed by exponential backoff.
It is the server’s responsibility to efficiently deal with resent requests especially when there is a bottleneck. The server can achieve this by re-generating responses, or caching UDP responses in the buffers so that they can be re-transmitted quickly. However, DHTTP stipulates that a response to a resent request be sent over TCP for flow control
4.4 NON-IDEMPOTENT REQUESTS:
Non-idempotent Requests are those requests, which should not be re-executed. Examples of such requests include some e-commerce transactions, such as an order to buy or sell stocks. Currently DHTTP deals with non-idempotent requests by delegating them to TCP transport, instead of providing special support at the application level. In this method, the protocol portion of a URL prescribes the transport protocol to be used by clients. For instance,
We can have a convention that, for URLs of the form given below can be used indicate that channel used for communication must be TCP
While for URLs that start with “dhttp:”, it can use UDP. Then, all non-idempotent URLs would be given the “dhttpt:” prefix.
5. CHANNEL SELECTION
5.1 CRITERIA FOR CHANNEL SELECTION:
The server must choose between TCP and UDP based on the response size and network conditions. When the network is not congested and packet loss is low, then the best strategy for the server would be to maintain no state for sent responses. This strategy optimizes for the common case of no packet loss, at the expense of having to re-generate the response after a loss does occur.
However, when the network is congested, this strategy is extremely poor. Not only do the UDP responses have to be regenerated and re-transmitted often, but even TCP responses may arrive at clients so slowly that clients send duplicate requests for them. The result is that the server sends many duplicate responses, which will further increase the network congestion. The same situation may occur with compute-intensive responses, which may take a long time to reach the client.
To address this issue, DHTTP server must maintain some information, which will inform the server about current network condition. For this server maintains two counters namely a “fresh requests counter” and “resent requests counter”.
Fresh Request Counter:
This counter is incremented any time the server sends a response by UDP to a request with unset resent flag,
Resent Request Counter:
It counts the number of resent requests received by the server.
Server then chooses a channel using parameters threshold parameter (L) and a size threshold parameter(s).
• All the responses exceeding the size threshold are send over UDP. Also the responses to the resend requests are also sent over TCP.
• The choice for the remaining responses depends on the threshold parameter L and ratio of resent request counter to fresh request counter. If this ratio is below L, these responses use UDP. The ratio above L indicates high packet loss and would suggest sending all responses by TCP. However, the server must still send a small number of responses over UDP to monitor the loss rate, since the TCP layer masks losses in the TCP channel. Therefore, server arbitrarily to send (1-L) fraction (or 99%) of small responses over TCP and the remaining small responses over UDP in the high loss condition.
5.2 HANDLING RESENT TCP REQUESTS:
There is still a race condition, where a client may time out and resend a request before the TCP response to this request arrives. To address this race condition, our server maintains a circular buffer of global request IDs that have been responded to by TCP. Global request ID is a combination of the client IP address and request ID from a request.
The buffer has room for 10000 global request IDs. When a resent request arrives, the master process ignores it if it is found in the buffer, since the response was already sent by TCP that has reliable delivery.
A potential limitation of the above algorithm is that the server maintains aggregate packet loss statistics. While the aggregate statistics reflect well the congestion of the server link to the Internet, congestion of network paths to individual clients may vary. Thus, enough clients with congested links can make the server use TCP even for clients with non-congested links. Conversely, the server may use UDP for an occasional congested client if its connection to the rest of the clients is uncongested. If a UDP response to the congested client is lost, the client will resend its request with the resent flag set, forcing the server to use TCP for this interaction.
5.3 CHOOSING SIZE THRESHOLD VALUE:
Choosing a size threshold presents another interesting tradeoff as far the performance is considered. A large value will reduce the number of TCP connections by sending more responses over UDP; however, if it exceeds one MTU (1460 bytes), some responses in the current version of DHTTP will be fragmented. Fragmentation degrades router performance and entails resending the entire response upon a loss of any fragment. Thus, in a high loss environment such as Internet, s should be limited to one MTU.
6. DHTTP SERVER
The current implementation of DHTTP server is made by modifying Apache 1.3.6 server, which is today’s most popular Web server. The Figure 9.1 shows structure of the DHTTP server. The server has two important sections
It is a process that accepts incoming requests. The master process has three threads.
- Read Request Thread, which reads incoming requests from the UDP port and copies them into a global buffer.
- Pipe Request Thread, which read the requests from the global buffer and pipes them to worker processes for execution.
- Maintenance thread, which wakes up every second and checks the status of the worker processes. If it finds too few idle worker processes, it forks some new ones. If there are too many idle worker processes, it kills some, so that number of worker processes would scale to the request rates.
The global buffer is used to move requests out of the UDP port buffer as quickly as possible, so that the incoming requests will not fill up the UDP port buffer and get dropped. A better alternative, which we have not yet implemented, would have been to increase the size of the UDP port buffer and eliminate the extra copying of requests between the two buffers
It is a process, which execute individual requests and send the responses back to clients. A worker process reads an HTTP request from its pipe, generates the HTTP response, chooses between UDP and TCP channels and sends the response to the client. If the worker process chose TCP and it already has an open TCP connection to the client, it reuses this connection; otherwise it opens a new connection to the client. Thus, a worker process can have at most one TCP connection to a client, although it may have several concurrent connections to different clients.
Each worker process also contains a Timeout thread that is invoked periodically to close any TCP connections created by this process that have been idle for more than a timeout period. This timeout parameter is equivalent to persistent connection timeout in HTTP servers; we varied it from 1 to 15 seconds.
Worker processes share with the master a data structure that describes open TCP connections. The data structure includes a pending request counter for each connection, which is an upper bound on the number of requests than may be responded to over this connection. When the Pipe Request thread chooses a worker process and the latter has an open connection to the client, the Pipe Request thread increments the pending request counter (even though it does not know if the response will use the connection or be sent over UDP).
When a worker completes sending the response, it decrements the pending request counter for this client. When choosing a worker process for a new request, the Pipe Request thread tries to efficiently reuse available TCP connections and, at the same time, distribute the incoming requests uniformly among all worker processes according to the following procedure:
1) If there is an open idle TCP connection available to this client and to the TCP port specified in the request message, and the connection’s pending request counter is zero, choose the worker process that has created this TCP connection.
2) If the number of open TCP connections to this client and the total number of open connections are below their respective limits, choose any idle worker process; if all processes are busy choose the one with the smallest total number of pending requests.
3) If either the number of TCP connections to this client or the total number of open connections reached their limits, choose a process that has a TCP connection to this client with the smallest pending request counter; if there are no connections to this client, choose a process with the minimal overall number of pending requests.
7. FUTURE ENHANCEMENT:
There are a number of ways in which DHTTP could potentially be improved. These include a native support for non-idempotent requests, a more sophisticated channel selection algorithm and connection management at the server.
1) Native Support for Non-Idempotent Requests
The method for supporting non-idempotent resources described in Section 4.4 is simple but excludes non-idempotent resources from performance gains of DHTTP. Instead, one could explore ways to add support for these requests to DHTTP. As one possibility, whenever the server generates a non-idempotent response, it logs it. For efficiency, it also maintains an index of logged responses that allows a quick check whether the request with a given request ID has been processed. Upon receiving a non-idempotent request with the resent flag set, the server will look it up in the index. If this request is not in the index, the server never received the original request. So, the server will generate the response. If the request is found in the index, the server will re-send the corresponding response from the log without re-executing the request.
2) Channel Selection Algorithm
Another avenue for improvement is to finesse the channel selection algorithm.
It would be interesting to consider finer-grain statistics on resent requests and factor the client subnet into channel selection.
It can be changing the UDP size threshold dynamically based on the observed packet loss. When the loss rate is low, we could double the size threshold to two Ethernet MTUs without relaxing TCP’s congestion control, for further performance gains. This would require sending a UDP response in several packets to avoid packet fragmentation.
3) Connection Management
Since the server in DHTTP can decide how many connections to open to a client, one can capitalize on this capability to improve performance.
For example: If the server has few open connections overall, it may decide to open many TCP connections to the client to parallelize sending multiple embedded objects If the server already has a large number of connections, it may send these objects sequentially over the same connection, or even use individual short-lived connections in a succession.
8. COMPARISON WITH HTTP
- While choosing TCP channel DHTTP requires same Round Trip Time (RTT) as compared to HTTP.
Both DHTTP and HTTP imposes similar requirements on the Firewalls. Except that DHTTP requires to open hole for packet from any source IP address
- Both DHTTP and HTTP are susceptible to Denial of Service (DOS) attacks.
- HTTP uses TCP as transport layer protocol. While DHTTP uses both TCP and UDP as transport layer protocol.
- Utilizing UDP for small response DHTTP enable better utilization of TCP connections for large objects, etc.
- Use of persistent connection in HTTP may increase number of open TCP connections at the server end, which may decrease server’s throughput especially when there is bottleneck at the server.
- Pipelining in HTTP imposes additional overheads of ordering of response. DHTTP does not have ordering constraints.
- DHTTP has a server initiated connection as compared to HTTP, which has client initiated connection.
- In DHTTP interception cache bypasses the TCP requests to server. While there is no restriction on interception cache in HTTP.
- In HTTP interception cache uses server’s IP address in the response and hence violets end-to-end semantics of Internet. In DHTTP interception cache uses its own IP address in the response
This seminar report describes a protocol for Web traffic called Dual HTTP. The protocol splits the traffic between UDP and TCP channels based on the size of responses and network conditions. When TCP is preferable, DHTTP opens connections from Web servers to clients rather than the other way around.
DHTTP retains the end-to-end principle of the Internet in the presence of interception Web proxies, allowing their unconstrained deployment throughout the Internet without the possibility of disrupting connections.
From the performance perspective, with existing HTTP, busy servers face a dilemma of either limiting the number of Web accesses that benefit from persistent connections or having to deal with a large number of simultaneous open connections. By performing short downloads over UDP, DHTTP reduces both the number of TCP connection set-ups and the number of open TCP connections at the server. At the same time, the TCP connections that DHTTP does open for transferring larger objects, increasing connection utilization. Also, by opening TCP connections back to the clients, the server no longer has a bottleneck process that receives all TCP connection requests from the clients.
Performance analysis shows that when the network is not congested, DHTTP significantly improves the performance of Web servers, reduces user latency, and increases utilization of remaining TCP connections, improving their ability to discover available bandwidth.
DHTTP server successfully detects network congestion and uses TCP for almost all traffic under these conditions. Also numbers of further optimizations to the protocol are possible.
2) ‘DHTTP’ at www.ieee.infocom.org/2001/papers by Hua Wang
3) ‘Internetworking with TCP /IP’ by Douglas Comer.
4) ‘Computer Networks’ by Andrew Tanenbaum.