?

Log in

No account? Create an account

fanf

TCP narg/rant

« previous entry | next entry »
29th Apr 2008 | 17:07

During a discussion on a work mailing list about TCP performance across the Atlantic, one of my colleagues made some comments that were ridiculous even by his standards. (He's notorious for declaring ex cathedra that things are extremely difficult and no-one understands them any more except a few giants (like himself) from the age when dinosaurs ruled the machine room.) I felt compelled to correct his most egregiously wrong statements, and I'm posting my comments here since they seem to be of wider interest than just University IT staff. In the quoted sections below, when he talks about "products" he means software that games or breaks TCP's fairness and congestion control algorithms.

I went to a very interesting talk by the only TCP/IP experts I have ever met who explained that the only reason the Internet works is that such products were NOT in widespread use.

The importance of congestion control has been common knowledge since the Internet's congestion collapse in the 1980s and Van Jacobson's subsequent TCP algorithm fixes. (It's taught to all CS undergrads, e.g. look at the Digital Communications 1 syllabus in CST part 1B.) However software that games TCP's fairness model *is* in widespread use: practically all P2P software uses multiple TCP connections to get better throughput.

In the mid 1990s the Internet was going to melt down because of all the web browsers making lots of connections that were too short for TCP's congestion control algorithms to be effective. So we got HTTP/1.1 with persistent connections and a limit on the concurrency that browsers use. However these TCP and HTTP measures only work because practically all software on end-hosts co-operates in congestion control: there's nothing to enforce this co-operation.

The rise of P2P means that ISPs are increasingly interested in enforcing limits on bandwidth hogs, hence the arguments about "network neutrality" as the ISPs point fingers at YouTube and the BBC and the file sharers. They are also deploying traffic shapers in the network which manipulate TCP streams with varying degrees of cleverness. (The most sophisticated manipulate the stream of ACKs going back to the sender which causes it to send at the network's desired rate; the stupid ones like Comcast's just kill connections they don't like.)

However, a better approach is to use a more realistic idea of fairness as the basis for congestion control instead of equal bandwidth per TCP connection. e.g. for ISPs, balance per subscriber. Bob Briscoe's recent position paper makes this argument in readable fashion. RFC 2140 groped in the same direction 10 years ago, but was never adopted. Either way, enforcement at the network's edge (instead of trusting the hosts) is likely to be the future whatever model of fairness is used.

Also interesting is the research on adapting TCP to high-bandwidth networks. There are many examples of what Briscoe argues against: great efforts of design and analysis to preserve TCP-style fairness in the new protocols. The GÉANT2 high-speed TCP page has lots of relevant links.

Luckily, there are very few people who know enough about TCP/IP and related protocols to write an effective product because, as I implied, there are probably no more than half-a-dozen TCP/IP experts in the world still working (and may even be none).

Tosh. There's a substantial market out there, both for managing congestion and for exploiting weaknesses in congestion control.

| Leave a comment | Share

Comments {10}

gareth_rees

from: gareth_rees
date: 29th Apr 2008 16:41 (UTC)

no-one understands them any more except a few giants (like himself) from the age when dinosaurs ruled the machine room

I have a feeling I know just who this is.

Reply | Thread

Tony Finch

from: fanf
date: 29th Apr 2008 19:44 (UTC)

Not exactly a big secret :-)

Reply | Parent | Thread

from: dwmalone
date: 29th Apr 2008 17:50 (UTC)

There's quite a few people where I work interested in this future-of-congestion control stuff. One interesting point is that because of small sockbufs, TCP's congestion control has been ineffective in many situations and the limiting factor has been a combination of sender sockbufs and advertised receiver window. As Linux, BSD and Windows get adaptive buffer sizing, it will be interesting to see if this becomes a problem.

It also may not become a problem in an obvious way - for example many "broadband" offerings come with huge buffering at the far end of the link, which TCP with no sockbuf restriction will keep moderately full. I tried playing with this, and had no problem generating a RTT of 4s on my DSL link. More buffering is not always a good thing!

I've been vaguely following Bob's stuff - I'm not sure that I fully buy his view (which I consider to be a bit ISP centric), but he does raise issues that are worth thinking about. I did like this paper, which I probably linked to before, which talks about the problem that in the Internet there is no way to say you don't want traffic.

Reply | Thread

Tony Finch

from: fanf
date: 29th Apr 2008 18:04 (UTC)

Yes, buffering in home broadband devices is very broken - see for example http://www.greenend.org.uk/rjk/2004/tc.html

Er, why did I think Bob Briscoe is called Ted? The general problem of traffic authorization is fundamental, and I wish there were ways of tackling it in a general way. ISPs have walled garden arrangements for infected hosts, content providers have DDoS mitigation devices, mail servers have DNS blacklists...

Reply | Parent | Thread

gareth_rees

from: gareth_rees
date: 29th Apr 2008 18:13 (UTC)

why did I think Bob Briscoe is called Ted?

Confusion with ejb1?

Reply | Parent | Thread

Tony Finch

from: fanf
date: 29th Apr 2008 19:44 (UTC)

Almost certainly!

Reply | Parent | Thread

Tony Finch

from: fanf
date: 29th Apr 2008 21:54 (UTC)

That paper is interesting. I'd like to see more discussion of what might be lost by limiting the architecture to symmetrical flows. Aren't some streaming flows highly asymmetrical? It would restrict syslog to a LAN-only protocol (but in practice you're a fool to run it very far). Also I wonder why they chose to delay traffic instead of dropping it.

Reply | Parent | Thread

Pete

from: pjc50
date: 30th Apr 2008 08:49 (UTC)

This is number of packets in both directions, not size of packets?

I'd be concerned about the effects on streaming video, VOIP (depending on the timeframe over which you average) and online games. Basically anything where latency is critical and therefore the application has an anti-nagle algorithm for sending packets.

Reply | Parent | Thread

Tony Finch

from: fanf
date: 30th Apr 2008 10:31 (UTC)

Yes, packets not bytes. (TCP works because of the ACKs, for example.) The paper argues that most protocols are, or can be made, reasonably symmetrical - e.g. unidirectional TCP has an asymmetry factor of 2. I don't know enough about streaming protocols to say how symmetrical they are, but they were the first thing I worried about. Games are an interesting example I hadn't thought of, though I'd expect them to be fairly symmetrical since they need to pass data in both directions.

Reply | Parent | Thread

Tony Finch

from: fanf
date: 30th Apr 2008 10:54 (UTC)

So I sent a couple of questions to the authors, and I got a very helpful reply from Jon Crowcroft, who said:

note that while some streaming apps are asymmetric, compliant RTP applications send RTCP reports with a frequency sufficient to keep within the rules... and many in the IETF think that media streaming should use DCCP not UDP/RTP, in which case there are compelling reasons for reverse path traffic with a high enough frequency to do TCP-friendly rate adaption (TFRC or whatever)

it turns out feedback is an architecural principle of e2e protocols... and control theory and information theory arguments about rate adaption and integrity argue that the packet rate needs to be "symmetric" on the order of an RTT's worth (i.e. 1/n ratio where n is the one way packet "window") so that "damage" can be repaired reasnably quickly by human or net measures - this then means that misbehaviour on the order of asymmetric packet rate is inherently a bad design or mischief...

at least this was part of our thinking...

Reply | Parent | Thread