Log in

No account? Create an account


Why I am not a fan of the locator / identifier split

« previous entry | next entry »
25th May 2011 | 01:04

A scalable system's size and complexity is not constrained by the capacity of its components.

By this definition the Internet is not scalable, since every router must have a large enough routing table to handle every network address prefix. Since the Internet has scaled impressively over the last 20 years, this might seem to be a theoretical problem, but I think it is a very practical problem and we have largely become inured to its consequences.

Consider your laptop or smartphone. It has three or four network interfaces - wifi, Ethernet, cellular, Bluetooth - but it can't roam from one uplink interface to another without resetting its network stack, nor balance the network load across interfaces, and its ability to provide connectivity to other devices is restricted. You can make these features work with special hackery (NAT, mobile IP, HIP) but they aren't supported by the architecture in the way that first-class multihoming is. The Internet cannot cope with a billion small and highly-mobile networks.

The cause of the scalability problem is the addressing scheme. Each packet contains a globally unique destination address that indicates where the packet should be delivered, without any information about how to get there. Therefore every router must maintain a map of the entire address space so it knows where to forward each packet. Routing tables must be very fast as well as very large - the packet arrival interval in 10Gbit Ethernet is as little as 75ns - so backbone routers require expensive custom silicon. Maintaining the routing tables requires all significant connectivity changes to be communicated to all routers over BGP, so communication overheads scale badly as well as routing table sizes.

The current strategy for controlling this problem relies on aggregation: the Internet registries try to allocate numerically adjacent address blocks to topologically adjacent networks, and hope that distant routers can use a single large routing table entry to span several blocks. But the need for multihoming and traffic engineering encourages de-aggregation despite the external costs this imposes. The regional Internet registries try to use their allocation policies to minimise the harm, but this has the side effect of slowing the growth of the Internet.

All network engineers are aware of this problem, and there has been a lot of work on developing better routing schemes. At the academic end there have been many "Future Internet" research programmes, which funded a lot of projects that have often ignored any need for backwards compatibility with interesting results. At the more practical end has been the work in the IETF and IRTF, especially the Routing Research Group. RFC 6115 has an overview of their work.

The most notable proposals are HIP and LISP, which both follow the locator / identifier split plan. The idea is to add a topology-independent overlay which endpoints use to identify each other. For example, identifiers are used instead of IP addresses in TCP connection tuples. Multi-homing etc. are handled by a mapping layer that translates between identifiers and locators. The locators can then be allocated much more strictly along topological lines than is currently reasonable. HIP requires upgraded endpoints with little support from the network, whereas LISP supports existing endpoints with an upgraded network. HIP puts an extra end-to-end header in packets whereas LISP is closer to the 8+8 scheme of GSE and ILNP.

I am not convinced that the ID/loc split will be a great success. It keeps the same unscalable globally unique address scheme, but duplicated, with a complicated mapping layer in between. It may, at great expense, make widespread site multihoming feasible, but I doubt that will ever be a feature of consumer connectivity let alone edge device connections.

I believe a properly scalable replacement for the Internet should be cheaper and have more features than the current Internet. However it looks like it takes at least 20 years to go from prototype system to a plausible replacement for the old network - see for example the replacement of traditional telephony with VOIP, and IPv4 with IPv6. So we are unlikely in the foreseeable future to see any anything come and sweep away the accumulating ugly complication of the Internet and replace it with an elegant clean slate.

| Leave a comment |

Comments {6}


handled by a mapping layer that translates between identifiers and locators

from: ewx
date: 25th May 2011 07:56 (UTC)

Sounds suspiciously like DNS but pushed down a level or two.

Reply | Thread

Tony Finch

Re: handled by a mapping layer that translates between identifiers and locators

from: fanf
date: 25th May 2011 09:10 (UTC)

Exactly how the mapping layer works is still the subject of argument, but putting it in the DNS has been one of the options...

Reply | Parent | Thread


from: pjc50
date: 25th May 2011 09:11 (UTC)

The challenge for development of plausible replacements is to prevent the insertion of antifeatures - billing, selective service degradation, censorship.

Reply | Thread

Tony Finch

from: fanf
date: 25th May 2011 10:49 (UTC)

At the moment billing happens at contractual boundaries (customer/provider, inter-AS) and it isn't applied transitively - I can't separately bill my customers' customers. (I'm talking about the network layer here; the billing structure at the application later is a separate matter.) I think this is a reasonable and fairly unavoidable state of affairs. If transitive billing is possible that implies the network fails to enable privacy, which I think is the important underlying feature.

Selective service degradation is also a sign of deeper security problems. Deep packet inspection only makes sense if you don't have a uniform data encryption layer. The other attack that has this effect is based on traffic analysis. I'm not sure it's possible to completely prevent this attack at a reasonable cost. Onion routing can defeat it selectively.

Censorship is a hard one. It is an attack on resource discovery, and it succeeds to the extent that we rely on gatekeepers to mediate access to services, and the extent to which the gatekeepers can be coerced into denying service. The less centralized the network, the easier it is to route around a co-opted mediator. This is mostly a layer 9 matter: if the regulatory environment encourages lively competition then it's harder to get all the providers to implement censorship. But the network architecture has to be decentralized and support a wide variety of contractual relationships in order to make the desirable layer 9 policies possible. There are similar considerations in several parts of the architecture: inter-AS routing and service naming are fundamental, but application protocols have to keep it in mind as well.

One of the ways the current architecture fails at contractual flexibility is the difficulty of changing service providers - it usually implies that you have to renumber your network. This is one of the things that makes NAT a win. The ID/loc split is supposed to address this problem - in fact it's a necessary consequence of support for multihoming.

Reply | Parent | Thread

from: thegameiam.wordpress.com
date: 25th May 2011 22:29 (UTC)

A rarely acknowledged problem with Loc/ID schema is the security + traceability problem - once I can uniquely identify a given host, I get extra insight into and around the various privacy approaches. Cookies and LSOs are downright pleasant when compared to these problems.

Reply | Thread

Tony Finch

from: fanf
date: 26th May 2011 18:02 (UTC)

I recently found the presentation below which has some interesting points to make along these lines. It certainly made me think a bit. Ben Laurie has written a lot on his blog about linkability of identities which I should probably go and re-read.


There's an asymmetry between clients and servers, in that servers must have a persistent identity and a public location in order to perform their function. Neither is necessary for clients; a client may need to have a persistent identity in order to make good use of a service, but probably not the same identity for all services. The current Internet generally has a mindset (especially amongst end-to-end fundamentalists) that every endpoint may be providing services and so is a public entity. NAT has undermined that considerably.

When you are considering a future internet in which mobile devices are first-class entities, everyone will be wandering around with several services in their pocket - think of all the LAN sharing features that desktop computers have. But these are not public services, so you need a way to publish the service's current location to just the people entitled to use it. You need a distributed private naming service. That way you can get something like the security & privacy benefits of NAT without preventing incoming connections from friendly clients.

Reply | Parent | Thread