Network Working Group R. Bush Internet-Draft Internet Initiative Japan Intended status: Standards Track O. Maennel Expires: September 10, 2009 Deutsche Telekom Laboratories J. Zorz go6.si S. Bellovin Columbia University L. Cittadini Universita' Roma Tre March 9, 2009 The A+P Approach to the IPv4 Address Shortage draft-ymbk-aplusp-03 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on September 10, 2009. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of Bush, et al. Expires September 10, 2009 [Page 1] Internet-Draft A+P Addressing Extension March 2009 publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract We are facing the exhaustion of the IANA IPv4 free IP address pool. Unfortunately, IPv6 is not yet deployed widely enough to fully replace IPv4, and it is unrealistic to expect that this is going to change before we run out of IPv4 addresses. Letting hosts seamlessly communicate in an IPv4-world without assigning a unique globally routable IPv4 address to each of them is a challenging problem. This draft discusses the possibility of address sharing by treating some of the port number bits as part of an extended IPv4 address (Address plus Port, or A+P). Instead of assigning a single IPv4 address to a device, we propose to extended the address by "stealing" bits from the port number in the TCP/UDP header, leaving the applications a reduced range of ports. This means assigning the same IP to different clients (e.g., CPE's, mobile phones), each with its port-range. In the face of IPv4 address exhaustion, the need for addresses is stronger than the need to be able to address thousands of applications on a single host. If address translation is needed, the end-user should be in control of the translation process - not some smart boxes in the core. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Bush, et al. Expires September 10, 2009 [Page 2] Internet-Draft A+P Addressing Extension March 2009 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Why Large-Scale-NATs are Harmful . . . . . . . . . . . . . 4 2. Design Constraints and Assumptions . . . . . . . . . . . . . . 6 2.1. Design constraints . . . . . . . . . . . . . . . . . . . . 6 2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7 3. Overview of the A+P Solution . . . . . . . . . . . . . . . . . 8 3.1. Signaling . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2. Address realm . . . . . . . . . . . . . . . . . . . . . . 11 3.3. Reasons for allowing multiple A+P gateways . . . . . . . . 13 4. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 15 4.1. A+P for Broadband Providers . . . . . . . . . . . . . . . 15 4.2. A+P for Mobile Providers . . . . . . . . . . . . . . . . . 15 4.3. A+P from provider networks perspective . . . . . . . . . . 16 4.4. Dynamic allocation of port ranges . . . . . . . . . . . . 18 4.5. Example of A+P-forwarded packets . . . . . . . . . . . . . 20 4.6. Forwarding of standard packets . . . . . . . . . . . . . . 24 4.7. Handling ICMP . . . . . . . . . . . . . . . . . . . . . . 24 4.8. Limitations of the A+P approach . . . . . . . . . . . . . 25 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 6. Security Considerations . . . . . . . . . . . . . . . . . . . 26 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 26 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 8.1. Normative References . . . . . . . . . . . . . . . . . . . 26 8.2. Informative References . . . . . . . . . . . . . . . . . . 26 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 Bush, et al. Expires September 10, 2009 [Page 3] Internet-Draft A+P Addressing Extension March 2009 1. Introduction This document addresses the imminent IPv4 address space exhaustion. Very soon there will be not enough IPv4 space allocatable to customers of broadband or mobile providers, while IPv6 is not widely enough deployed to migrate to an IPv6-only world. Many large Internet Service Providers (ISPs) face the problem that their networks' customer edges are so large that it will soon not be possible anymore to provide each customer with a single IPv4 address. Therefore ISPs have to devise something more ingenious. Although undesirable, address sharing is inevitable. To allow end-to-end connectivity between IPv4 speaking applications we propose to "steal" some bits from the UDP/TCP header and use them for addressing devices. Assuming we could limit the applications' port addressing to 8 (or 4) bits, we can increase the effective size of an IPv4 address by 8 (or 12) additional bits. In this scenario, 128 (or 4096) customers could be multiplexed on the same IPv4 address, while allowing them a fixed range of 512 (or 16) ports. Customers that require larger port-ranges could dynamically request additional blocks, depending on their contract. We call this "extended addressing" or "A+P" (Address Plus Port) addressing. The main advantage of A+P is that it preserves the Internet "end-to-end" paradigm by not translating (at least some ports of) an IP address. With NAT this end-to-end connectivity is broken. As long as the customer chooses to do this on his/her premises this is a choice that he/she takes, however this is not an option anymore in face of the looming IPv4 address exhaustion, where so called Large-Scale-NATs (LSNs) might be deployed within the providers network - outside the control of the customer. 1.1. Why Large-Scale-NATs are Harmful Various forms of NATs will be installed at various levels and places in the IPv4-Internet to achieve the necessary address compression. This document argues for mechanisms that end-customers will not be locked behind a walled-garden shrine without any control over the translation and that it is therefore essential to create mechanisms to "bypass" a NAT, and keep the control at the end-user: "Carrier grade" is a euphemism for centralized. More semantics move to the core of the network. This is bad in and of itself. Net-heads call it "telco-think" because it is the telco model of smarts in the core as opposed to the Internet model of a simple, just-forward- packets core, with smart edges. It also places the provider in the position, where the user is trapped behind unchangeable application and policies. This is the opposite of the "end-to-end" model of the Internet. Bush, et al. Expires September 10, 2009 [Page 4] Internet-Draft A+P Addressing Extension March 2009 With the smarts at the edges, one can easily field new protocols between consenting end-points by "just" tweaking the NATs at the corresponding Customer Premises Equipment (CPE), even adding application layer gateways (ALGs) if they are needed. However, LSNs do not build an Internet walled garden at the edges, they build it by restricting the core. With LSNs in the core, customers wanting new application protocols which require cooperation from the NAT, have to beg help from the broadband providers' engineers and lawyers, and all other users of the large-scale-NATs. It is feared that all new application protocols have to go through the carrier-loving lawyers to be allowed to be handled by the NATs in their core. Today's NATs are typically mitigated by ALGs over which the customer has some degree of control, e.g. port forwarding or UPnP. However, this is not expected to work anymore with LSNs. LSN proposals admit that it is not expected that applications that require specific port assignment or port mapping from the NAT box will keep working [I-D.durand-softwire-dual-stack-lite]. This is the ultimate horror the NAT-haters fear, and, in this case, they are not all that wrong. We believe this is not an option and that the end-user must have the ability to control its own ALGs. So, if someone wants to deploy a new application, they can talk to the broadband providers' lawyers or run new disruptive technology over HTTP; we can pick our poison. And if the NAT is not where the customer can directly control it, i.e., it is anywhere back in the provider's network, then the provider controls what the user can control, i.e. it is not really under user control. We do not wish to deal with the case where the provider has to decide whether to allow Skype v42 when they themselves provide a competing VoIP product. And remember that as IPv6 deploys, if we want to have one Internet, i.e. IPv4 nodes talking freely with IPv6 nodes, then translation must be done somewhere. The challenge is whether someone can figure out a scheme where it is done for these large networks? We believe it should be at the customer edge, not in the core. Another issue with LSN is scalability. ISPs face a tension between the placement of LSNs within their network to aggregate as much as possible, when too much aggregation creates a massive state problem. To reduce the state, the placement ends up somewhere closer to the edge, where the benefits are somewhat limited. It is not clear how a LSN should maintain per-session state in a scalable manner. State for improperly terminated sessions could remain stale for some time. The LSN hence trades scalability for the amount of state that needs to be kept, which makes optimally placing a LSN a hard engineering problem. Bush, et al. Expires September 10, 2009 [Page 5] Internet-Draft A+P Addressing Extension March 2009 In addition, NATs frequently need to initiate translation for secondary port numbers. This may be a decision based on packet inspection (i.e., looking for PORT commands in FTP [RFC0959] sessions), or it may rely on explicit signaling from the end host via protocols such as UPnP. Either way, LSNs pose a security threat and/or an administrative nightmare. The issue is proper authentication of such requests. Most UPnP devices do not implement appropriate security features. Even if they did, there would be no way to administer the security mechanism. Every end-user device would have to have a secret corresponding to some authentication field in the LSN. End users will not set these up properly; providers do not want to maintain such a database. Decisions made based on packet inspection are just as problematic. A request from one customer could easily request opening a port for an other customer's addresses, similar to the Java-based attack described by Martin et al in [Martin-Java]. Furthermore, with LSNs, tracing hackers, spammers and other criminals will be impossible, unless all the connection based mapping information is recorded and stored. This would not only cause concern for law enforcement services, but also for privacy advocates. 2. Design Constraints and Assumptions The problem of address space shortage is first felt by providers with a very large end-user customer base, such as broadband providers and mobile-service providers. Though the cases and requirements are slightly different, they share many commonalities. In the following we will develop a set of overall design constraints. 2.1. Design constraints We regard several constraints as important for our design: 1) End-to-end is under customer control. Customers shall have the possibility to send/receive packets unmodified and deploy new application protocols at will. IPv4 address exhaustion is no clearance to break the Internet's end-to-end paradigm. 2) End-to-end transparency through multiple intermediate devices. Multiple gateways should be able to operate in sequence along one data path without interfering with each other. Bush, et al. Expires September 10, 2009 [Page 6] Internet-Draft A+P Addressing Extension March 2009 3) Incremental deployability and backward compatibility. The approaches shall be transparent to unaware users. Devices or existing applications shall be able to work without modification. Emergence of new applications shall not be limited. 4) Automatic configuration/administration. There should be no need for customers to call the ISP and tell them that they are operating their own A+P-gateway devices. Customers/ mobile phone users are NOT supposed to lookup assigned ports manually on websites and then configure them on devices or applications. 5) "Double-NAT" shall be avoided. Based on Constraint 2 multiple gateway devices might be present in a path, and once one has done some translation, those packets should not be re-translated. 6) Legal traceability. ISPs must be able to provide the identity of a customer from the knowledge of the IPv4 public address and the port. This should have the lowest impact possible on the storage and the ISP. We assume that NATs on customer premises do not pose much of a problem, while provider NATs need to keep additional logs. 7) IPv6 deployment should be encouraged. While we acknowledge that A+P works in an IPv4-only environment (e.g., [I-D.boucadair-port-range]) we strongly believe that IPv6 is the long-term solution to the problem, and that A+P should be considered only as an intermediate hack towards an IPv6-only world. We therefore assume in constraint 7 that the ISP has migrated to a dual-stack core and A+P can use IPv6 as a transport inside the network. This ensures that A+P will not be an hindrance to the introduction of IPv6. Constraints 2 and 5 are important: while many techniques have been deployed to allow applications to work through a NAT, traversing cascaded NATs is crucial if NATs are being deployed in the core of a provider network. 2.2. Terminology The A+P idea can be split into three distinct functionalities: encaps/decaps, NAT, and signaling functionalities. Encaps/decaps functionality: is used to forward port-restricted A+P- packets over intermediate legacy devices. The encapsulation Bush, et al. Expires September 10, 2009 [Page 7] Internet-Draft A+P Addressing Extension March 2009 functionality takes an IPv4 packet, looks up the IP and TCP/UDP headers, and puts the packet into the appropriate tunnel. The state needed to perform this action is comparable to a forwarding table. The decapsulation device SHOULD check if the source address and port of packets coming out of the tunnel are legitimate (e.g., see [BCP38]). Based on the result of such a check, the packet MAY be forwarded untranslated, it MAY be discarded or MAY be NATed. Network Address Translation (NAT) functionality: is used to connect legacy end-hosts. Unless upgraded, end-hosts or end-systems are not aware of A+P restrictions and therefore assume a full IP address. The NAT functionality is performing any address or port translation, including application-level-gateways (ALGs). The state that has to be kept to implement this functionality is the mapping for which external addresses and ports have been mapped to which internal addresses and ports. Signaling functionality: is used in order to allow A+P-aware devices get to know which ports are assigned to be passed through untranslated and what will happen to packets outside the assigned port-range (e.g., could be NATed or discarded). In addition, the signaling functionality is used to dynamically increase/decrease the requested port-range. A+P address realm: a public routable IPv4 address that is port restricted (A+P). Forwarding of packets is done based on the IPv4 address and the TCP/UDP port numbers. When this draft talks about "A+P packets" it is assumed that those packets pass untranslated. Private address realm: IPv4 addresses that are not globally routed. Ideally they should be taken from [RFC1918] range. However, this draft does not make such an assumption. We regard as private address space any IPv4 address, which needs to be translated in order to gain global connectivity, irrespective of whether it falls in [RFC1918] space or not. 3. Overview of the A+P Solution The core architectural elements of the A+P solution are three separated and independent functionalities: the NAT functionality, the encaps/decaps functionality, and the signaling functionality. The NAT functionality is similar to a NAT as we know it today: it performs a translation between two different address realms. When the external realm is public IPv4 address space, we assume that the translation is many-to-one, in order to multiplex many customers on a single public IPv4 address. The only difference with a traditional NAT (Figure 1) is that the translator might only be able to use a Bush, et al. Expires September 10, 2009 [Page 8] Internet-Draft A+P Addressing Extension March 2009 restricted range of ports when mapping multiple internal addresses onto an external one, e.g., the external address realm might be port- restricted. "internal-side" "external-side" +-----+ internal | N | external address <---| A |---> address realm | T | realm +-----+ Traditional NAT Figure 1 The encaps/decaps functionality, on the other hand, consists of the capability of establishing a tunnel with another endpoint providing the same functionality. This implies some form of signaling to establish a tunnel This can be viewed as integrated with DHCP or a separate service. Section 3.1 discusses the constraints of this signaling function. The established tunnel can be encapsulation in IPv6, a layer-2 tunnel, or some other form of softwire. Note that the presence of a tunnel allows for intermediate legacy devices between the two endpoints. Two or more devices which provide the encaps/decaps functionality and are linked by tunnels form an A+P subsystem. The function of each gateway is to encapsulate and decapsulate respectively. Figure 2 depicts the simplest possible A+P subsystem, that is, two devices providing the encaps/decaps functionality. +------------------------------------+ port-restricted | +----------+ tunnel +----------+ | external address realm --|-| gateway |==========| gateway |-|-- address | +----------+ +----------+ | realm +------------------------------------+ A+P subsystem A simple A+P subsystem Figure 2 Within an A+P subsystem, the external address realm is extended by "stealing" bits from the port number. Each device is assigned one Bush, et al. Expires September 10, 2009 [Page 9] Internet-Draft A+P Addressing Extension March 2009 address from the external realm and a range of port numbers. Hence, devices which are part of an A+P subsystem can communicate with the external address without the need for address translation (i.e., preserving end-to-end packet integrity): an A+P packet originated from within the A+P subsystem can be simply forwarded over tunnels up to the endpoint, where it gets decapsulated and routed in the external realm. On the other hand, packets that are originated from outside the A+P subsystem need to be translated, since they belong to different realms. For this reason, one of the two edges of the A+P subsystem MUST provide the NAT functionality (or both). It is up to the provider to trade-off the placement of the NAT functionality. Hence, the design of A+P is deliberately agnostic to where packets in transit will be translated, provided that the translation happens exactly once (Constraint 5). 3.1. Signaling The following information needs to be available on all the gateways in the A+P subsystem. We propose to deploy a signaling protocol such as [I-D.boucadair-dhc-port-range], [I-D.bajko-v6ops-port-restricted-ipaddr-assign]. The information that needs to be shared are the following: o a set of public IPv4 addresses, o for each IPv4 address a set of allocated port-ranges (port-set), o the tunneling technology to be used (e.g., "IPv6-encapsulation") o addresses of the tunnel endpoints (e.g., IPv6 address of tunnel endpoints) o whether or not NAT functionality is provided by the gateway o a device identification number and some authentification mechanisms o a version number and some reserved bits for future use. Note that the functions of encapsulation and decapsulation have been separated from the NATing functionality. However, to accommodate legacy hosts, NATing must provided at some point in the path; therefore the availability or absence of NATing must be communicated in the signaling, as A+P is agnostic about NAT placement. Bush, et al. Expires September 10, 2009 [Page 10] Internet-Draft A+P Addressing Extension March 2009 3.2. Address realm Each gateway within the A+P subsystem manages a certain portion of A+P address space, that is, a portion of IPv4 space which is extended borrowing bits from the port number. This address space may be a single, port-restricted IPv4 address. The gateway MAY use its managed A+P address space for several purposes: o Allocate a sub-portion of the A+P address space to other authenticated A+P gateways in the A+P subsystem (referred to as delegation). We call the allocated sub-portion delegated address space. o Exchange (untranslated) packets with the external address realm. For this to work, such packets MUST use source address and port belonging to the non-delegated address space. Note that if the gateway is also capable of performing the NAT functionality, it MAY translate packets arriving on an internal interface which are outside of its managed A+P address space into non-delegated address space. An A+P gateway ("A"), accepts incoming connections from other A+P gateways ("B"). Upon connection establishment (provided appropriate authentication), B would "ask" A for delegation of an A+P address. In turn, A will inform B about its public IPv4 address, and will delegate a portion of its port-range to B. In addition, A will also negotiate the encaps/decaps functionality with B (e.g., let B know the address of the decaps device/other-end-point of the tunnel). This could be implemented for example via a DHCP-similar solution. In general the following rule applys: A sub-portion of the managed A+P address space is delegated as long as devices below ask for it, otherwise private IPv4 is provided to support legacy hosts. Bush, et al. Expires September 10, 2009 [Page 11] Internet-Draft A+P Addressing Extension March 2009 private +-----+ +-----+ public address ---| B |==========| A |--- Internet realm +-----+ +-----+ Address space realm of A: public IPv4 address = 12.0.0.1 port range = 0-65535 Address space realm of B: public IPv4 address = 12.0.0.1 port range = 2560-3071 Figure 3 Figure 3 illustrates a sample configuration. Note that A could actually consits of three different devices: one that handles signaling requests from B; one device that performs encapsulation and decapsulation; and, if provided, one device that performs NATing functionality (e.g., LSN). Packet forwarding is assumed in the following way: In the "out-bound" case, a packet arrives from the private address realm to B. As stated above, B has two options: it can either apply or not apply the NAT function. The decision depends upon the specific configuration and/or the capabilities of A and B. Note that NAT functionality is required to support legacy hosts, however, this can be done at any of the two devices A or B. The term NAT refers to translating the packet into the managed A+P address (B has address 12.0.0.1 and ports 2560-3071 in the example above). We then have two options: 1) B NATs the packet. The translated packet is then tunneled to A. A recognizes that the packet has already been translated, because the source address and port match allocated information. A decapsulates the packet and releases it in the public Internet. 2) B does not NAT the packet. The untranslated packet is then tunneled to A. A recognizes that the packet has not been translated, so A forwards the packet to a co-located NATing device, which translates the packet and routes it in the public Internet. This device, e.g., an LSN, has to store the mapping between the source port used to NAT and the tunnel where the packet came from, in order to correctly route the reply. Note that A cannot use a port number from the range that has been delegated to B. As a consequence A has to assign a part of its non-delegated address space to the NATing functionality. "Inbound" packets are handled in the following way: a packet from the public realm arrives at A. A analyzes the destination port number to Bush, et al. Expires September 10, 2009 [Page 12] Internet-Draft A+P Addressing Extension March 2009 understand whether the packet needs to be NATed or not. 1) If the destination port number belongs to the range that A delegated to B, then A tunnels the packet to B. B can now NAT the packet using its stored mapping and forward the translated packet in the private domain. 2) If the destination port number is from the address space of the LSN, then A passes the packet on to the co-located LSN which uses the stored mapping to NAT the packet into the private address realm of B. The appropriate tunnel is stored as well in the mapping of the initial NAT. The LSN then encapsulates the packet to B, which decapsulates it and normally routes it within its private realm. 3) Finally, if the destination port number does neither fall in a delegated range, nor into the address range of the LSN, A discards the packet. If the packet is passed to the LSN, but no mapping can be found, the LSN discards the packet. 3.3. Reasons for allowing multiple A+P gateways Since each device in the A+P subsystem provides the encaps/decaps functionality, new devices can establish tunnels and become in turn part of the A+P subsystem. As noted above, being part of the A+P subsystem implies the capability of talking to the external address realm without any translation. In particular, as described in the previous section, a device X in the A+P subsystem can be reached from the external domain by simply using the public IPv4 address and a port which has been delegated to X. Figure 4 shows an example where three devices are connected in a chain. In other words, A+P signaling can be used to extend end-to-end connectivity to the devices which are in the A+P subsystem. This allows A+P-aware applications (or OSes) running on end hosts to enter the A+P subsystem and exploit untranslated connectivity. There are two modes for end-hosts to gain end-to-end connectivity. The first one is having end-hosts perform the NAT function (along with the encaps/decaps function which is required to join the A+P subsystem). This option works in a similar way to the NAT-in-the- host trick employed by virtualization software such as VMware, where the guest operating system is connected via a NAT to the host operating system. The second mode is applications who autonomously ask for an A+P address and use it to join the A+P subsystem. This capability is necessary for some applications that require end-to-end connectivity (e.g., applications that need to be contacted from outside). Bush, et al. Expires September 10, 2009 [Page 13] Internet-Draft A+P Addressing Extension March 2009 +---------+ +---------+ +---------+ internal | gateway | | gateway | | gateway | external realm --| 1 |======| 2 |======| 3 |-- realm +---------+ +---------+ +---------+ An A+P subsystem with multiple devices Figure 4 Whatever the reasons might be, the Internet was build on a paradigm that end-to-end connectivity is possible. A+P makes this still possible in a time where address shortage forces ISPs to use NATs at various levels. In such sense, A+P can be regarded as a way to bypass NATs. +---+ (customer2) |A+P|-* +---+ +---+ \ NAT|A+P|-* \ +---+ | \ | forward if in-range +---+ \+---+ +---+ / |A+P|------|A+P|----|A+P|---- +---+ /+---+ +---+ \ / NAT if necessary / (cust1) (prov. (e.g., provider NAT) +---+ / router) |A+P|-* +---+ A complex A+P subsystem Figure 5 Figure 5 depicts a complex scenario, where the A+P subsystem is composed by multiple devices organized in a hierarchy. Each A+P gateway decapsulates the packet and then re-encapsulates it again to the next tunnel. A packet can either be NATed when it enters the A+P subsystem, or at intermediate devices, or when it exits the A+P subsystem. This could be for example a gateway installed within the provider's network, together with a LSN (a large-scale-NAT provided by the provider). Then each customer operates its own CPE. However, behind the CPE applications might also be A+P-aware and run their own A+P-gateways, which enables them to have end-to-end connectivity. One limitation applies, if "delayed translation" is used (e.g., translation at the LSN instead of the CPE). If devices using Bush, et al. Expires September 10, 2009 [Page 14] Internet-Draft A+P Addressing Extension March 2009 "delayed translation" want to talk to each other they SHOULD use A+P addresses or out-of-band addressing. 4. Deployment Scenarios 4.1. A+P for Broadband Providers Large broadband providers have not enough IPv4 address space to provide every customer with a single IP. The natural solution is sharing a single IP address among many customers. Multiplexing customers is usually accomplished by allocating different port numbers to different customers somewhere within the network of the provider. In this document we use the following terms and assumptions: 1. Customer Premises Equipment (CPE), i.e. cable/DSL modem. 2. Provider Edge Router (PE), AKA customer aggregation router 3. Port Range Router (PRR), edge behind which A+P addresses are used. 4. Provider Border Router (BR), providers edge to other providers 5. Network Core Routers (Core), provider routers which are not at the edge. It is expected that the CPE can be upgraded or replaced to support A+P encaps/decaps functionality. Ideally the CPE also provides NATing functionality. Further, it is expected that at least another component in the ISP network provides the same functionality, and hence is able to establish an A+P subsystem with the CPE. This device is referred to as A+P border router or port-range router (PRR), and could be located close to the PE router. The core of the network MUST support the tunneling protocol (which SHOULD be IPv6, as per Constraint 7). In addition, we do not want to restrict any initiative of customers, who might want to run an A+P-capable network behind their CPE. To satisfy both Constraints 1 and 3 unmodified legacy hosts should keep working seamlessly, while upgraded/new end- systems should be given the opportunity to exploit enhanced features. 4.2. A+P for Mobile Providers In the case of mobile service provider the situation is slightly different. The A+P border is assumed to be the gateway (e.g., GGSN/ PDN GW of 3GPP, or ASN GW of WiMAX). The need to extend the address Bush, et al. Expires September 10, 2009 [Page 15] Internet-Draft A+P Addressing Extension March 2009 is not within the provider network, but on the edge between the mobile phone devices and the base-station. While desirable, IPv6 connectivity may or may not be providable. For mobile providers we use the following terms and assumptions: 1. Provider Network (PN) 2. Gateway (GW) 3. Mobile Phone device (phone) 4. Devices behind phone, e.g., laptop computer connecting via phone to Internet. We expect that the gateway has many IPv4 addresses and is always in the data-path of the packets. Transportation between gateway and phone devices is assumed to be an end-to-end layer-2 tunnel. We assume that phone as well as gateway can be upgraded to support A+P. However, some applications running on the phone or devices behind the phone (such as laptop computers connecting via the phone), are not necessarily expected to be upgraded. Again, while we do not expect that devices behind the phone will be A+P aware/upgraded we also do not want to hinder their evolution. In this sense the mobile phone would be comparable to the CPE in the broadband provider case; the gateway to the PRR/LSN box in the network of the broadband provider. 4.3. A+P from provider networks perspective ISPs suffering from IPv4 address space exhaustion are interested in achieving a high address space compression ratio. In this respect, an A+P subsystem allows much more flexibility than traditional NATs: the NAT can be placed at the customer, and/or in the provider network. In addition hosts or applications can request ports and thus have untranslated end-to-end connectivity. Bush, et al. Expires September 10, 2009 [Page 16] Internet-Draft A+P Addressing Extension March 2009 +------------------------------+ private | +------+ A+P-in +--------+ | dual-stacked (RFC1918) --|-| CPE |==-IPv6-==| PRR |-|-- network space | +------+ tunnel +--------+ | (public addresses) | ^ +--------+ | | | IPv6-only | LSN | | | | network +--------+ | +----+----------------- ^ -----+ | | on customer within provider premises and control network A simple A+P subsystem example Figure 6 Consider the deployment scenario in Figure 6, where an A+P subsystem is formed between the CPE and a port-range router (PRR) within the ISP core network. The PRR is placed somewhere within the ISPs network, preferably close to the customer edge and forms the border from where on packets are forwarded based on address and port. The provider MAY deploy a LSN co-located with the PRR: in this case packets that have not been translated by the CPE will be handed to the LSN and NATted. In such a configuration, the ISP allows the customer to freely decide whether the translation is done at the CPE or at the LSN. In order to establish the A+P subsystem, the CPE will be configured automatically (e.g. via a signaling protocol, that conforms with the requirements stated above). Note that the CPE in the example above is only provisioned with an IPv6 address on the external interface. Bush, et al. Expires September 10, 2009 [Page 17] Internet-Draft A+P Addressing Extension March 2009 +------------ IPv6-only transport ------------+ | +---------------+ | | | | |A+P-application| | +--------+ | +-----+ | dual-stacked | | on end-host |=|==| CPE w/ |==|==| PRR |-|-- network | +---------------+ | +--------+ | +-----+ | (public addresses) +---------------+ | +--------+ | +-----+ | private IPv4 <-*--+->| NAT | | | LSN | | address space \ | +--------+ | +-----+ | for legacy +|--------------|----------+ hosts | | | | end-host with | CPE device | provider upgraded | on customer | network application | premises | An extended A+P subsystem with end-host running A+P-aware applications Figure 7 Figure 7 shows an example of how an upgraded application running on a legacy end-host can connect. The legacy host is provisioned with a private IPv4 address allocated from the CPE. Any packet sent from the legacy host will be NATed either at the CPE (if configured to do so), or at the LSN (if available). An A+P-aware application running on the end-host MAY use the signaling described in Section 3.1 to connect to the A+P-subsystem. Hence, the application will be delegated some space in the A+P address realm, and will be able to contact the external realm (i.e., the public Internet) without the need for translation. Note that part of A+P signaling is that the NATs are optional. However, if neither the CPE nor the PRR provides NATing functionality, then it will not be possible to connect legacy end- hosts. 4.4. Dynamic allocation of port ranges Allocating the same sized fixed range of ports to all CPE may lead to exhaustion of ports that are needed for NAT in a CPE to operate, because that customer has several hosts behind CPE and uses NAT to communicate with the Internet, any given restricted range of allocated ports might become exhausted. This is a perfect recipe for upsetting the more demanding customer. A mechanism for dynamic allocation of port ranges allows the ISP to achieve two goals; a more efficient compression ratio of number of customers on one IPv4 Bush, et al. Expires September 10, 2009 [Page 18] Internet-Draft A+P Addressing Extension March 2009 address and, on the other hand, not limiting the more demanding customers on their communication to/from Internet. The following mechanism applies to NAT functionality in CPE only: If a customer has an arrangement with the ISP for well-known-ports, and the PRR allocates to this CPE WKP range, this range is used for end- to-end communications to a server behind CPE with public IP address or if customer configures so for inbound NAT (1:1 or port forwarding). This function has a fixed range of ports and is not considered in the dynamic allocation mechanism. On the other hand, if customer configures the NAT function to access Internet from private address pool behind the CPE, this mechanism is automatically applied. NAT keeps track of translation tables, so only a small "daemon" needs to be developed and implemented by the CPE manufacturer to keep track of allocated ranges of ports and how many are used. In the case of 90% usage, the dynamic allocation daemon signals to the PRR the need for additional ports. A downside of this mechanism is that port allocation to a CPE might get quite large without and additional mechanism that would return unused port ranges back to the PRR's pool. This may be fixed by forcing the NAT to sequentially allocate ports for translation and reallocate to new requesta and released ports. So the use of ports is controlled and unfragmented ranges can be returned to pool. An other, not so pretty, way is to reset the additional allocations to 0 every 24 hours, and leave only the first allocation. Additional allocations would be requested by mechanism in very short time, leaving the customer unlikely to notice the event. The mechanism would prefer allocations of port ranges from the same IP for an initial allocation. If it is not possible to allocate an additional port range from the same IP, than mechanism can allocate a port range from another IP within the same subnet. With every additional port range allocation, the PRR updates its routing table and sends packets coming to allocated ports on that IP to the appropriate tunnel that ends on the CPE which requested and allocated that additional port range. The mechanism for allocating additional port ranges may be part of normal signaling that is used to authenticate CPE to ISP. The ISP controls the dynamic allocation of port ranges by the PRR by setting the initial allocation size and maximum number of allocations per CPE, or the maximum allocations per subscription, depending on subscription level. There is a general observation that the more demanding customer uses around 1024 ports when heavily communicating. So, for example, a first suggestion would be 512 ports initially and then dynamic allocations of ranges of 512 ports up to 6 more allocations maximum. The maximum number of allocations should prevent from one customer acting in distructive manner, in case they Bush, et al. Expires September 10, 2009 [Page 19] Internet-Draft A+P Addressing Extension March 2009 become infected. The maximum number of allocations can also be fine grained with parameter of how many allocations a user can request per time frame. If this is used, evasive applications are limited in bad behavior, for example one additional allocation per minute would considerably slow the port requesting storm. Note that there is no minimum request size. This is because A+P- aware applications running on end-hosts MAY request a single port (or a few ports) for the CPE to be contacted on (e.g., VoIP clients register a public IP and a single delegated port from the CPE, and accept incoming calls on that port). The implementation on the CPE or PRR will dictate how to handle such requests for smaller blocks: For example half of available blocks might be used for "block- allocations", 1/6 for single port requests, and the rest for NATing. 4.5. Example of A+P-forwarded packets This section provides a detailed example of A+P setup, configuration, and how packets flow from an end-host behind an A+P upgraded provider to any host in the IPv4 Internet and how the return packets flow back. The following example discusses the situation of an A+P- unaware end-host, the NATing is done at the CPE. Figure 8 illustrates how the CPE receives an IPv4 packet from the end-user device. We first describe the case where the CPE has been configured to provide the NAT functionality (e.g., by the customer via interaction via a website, or via automatic signaling). In the following, we call a packet which is translated at the CPE an A+P- forwarded packet, in analogy with the port-forwarding function employed in today's CPEs. Upon receiving a packet from the internal interface, the CPE NATs it and forwards it to the PRR. The NAT on the CPE is assumed to store the 5-tuple (source_IPv4, source_port, destination_IPv4, destination_port, tunnel-interface). When the PRR receives the A+P-forwarded packet, it de-capsulates the inner IPv4 packet and it checks the source address and port. If the source address and port match the CPE's A+P address, then the PRR simply routes the encapsulated packet. This is always the case for A+P-forwarded packets. Otherwise, the PRR assumes that the packet is not A+P-forwarded, and then passes it to the LSN function, which in- turn NATs the packet and then releases it into the Internet. Figure 8 shows the packet flow for an outgoing A+P-forwarded packet. Bush, et al. Expires September 10, 2009 [Page 20] Internet-Draft A+P Addressing Extension March 2009 +-----------+ | Host | +-----+-----+ | | 10.0.0.2 IPv4 datagram 1 | | | | v | 10.0.0.1 +---------|---------+ |CPE | | +--------|||--------+ | ||| a::2 | ||| 12.0.0.3 (100-200) IPv6 datagram 2| ||| | |||<-IPv4-in-IPv6 | ||| -----|-|||------- / | ||| \ | ISP access network | \ | ||| / -----|-|||------- | ||| v ||| a::1 +--------|||--------+ |PRR ||| | +---------|---------+ | | 12.0.0.1 IPv4 datagram 3 | | -----|--|-------- / | | \ | ISP network / | \ Internet / -----|--|-------- | | v | 128.0.0.1 +-----+-----+ | IPv4 Host | +-----------+ Figure 8: Forwarding of Outgoing A+P-forwarded Packets Bush, et al. Expires September 10, 2009 [Page 21] Internet-Draft A+P Addressing Extension March 2009 +-----------------+--------------+-----------------------------+ | Datagram | Header field | Contents | +-----------------+--------------+-----------------------------+ | IPv4 datagram 1 | IPv4 Dst | 128.0.0.1 | | | IPv4 Src | 10.0.0.2 | | | TCP Dst | 80 | | | TCP Src | 8000 | | --------------- | ------------ | --------------------------- | | IPv6 Datagram 2 | IPv6 Dst | a::1 | | | IPv6 Src | a::2 | | | IPv4 Dst | 128.0.0.1 | | | IPv4 Src | 12.0.0.3 | | | TCP Dst | 80 | | | TCP Src | 100 | | --------------- | ------------ | --------------------------- | | IPv4 datagram 3 | IPv4 Dst | 128.0.0.1 | | | IPv4 Src | 12.0.0.3 | | | TCP Dst | 80 | | | TCP Src | 100 | +-----------------+--------------+-----------------------------+ Datagram header contents An incoming packet undergoes the reverse process. When the PRR receives an IPv4 packet on an external interface, it first checks whether the destination port number falls in a delegated range or not. If the address space was delegated, then PRR tunnels the packets unmodified. If the address space was not-delegated the packet will be handed to the LSN to check if a mapping is available. Figure 9 shows how an incoming packet is forwarded, under the assumption that the port number matches the port range which was delegated to the CPE. Bush, et al. Expires September 10, 2009 [Page 22] Internet-Draft A+P Addressing Extension March 2009 +-----------+ | Host | +-----+-----+ ^ | 10.0.0.2 IPv4 datagram 3 | | | | | | 10.0.0.1 +---------|---------+ |CPE | | +--------|||--------+ ^ ||| a::2 | ||| 12.0.0.3 (100-200) IPv6 datagram 2| ||| | |||<-IPv4-in-IPv6 | ||| -----|-|||------- / | ||| \ | ISP access network | \ | ||| / -----|-|||------- | ||| | ||| a::1 +--------|||--------+ |PRR ||| | +---------|---------+ ^ | 12.0.0.1 IPv4 datagram 1 | | -----|--|-------- / | | \ | ISP network / | \ Internet / -----|--|-------- | | | | 128.0.0.1 +-----+-----+ | IPv4 Host | +-----------+ Figure 9: Forwarding of Incoming A+P-forwarded Packets Bush, et al. Expires September 10, 2009 [Page 23] Internet-Draft A+P Addressing Extension March 2009 +-----------------+--------------+-----------------------------+ | Datagram | Header field | Contents | +-----------------+--------------+-----------------------------+ | IPv4 datagram 1 | IPv4 Dst | 12.0.0.3 | | | IPv4 Src | 128.0.0.1 | | | TCP Dst | 100 | | | TCP Src | 80 | | --------------- | ------------ | --------------------------- | | IPv6 Datagram 2 | IPv6 Dst | a::2 | | | IPv6 Src | a::1 | | | IPv4 Dst | 12.0.0.3 | | | IP Src | 128.0.0.1 | | | TCP Dst | 100 | | | TCP Src | 80 | | --------------- | ------------ | --------------------------- | | IPv4 datagram 3 | IPv4 Dst | 10.0.0.2 | | | IPv4 Src | 128.0.0.1 | | | TCP Dst | 8000 | | | TCP Src | 80 | +-----------------+--------------+-----------------------------+ Datagram header contents Note that datagram 1 travels untranslated up to the CPE, thus the customer has the same control over the translation as it has today where he/she has an home gateway with customizable port-forwarding. 4.6. Forwarding of standard packets Packets for which the CPE does not have a corresponding port forwarding rule are tunneled to the PRR which provides the LSN function. We underline that the LSN MUST NOT use the delegated space for NATting. See [I-D.durand-softwire-dual-stack-lite] for network diagrams which illustrate the packet flow in this case. 4.7. Handling ICMP ICMP is problematic for all NATs, because it lacks port numbers. A+P routing exacerbates the problem. Most ICMP messages fall into one of two categories: error reports, or ECHO/ECHO reply (commonly known as "ping"). For error reports, the offending packet header is embedded within the ICMP packet; NAT devices can then rewrite that portion and route the packet to the actual destination host. This functionality will remain the same with A+P; however, the PRR will need to examine the embedded header to extract the port number, while the A+P gateway will do the necessary rewriting. Bush, et al. Expires September 10, 2009 [Page 24] Internet-Draft A+P Addressing Extension March 2009 ECHO and ECHO reply are more problematic. For ECHO, the A+P gateway device must rewrite the "Identifier" and perhaps "Sequence Number" fields in the ICMP request, treating them as if they were port numbers. This way, the BR can build the correct A+P address for the returning ECHO replies, so they can be correctly routed back to the appropriate host in the same way as TCP/UDP packets. (Pings originated from an external domain/legacy Internet towards an A+P device are not supported.) 4.8. Limitations of the A+P approach One limitation that A+P shares with any other IP address-sharing mechanism is the availability of well-known ports. In fact, services run by customers that share the same IP address will be distinguished by the port number. As a consequence, it will be impossible for two customers who share the same IP address to run services on the same port (e.g., port 80). Unfortunately, working around this limitation usually implies application-specific hacks (e.g., HTTP and HTTPS virtual hosting), discussion of which is out of the scope of this document. Of course, a provider might charge more for giving a customer the well-known port range, 0..1024, thus allowing the customer to provide externally available services. Many applications require the availability of well known ports. However, those applications are not expected to work in A+P environment unless they can adapt to work with different ports. However, such application do not work behind today's NATs either. Another problem which is common to all kind of NATs is the coexistence with IPsec. In fact, a NAT which also translates port numbers prevents AH and ESP from functioning properly, both in tunnel and in transport mode. In this respect, we stress that, since an A+P subsystem exhibits the same external behavior as a NAT, well-known workarounds (such as [RFC3715]) can be employed. Port randomization is also a bit compromised in A+P solution. As CPE can randomize ports only within port range that is allocated to it, randomness is more limited than in the the scenario with full range of ports, allowed for randomization. We can assume, that CPE either gets port range from ephemeral range (49152-65535) or from "registered ports" range (1024-49151). Both ranges can be used for randomization, see [I-D.ietf-tsvwg-port-randomization] for more details. 5. IANA Considerations This document makes no request of IANA. Bush, et al. Expires September 10, 2009 [Page 25] Internet-Draft A+P Addressing Extension March 2009 Note to RFC Editor: this section may be removed on publication as an RFC. 6. Security Considerations 7. Acknowledgments The authors wish to thank especially (in alphabetical order) Gabor Bajko, Remi Despres, Alain Durand, Pierre Levis, and Teemu Savolainen for their close collaboration on the development of the A+P approach. David Ward for review, constructive criticism, and interminable questions. Cullen Jennings for discussion and review of fragmentation, and Dave Thaler for useful criticism on "stackable" A+P gateways. We would also like to thank the following persons for their feedback on earlier versions of this work: Bernhard Ager, Rob Austein, Gert Doering, Dino Farinacci, Russ Housley, and Ruediger Volk. 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 8.2. Informative References [BCP38] Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, May 2000. [I-D.bajko-v6ops-port-restricted-ipaddr-assign] Bajko, G. and T. Savolainen, "Port Restricted IP Address Assignment", draft-bajko-v6ops-port-restricted-ipaddr-assign-02 (work in progress), November 2008. [I-D.boucadair-dhc-port-range] Boucadair, M., Grimault, J., Levis, P., and A. Villefranque, "DHCP Options for Conveying Port Mask and Port Range Router IP Address", draft-boucadair-dhc-port-range-01 (work in progress), October 2008. [I-D.boucadair-port-range] Bush, et al. Expires September 10, 2009 [Page 26] Internet-Draft A+P Addressing Extension March 2009 Boucadair, M., Levis, P., Bajko, G., and T. Savolainen, "IPv4 Connectivity Access in the Context of IPv4 Address Exhaustion", draft-boucadair-port-range-01 (work in progress), January 2009. [I-D.durand-softwire-dual-stack-lite] Durand, A., Droms, R., Haberman, B., and J. Woodyatt, "Dual-stack lite broadband deployments post IPv4 exhaustion", draft-durand-softwire-dual-stack-lite-01 (work in progress), November 2008. [I-D.ietf-tsvwg-port-randomization] Larsen, M. and F. Gont, "Port Randomization", draft-ietf-tsvwg-port-randomization-02 (work in progress), August 2008. [Martin-Java] Martin, D., Rajagopalan, S., and A. Rubin, "Blocking Java Applets at the Firewall", Proceedings of the Internet Society Symposium on Network and Distributed System Security, pp. 16-26, 1997. [RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 9, RFC 959, October 1985. [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E. Lear, "Address Allocation for Private Internets", BCP 5, RFC 1918, February 1996. [RFC3715] Aboba, B. and W. Dixon, "IPsec-Network Address Translation (NAT) Compatibility Requirements", RFC 3715, March 2004. Authors' Addresses Randy Bush Internet Initiative Japan 5147 Crystal Springs Bainbridge Island, Washington 98110 US Phone: +1 206 780 0431 x1 Email: randy@psg.com Bush, et al. Expires September 10, 2009 [Page 27] Internet-Draft A+P Addressing Extension March 2009 Olaf Maennel Deutsche Telekom Laboratories Ernst-Reuter-Platz 7 Berlin 10587 Germany Phone: +3727120686 Email: o@maennel.net Jan Zorz go6.si Frankovo naselje 165 Skofja Loka 4220 Slovenia Phone: +38659042000 Email: jan@go6.si Steven M. Bellovin Columbia University 1214 Amsterdam Avenue MC 0401 New York, NY 10027 US Phone: +1 212 939 7149 Email: bellovin@acm.org Luca Cittadini Universita' Roma Tre via della Vasca Navale, 79 Rome, 00146 Italy Phone: +39 06 5733 3215 Email: luca.cittadini@gmail.com Bush, et al. Expires September 10, 2009 [Page 28]