David M. Goldschlag (goldschlag@itd.nrl.navy.mil)
Michael G. Reed (reed@itd.nrl.navy.mil)
Paul F. Syverson (syverson@itd.nrl.navy.mil)
Naval Research Laboratory
Center For High Assurance Computer Systems
Washington, D.C. 20375-5337
USA
+1 202.767.2389 (voice)
+1 202.404.7942 (fax)
The World Wide Web is rapidly becoming an important tool for modern day communication and electronic commerce. But electronic messages sent over the Internet can be easily snooped and tracked, revealing who is talking to whom and what they are talking about. Is privacy important, and how can it be guaranteed? This paper describes how a freely available system, onion routing, can be used to provide privacy for a wide variety of Internet services, including Virtual Private Networks, Web browsing, e-mail, remote login, and electronic cash.
Keywords: Anonymity, Internet, mixes, privacy, security, traffic analysis.
The World Wide Web (Web) is rapidly becoming an important tool for modern day communication and electronic commerce. But is Internet communication private? Most security concerns focus on eavesdropping [10] to prevent outsiders from listening in on electronic conversations. But encrypted messages can still be tracked, revealing who is talking to whom. This tracking is called traffic analysis and may reveal sensitive information. For example, the existence of inter-company collaboration may be confidential. Similarly, e-mail users may not wish to reveal whom they are communicating with to the rest of the world. In certain cases anonymity may be desirable also: anonymous e-cash is not very anonymous if delivered with a return address. Web based shopping or browsing of public databases should not require revealing one's identity.
This paper describes how a freely available system, onion routing, can be used to protect a variety of Internet services against both eavesdropping and traffic analysis attacks from both the network and observers. The focus here is on configurations of onion routing networks and applications of onion routing, including Virtual Private Networks (VPN), Web browsing, e-mail, remote login, and electronic cash. For the purposes of this paper, onion routing is treated as a black box that provides anonymous connections. (We open this black box slightly in section 3.) Anonymous connections are bi-directional and real-time communication channels that do not implicitly convey identifying information about the connected parties. Any identifying information must be carried in the data stream over the anonymous connection. The goal of onion routing is anonymous connections, not anonymous communication. Section 3 provides a brief overview of onion routing; other papers describe onion routing in greater detail [12, 11, 8].
This paper is organized in the following way: Section 2 defines the threats of eavesdropping and traffic analysis. Section 3 provides a brief overview of the onion routing system. Section 4 describes how onion routing networks may be configured and how varying the configuration changes the privacy characteristics of the network. Section 5 describes how onion routing may be used in a wide variety of Internet applications. Section 6 describes related work, and presents concluding remarks.
Letters sent through the Post Office are usually in an envelope marked with the sender's and recipient's addresses. We trust that the Post Office does not peek inside the envelope, because we consider the contents private. We also trust that the Post Office does not monitor who sends mail to whom, because that information is also considered private.
These two types of sensitive information, the contents of an envelope and its address, apply equally well to electronic communication over the Internet and the Web. As the Web becomes an important part of modern day communication and electronic commerce, protecting the privacy of electronic messages becomes increasingly important. Just like mail, electronic messages travel in electronic envelopes. Protecting the privacy of electronic messages requires both safeguarding the contents of their envelopes and hiding the addresses on their envelopes. Although communicating parties usually identify themselves to one another, there is no reason that the use of a public network like the Internet ought to reveal to others who is talking to whom and what they are talking about. The first concern is traffic analysis, the latter is eavesdropping.
By making both eavesdropping and traffic analysis hard, the privacy of communication is protected. But what about anonymity? Can two parties communicate, if one or both do not want to be identified to the other? If a Web surfer wants to buy something using the electronic equivalent of (untraceable) cash [13] how could that e-cash be moved through the Web without identifying the purchaser?
If an electronic envelope keeps its contents private, and the address on the envelope is also hidden, then any identifying information can only be inside the envelope! So for anonymous communication, we also remove identifying information from the contents of an envelope. This may be called anonymizing a private envelope.
These goals may appear to be incompatible: Can the contents of an envelope really be kept private? How can a letter reach its destination if its address is hidden? Can two parties communicate without revealing their identities to one another? Can all this be done without trusting third parties (the Post Office, for example) not to remember addresses or to open envelopes?
The next sections briefly describe the onion routing system, how the anonymous connections that it provides are secure against both eavesdropping and traffic analysis, and how they may be used for anonymous communication too.
Traffic analysis can be used to infer who is talking to whom over a public network. For example, in a packet switched network like the Internet, packets have a header used for routing, and a payload that carries the data. The header, which must be visible to the network (and to observers of the network), reveals the source and destination of the packet. Even if the header were obscured in some way, the packet could still be tracked as it moves through the network. Encrypting the payload is similarly ineffective, because the goal of traffic analysis is to identify who is talking to whom and not (to identify directly) the content of that conversation.
Onion routing protects against traffic analysis attacks from both the network and observers. Onion routing works in the following way: The initiating application, instead of making a connection directly to a responding server, makes a connection to the appropriate onion routing proxy on some remote machine. That onion routing proxy builds an anonymous connection through several other onion routers to the destination. Each onion router can only identify adjacent onion routers along the route. When the connection is broken, even this limited information about the connection is cleared at each onion router. Data passed along the anonymous connection appears different at and to each onion router, so data cannot be tracked en route and compromised onion routers cannot cooperate. An onion routing network can exist in several configurations that permit efficient use by both large institutions and individuals.
The onion routing proxy defines a route through the onion routing network by constructing a layered data structure called an onion and sending that onion through the onion routing network. Each layer of the onion defines the next hop in a route. An onion router that receives an onion peels off its layer, reads from that layer the name of the next hop and the cryptographic information associated with its hop in the anonymous connection, pads the embedded onion to some constant size, and sends the padded onion to the next onion router.
Before sending data over an anonymous connection, the initiator's onion routing proxy adds a layer of encryption for each onion router in the route. As data moves through the private connection, each onion router removes one layer of encryption, so it finally arrives as plaintext. This layering occurs in the reverse order for data moving back to the initiator. So data that has passed backward through the anonymous connection must be repeatedly decrypted to obtain the plaintext.
The last onion router forwards data to another type of proxy on the same machine, called the responder's proxy, whose job is to pass data between the onion network and the responding server.
For instructions on how to use our onion routing prototype, please visit the onion routing web site.
In one basic onion routing network configuration, an onion router might sit on the firewall of a protected site. This onion router serves as an interface between machines behind the firewall and the rest of the network. To complicate tracking of traffic originating or terminating within the protected site, this onion router should also route data between other onion routers.
There are four important features of this basic configuration:
In the basic configuration, the first onion router (or the first onion routing proxy) is the most trusted one. It may be desirable to move that trust closer to the user.
For example, an Internet Services Provide (ISP) may run an onion router that accepts connections from onion routing proxies running on subscribers' machines. In this configuration, users generate onions specifying a path through the ISP's onion router to the destination. Although the ISP knows who initiates the connection, the ISP would not know with whom the customer is communicating. So the customer need not trust the ISP to maintain his privacy. Furthermore, the ISP becomes a common carrier, who carries data for its customers. This may relieve the ISP of responsibility both for whom users are communicating with and for the content of those conversations.
We first describe how to use anonymous connection in VPNs and anonymous chatting services. We then describe onion routing proxies for three Internet services: Web browsing, e-mail, and remote login. These three onion routing proxies have been implemented. Anonymizing versions of these proxies that remove the identifying information that may be present in the headers of these services' data streams have been implemented as well.
If two sites want to collaborate, they could establish a long term tunnel that would multiplex many socket connections over a single anonymous connection. This would effectively hide who is collaborating with whom and what they are working on, without the expense of constructing many individual anonymous connections. Such long term anonymous connections between enclaves provide the analog of a leased line over a public network. The privacy provided by tunneling through an anonymous connection is superior to IPSEC tunnels or to firewall to firewall encryption. In those tunnels, only the identities of the endpoint machines are hidden; here, the existence of communication between the enclaves is hidden too.
Anonymous connections can be used in a service similar to IRC, where many parties meet to chat at some central server. The chat server may mate several anonymous connections carrying matching tokens. Each party defines the part of the connection leading back to itself, so no party has to trust the other to maintain its privacy. If the communicating parties layer end-to-end encryption over the mated anonymous connections, they also prevent the central server from listening in on the conversation.
Certain forms of e-cash are designed to be anonymous and untraceable, unless they are double spent or otherwise misused. However, if a customer cannot contact a vendor without identifying himself, the anonymity of e-cash is undermined. For transactions where both payment and product can be conveyed electronically, anonymous connections can be used to hide the identities of the parties from one another.
How can the customer be prevented from taking his purchase without paying for it (e.g., by closing the connection early) or the vendor be prevented from taking the customer's e-cash without completing the transaction? This problem is outside the scope of this paper, and is addressed in [6, 2]. In the case of a well known vendor, a non-technical solution requires customers to pay first. Such a vendor is unlikely to deliberately cheat its customers since it may be caught in an audit.
We proxy remote login requests by taking advantage of the optional
-l username
to rlogin
. The usual rlogin
command is
of the form:
rlogin -l
username server
To use rlogin
through an onion routing proxy, one would type
rlogin -l
username@server proxy
where proxy refers to the onion routing proxy to be used and both username and server are the same as specified above. A normal rlogin request is transmitted from a privileged port on the client to the well known port for rlogin (513) on the server as:
\0
username on client \0
username on server \0
terminal type \0
where username on client is the username of the individual
invoking the command on the client machine, username on server
is either the -l
field (if specified) or the username of the
individual invoking the command on the client machine (if no -l
is
specified), and the terminal type is a standard
termcap/linespeed specification. The server responds with a single
zero byte if it will accept the connection or breaks the socket
connection if an error has occurred or the connection is rejected. Our
normal rlogin proxy therefore receives the initial request:
\0
username on client \0
username@server \0
terminal type \0
The proxy creates an anonymous connection to the RLOGIN port on the server machine and proceeds to send it a massaged request of the form:
\0
username \0
username \0
terminal type \0
Once this request is transmitted to the server, the proxy blindly forwards data in both directions between the client and server until the socket is broken by either side.
Notice that the onion router does not send the server the client's username on the client, so communication is anonymous, unless the data-stream subsequently reveals more information.
Proxying HTTP requests follows the IETF HTTP V1.1 Draft Specification [5]. An HTTP request from a client through an HTTP proxy is of the form:
GET http://www.server.com/file.html HTTP/1.0
followed by optional fields. Notice that an HTTP request from a client to a server is of the form:
GET file.html HTTP/1.0
also followed by optional fields. The server name and protocol type are missing, because the connection is made directly to the server.
As an example, a complete request from Netscape Navigator to an onion router HTTP proxy may look like:
GET http://www.server.com/file.html HTTP/1.0 Referer: http://www.server.com/index.html Proxy-Connection: Keep-Alive User-Agent: Mozilla/3.0 (X11; I; SunOS 5.4 sun4m) Host: www.server.com Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg
The proxy must create an anonymous connection to
www.server.com
, and issue a request as if it were a client.
Therefore, the request must be massaged to remove the server name and
protocol type, and transmitted to www.server.com
over the
anonymous connection. Once this request is transmitted to the server,
the proxy blindly forwards data in both directions between the client
and server until the socket is broken by either side.
For the anonymizing proxy of HTTP, the proxy proceeds as outlined
above but also sanitizes the optional fields that follow the
GET
command because they may contain identifying information.
For example, we filter out cookies.
Furthermore, the data stream during a connection must be monitored, to
sanitize additional headers that might occur during the connection.
The Anonymizer [1] also provides anonymous Web browsing. Users can connect to servers through the Anonymizer and it strips off identifying headers. This is essentially what our anonymizing HTTP onion routing proxy does. But data sent through the Anonymizer can still be tracked and monitored. The Anonymizer could be used as a front end to the onion routing network to provide both anonymous Web browsing and effective protection against traffic analysis.
Electronic mail is proxied by utilizing the user%host@proxy
form of email address instead of the normal user@host
form.
This form should work with most current and older mail systems. Under
this form, the client contacts the proxy server's well known SMTP port
(25). Instead of the normal mail daemon listening to that port, the
proxy listens and interprets what it receives following a strict state
machine: wait for a valid HELO
command, wait for a valid
MAIL From:
command, and then wait for a valid RCPT To:
command. Each command argument is temporarily buffered. Once the
RCPT To:
command has been received, the proxy proceeds to
create an anonymous connection to the destination server and relays
the HELO
and MAIL From:
commands exactly as received.
The RCPT To:
command is massaged and forwarded. Once this
request is transmitted to the server, the proxy forwards data in both
directions from the client and server. An example of email from
joe@sender.com
on the machine sender.com
to
mary@recipient.com
via the onion.com
onion router is
given below. Joe types mail mary%recipient.com@onion.com
.
First the communications from the client on sender.com
to the
onion router SMTP proxy on onion.com
is given, followed by the
communications from the responder's proxy to recipient.com
:
220 onion.com SMTP Onion Routing Network. HELO sender.com 250 onion.com -- Connection from sender.com (2.0.0.1). MAIL From: joe@sender.com 250 Sender is joe@sender.com. RCPT To: mary%recipient.com@onion.com
The proxy massages the RCPT To:
line to make the address
mary@recipient.com
and makes an anonymous connection to
recipient.com
. It then replays the massaged protocol to
recipient.com
:
220 recipient.com Sendmail 4.1/SMI-4.1 ready at Wed, 28 Aug 96 15:15:00 EDT HELO Onion.Routing.Network 250 recipient.com Hello Onion.Routing.Network [2.0.0.5], pleased to meet you MAIL From: joe@sender.com 250 joe@sender.com... Sender ok RCPT To: mary@recipient.com
At this point, the proxy forwards data in both directions, until a line containing only a period is sent from the sender to the recipient:
250 mary@recipient.com... Recipient ok DATA 354 Enter mail, end with "." on a line by itself This is a note .
The proxy forwards the line containing only a period to the recipient,
and forwards the recipient's response to the sender. At that point,
the proxy sends QUIT
to the recipient, reads the response and
closes the connection to the recipient. The proxy then waits for a
command from the sender; if that command is QUIT
, the proxy
sends a response and closes its connection to the sender:
250 Mail accepted QUIT 221 onion.com Service closing transmission channel.
For the anonymous proxy of electronic mail, the proxy proceeds as
outlined above with a few changes. It is now necessary to sanitize
both the MAIL From:
command and the header portion of the
actual message body. Sanitization of the MAIL From:
command is
trivial with a simple substitution of anonymous
for
joe@sender.com
. For the header sanitization, we have taken the
conservative approach of deleting all headers, but this may be
modified in the future to only remove identifying information and
leave the remaining header information intact.
A new primitive, the anonymous connection, was introduced in [8, 11, 7, 12]. Anonymous connections are strongly resistant to both eavesdropping and traffic analysis. Anonymous connections do not reveal to the network or to observers of the network who is talking to whom. If the data stream does not contain identifying information, communication is anonymous also. This paper demonstrates the versatility of anonymous connections by exploring their use in a variety of Internet applications. These applications include standard Internet services like Web browsing, remote login, and electronic mail. Anonymous connections can also be used to support Virtual Private Networks with connections that are resistant to traffic analysis.
The onion routing network supporting anonymous connections can be configured in several ways, including a Customer-ISP model that moves trust and privacy to the user's computer. This may relieve the carrier of responsibility for the user's connections, because the carrier cannot learn to whom the user is connecting and cannot read the data stream.
Onion routers are based on Chaum mixes [3]. Other Internet uses of Chaum mixes have been very application specific, focusing on one-way store and forward applications like anonymous remailers [4, 9]. Onion routing modularizes the private communications infrastructure by moving the mixes beneath the application level. This provides bi-directional and real-time communication channels that can be easily used by a variety of applications and services.
Privacy on the Internet
This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 index.
The translation was initiated by David Goldschlag on Thu Apr 17 10:29:56 EDT 1997
The appearance of external hyperlinks does not constitute endorsement by the United States Department of Defense, the United States Department of the Navy and The Naval Research Laboratory of the linked web sites, or the information, products or services contained therein. For other than authorized activities such as military exchanges and Morale, Welfare and Recreation (MWR) sites, the United States Department of Defense, the Department of the Navy and The Naval Research Laboratory does not exercise any editorial control over the information you may find at these locations. Such links are provided consistent with the stated purpose of this DoD web site.