Privacy on the Internet

David M. Goldschlag (goldschlag@itd.nrl.navy.mil)
Michael G. Reed (reed@itd.nrl.navy.mil)
Paul F. Syverson (syverson@itd.nrl.navy.mil)

Naval Research Laboratory
Center For High Assurance Computer Systems
Washington, D.C. 20375-5337
USA
+1 202.767.2389 (voice)
+1 202.404.7942 (fax)

Abstract

The World Wide Web is rapidly becoming an important tool for modern day communication and electronic commerce. But electronic messages sent over the Internet can be easily snooped and tracked, revealing who is talking to whom and what they are talking about. Is privacy important, and how can it be guaranteed? This paper describes how a freely available system, onion routing, can be used to provide privacy for a wide variety of Internet services, including Virtual Private Networks, Web browsing, e-mail, remote login, and electronic cash.

Keywords: Anonymity, Internet, mixes, privacy, security, traffic analysis.

Introduction
The Problem
Onion Routing
Network Configurations
- A Basic Configuration
- The Customer-ISP Model
Applications
Conclusion
References
About this document ...

Introduction

The World Wide Web (Web) is rapidly becoming an important tool for modern day communication and electronic commerce. But is Internet communication private? Most security concerns focus on eavesdropping [10] to prevent outsiders from listening in on electronic conversations. But encrypted messages can still be tracked, revealing who is talking to whom. This tracking is called traffic analysis and may reveal sensitive information. For example, the existence of inter-company collaboration may be confidential. Similarly, e-mail users may not wish to reveal whom they are communicating with to the rest of the world. In certain cases anonymity may be desirable also: anonymous e-cash is not very anonymous if delivered with a return address. Web based shopping or browsing of public databases should not require revealing one's identity.

This paper describes how a freely available system, onion routing, can be used to protect a variety of Internet services against both eavesdropping and traffic analysis attacks from both the network and observers. The focus here is on configurations of onion routing networks and applications of onion routing, including Virtual Private Networks (VPN), Web browsing, e-mail, remote login, and electronic cash. For the purposes of this paper, onion routing is treated as a black box that provides anonymous connections. (We open this black box slightly in section 3.) Anonymous connections are bi-directional and real-time communication channels that do not implicitly convey identifying information about the connected parties. Any identifying information must be carried in the data stream over the anonymous connection. The goal of onion routing is anonymous connections, not anonymous communication. Section 3 provides a brief overview of onion routing; other papers describe onion routing in greater detail [12, 11, 8].

This paper is organized in the following way: Section 2 defines the threats of eavesdropping and traffic analysis. Section 3 provides a brief overview of the onion routing system. Section 4 describes how onion routing networks may be configured and how varying the configuration changes the privacy characteristics of the network. Section 5 describes how onion routing may be used in a wide variety of Internet applications. Section 6 describes related work, and presents concluding remarks.

The Problem

Letters sent through the Post Office are usually in an envelope marked with the sender's and recipient's addresses. We trust that the Post Office does not peek inside the envelope, because we consider the contents private. We also trust that the Post Office does not monitor who sends mail to whom, because that information is also considered private.

These two types of sensitive information, the contents of an envelope and its address, apply equally well to electronic communication over the Internet and the Web. As the Web becomes an important part of modern day communication and electronic commerce, protecting the privacy of electronic messages becomes increasingly important. Just like mail, electronic messages travel in electronic envelopes. Protecting the privacy of electronic messages requires both safeguarding the contents of their envelopes and hiding the addresses on their envelopes. Although communicating parties usually identify themselves to one another, there is no reason that the use of a public network like the Internet ought to reveal to others who is talking to whom and what they are talking about. The first concern is traffic analysis, the latter is eavesdropping.

By making both eavesdropping and traffic analysis hard, the privacy of communication is protected. But what about anonymity? Can two parties communicate, if one or both do not want to be identified to the other? If a Web surfer wants to buy something using the electronic equivalent of (untraceable) cash [13] how could that e-cash be moved through the Web without identifying the purchaser?

If an electronic envelope keeps its contents private, and the address on the envelope is also hidden, then any identifying information can only be inside the envelope! So for anonymous communication, we also remove identifying information from the contents of an envelope. This may be called anonymizing a private envelope.

These goals may appear to be incompatible: Can the contents of an envelope really be kept private? How can a letter reach its destination if its address is hidden? Can two parties communicate without revealing their identities to one another? Can all this be done without trusting third parties (the Post Office, for example) not to remember addresses or to open envelopes?

The next sections briefly describe the onion routing system, how the anonymous connections that it provides are secure against both eavesdropping and traffic analysis, and how they may be used for anonymous communication too.

Onion Routing

Traffic analysis can be used to infer who is talking to whom over a public network. For example, in a packet switched network like the Internet, packets have a header used for routing, and a payload that carries the data. The header, which must be visible to the network (and to observers of the network), reveals the source and destination of the packet. Even if the header were obscured in some way, the packet could still be tracked as it moves through the network. Encrypting the payload is similarly ineffective, because the goal of traffic analysis is to identify who is talking to whom and not (to identify directly) the content of that conversation.

Onion routing protects against traffic analysis attacks from both the network and observers. Onion routing works in the following way: The initiating application, instead of making a connection directly to a responding server, makes a connection to the appropriate onion routing proxy on some remote machine. That onion routing proxy builds an anonymous connection through several other onion routers to the destination. Each onion router can only identify adjacent onion routers along the route. When the connection is broken, even this limited information about the connection is cleared at each onion router. Data passed along the anonymous connection appears different at and to each onion router, so data cannot be tracked en route and compromised onion routers cannot cooperate. An onion routing network can exist in several configurations that permit efficient use by both large institutions and individuals.

The onion routing proxy defines a route through the onion routing network by constructing a layered data structure called an onion and sending that onion through the onion routing network. Each layer of the onion defines the next hop in a route. An onion router that receives an onion peels off its layer, reads from that layer the name of the next hop and the cryptographic information associated with its hop in the anonymous connection, pads the embedded onion to some constant size, and sends the padded onion to the next onion router.

Before sending data over an anonymous connection, the initiator's onion routing proxy adds a layer of encryption for each onion router in the route. As data moves through the private connection, each onion router removes one layer of encryption, so it finally arrives as plaintext. This layering occurs in the reverse order for data moving back to the initiator. So data that has passed backward through the anonymous connection must be repeatedly decrypted to obtain the plaintext.

The last onion router forwards data to another type of proxy on the same machine, called the responder's proxy, whose job is to pass data between the onion network and the responding server.

For instructions on how to use our onion routing prototype, please visit the onion routing web site.

Network Configurations

A Basic Configuration

In one basic onion routing network configuration, an onion router might sit on the firewall of a protected site. This onion router serves as an interface between machines behind the firewall and the rest of the network. To complicate tracking of traffic originating or terminating within the protected site, this onion router should also route data between other onion routers.

There are four important features of this basic configuration:

The onion router at the originating protected site knows both the source and destination of a connection.
Software on client machines does not need to be modified.
Connections between machines behind onion routers are protected against both eavesdropping and traffic analysis. Since the data stream never appears in the clear on the public network, this data may carry identifying information, but communication is still private. (This feature is used in section 5.1.)
If the responding server is not behind an onion router (and end-to-end encryption is not layered over the anonymous connection) communication must be anonymous. That is, the data stream must not identify the initiator. (We call this anonymizing the anonymous connection.) Otherwise, an attacker could listen in on the final segment of the connection and identify the initiator.

The Customer-ISP Model

In the basic configuration, the first onion router (or the first onion routing proxy) is the most trusted one. It may be desirable to move that trust closer to the user.

For example, an Internet Services Provide (ISP) may run an onion router that accepts connections from onion routing proxies running on subscribers' machines. In this configuration, users generate onions specifying a path through the ISP's onion router to the destination. Although the ISP knows who initiates the connection, the ISP would not know with whom the customer is communicating. So the customer need not trust the ISP to maintain his privacy. Furthermore, the ISP becomes a common carrier, who carries data for its customers. This may relieve the ISP of responsibility both for whom users are communicating with and for the content of those conversations.

Applications

We first describe how to use anonymous connection in VPNs and anonymous chatting services. We then describe onion routing proxies for three Internet services: Web browsing, e-mail, and remote login. These three onion routing proxies have been implemented. Anonymizing versions of these proxies that remove the identifying information that may be present in the headers of these services' data streams have been implemented as well.

Virtual Private Networks

If two sites want to collaborate, they could establish a long term tunnel that would multiplex many socket connections over a single anonymous connection. This would effectively hide who is collaborating with whom and what they are working on, without the expense of constructing many individual anonymous connections. Such long term anonymous connections between enclaves provide the analog of a leased line over a public network. The privacy provided by tunneling through an anonymous connection is superior to IPSEC tunnels or to firewall to firewall encryption. In those tunnels, only the identities of the endpoint machines are hidden; here, the existence of communication between the enclaves is hidden too.

Anonymous Chatting

Anonymous connections can be used in a service similar to IRC, where many parties meet to chat at some central server. The chat server may mate several anonymous connections carrying matching tokens. Each party defines the part of the connection leading back to itself, so no party has to trust the other to maintain its privacy. If the communicating parties layer end-to-end encryption over the mated anonymous connections, they also prevent the central server from listening in on the conversation.

Anonymous Cash

Certain forms of e-cash are designed to be anonymous and untraceable, unless they are double spent or otherwise misused. However, if a customer cannot contact a vendor without identifying himself, the anonymity of e-cash is undermined. For transactions where both payment and product can be conveyed electronically, anonymous connections can be used to hide the identities of the parties from one another.

How can the customer be prevented from taking his purchase without paying for it (e.g., by closing the connection early) or the vendor be prevented from taking the customer's e-cash without completing the transaction? This problem is outside the scope of this paper, and is addressed in [6, 2]. In the case of a well known vendor, a non-technical solution requires customers to pay first. Such a vendor is unlikely to deliberately cheat its customers since it may be caught in an audit.

Remote Login

We proxy remote login requests by taking advantage of the optional -l username to rlogin. The usual rlogin command is of the form:

rlogin -l username server

To use rlogin through an onion routing proxy, one would type

rlogin -l username@server proxy

where proxy refers to the onion routing proxy to be used and both username and server are the same as specified above. A normal rlogin request is transmitted from a privileged port on the client to the well known port for rlogin (513) on the server as:

\0 username on client \0 username on server \0 terminal type \0

where username on client is the username of the individual invoking the command on the client machine, username on server is either the -l field (if specified) or the username of the individual invoking the command on the client machine (if no -l is specified), and the terminal type is a standard termcap/linespeed specification. The server responds with a single zero byte if it will accept the connection or breaks the socket connection if an error has occurred or the connection is rejected. Our normal rlogin proxy therefore receives the initial request:

\0 username on client \0 username@server \0 terminal type \0

The proxy creates an anonymous connection to the RLOGIN port on the server machine and proceeds to send it a massaged request of the form:

\0 username \0 username \0 terminal type \0

Once this request is transmitted to the server, the proxy blindly forwards data in both directions between the client and server until the socket is broken by either side.

Notice that the onion router does not send the server the client's username on the client, so communication is anonymous, unless the data-stream subsequently reveals more information.

Web Browsing

Proxying HTTP requests follows the IETF HTTP V1.1 Draft Specification [5]. An HTTP request from a client through an HTTP proxy is of the form:

GET http://www.server.com/file.html HTTP/1.0

followed by optional fields. Notice that an HTTP request from a client to a server is of the form:

GET file.html HTTP/1.0

also followed by optional fields. The server name and protocol type are missing, because the connection is made directly to the server.

As an example, a complete request from Netscape Navigator to an onion router HTTP proxy may look like:

GET http://www.server.com/file.html HTTP/1.0
Referer: http://www.server.com/index.html
Proxy-Connection: Keep-Alive
User-Agent: Mozilla/3.0 (X11; I; SunOS 5.4 sun4m)
Host: www.server.com
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg

The proxy must create an anonymous connection to www.server.com, and issue a request as if it were a client. Therefore, the request must be massaged to remove the server name and protocol type, and transmitted to www.server.com over the anonymous connection. Once this request is transmitted to the server, the proxy blindly forwards data in both directions between the client and server until the socket is broken by either side.

For the anonymizing proxy of HTTP, the proxy proceeds as outlined above but also sanitizes the optional fields that follow the GET command because they may contain identifying information. For example, we filter out cookies. Furthermore, the data stream during a connection must be monitored, to sanitize additional headers that might occur during the connection.

The Anonymizer [1] also provides anonymous Web browsing. Users can connect to servers through the Anonymizer and it strips off identifying headers. This is essentially what our anonymizing HTTP onion routing proxy does. But data sent through the Anonymizer can still be tracked and monitored. The Anonymizer could be used as a front end to the onion routing network to provide both anonymous Web browsing and effective protection against traffic analysis.

Electronic Mail

Electronic mail is proxied by utilizing the user%host@proxy form of email address instead of the normal user@host form. This form should work with most current and older mail systems. Under this form, the client contacts the proxy server's well known SMTP port (25). Instead of the normal mail daemon listening to that port, the proxy listens and interprets what it receives following a strict state machine: wait for a valid HELO command, wait for a valid MAIL From: command, and then wait for a valid RCPT To: command. Each command argument is temporarily buffered. Once the RCPT To: command has been received, the proxy proceeds to create an anonymous connection to the destination server and relays the HELO and MAIL From: commands exactly as received. The RCPT To: command is massaged and forwarded. Once this request is transmitted to the server, the proxy forwards data in both directions from the client and server. An example of email from joe@sender.com on the machine sender.com to mary@recipient.com via the onion.com onion router is given below. Joe types mail mary%recipient.com@onion.com. First the communications from the client on sender.com to the onion router SMTP proxy on onion.com is given, followed by the communications from the responder's proxy to recipient.com:

220 onion.com SMTP Onion Routing Network.
HELO sender.com
250 onion.com -- Connection from sender.com (2.0.0.1).
MAIL From: joe@sender.com
250 Sender is joe@sender.com.
RCPT To: mary%recipient.com@onion.com

The proxy massages the RCPT To: line to make the address mary@recipient.com and makes an anonymous connection to recipient.com. It then replays the massaged protocol to recipient.com:

220 recipient.com Sendmail 4.1/SMI-4.1 ready at Wed, 28 Aug 96 15:15:00 EDT
HELO Onion.Routing.Network
250 recipient.com Hello Onion.Routing.Network [2.0.0.5], pleased to meet you
MAIL From: joe@sender.com
250 joe@sender.com... Sender ok
RCPT To: mary@recipient.com

At this point, the proxy forwards data in both directions, until a line containing only a period is sent from the sender to the recipient:

250 mary@recipient.com... Recipient ok
DATA
354 Enter mail, end with "." on a line by itself
This is a note
.

The proxy forwards the line containing only a period to the recipient, and forwards the recipient's response to the sender. At that point, the proxy sends QUIT to the recipient, reads the response and closes the connection to the recipient. The proxy then waits for a command from the sender; if that command is QUIT, the proxy sends a response and closes its connection to the sender:

250 Mail accepted
QUIT
221 onion.com Service closing transmission channel.

For the anonymous proxy of electronic mail, the proxy proceeds as outlined above with a few changes. It is now necessary to sanitize both the MAIL From: command and the header portion of the actual message body. Sanitization of the MAIL From: command is trivial with a simple substitution of anonymous for joe@sender.com. For the header sanitization, we have taken the conservative approach of deleting all headers, but this may be modified in the future to only remove identifying information and leave the remaining header information intact.

Conclusion

A new primitive, the anonymous connection, was introduced in [8, 11, 7, 12]. Anonymous connections are strongly resistant to both eavesdropping and traffic analysis. Anonymous connections do not reveal to the network or to observers of the network who is talking to whom. If the data stream does not contain identifying information, communication is anonymous also. This paper demonstrates the versatility of anonymous connections by exploring their use in a variety of Internet applications. These applications include standard Internet services like Web browsing, remote login, and electronic mail. Anonymous connections can also be used to support Virtual Private Networks with connections that are resistant to traffic analysis.

The onion routing network supporting anonymous connections can be configured in several ways, including a Customer-ISP model that moves trust and privacy to the user's computer. This may relieve the carrier of responsibility for the user's connections, because the carrier cannot learn to whom the user is connecting and cannot read the data stream.

Onion routers are based on Chaum mixes [3]. Other Internet uses of Chaum mixes have been very application specific, focusing on one-way store and forward applications like anonymous remailers [4, 9]. Onion routing modularizes the private communications infrastructure by moving the mixes beneath the application level. This provides bi-directional and real-time communication channels that can be easily used by a variety of applications and services.

References

1: The Anonymizer. http://www.anonymizer.com
2: L. J. Camp, M. Harkavey, B. Yee, J. D. Tygar, ``Anonymous Atomic Transactions'', Second USENIX Workshop on Electronic Commerce, 1996.
3: D. Chaum. Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms, Communications of the ACM, v. 24, n. 2, Feb. 1981, pages 84-88.
4: L. Cottrell. Mixmaster and Remailer Attacks,
http://obscura.obscura.com/ loki/remailer/remailer-essay.html
5: R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berner-Lee. Hypertext Transfer Protocol - HTTP/1.1, ftp://ds.internic.net/rfc/rfc2068.txt
6: M. Franklin and M. Reiter, ``Fair Exchange with a Semi-Trusted Third Party'', Fourth ACM Conference on Computer and Communications Security, Zurich, April 1997.
7: D. Goldschlag, M. Reed, and P. Syverson. Protocols using Anonymous Connections: Mobile Applications, 1997 Security Protocols Workshop, Paris, France, April 1997. Postscript
8: D. Goldschlag, M. Reed, and P. Syverson. Hiding Routing Information, Information Hiding, R. Anderson (editor), Spring-Verlag LLNCS 1174, 1996, pages 137-150. Postscript
9: C. Gülcü and G. Tsudik. Mixing Email with Babel, 1996 Symposium on Network and Distributed System Security, San Diego, February 1996.
10: Internet Engineering Task Force. http://www.ietf.org/
11: M. Reed, P. Syverson, D. Goldschlag. Proxies for Anonymous Routing, 12th Annual Computer Security Applications Conference, San Diego, CA, December, 1996. Postscript
12: P. Syverson, D. Goldschlag, and M. Reed. Anonymous Connections and Onion Routing, Proceedings of the Symposium on Security and Privacy, Oakland, CA, May 1997. Postscript
13: Peter Wayner. Digital Cash: Commerce on the Net, AP Professional, Chestnut Hill, Mass., 1996

About this document ...

Privacy on the Internet

The command line arguments were:
latex2html -split 0 index.

The translation was initiated by David Goldschlag on Thu Apr 17 10:29:56 EDT 1997

David Goldschlag
Thu Apr 17 10:29:56 EDT 1997

The appearance of external hyperlinks does not constitute endorsement by the United States Department of Defense, the United States Department of the Navy and The Naval Research Laboratory of the linked web sites, or the information, products or services contained therein. For other than authorized activities such as military exchanges and Morale, Welfare and Recreation (MWR) sites, the United States Department of Defense, the Department of the Navy and The Naval Research Laboratory does not exercise any editorial control over the information you may find at these locations. Such links are provided consistent with the stated purpose of this DoD web site.