Dan Bricklin's Web Site: www.bricklin.com
Thoughts on Peer-to-Peer
The success of Napster makes thinking about P2P important. Here are some issues.
In my essay "The Cornucopia of the Commons: How to get volunteer labor" I discuss some of the reasons I believe Napster has been successful. Using Peer-to-Peer is not one of them. Because of Napster's success, though, many people are trying to learn more about P2P and to see how and where it could be helpful. In this essay, I discuss some of those issues.
Peer-to-Peer (P2P) is a term used a lot these days. It refers to the topology and architecture of the computers in a system. Depending on where you come from, it can have a variety of interpretations. For this essay, let's look at properties of the P2P popularized by Napster:
The "Peers" are the normal PCs used by regular people to do email, browse the web, word processing and personal applications.
Many of these PCs are home connections, usually dialup. They do not, in general, have stable IP addresses, and are not ready to respond to incoming Internet requests 7x24.
Users are not just computer enthusiasts. They care more about various forms of content than IP addresses and protocols.
A sophisticated, professionally created and maintained server system is used to organize requests and do content management functions such as searching.
Files are shared, with the original content being provided by the users themselves, either by personal creation or copying from elsewhere.
Where this works:
Simple implementations that usually work for regular people. Complex instructions will cut usage. For example, setting up Napster is easy, while Gnutella (which doesn't use a centralized organizer) may need lots of understanding of IP addresses, routing, and firewalls. The implementations that must be simple include how files are uploaded off of the sharing computer. The fact that the shared copy goes to another random PC rather than a centralized server has little bearing on how easy it is. The fact that Napster has a special client-based program for uploading, like many photo web sites, is what makes it easy, not that it's P2P. Separate FTP programs or using browser primitives for uploading is not usually simple and gives uploading a bad name.
The same data on many different PCs. If only one PC has the data, access to it could be unreliable.
The files are static, the information being downloaded is never changed. The files shared with Napster are not news feeds -- they are more likely the works of dead musicians.
Data such that you don't mind trusting the person sharing it. If a music file for personal use was converted from CD to MP3 poorly, many people don't care. If the file being downloaded was destined for broadcast or other commercial purposes having an appropriate trust relationship with the source may be important and that complicates things and may not be practical.
Lots of college students with desktops connected to local ethernets.
Where it probably doesn't work as well:
Unique content on each PC where reliability or constant availability is important. When I want your pictures I don't want to have to call you up and have you boot up your PC. (Like when you have one phone line and an old fax: call first, then tell me to connect the fax -- if I'm home -- and then call back...). Many of us live with laptops which are only connected during the day and infrequently (hopefully) at night.
Content that constantly changes. This makes the copies people recently downloaded obsolete, and effectively gets back to only one copy on one specific PC, except you may not know it.
Content that requires a trust relationship with the source. Because of the "search everywhere" nature and simple signup of Napster-style P2P, caveat emptor.
Reliable connection speed. The data you want may be on a low-speed link. Your T1 doesn't help much.
Some of these problems can be solved by clever servers and complex protocols, but is it worth it? A lot of applications moved to centralized servers because the ease of implementation led to fewer bugs and faster deployment and more frequent upgrading. The cost of communications is going to near zero in comparison to other costs like development and time-to-market. Look at Lotus Notes in the "old" days and all the effort that went into synchronizing dispersed computers when high-speed intercomputer communication was a luxury. Today, a simple web server anywhere on the Internet is an easier way for implementing many of the simple uses of those systems at a fraction of the cost and complexity.
Imagine eBay having all its data stored like Napster. When you go to check on a type of item being auctioned, you never know if you get them all, or why one you saw a while ago disappeared.
The normal web and search engines, though, are kind of like Napster, but you hope the search result URL you want (a) was up when the spidering occurred and (b) is up and fast when you try to connect. If it's unreliable both ways, you'd go on to something better. Hence, stable servers win.
The crux of what I'm trying to say is that Napster's use of P2P is just plumbing that doesn't get in it's way. The problems with Napster's architecture, while real, are tolerated by users because of the nature of the data being shared: Music that many people wanted to have identical copies of with tolerance for variations in quality, and where if what you asked for wasn't available you just look for something else. The benefits of other architectures were outweighed by the legal issues and setup costs, since there was no initial or recurring income to pay for licenses, servers, storage and bandwidth. The fact that users had no alternative higher-throughput systems at acceptable prices helped, too, in getting it started. Finally, many users were at educational institutions where 7x24 high bandwidth connections were available along with large CD collections to prime the pump.
Some systems, such as Freenet, use specially crafted Peer-to-Peer architectures for the explicit purpose of implementing something without centralized components, even though that is a much more complex implementation for its simple file sharing. Freenet implements an alternative way of sharing that can be used to move information in a way that inhibits censorship. Peer-to-Peer turned out to be directly related to it's whole reason to be. Static-file sharing doesn't inherently need P2P.
So, just like any other technology, the choice of system topology needs to be thought through very carefully in light of specific application needs and requirements. Like the hyped technologies of the recent past, Java and "Push", P2P is not the answer to everything, just some things.
My co-worker Peter Levin points out another issue specific to the facts of Napster and its audience: There is something attractive about the defiance or avoidance of authority. It's an especially good fit with kids and music -- just listen to the lyrics of many songs.
To quote from Gnutella's posted History:
This is from the people that actually created Gnutella. This documentation was found in version 0.2 of Gnutella and is quite outdated. It is here for historic purposes...you don't have to tolerate ads or corporate dogma...Distributed nature of servant makes it pretty damned tough for college administrators to block access to the gnutella service...Ability to change the port you listen on makes it even harder for those college administrators to block access...Ability to define your own internal network with a single exit point to the rest of the internet makes it almost ****ing impossible for college sysadmins to block the free uninhibited transfer of information.
I don't know how much this fits with various "business models" where companies want an image of getting paid by their users.
-Dan Bricklin, August 10, 2000
© Copyright 1999-2010 by Daniel Bricklin
All Rights Reserved.