Like many people in their twenties, Abbi spends a significant portion of her day dealing with what others say about her penis size. "It's amazing," she says. "I must get 10 or 15 offers a day to help me enlarge it. No surgery, no chemicals, no problem." She stops, and barely manages a laugh. "I never really worried about it before, but now, if it wasn't so irritating, I'd probably feel inadequate."
What Abbi is referring to are some of the dozens of e-mails that arrive in her in-box every day. Offering discounts on prescription medication, breast and penis enlargement strategies, sales on computer goods and insurance and copious amounts of pornography, the e-mails arrive unbidden. Known as spam, the electronic letters have become a problem that ranges from an irritation for users like Abbi (who asked that we not use her full name, since it is associated with her e-mail address), to a multi-million dollar expense for businesses that rely on e-mail to connect their employees and customers to one another. It's become such a hot topic, it may even emerge as an issue in the 2004 presidential election.
According to the Radicati Group, a research firm in Palo Alto, Calif., nearly half a trillion messages in 2003 will qualify as spam; that's 45 percent of all e-mail worldwide for the year.
"At least half of my e-mails are spam -- probably 10 out of 20 each day," notes another casual e-mail user. "If I let it go a few days, I have dozens. It's increased in the last two or three months from one or two to over half, and the only thing that's frustrating is that it clogs my in-box. Fortunately, I spend maybe one minute sorting through the mail. It's easy to delete everything."
For large companies, however, those minutes begin to add up. Some analysts estimate that approximately 2 percent of e-mail users spend more than an hour each day sorting and eliminating spam from their regular mail. That's 10 hours every week for every 100 employees that a company has, or 13 work weeks each year.
Even if that weren't enough of a price for recipients to pay, the substructure underlying the entire Internet bears the burden of even more spam than reaches users' desks. When sending out tens of thousands of e-mails, a spammer is occupying "bandwidth," the channel by which an Internet service provider (ISP) grants access to the Internet. A spammer's disproportionate output consumes a significant portion of the bandwidth that is meant to be shared among the many different Internet users that the ISP serves. In order to keep the Internet service at an acceptable speed for their other users, ISPs are then forced to buy more bandwidth. In a recent case, ISP Earthlink officials estimated that one notorious spammer used bandwidth costing them more than $1 million.
"There are a number of aspects that stem far beyond the inconvenience of receiving vast quantities of unsolicited e-mail," states Kevin Doerr, business manager of Microsoft's anti-spam technology group. "There are real costs involved for both the providers of infrastructure, and storage and processing. So Microsoft, in the position of being an in-box provider through Hotmail, or a real ISP through MSN, incurs real hard costs in processing this mail. The other costs are more difficult to come by, but certainly there is a vast expense of productivity that affects consumers and workers. And then there is the issue of mail with unwanted or offensive images, or deceptive advertising or offerings -- it's very, very harmful to consumers and costly."
Almost none of the costs are shared by the senders of spam. The amount of unsolicited advertising being sent by e-mail would be unimaginable if the messages were sent via normal mail. Where spam is concerned, however, not only is it possible for a single computer based in someone's home to send thousands of e-mails every hour, but they can be sent through free e-mail accounts provided by companies like Microsoft and Yahoo. And with marketing costs to the sender almost infinitesimal for each e-mail, generating a large number of responses isn't crucial. According to a study conducted by the Wall Street Journal last November, a return rate of 0.001 percent -- only 10 sales for every one-million e-mails -- can be profitable.
"Unfortunately, I think it's going to get much worse in the next few years before it's going to get better," observes Vincent Schiavone, president and CEO of the ePrivacy Group, which manufactures an anti-spam product called SpamSquelcher. "There aren't many ways to fundamentally change the economics that it does not cost any more to send a million messages than a hundred-thousand messages than 10 messages. So as long as that economy's in place, it's going to get worse until our technology catches up with it."
Ingredients of Spam -- So what, if anything, can be done about spam? That all depends, as Bill Clinton once said, "on what your definition of 'is' is." Deciding what spam is, for legal, technological and everyday purposes, is part of the problem. Conversely, defining spam could lead to a large part of the solution. But like anything that depends on personal standards, the definition of spam is as diverse as the people who receive it.
"There seems to be a tremendous disagreement over the exact definition," says Microsoft's Doerr. "The most common words that are used in association with spam is the phrase 'unsolicited commercial e-mail.' Occasionally words like 'bulk' or 'automated' come into play, but it's unclear to what extent they'll impact the scope of the definition of spam -- whether they make it too broad or too narrow."
One software engineer who has worked on the spam problem has described the definition of spam in two parts: "hard-and-fast, and soft-and-squishy." Since spam is delivered to a user via computer, and computers are incredibly efficient at repeating exact procedures, the hard-and-fast approach to combating spam would be to identify spam based on a set of rules. Once these rules were determined, a computer could easily sort spam from legitimate e-mail. Many e-mail programs currently give users the option of writing rules that deliver mail from specific senders, or contain certain words, to a folder of junk mail. The problem comes when a user decides, for example, that any mail containing the word "sex," is spam. Now, while the software will happily eliminate any pornography advertisements that contain the word, it will also trash letters from friends, family or automated lists that feature "sex." And a current trend among spammers suggests that they've already found a way around this problem. Misspellings such as
"se x" and "sexx" make their message perfectly clear to a reader, and pass a computer's rigorous filters without difficulty.
Consequently, the soft-and-squishy approach aims to customize spam-filters based on human feedback. "One approach is to take the customer's perspective," points out Doerr, "and to simply state that spam is any mail that the customer gets that they did not want. Allowing customers to be the ultimate arbiter of what is spam is perhaps one of the most effective tools to be able to determine what is spam."
"Shaw," who uses e-mail extensively at home and work, wasn't worried about spam until he got a message recently "that offered to sell me pornography of American soldiers [allegedly] raping Iraqi women. I was disgusted; and I wanted them to take me off of whatever list they were using for sending these letters. I almost wrote back."
But as Shaw might have discovered, a response could have been more valuable to the sender of the letter than any sale of pornography. By not replying, Shaw wasn't giving the sender of the message any indication that it had been received by an actual person; but by submitting his e-mail address, he would have given a clear indication that someone was on the receiving end. With so many e-mail addresses in the world, and users changing them frequently, lists of working addresses are potential assets for bulk e-mailers. Knowing the ones that work allows spammers to be even smarter about their targeting.
And that may be one of the most insidious aspects of spam. It no longer is simply a blunt tool that is used to bludgeon potential consumers with advertising. It has become a harvesting mechanism, designed to turn the users themselves into products to be sold to other advertisers.
There Oughta Be a Law -- Clearly, stopping spam at its source would eliminate the greatest burden on the Internet's substructure and users in general. If spammers could be eliminated, there would be no need to worry about protecting servers and personal in-boxes. Most recently, Microsoft filed 15 lawsuits against alleged spammers in the United States and the United Kingdom. But identifying the senders of the messages was a high-tech game of cat-and-mouse.
"Essentially, the way we identified these spammers was by setting up thousands of what we call trap-accounts, or drop-box accounts that have never solicited anything" explains Microsoft Senior Corporate Attorney Tim Cranton. "So we just opened up these accounts and then watched what came into the in-box. And then, once we had millions and millions of messages, we put them together and started looking at the different characteristics of the messages to see who were the worst offenders, who were the highest-volume spammers that were coming into these drop-boxes, and start profiling kind of the nature of the e-mails that we received."
While 15 alleged spammers might not be a significant amount, any legal decisions rendered on the cases could provide valuable precedent for ongoing legislation. During the past few years, legislation has been regularly proposed on a national level to combat spam, but has failed to pass due to the difficulties involved in defining and regulating spam.
For several years, Sens. Conrad Burns (R-Mont.) and Ron Wyden (D-Ore.) have been proposing basic legislation that identifies spam on a national level and sets broad standards for what constitutes abuse. In short, the Burns-Wyden measure, known as "CAN-SPAM," would require that all unsolicited marketing e-mail contain a valid return address and a subject heading that accurately reflects the contents of the mail. It also states that senders must provide recipients with a means to remove their name from the sender's mailing list.
A more aggressive initiative recently proposed by Sen. Charles Schumer (D-N.Y.) is a "no-spam" registry. Similar to the "do-not-call" list that was recently launched to wild popularity by the Federal Trade Commission, the registry would allow users to specify that they not receive spam, and make it a crime for spammers to target addresses on that list.
But identifying spam can be much easier than identifying a spammer who has violated any anti-spam laws. Because of the protocol that computers have for identifying e-mail, a sender can lie completely about their identity with no way for the recipient to easily determine the truth. Like the many spam messages that users get from "Anita" or "Chriss," a spammer can send mail claiming it is from any source, imaginary or real.
Rep. Zoe Lofgren (D-Calif.) has suggested another approach to this problem, proposing a measure that would authorize the FTC to pay 20 percent of all fines collected in anti-spam prosecution to the individuals who track down violators -- essentially placing an open bounty on the heads of spammers.
Catch Them If You Can -- It's almost as daunting to track down the physical locations of spammers as it is to determine their identity. While e-mail must ultimately be sent from a specific server -- and this is indicated in the sent mail -- it is possible for spammers to hack into the servers of legitimate businesses and use them to send spam. That's what happened to the information systems manager for a small business who asked that neither he nor his business be named in this story. He's in charge of e-mail for about two dozen workers, each of whom accesses the Internet and e-mail. "Essentially what happened was that our relay became open," he reports, "and that allows the spammer to actually go in and use the open port."
He estimates that tens of thousands of spam messages were sent through his company's server over a weekend -- none of them originating in employees' offices. "Our server was actually blacklisted because when that sort of activity goes on, it tends to get reported to the blacklisting services. And a lot of e-mail servers pick that information up regularly -- they keep a database of all the servers that are blacklisted. And eventually our e-mail wasn't getting through to legitimate clients, people that we wanted to send our e-mail to. They would just read the header and it would come right back to us because it came from our server and our server had been blacklisted."
Ed Wu, a product manager for Microsoft's server Exchange, recognizes that problem, but says that right now it's difficult to combat. "If I am a legitimate sender, and somehow I am falsely identified as a spammer, the first thing that I'm going to do is go contact those black-hole lists and validate myself," he says. "So there may be, in the future, some process or improved process to validate who you are and that you have a legitimate business reason to send e-mail. But it will still be difficult from an IT admin-istrator's viewpoint to really know for sure whether you're supposed to be getting e-mail or not."
To facilitate this solution, Microsoft recently partnered with other Internet industry leaders, including AOL, Yahoo! and Earthlink to combat spam. That these arch-competitors should join together is an indication of the scope of the problem. However, even if legislation is passed to curb spam at the sender level, and even if software companies and ISPs continue to develop measures stemming the flow of spam online, individual users will still want a way to protect their personal in-boxes from any unwanted mail that slips through.
Marc Olson, with Microsoft's Outlook -- a popular e-mail program -- is working on new ways for users to eliminate spam from their daily routine. Outlook's current solution, which will be available in the next version of the product due out later this year, is a filter that automatically permits mail from colleagues and people who are listed in a user's database of contacts. After that, the program uses a new system that analyzes e-mail for potential spam. Anything that is considered to be spam is saved for users to review, allowing them to identify the types of mail that they want to receive in the future.
"The new system came out of work done in Microsoft Research around advanced classification techniques, to examine some unknown content and decide whether it's like something I've seen before or not," Olson says. "In the case of junk e-mail, what the core anti-spam team at Microsoft has done is taken this technique from our research group, and then harvested a very large number of messages that are junk mail, and a corresponding number of messages that are good mail. So humans have evaluated both groups and said, 'Yes these are clearly spam, and these are not.' Given that very large data set of good and bad mail, the algorithm doesn't just look at one attribute and decide that it tips it into the bad camp or the good camp; it looks at all aspects of the message."
No one is certain, however, how spam will adapt as strategies for eliminating it improve. And so Microsoft plans to adapt with the problem.
"We're in it for the long haul," declares Olson. "Just by putting these products out there for the first time, we're going to change the playing field. The people who send spam today are largely unfettered by any resistance on a technology or legislative level. And as we shift that, they'll respond. And it will be our challenge to keep up. So you can anticipate seeing an ongoing effort outside of what you might consider to be normal product releases, to deal with the problem of anti-spam, like we do with security and anti-virus. We track how the product is behaving in the field, how are customers are reacting to it, and how the character of spam changes."
Cleaning Up Your Mailbox
* Check with your ISP to see if they provide anti-spam software. If you think they aren't doing enough to stop a significant amount of spam from reaching you, find a provider that does.
* Never reply to spam -- that only lets spammers know that they have a "live" address, and you'll probably be flooded with even more junk e-mail.
* Don't post your e-mail address online. Many spammers use harvester programs to search the internet for e-mail addresses, which are then deluged with spam. If you must post your address, list it in plain language, like "john doe at isp dot com," or modify it, so that it reads email@example.com, then leave a note alerting readers to remove the "-spam." Studies have shown that this eliminates an enormous amount of spam over time.
* Thoroughly read every Web site at which you submit your e-mail address. Many contain provisions allowing the company to send you unsolicited e-mail or to sell your address. Find out how to opt out of this at the Web site.
* Consider getting multiple e-mail accounts. There are many providers of free mailboxes, and if you're uncertain about giving your main address to a Web site, offer the free account instead. You can usually ask the sender of good e-mail to change the address that they're using if you want to keep getting the mail. And if the free account becomes flooded with spam, you can simply cancel it and find a new one.
* Investigate nonprofit anti-spam groups, such as spamcon.org and cauce.org. They support anti-spam legislation and direct e-mail users toward anti-spam causes.
Comments? Send them to firstname.lastname@example.org -- but please, no spam!