Eric Allman - interview

From LXF Wiki

LXF Interview Eric Allman

The mail man

This month we hear from Eric Allman, one of the fathers of email, on the past and present challenges for a system of communication that many now claim is a basic human right...

As a student at Berkeley in the early seventies, Eric Allman hacked away on Unix code until he came up with a winner: a Mail Transport Agent that could span ARPANET. That MTA code evolved into Sendmail in 1981, which is still the most widely-used email engine in the world. In 1998, Allman founded Sendmail Inc, to help fund development of the code and increase its usefulness to the world of business. He still lives in California, but recently took a trip to the UK where he exchanged signals with Nick Veitch.

Linux Format: What are the main problems surrounding email today?

Eric Allman: If you ignore the obvious ones ­ spam, phishing, viruses ­ and go for the ones with a little more subtlety, I think in some sense email doesn't get used enough, and gets used too much, at the same time. A lot of people use it in the wrong way ­ email can be really a horrible timewaster if you don't use it properly. I'll give you an example: if you get a message going back and forth between two people arguing about a design decision or something like that, often a five-minute phone call could save them half a day. That's definitely the `too much' bit.

One of the nice things about email ­ or not nice things, depending on your situation ­ is that email can leave effectively a paper trail. So for example, one good way to make decisions is to make them by getting together or whatever, and then one person emails the other about their understanding of what the decision is. Get 15 people together in a room to discuss a design decision and you can pretty much guarantee that nothing is going to get done.

This is actually not new ­ it's been true of all media for all time as near as I can tell ­ but email certainly suffers from it. Another thing today, though, that's perhaps more topical, is that as email embodies more and more of the intellectual property of the company, companies need to be paying more and more attention to it. So much can leak out through email, not because somebody is intentionally trying to do something bad, but because they misaddress something, or they just don't think ­ they forward a message to somebody without saying that the information contains...

LXF: Sure ­ I believe that in Kevin Mitnick's book on social engineering, for example, he was citing examples of how really very secretive information can easily be obtained through the use of emails, because when people see an email they believe it to have come from someone trustworthy.

EA: Phishing is like magicians doing sleight of hand: they distract you over here so that they can do something [over there]. And some phishers are getting to be very, very good.

LXF: The latest figures are that 70% or more of all email traffic is spam; then there are obviously various phishing attacks and all sorts of other things going on with email. Isn't it now slowly reaching the point where it's almost more trouble than it's worth? Is the system fundamentally broken?

EA: Well, I've got two answers. One of them is, "Yes, we have a real problem" and there is no doubt in my mind that spam, viruses, phishing and so forth have diminished the value of email. But is it hopelessly broken? No, I really don't think so. Spam is one of the reasons why I'm working on something called DKIM, domain keys identified mail, which is an email authentication technique, and one of the reasons for it is that there's a lot you can do if you know who the mail is actually from. Right now everything we do is content scanning: "Let's look at what's in the message and try and guess whether that's spam or not." And the problem is you can only take it so far. You can do a lot ­ some spam is so obviously spam you can just get rid of it.

DKIM enables kind of identity-based filtering. The combination of better and better content filters with the ability to filter based on identity I think will get us there, [but spam is a problem] that will never go away. I also get ads in my physical mailbox that don't go away, but at least there isn't a truck backing up to my door every day and dumping all this stuff out ­ which is more or less what happens with email.

LXF: Do you think ultimately the burden of tackling the problems with emails lies with the recipients, or should there be more action from ISPs and other organisations?

EA: Frankly it's just good business to spam filter. And [the ISPs] have a vested interest in filtering out spam, because they have to store all this stuff if they leave it up to the consumer. Ideally you'd get rid of all the obvious spam as quickly as possible.

I think when you look at the enterprise it's a somewhat different situation. If you will, the major number of email users are in ISPs or ESPs ­ the Yahoos, the Gmails, the Hotmails of the world ­ but the money is in the enterprise, because for them it's real business-critical time. So they've got a slightly different equation: their employees are not going to quit because they get too much spam in their mailbox, but on the other hand the employees are all going to waste their time, so they have a vested interest as well in filtering it.

In some sense you'd like to push the burden as close to the centre as absolutely possible. Now that's hard to do. There have been discussions of using e-postage and that kind of thing, and there are political as well as technical reasons why that's probably not going to happen any time soon. There are lots of ways to push costs back. An example is DKIM. This is not why DKIM was designed the way that it is, but since you have to compute a cryptographic function over the mail, that actually does require more computer power on the sender's part.

LXF: Do you think DKIM is going to be universally accepted?

EA: If you include DomainKeys, the predecessor, the standard hasn't even stabilised yet, and Google are signing all of their mail with DK, Yahoo are signing all of their mail with DK, eBay are signing all of their email with DK. The business is huge. I think I've heard that Amazon is going to be using it, and I know a lot of financial institutions have shown a lot of interest .

LXF: So if you were starting again to create the whole MTA infrastructure, presumably it would be substantially different from the way it was originally. If you had the chance to forget email ever existed and start again, what things would you implement ­ if we could have an email 2.0?

EA: If I were redesigning it today I would make authentication required just as a start. Don't try and build trust in later, just design it in to begin with. That didn't happen initially because ­ and people don't realise this ­ the RSA algorithm that we use as the basis for public key encryption wasn't even invented when SMTP was designed. And it's a little hard to require something that hasn't been invented yet! So people say, "Well, isn't that a flaw in the email?" Certainly we'd do it differently today, but we would have had to wait an extra 15 or 20 years to get email to be consistent on that.

There are some details in the protocol I'd do differently. It could be more efficient; there's too many round trip times ­ you know, little geeky things like that. I would certainly specify what the various header fields mean better, `reply to' being an example. Nobody can agree on what `reply to' means...

LXF: Are you surprised that so many years on, Sendmail still commands the place in the infrastructure that it does?

EA: In a sense I'm surprised, not because I think there's something wrong with Sendmail but because there's not a lot of 25-year-old technology out there that's still running.

Obviously Sendmail has been updated over the years, but it is still pretty impressive that it is that resilient. Apache's newer than that... you can go through the list of open source stuff and most of it's older than that. There's the birth of TCP/IP stack, which is about the same age and which is used in a lot of systems. Though not Linux, I believe ­ Linux folks did their own, goodness knows why, but they did. Maybe it was raining that weekend!

LXF: One thing I've noticed is the frequency of updates to Sendmail now. Is that because you're really on the ball with security patches? Compared with something like Exim, for example, there are probably two or three times as many updates and patches released for Sendmail.

EA: There's actually been some debate at Sendmail Inc. We have gotten very sensitive about security issues. So things that other groups might not classify as security issues, we do. So we've done Sendmail security patches because of bugs in OpenSSL. It had nothing to do with Sendmail whatsoever, but from the point of view of the end user, they don't care. It's a security bug and something on port 25 can trigger it, ergo, it's our problem. I think a lot of other places take the attitude of "it's not our code, ergo, it's not our problem" We have taken maybe an extreme position on it but I want to do the best thing possible for the people out there.

LXF: I guess at least if you're supplying the patches, whether people feel the update is warranted or not is left up to them.

EA: True. You know, frankly, a fair amount of it is that in the past Sendmail has had a patchy security record. I'm not going to try and sweep that under the rug. Quite some time ago though, about v8.8 I think it was, we went through the code. We put in checks so that before it would open up a file it would check to make sure that you had very strict permissions on that file. It's probably the most paranoid program out there ­ way more paranoid than Apache, where if you configure your system wrong people can go walk all over your filesystem. And people do do this ­ but nobody blames it on Apache, they say, "You configured it wrong." If you configure Sendmail wrong they blame it on Sendmail. It's just not fair!

LXF: To be fair though, configuring Sendmail is an art in itself, isn't it?

EA: Configuring Apache is an art too. The configuration file for Apache is easier to read, but it has just about as many options as Sendmail does, and a lot of them are pretty hard to understand. If you want a serious website, you need somebody who is an Apache guru. If you've got a serious email site and you're running Sendmail, you need someone who is a Sendmail guru. This doesn't seem like too high a price to pay, to me. And by the way, that's not to bash Apache. Apache's a powerful program and it needs the ability to configure in order to be as flexible as it is.

LXF: In some ways, though, if properly configuring Sendmail were easier there'd be fewer poorly-configured Sendmail servers.

EA: I think that's absolutely true. I converted Sendmail to use the M4 configuration technique some time ago largely to counter that. People still persist in tweaking the config file directly. I consider it a binary file. I will not go in and tweak it ­ occasionally if I want to do a very quick patch to test something, I will [slaps wrist], but you should be working off the .mc file all the time. That makes it a lot easier to configure. I won't say it's easy, but easier.

LXF: Personally, do you feel a great responsibility to all of these people who have been using Sendmail?

EA: I remember there was a time when a light bulb went on for me and I said, "I can't optimise Sendmail so that it's good for a particular system of a sender or receiver. Because the system on the other end is going to be Sendmail, and, if I optimise it better for this system then it's going to degrade that system. I have to optimise for the whole internet, and that means making some decisions so that from an individual site it may appear to be sub-optimal." But you're right. There is a responsibility to do the right thing. When so many people are running your code you do have to think about that.

LXF: GPL version 3 ­ good or not good?

EA: I have not reviewed version 3 in great detail, I have skimmed over it. It looks to me like a substantial improvement over v2. I was very iffy about GPL2, and I might actually consider using GPL3. The legal language, from a quick skim, seems to have been tightened up a lot. One of the reasons I have not liked GPL, and I still do not like about v3, is that a licence should be a legal document, not a political polemic ­ it's just the wrong place for it. Not that I disagree with the politics...

LXF: Does it ever worry you that, if open source did have an Achilles heel, it would be the IP question, just because of the way that open source is? Do you ever worry that code which shouldn't exist could sneak in to Sendmail?

EA: I don't think I'd say yes to that question exactly but I would say yes to a very close question: that somebody either might have or actually produce a patent on technology in Sendmail and then come back and say, "By the way, we own you!" That's one of the problems with software patents. The way you have to justify patents, somebody actually has to have published what they've done. Some of these things... Patents have been given out on things that are so obvious that nobody ever thought about publishing them, because they were so obvious, and then somebody gets a patent and goes back ­ you're not necessarily dead, but you have to hire very expensive lawyers.

LXF: But that seems to be the way it works. That's one of the vulnerabilities of open source ­ the people who generally produce the software don't have huge bags of cash to pay the lawyers.

EA: Exactly. Which is part of why I kind of understand on the SUSE or Novell side why they'd want to get this assurance from Microsoft [regarding the `Microvell' deal]. If you're a corporation, you dare not indemnify your users until you're sure that you yourself are not going to get sued, so I do understand the Novell position on that. I'm not sure it's a good idea, but I understand.

LXF: You mentioned DKIM ­ what other exciting things are in the pipeline today for Sendmail? I'm sure there must be some things you're working towards.

EA: I actually have some projects that I'm personally doing. You know, when I started Sendmail Inc, one of the big reasons was that I was spending too much time supporting Sendmail and I wanted to go in and be able to code again, to do innovative new stuff etc. It turns out that founding your own company is not the way to get back to coding. That was a big mistake on my part! But I've actually been able to carve out a bit of time recently to start working on some projects.

Whether or not the IETF approves it, it's very obvious that China is going to start using native characters in messages. You can do that in the body now, but in Sendmail, because of the way headers were encoded internally, it was not completely 8-bit clean ­ well, it is now. Like I said, not exactly things that people will get up and cheer about, but in my opinion it was very, very important to do that and do it before it became a crisis.

Most of the real innovation in email, though, right now is going in to filtering. Version 8.14 has come up with a new version of the milter interface that's more powerful so it can do more stuff in the milter. I think that's where you'll see the major stuff, Now, will that be innovations in Sendmail? Technically no ­ pretty much 95% of our commercial effort goes into the milter itself, which is as it should be.

LXF: Finally, am I right in thinking you wrote the original Unix Trek?

EA: You're absolutely correct.

LXF: Well, it's crying out for an update!

EA: Do you know, that had not even occurred to me!

LXF: I've wasted so many hours on that!

EA: So did I! You know, some day, maybe I'd go back for the sheer fun of it, go back and look at it, but it's text-based ­ people want graphics, they want flashing lights, they want shiny things. LXF