Kim Polese and Murugan Pal (spikesource) - interview

From LXF Wiki



There's money in stacks. Apparently. Certified, integrated open source stacks. If enterprise is to adopt Linux it will need lots ­ and SpikeSource, led by Kim Polese and Murugan Pal, intends to deliver them.

Imagine you're at a dinner party, and find yourself sitting next to a Mike Saunders. After dispatching with the family and work preliminaries, Mike makes a throwaway comment about Firefox, and it turns out you're both passionate about open source ­much to the bemusement of your fellow guests. Mike tells you how far the technology has come recently, and suggests that a modern Linux stack could save your business thousands. But the reason why you're just a hobbyist and haven't used it in your business yet is this: who's going to integrate all the software and configure it? Who will test the stack and certify it for you? The people at MySQL or Apache won't, after all. And who'll support the software for your business, and maintain it?

What Mike would tell you, if his attention hadn't been grabbed by the Nintendo console in the host's front room, is that you need a company like SpikeSource. This Redwood City, California-based startup sells "business-ready" solutions to ease enterprise into open source, and has attracted interest not only for offering an important and intriguing service, but for being endorsed by big names like Brian Behlendorf and Tim O'Reilly. Does its business model stand up? Will it give back to the community? And what's so important about testing anyway? Paul Hudson spoke to its CEO Kim Polese (the Kim Polese who may or may not have coined the name Java) and its CTO and co-founder Murugan Pal (seated, above centre) to find out.

Linux Format: So, SpikeSource. What's it all about?

Kim Polese: Murugan and Ray [Lane, SpikeSource co-founder] saw in Spring of 2003 a growing need, which was that as open source became more and more popular, the challenges of managing it ­particularly in interoperability ­ were becoming a real cost for companies. Murugan realised that to solve this problem you really need a new technology: an automated test framework that enables you to be vendor-neutral, enables the company to be vendor-neutral and enables the customer to have choice and flexibility. The goal was to provide customers with the breadth of choice across operating systems, database, application server, language runtimes and the many dozens of components that were starting to emerge.

So Murugan set out to create the team and the framework that team built, and today we're running 26,000 tests across six operating systems, six language runtimes, nearly 80 components, 189 configuration files, 273 parameters... This is a massive problem at scale, and the technology approach is one that requires automation.

That was the first revelation. The second was: not only is automation required, but participation... you can't build a test framework or a test environment that can cover all the combinatorial possibilities if you're one company or one entity. You need to tap into the collective knowledge of the community. So this is a two-way pipe ­ this test framework was designed not only to automate everything from spidering bug databases to doing the build test certification, delivery and updates, but also aggregating test results from the community, so you'd be continually expanding the knowledge base of configuration.

LXF: In the last three years we haven't heard very much from SpikeSource at all. I guess you've been busy doing the legwork.

MP: Exactly. From May 2003 until Kim came on board ­ she came on board in August 2004 ­ we were working within [VC firm] Kleiner Perkins' environment, and many people didn't even know what we were doing. We were putting together this test framework, and then assembled the team of architects or founders of other companies.

LXF: You're pitching yourselves as the leader in open source testing, which seems sudden!

KP: I don't know if we're pitching ourselves that way. We can't announce ourselves as the leader in open source testing, because this takes the participation of the committers, open source developers and IT developers who are actually gathering and aggregating the knowledge for granted.

MP: We call it participatory testing. That's the only way to scale. This is basically the derivation of the concept of architectural participation that Tim O'Reilly has been preaching. We want people to come along and test with us because we cannot do that. How can we go and replicate the 30 years' worth of work and the 15,000 employees' worth of work that has been going on in Microsoft? The only way is to work with the community. For example, yesterday a couple of our Perl test committers came to us and said: "Hey, I have 900 tests, can I contribute it to you guys and can you run it for us?" Another committer came back and said, "I have a Perl test harness I have written. Whenever you are going to share these technologies with the customers, can you integrate this piece?" That is what we want.

KP: We're not going around pronouncing ourselves the leaders...

LXF: It's on the website actually.

MP: I think the goal is to be participatory.

KP: We're not... we're very sensitive about...

LXF: Being called the leader? Or calling yourselves the leader?

KP: Claiming that we're the leader. We've innovated I think, in a unique way, this automated test framework, but at the same time we're open to learning and to aggregating knowledge for the community.

LXF: One of the slides from a presentation of Kim's I attended actually confused me. You showed this timeframe for releases ­ I think it was Red Hat, Apache, MySQL, Ajax and other things, and I thought, "That doesn't make sense." Ajax isn't really a product that gets released; Red Hat already does Apache and MySQL management for you. So I'm wondering where you fit into the equation, because no one really cares when MySQL releases, they care when Red Hat releases.

KP: The point of that slide was to say that there is great variation in release trends with all these different components, and companies are left to deal with the integration, constantly, of new releases and patches. There isn't a formalised structure the way that you have with Microsoft, where they aggregate all those patches and updates and then present them.

LXF: Well, there's Red Hat Enterprise Linux: surely you pay for that and it gives you all the products and all the updated patches for seven years?

MP: Actually the key here is that Red Hat supports certain products ­ like Apache ­ whereas the customers want different variations of software, they want Geronimo to be working within their environment, they want JBoss to be working within their environment. Red Hat doesn't support that; not in an integrated fashion.

The second thing is that there are class libraries such as Struts and Hibernate and Spring microkernel, and today there is is not an official way of promoting that environment, but the customers are already using it. There are a couple of firms we know that are using 38 components of these class libraries, and interestingly, our Spike Asset Manager, which is a simple Python script that inventories all the open source components that run on your machine, when we ran it on these things, we found redundant reproduction of dynamic libraries six times within these different components.

This is the DLL hell problem. You do not even know what is inside ­even in some of the versions that are officially supported ­ so we go through and remove these redundant reproductions. The other thing is that you talked about Ajax or Red Hat, and that's the beauty of the whole model, because the flexibility is there, and people want to use different things on their own.

LXF: Red Hat might say that the reason why they don't support things like Geronimo is that it's quite niche. Do you really think you can compete with them on the major products, things like MySQL?

MP: First of all it's not a matter of competing. This is all about coexistence, not replacement. For us, Red Hat is an important partner; there are enough Red Hat servers running, they have a market leadership, it is very important for us to be partners with them.

At the same time, MySQL and Red Hat... if they have an existing relationship actually that's good for us, because the customers eventually want, you know, Apache, and then JK2, Tomcat, Connector/J and MySQL all working together. And they don't want different bug databases to file their bugs, they want one person to call. If the customers already have an established relationship with Red Hat and MySQL, and maybe SpikeSource for all these `long-tail' components, they will call SpikeSource, and we will still leverage the existing support relationship with Red Hat that MySQL would have, and we would work with them at the back-end to resolve the issues.

LXF: What if one of your customers went to Oracle and had them supply him with Oracle, Red Hat Enterprise Linux and support for the lot, and then the customer rings them up and says, "Hello Oracle, my server's not working. I installed Geronimo from SpikeSource, what can you do about it?" Oracle could say, "Call SpikeSource, don't call us." That's probably what a lot of these companies don't want to be hearing, I would have thought.

MP: You see, coming from Oracle I have seen this kind of thing. Let's forget about open source, and Geronimo and Tomcat. Let's roll back to 1996 and what we had then. We had the Solaris operating system, Netscape enterprise server, and then NSAP plugins, Oracle PL/SQL cartridge, Oracle database connectivity, a database running on Solaris or HP-UX. If this chain breaks, there is no single place for you to call. This interoperability is still a problem whether it's commercial or open source. But where open source actually enables and empowers people is: if the customer's engineer is more powerful or smarter than our engineers, guess what? They're going to fix it. They will contribute it to us and we will test it, and we will validate it. So that is why this participatory testing is important.

LXF: Which parts of your core stack are actually open source?

KP: The whole thing.

LXF: Is Java part of your core software stack?

KP: We certify across Java as well as Perl, Python, PHP...

LXF: Is it part of your professional core?

MP: If you install our core stack today, for the Java VM alone there is a separate licence, because we had to comply with Sun. But all other pieces as Kim mentioned, we do a passthrough licence model. That means that if it's Apache Software License (ASL), or BSD, or GPL, we pass through that intact.

LXF: Don't your customers have to get a lawyer to look through every individual licence to make sure it complies?

KP: There are companies now, Black Duck and others, that are starting to offload that task. It's not our business, but we're partnering those companies. For example, we have an asset manager tool that we've integrated with Black Duck's licence calculator.

LXF: You're trying to introduce yourself to the community. What sort of support are you looking for back from them?

KP: Participatory testing. We're looking for test contributions.

LXF: So you want people to give their results back to SpikeSource?

MP: Actually, both ways. As Kim says, it's always bi-directional. If you look at testing as a process, there is a concept called code coverage testing. That means you will run some tests and find out how many lines of code or branches or patches have been tested. There are tools that are available for C, C++ and Java. But there was no such tool that existed for PHP. So one of the engineers [at SpikeSource] implemented a PHP coverage tool and open sourced it; it is in SourceForge. We touched base with Andi Gutmans and Zeev Suraski, and they really like it.

Now, once that tool has been done, we actually work on system implementation with our testing services partners. There are a couple of companies who we work with. They have actually tested phpMyAdmin and phpPgAdmin, which are database administrator tools retained in PHP that have never been tested. We've generated 1,900 tests to test these things and we are contributing back to the projects. And this Spike PHPCoverage tool, or the asset manager tool we wrote...any contributions from our side will be open sourced, as Kim says.

LXF: Your advisory board of directors is like a Who's Who list: Brian Behlendorf, Mitchell Baker, Tim O'Reilly... What does each of them bring to the company?

KP: Not only are they well known and highly respected in the industry but they are actively participating in helping us shape this company. Because they are excited about what's possible and they know like we know that there's no sort of guidebook or rulebook to follow. This is not just an advisory board ­ you know, let's have a masthead, or a list of people that looks good on a website. Each one of them is truly excited about what we're building here.

MP: Kim was very instrumental. The advisory board was thought up by her; she got it done very quickly when she came on board, and that was very instrumental in our success. But one thing that we said to them was: "We want you to use SpikeSource to practice your passion to accelerate and promote open source ­ however you want." Larry [Rosen] might come from a licensing point of view, Steve Weber purely from a sociology point of view. Our goal is to help them to practice their passion through this company.

LXF: I think people are seeing venture capitalists, and big names coming along to a dotcom company, and words like `productising' and are thinking, "Hmm, this is a buzzword, this feels like the dotcom boom all over again." What would you say to allay their fears?

KP: I would have the same suspicion if there weren't a real problem that we had some tools to solve. In other words, a lot of the dotcommers in the late nineties were creating products that weren't really solving critical problems. In this case, Murugan and Ray identified this problem and realised that a new approach was needed, so from a technology standpoint and the way you work with the community and build a company truly based on the architecture of participation... that's what we set out to do.

MP: Actually the proof is simple examples like this. Sleepycat of Berkeley DB came to us and they gave us 9,000 tests. And we are spending our electricity, network bandwidth to validate on multiple platforms, to guarantee interoperability. That's the key. Our vindication, our conviction is to help people in this process, unlike the dotcom companies, and our mode of software delivered as a service ­ [so that] people can download this integrated software.

Previously, for people who do not know anything about this it might take three weeks to integrate it. For people who know this and who have gone through this, it might take anything from eight hours to 24 hours to integrate their core stack. Whereas in this case [it takes] 15 minutes. They're ready to go, and then they can look at what they can work on.

KP: If we can save companies money, and give them flexibility and all the benefits of open source without the overhead and risk; and if we can work with the community to help projects like JetSpeed and Struts and Postgres and others get broader certification across multiple platforms; and also publish tests to the whole community so everybody benefits, that to me is a definition of a successful company. We'll be solving a real problem, from a business standpoint, and also hopefully benefiting the community, to create a much richer environment for everybody.

LXF: So what you're thinking ideally is: someone buys Red Hat Enterprise Linux then they go along to SpikeSource and get all the other things they need to complement it. You say you're vendor-neutral, but I wonder how interested you are in promoting a wholly open stack? I notice that about half of your machines are running Windows. Are you happy to say, "Apache, MySQL, PHP, Windows?" Or are you looking for something open source across the whole stack?

KP: We're truly vendor-neutral, in that if one of our customers want to use, say, SugarCRM on Windows, that's fine. If they want to integrate Apache and JBoss with Oracle, that's fine too. So we'll certify whatever the combination of components is that's most popular and convenient to use in the enterprise and the open source world.

MP: But at the same time internally as an ideal our engineering runs on Linux desktops, and uses We run on Samba, Postfix and the Courier email server, and that's a cost-saving thing. Because at the end of the day, it's about customer flexibility but also cost saving.

LXF: And how do you see the `productisation' of open source in the future? Apart from SpikeSource being really big!

KP: What do we mean by productisation?

LXF: More like, how do you see open source continuing to be productised in the future?

KP: For one thing, we see an endless abundance of open source components, continuing to be generated by the community and adopted by the enterprise: it's this long tail thing, it's not going to go away by any means. And that means that the complexity will continue, so the challenges around integration and interoperability will need a new approach. So that's one thing we see.

We also see open source going everywhere. If you look at the new generation of open source applications, like Sugar, JasperReports, Alfresco... every day, almost, there's another one. In my opinion just about every application you can imagine will be available in some form as open source software. So there will be a choice. We see open source not only penetrating the enterprise, but also small-to-medium businesses will benefit, I think, developing economies, we'll see open source going in to embedded devices, equipment... everywhere software can run, open source will be there, in my view. LXF