Kapor on Wikipedia at SIMS

Host: Since the room is full, I'm going to take the prerogative of introducing our guest right now. Those of you who aren't here yet will just have to find a space. I'm really delighted to welcome you to the final lesson in our Distinguished Lectures Series this semester. We have a really special guest, Mitch Kapor, who comes to us first as a teacher. He's been teaching this semester, a course on open source software, which is a very popular idea. Mitch is currently the president chair of the Open Source Application Foundation, which, for those of you that know, is a nonprofit that promotes the development and application of high quality, applications and software using open source methods. But he may be the best known, he's widely known for founding a company called Lotus Development Corporation and actually designing Lotus 1-2-3 which was of course the killer app of this era, and really did sort of transform the PC industry and bring it to businesses around the world. He has all sorts of other distinctions that I would read to you, but it would take up all of his time! He was a DJ at WHCN FM in Hartford, Connecticut, a commercial progressive rock station in 1970. He taught TM in 1977 in Cambridge. He founded Lotus in 1982. I'm just highlighting things that my eyes stopped on. In 1990, he founded the Electronic Frontier Foundation with John Perry Barlow and was its first chairman. In 2003, he became the founding chair of the Mozilla Foundation; many of us know that very well. And you've probably read a lot of his work in many different places, newspapers, magazines, journals. We're really lucky that he now lives in the Bay Area in San Francisco, and we're hoping to see more of him here in South Hall. And without any further ado I'm going to introduce Mitch!
Mitchell Kapor: Thank you. So, not only do I teach on campus, I teach in this very room on Monday afternoons, and some of the students who are genuinely masochistic from that course are here this afternoon, as if they haven't gotten enough of me, so thank you. I have several apologies to make. The first is, as you can probably hear, I'm hopefully in the end stage of a cold, and I'm somewhat hoarse. I actually haven't spoken for more than five minutes continuously in over a week, so my voice is not coming with a warranty. We're just going to see how this works out. And also, we'll apologize in advance to Larry Lessig for shamelessly stealing his presentation format, which is becoming increasingly popular among people who do presentations. That may actually be as big a legacy as Creative Commons, and everything else Lessig has done is sort of changing how we do PowerPoint and KeyNote and things like that.



Let me ask you a question, since Jimmy Wales, the founder of the Wikipedia project was here last week and spoke, could I see the hands of people who heard his talk? Okay, so less than a quarter. If you folks will bear with me, I will do the minimum amount of necessary repetition to get everybody on the same page. I don't know whether it's going to be virtually everybody or not, but my talk builds on his and thanks to Joe, I was able to actually, couldn't get here to hear his talk because I got stuck in traffic on the Bay Bridge, but I was able to hear the recording. How many people use the Wikipedia? If I could see hands here? Oh, good. All right, we won't have to deal with the basic introduction. And how many people have ever contributed an article or edited anything on the Wikipedia, could I see? Okay, so about half. Pretty sophisticated crowd, that's great, that means we can have some fun.



So, I've called my talk "13 Ways to Look at the Wikipedia" and my final apologies to Wallace Stevens for borrowing his title. Number one: the Wikipedia can't possibly work. When I describe it to people who have never seen it or used it, and I basically say it's a completely open encyclopedia written entirely by its volunteers, eight times out of ten, this is what I hear. But it does, it does work. It's really amazing. It is now one of the top 40 websites in the world. A few months ago it was 50, Jimmy said in a good week, it's number 30. I think it's heading towards the top 10 -- of any website. That's eBay, Amazon, Yahoo, ... It's continuing to grow very, very rapidly. I could give you more statistics, but in the interest of time and the fact that you all use it, I think we can take for granted that at some level, it works. But it can't possibly work. It's so counter-intuitive. When I started using it, I understood perfectly that it was open and anybody could contribute, and that it wasn't centralized, but I became interested in it precisely because it did work despite the fact that something in me said that it shouldn't. And so, really, it doesn't look like it should work, but it does. And we go around and around on that, and around and around, but there's really a Zen sort of koan aspect to it. When you're given a paradox to wrestle with, you can either kind of go down with it because it's insoluble in its own terms, or you can transcend it, the paradox, because you find out you've had some limiting assumptions that you didn't know you had, and it's only an apparent paradox. So this talk is my effort to recount how I've wrestled with some of the paradoxes of Wikipedia and share that with you. And I promise I will leave time for questions, and we'll do something interactive.



Anyone can edit any article at any time. Not only is this approximately true, it is literally true, which is one of the most striking things. And when you tell people this, it sets them off, and I love to give the uninitiated demos, and I say, "Pick a topic." And we find the article on it. There are 800,000 articles in the English language edition of the Wikipedia at the moment, 300,000 in the German, several French, and I forget which others with over 100,000, and probably 30 different language editions with over 10,000 articles. So it's a very serious global project. And anyone can edit any article at any time; it's kind of just like the Internet. In the early days, I understood the Internet as really being driven by the two-word phrase "Anyone can." Anyone could set up a site, anyone could get connected, ... It has that quality of openness to it. And that really appealed to me, has always appealed to me when I found something like that. It's the sense that there are very low barriers to entry.



It probably has something to do with the fact that I was just a total social outcast as a child. My parents -- I will not give you the sad story of my childhood here, you're not interested. But I will say I was two years younger than everybody else in school, and that was not a big help in terms of social inclusiveness, especially being a boy and maturing more slowly. So no matter how many standard deviations from the norm you looked out, I was not in that in-group. And so the idea that you could actually participate in a system because there really weren't any barriers -- or nobody to say no to you or exclude you -- has always been very appealing from the early days of the personal computer industry up to the present.



But if anyone can edit any article at any time, and you tell that to an intelligent but naive person, they're going to say, "But what about vandalism? Do you mean somebody could come in right now and mess up your entry?" And I say, "Yes, absolutely, and that happens sometimes." -- "Does that mean that someone could intentionally put in false or misleading information, and you wouldn't know?" And I say, "Yes, that also happens some of the time." -- "Does that mean you could have very opinionated people with no actual command of the facts being highly participatory and spreading their nonsense all over it?" -- "Yes, that happens." All of those things happen, but despite that, the quality of the Wikipedia, to put it as conservatively as I would put it, is actually pretty good. Sometimes it's terrific, and there are areas with problems. And that's really the paradox. How does it manage to be somewhere between pretty good and fantastic despite all of those things happening? Well, we're going to get to that very shortly. But what people assume is someone has to be in charge if it's going to be any good. And I love talking to people about the Wikipedia who don't know about it because it helps people find their deep-seeded unexamined belief that authority is a necessary component of all working social systems.



Having grown up in the Sixties and kind of having problems with authority, I love this because it's a great counter-example. It's no longer theoretical. In a conventional sense, nobody is in charge. We'll talk about governance and Jimmy Wales and the subtleties of it, but by any standard that people would recognize, is there a boss, is there a hierarchy? No. Someone does not have to be in charge, it only seems to us that everything has to work that way. I heard something last week that I didn't know that testified even to how deeply embedded this is in me. Jimmy was talking about the fact that Wikipedia now -- I forget how many servers there are, 30 or so, they're on several continents, some number, billions of page views per month. So they have to have system administrators. You know, occasionally machines go down and stuff happens and so on. It's all done on a totally volunteer basis. There is no schedule of it. There is nobody assigning sys admins. There's a pool of -- I think it's actually about a hundred, he didn't say that -- of people with sys admin privileges, and they just sort of assume at any given time. About a dozen people are actually going to be online and watching and in fact, it has worked so far. It has scale. He did admit that they have each other's phone numbers, if something goes wrong, they can call. But I just assumed, "Well yeah, Wikipedia is very decentralized, but of course they have scheduled a sys admin because you can't let the site go down if something happens..." No, I was wrong again.



Second big thing where people have an unexamined belief is that without experts, you really can't trust the information. "You've got to have experts. How are you going to know if it's any good?" So it's similar to the other assumption. We don't even know that we have this. And people usually go, "Well, okay, experts, yeah, they add value, and in certain circumstances, I really want to have an opinion from an expert if it's a life-threatening issue." That's all well and good, but what I think the Wikipedia reveals when you get into discourse with people about it is they just have trouble accepting the fact that you can have good quality information without having experts. And experts, of course, have to be certified and registered, so the idea of expertise and authority kind of go together in people's minds.



When you begin to examine this, this is a little bit odd because I thought we had learned about not giving our trust over simply to people because they are experts and because they are authorities. And I thought we learned this but evidently not. So we're still wrestling with these issues in our society, and again, that's why the Wikipedia is such a useful kind of actual thing, it's not theoretical, it's there, you can go play with it, you can go, and everything is transparent. It's not necessarily easy to find where everything goes on, but the whole process by which it's managed is conducted in the Wikipedia itself, so if you know which page to go to, you can actually see it. And everything that doesn't happen on the Wiki happens in IRC, and all those channels are public, too, so it's very, very transparent.



So I've become a big Wikipedia evangelist. My user ID is Mkapor. You can see which articles I've contributed because you can see which articles everybody has contributed. And when I meet an interesting person who I think brings some value to the world, I ask them if they have a Wikipedia entry, and if they don't, I tend to create one for them. So if you look... I mean, if it's going to be a repository of all the world's knowledge for all the people of the world in their own languages, someone has to -- I mean, I'm a volunteer.



So I met Freeman Hrabowski. How many people know who Freeman Hrabowski is? You should go look it up in the Wikipedia. He is the president of the University of Maryland Baltimore County. But he is perhaps the leading expert thinker and genius about educating African American Ph.D.s in the sciences and has an unbelievable track record what he's done at UMBC. And he's written a book or two, but people don't know. That's why there needs to be an entry on him. So after I met him when he was here on campus, I created an entry on him.



But what I hear when I give people the spiel about Wikipedia -- if haven't lost them yet -- is, "Well, I'll never use it. I just can't trust it." What's even worse when I go to gatekeepers like -- and you'll hear more on this -- people who run big nonprofits with lots of intellectual capital on their website, and I say, "Maybe it would be a good idea to get your people to take some of the stuff that's on your website that's interesting and valid and fits the criteria, and put it into the Wikipedia." -- "Ha, over my dead body." So there are still some legitimacy issues that I'm going to talk about -- what the barriers are, what some of the challenges are, and so on. Because those people who are enthusiastic about it get it pretty much, but it's still a small percentage.



So let's come back to our mantra: Anyone can edit any article at any time. I thought it would be unfair not to say something about -- in the light of the fact that there is vandalism and people write scurrilous things sometimes -- how some of the quality control is maintained. One assignment we did in the class I am teaching here, early on, was to have everybody in the class -- this is an open source class -- actually make a contribution to the Wikipedia of some kind, and then write a short paper about it. And it was really fascinating, and that practice has now been picked up in at least one other class. And I hope it will be picked up much more widely. So, one thing that people found was that some people were attracted -- you know, when it's an assignment, you kind of have to do it to get credit for the course, so I admit it's a slightly unnatural situation, but it did give people motivation, albeit perhaps slightly artificial. People reported sometimes having trouble figuring out what to do, but one pattern was that people said, "Well, I was reading articles, and I found problems. I happen to know about knitting or making acoustic guitars, and I read there was a problem, so I went in and I fixed it." And I said, "Aha! You see, that impulse times about a million is how the thing manages to get better. So the fact that it's wrong today is one consideration, but the question is, what are the dynamics of change? And there are a lot of dynamics for improvement because it's a lot easier to edit an entry in the Wikipedia than it is to write source code for an open source project. And many, many, many more people, by orders of magnitude, have the ability to do that, and they stumble on it. And still, there's kind of a learning curve, there are some barriers. If you say, "I want to edit stuff," I would say it's not exactly user-friendly but despite that, so there is this vector of things getting better. That is because people like to improve stuff, and they like to feel like they're making a contribution.



And another dynamic is it's very easy to revert an article back to a prior state. There is a complete history kept of every edit of every single article, which is easily visible. And there are people who, as volunteers -- a big part of what they do is to look at changes that have been made, notice vandalism and go back and fix it because it just takes a few seconds to revert it back. So there's a Wikipedia entry on me, you can look that up, it was not written by me but basically taken from my website, the biography. At some point over the summer, somebody humorously edited it to change probably about a dozen things. So they said I was married to Grace Murray Hopper and -- it was clearly somebody with a sense of humor, and it was clearly done as a demonstration of how easy it was to change. And sure enough, within -- somebody can look up here, I forget how long it was, but it was a brief period of time -- somebody came in and just reverted it back. But you can see the humorous version, and that happens a lot. In fact, there are some volunteers who consider it to be their civic duty to go and clean up after vandals. And they do that, and it's kind of a game that they play. "Who can get to the stuff fastest?" So those are dynamics that really help a lot. In the interest of time, let me go on. Yes?
Audience member: You might want to mention the watch list.
Mitchell Kapor: Sure. There's another mechanism -- because they don't have modern stuff like RSS feeds, which we'll talk about in a minute -- called the watch list. If you are interested in an article, you can put it on your watch list. And what that means is anytime that article changes, you can go to your page of watched articles. There's an entry there so you can easily track changes to anything you've written, anything anybody else has written. And people do that because they sort of take responsibility for some area and they watch what's going on, and they're regular participants in it. I should say these few mechanisms I mentioned are just the tip of the iceberg, there are probably dozens of these things that people do. So, all right, that's just the first of 13, the others are shorter.



In my opinion, Wikipedia is the next big thing, and I have a kind of history -- for those of you who know -- of being involved pretty early with next big things, so... In '78, I thought the PC was the next big thing. I had my little Apple, too. I used to hang around MIT. I wasn't a student, and nobody there took personal computers seriously. They were toys. This was before Clay Christensen had written about disruptive innovation, explaining it all. So I just became convinced that we were going to have to make something out of these PCs to convince people. 1982, I thought a five hundred dollar spreadsheet was the next big thing. Fortunately, there was at least one other person who believed me, which was a venture capitalist. That was Lotus. To tell you where expectations were when I wrote the Lotus business plan in '82, we forecast first-year sales of 3 million dollars. Not unreasonable, because the biggest company in the whole industry at that point had just had a 10 or 12 million dollar year. So to come out of the box and be 25 percent of the biggest guy would be pretty good. In '83, we actually had 53 million in sales, so that was a 1700 percent forecasting error. Fortunately, the variance was favorable. The next year, sales tripled again to 150 million. I didn't know it was going to come out like that, but I just felt it's the next big thing, and people are going to want to have PCs, and they're going to want to have tools on them. And that was Lotus.



1992, I made an investment in a company called UUNET, which was one of the first ISPs that returned about 300 to 1. I was too chicken to put in a lot of money, so again, kind of knowing what the next big thing is one thing, and sort of knowing how to take advantage of it is a slightly different thing. But I tried to get John Deorr, a premiere venture capitalist, to take a look at this company, and he said, "What?" And he wouldn't take the meeting because it was just too soon, the idea that -- I mean, remember 1992, and you people are old enough to remember 1992, unlike many of the students here.



All right, we won't talk about streaming media in 1995, but to me, the Wikipedia is that order of next big thing. But the question is how it's going to play out. Who are the winners, going to be the losers, what's the time frame, is it the Wikipedia itself, is it something like the Wikipedia but different, ... There's no way to know that, I mean, I think theoretically, there's no way to know that. I don't believe in precognition, but what we can do is to sort of inform our intuition better by diving into what currently is in order to figure out what might be. And that's about all you can do.



So what is it that drives the Wikipedia? Number three. It's community. Jimmy said it last week. It's not technology, and I think he is absolutely right. He said, "Look, it's not an ant hill. It's a community." It's people actually relating to each other. The quality of the articles does not emerge through some random stochastic process. It's not like a market; it has nothing to do with Adam Smith. It's a community. It's an Internet-enabled community of a particular type, but the magic is that it's, in a sense, not magical. It has lots of things that other successful social organizations have. It has a vision, a free encyclopedia of the world's knowledge for the world's people in their own languages. The vision is the goal. It sort of states what the whole project is heading to. It's very helpful because it provides a kind of an orienting field for a million questions large and small. Because we ask is this thing that we're thinking about doing, is it going to get us to the vision, or is it going to take us further away? The Wikipedia has always had a very clear vision. Also, it has a clear mission. The mission of the people in the project is to make this thing happen. So they have a kind of a purpose. So it's not just a goal in the abstract, but it's a goal and a direction. And it has an active sense of identity for its members, they call themselves Wikipedians.



Last summer, I went to Hamburg, Germany, to go specifically to conference called Wikimania, which was the first time the Wikipedians have ever gathered together on a worldwide basis. Being a globally distributed community and being a volunteer community, there would be no occasion to get together. The logistics are pretty hard unless they made a specific point of it. And I always figured if you really want to understand what's really going on, you've got to go and meet the people and talk to them face to face. And these were people with deep connections to each other who had been working with each other for a number of years, who were actually meeting each other face to face for the first time. It was quite remarkable in many respects. As a community, they have values. There are things that they believe in together, and we'll get to that in a minute. But values are part of what holds the community together. And as a community, they have practices, things they do. Cultural practices. Ways of doing things that are consistent with the values that advance behavior in accordance with the mission towards the vision.



So, I know nothing about social theory, nothing formal. But it just seemed to me that these simple concepts of community, participants, values, practices and -- oh, it has a leader, and that's Jimmy Wales, a really interesting model of leadership. I don't know if there could be leaderless communities, but the successful open source and open content communities all -- except for maybe the Apache community, but we can get into that if you want in the Q&A -- by and large they all have leaders who exemplify a very different style of leadership than the stereotype model of a corporate leader sort of directing the action and commanding the troops to march forward. The Wikipedia style of leadership actually, in some sense, is imposed by the fact that, look, there are no employees. Well, there are now two employees. We can get to -- there was one, now there are two, but... 99 point something percent of the people who work on it are volunteers. You can't tell them what to do. Because if they don't like it, they don't have to do it. So you can't command. And also, there's no money. There could be money, a lot of money. We can talk about that, too, but the whole thing has basically been run on, not nothing, but maybe half a million dollars to date of donated services and goods, and more now because of all the bandwidth and the servers and so on, but remarkably little. It should go into the category of things money can't buy. So there's a) love -- I didn't think to make a slide for this -- but b) a Wikipedia. If you had had a billion dollars five years and you could pick the smartest entrepreneur in the world and say, "Mister or Miss Smart Entrepreneur, here's a billion dollars I would like you, in five years, to build this encyclopedia with." And the statistics, and there's billion words in the English version now. A top 30 website in the world -- I don't think it could be done. You can't spend money to do this sort of thing. People have to find their own motivations and have to be this aggregation of really widely distributed decentralized volunteer work. And the leadership task is to figure out how to enable that. And, I think, even more than that: how not to disable it. We'll talk a little bit more about that. But it's community. I don't think it's a big mystery.



I think there's kind of a mantra to dive down to the next level about how the Wikipedia works. Let a thousand users edit. So it turns out that -- and here, I think, we're talking about the English version, but I believe similar things are true for the other versions -- the bulk of the edits to the Wikipedia are made by a fairly small distinct group of users numbering not more than a thousand. You know, fifty, sixty, seventy percent. Now that's not necessarily the number of words because it might be that these thousand users do a huge number of little edits. If you actually go and look at the edit history, there are people who do writing and adding stuff, and then there's people who clean it up, put the links in, make sure the syntax is right. They're editors. But one of the reasons -- and I think this both the good news and the bad news, and it really speaks against the ant hill theory -- this is a coherent community of about a thousand who not everybody has regular contact with everybody else, but routinely, many of these people have contact with each other online and know each other and have personal regard for each other and have an appreciation of strengths and weaknesses and how they work and when to leave them alone and when to push them... I think what's remarkable is that you can get something like Wikipedia coordinated by a large number of volunteers. The first thing is that you can get volunteers at all. The second that they can self-coordinate. Because nobody is saying, "Go do this, go do that." Everything is self-designed. I mean, it really is, and that's part of the miracle of it. And it's also, if you think about it, a fairly small number of people. None of these people do it full-time. I shouldn't say none, but they all presumably have jobs or are students or...



But I'll tell you; we don't really know the answer to "What do these thousand people do with themselves?" There are so many interesting research questions here, and it's such a great subject of research. I'm always promoting the idea that people should do more research about it. So not only is it a community, but it's a real virtual community, by which I mean a community does not have to be physically real to be metaphysically real. People actually have regard for each other; they care about each other as people with respect to their contributions to the project. And that matters a lot. So a great example is that people have reputations. Now, these are not like eBay reputations, which are a kind of mechanical sort of thing that is quite subject to gaining, although quite useful. These are actual reputations based on actual behavior, so less tidy but more nuanced. So if there's a discussion on a subject -- and every article also has a talk page for discussion -- even if ten people have one opinion, if the eleventh person is someone who is highly respected in the community or sub-community, more often than not, that's where the conversation will stop. So it's not like, in that case, voting ten to one. It works kind of like the real world, a good part of the real world. And that's interesting. Again, it's so non-magical that it's -- no, it's magical.



So let's talk about the values a little bit. And again, it's probably a list of ten or fifteen chore values. I just picked out two. This is so profound. But no fooling, if you go to a Wikipedian and you go, "Is 'be nice' a value?" They're going to go, "Absolutely, it's number one or number two." It's actually a value there, and they mean it. This is sort of like -- the last time this stuff got taken as seriously was in kindergarten. But by consensus, it's a be nice sort of place, you're just nice to people. It is not okay to go and tee off on people just because you're in a bad mood. So it is just so opposite most of the rest of what's published on the Net. And it holds things together. People like being there because people are nice to each other. Duh! The only remarkable thing is that it actually exists and works as well as it does. Be nice. So, what it does, though, is it undercuts one of these hidden assumptions that we have. That's why it's interesting, and that's why I put it up here.



There's an unconscious expectation a lot of us have basically because we've ingested the dominant culture, which is a culture of business. It's the bottom line of our society, to use a business metaphor. And there's a kind of ethos in business that justifies cruelty based on necessity, you know, nature, red in tooth and claw. And it's so internalized, we may not even know that we have it, but it's kind of the zero sum gain thing. The justification for all sorts of petty cruelty in the business world comes from thinking that the competitive scarce resource aspect of reality is the only significant one that matters in the carrying out of activities. The Wikipedia is a fabulous counter-example of its extent that you can actually have a large-scale, high-impact thing that explicitly does not share that premise. That's very hopeful because it leads me to ask, "Well heck, if they can do it, can anybody else do it? Is this generalizable, or is this just one of a kind? And we'll talk about that. I think it's interesting.



Jimmy Wales describes himself over and over again as someone with a libertarian political philosophy. In fact, he says he was a kind of an objectivist, you know, Ayn Rand. And I found that just totally striking because the net dot libertarians that I know have a tropism towards the emotional development of teenagers and a kind of self-absorption and selfishness to stereotype, I'm stereotyping here. That is just not the way things are run on the Wikipedia. So I probably have had a much too narrow a kind of personal take on that. I've been assuming that if you call yourself a libertarian or whatever, you're really a jerk. It's probably because in the early days at the Electronic Frontier Foundation, I just got harassed terribly by people. I don't want to go into a long sad story, so you can ask me more about that if you want. But I said to Jimmy, "I don't understand this. Because you describe yourself as a libertarian and an objectivist, but as far as I can tell, you don't do anything any different than what the Dalai Lama would do if he were in your shoes. Trust me, he's not libertarian." It may just be that our conventional labels don't apply very well, whatever they are, with this type of phenomenon. And that, I think, is the deeper point.



NPOV is probably the top acronym on Wikipedia -- neutral point of view. So it is kind of a kind of a tenet, a sort of value. It just says, "Articles should be written from a neutral point of view." But there's an enormous amount of depth to that and lots of discussion about it. And it has served things very well because it is the basis of trying to form agreement between people who do not otherwise agree. Is there some set of facts about this contentious subject -- whether it's North Korea or abortion or Palestine -- that people can agree on even though their sort of opinions and values differ? Not always possible, but having a neutral point of view in the content of the Wikipedia has been one of the main drivers towards convergence rather than divergence. It's very interesting; I think it's true at a meta level.



So I asked at one point, "So is there an obligation or responsibility on the part of Wikipedians to ensure that the content of it is inclusive in a comprehensive sense?" Now, why is this an issue? Well, because in the early days, there were many more articles about Middle Earth than there were about Africa. No longer true, because it reflected the interests of the people writing things. But the answer I got was, "Yes, there is that responsibility, so if we determine that we're really not representative, we Wikipedians of the world, we have an affirmative responsibility to go and bring people in who are going -- who in sum will be more reflective of the diversity of the world itself. And in that way we will not implicitly be saying Middle Earth is a lot more important than Africa." So I thought that was pretty good that there's a real willingness to walk the walk, not just talk the talk.



Practices. This is one of my favorite ones: "Don't criticize; improve." You can literally find that written down. It says, look, if you find something that's wrong, don't bitch about it; go fix it. It's a community value. You get points for doing that. You get a finger wag if you don't. And it's practical and it's like grade school stuff, but it's being applied and it works, which is really great.



Z Another practice, I hadn't heard the term till last week: real-time peer review. I'll just tell one story. So, there is a recent changes log in the Wiki of all the changes that are made, and people used to watch it because that's one big way to monitor quality. It got to be so many changes that a human could not keep up with it. What they were actually doing is they were piping the recent changes into an IRC channel, and things were moving too quickly. So what somebody did is to write a tool in Java that ingested the IRC feed and let you put filters on it, so you could watch for new articles or articles being created by someone at a particular IP address. And particularly you could get a feed of all of the edits being made by anonymous users. About 20 percent of the contributions are made by unregistered users just from an IP address. Some of them are people who just forget to log in. But of course it's very thoughtful of the vandals to be anonymous because then they're easy to find. And that's what people do, and it's somewhat competitive to watch the filtered feed of the anonymous contributions because that's a good place to find people messing with the system. So that's real-time peer review.



Okay, I've got to pick up the pace a little bit because I want to try to go another 15 minutes and then leave time for questions. So this is going to be about the future. So how big is a fully-grown Wikipedia? I don't know, but I'm going to speculate and I hope that somebody challenges me or it starts a discussion. We call it an encyclopedia, but if you actually look at it, it's more like a nascent reference library. It has lots of different kinds of volumes in it of specialized reference works like biographical dictionaries and different subject matter. So if it's a reference library, then how big is that if it were fully done? So I'm guessing a big reference library might be 50,000 volumes, no, no a million, not a thousand. There might be 10,000 entries in each, to pick an order of magnitude, not a hundred thousand, not a thousand. Okay, I'm going to do some multiplication. 500 million articles, that's how big that would be, it's just straight multiplication. That's a lot. If there are a little less than a million right now, there's .2 percent of the thing is done, and that's just of the reference library, not of other kinds of stuff. So the point is if it were filled out, on this account, it would be pretty big, a lot bigger than it is now. Fortunately a lot smaller than the web, and you could actually index the whole thing nicely using existing technology and isn't going to break anything.



So is it likely to get that big? That's really complicated. I believe that one of the drivers I've called the eBay effect here, but it's just increasing returns. As it becomes better known for being better known, more and more people go to it, they have less incentive to go anywhere else. This is why eBay got all the auction action and the general market and killed Yahoo and everybody else. Well, it's the same thing. If people just go to the Wikipedia, because that's what they know, if you're contributing, where are you going to contribute? Are you going to go to the Wikipedia or the something else? No, you're going to go to the Wikipedia. That doesn't dominate everything, but I think it is kind of this winner take all sort of thing that plays out. We'll talk about some challenges to it, but in my mind, I don't think that it's impossible.



The Wikipedia, assuming it doesn't fall over, could be 500 million or a billion articles. It's certainly on a path to going. Another perspective. The Wikipedia is fundamentally not about technology. Technologists find this very upsetting; they don't want to hear this. They assume otherwise. It's another one of these deep-seeded beliefs that somehow, the magic of Wikipedia -- I mean, there's a bunch of technology if you're doing 40 million page views a month, but is this the sort of cutting-edge, innovative, is it the secret sauce? This is what I found out when I went to Hamburg. There's little technology leverage that's there compared to the potential of what your average geek like me can imagine them doing. Most things are actually done by the electronic equivalent of passing around little slips of paper, which I find just completely troubling, but again, this is my bias. Hey, it works! So then I'm forced to go, "Well, wait a minute, maybe I think that new technology is more important than it is." You can't deny the fact that it works. So the example is this "votes for deletion" page. They've changed the name recently to "articles for deletion". There's a process in the community by which people can nominate new articles to be considered for deletion, being expunged. Why? Well, for one thing, there's a kind of a notability criterion. It says, "To be in the Wikipedia, the fact or the information should be notable." This is why not every single high school in the US should have its own entry; at least that's what the argument is about. But also, there are other reasons. There are 20 different reasons -- you can look them up, there's a page there -- for why things can deleted. And what they actually do is they have a Wiki-based discussion. It's not a bulletin board system, there's no forum, there's no "reply to". The participating is you hit the edit button, and you put your two cents in at the bottom of the page. And then after a certain period of time, you "vote", which you go, you know, remove it or keep it. It's really a discussion. There's certainly no notification, there's no voting, and you don't press a button. It's like doing old school editing of Wiki pages in mark up. That's what I mean by not a lot of technology leverage.



Because you look at it and go, "There's 50 ways this could be better." How does the community think about it? Well -- and I hope this is a fair characterization -- they go, "If we automated this, we'd be losing something valuable." And I gave earlier the example that the last person weighing in on the discussion was to remove something, if that's a high reputation person, that can turn the whole tide. And I actually want to be counting votes; you want a sense of it. And I go, "Yeah, that's fine." But it's much harder than it ought to be, and it could be a lot easier and people could be more productive if the tools were better. But the value currently is that community trumps technology in the Wikipedia, and I think that's kind of an issue because there could be some problems. So this is the next time I see Jim. There's a little subtlety here, hang on.



Wikipedians are not technophiles, they're techno-skeptics. They're unwilling, I think for good reasons, to want to put community at risk for the sake of technology, and they're therefore willing to be very selective about technology upgrades. And that's what I mean, like the Amish. The Amish don't have phones on their houses, they don't think that's consistent with -- and I don't know which Amish we're talking about, forgive my ignorance, but some Amish, and I'll tell you how I know about this in a minute. But they have them out in the fields. They have cell phones out there. They made a decision through a community process that their lives would be enhanced when there's emergencies and problems and issues. And they actually have cell phones out in the field; they're still agricultural, not in the houses.



Kevin Kelly, the author of "Out of Control" and many other books is researching or writing a book, I think, called "What Technology Wants". And he actually went out and visited the Amish, and he's written about this, it's online. So the Wikipedians are actually a bit like the Amish in that they're being selective about it.



But I think it's not unfair to hold the point of view that, like the Amish, they may be missing out on some opportunities by being pretty conservative about stuff. So let me quickly talk about a challenge that involves that. It's all text in there. There's no XML, it's not structured, you can't get at it, there's no web services interface. There's a ton of quanatative data, but you can't extract it. The Wikipedia is not a database. It's not anything like a database, currently. And it's a significant limitation because it is disconnected from the entire world of the emerging semantic web and meta information and micro formats and web services, and anybody who is in that world can take five minutes looking at Wikipedia and say, "Wouldn't it be great if it could do A, B, C, X, Y and Z also?" And it can't. And it wouldn't say that there's no forward motion towards that, but there's very, very little. And that's a huge challenge.



Fortunately, there's also nothing preventing somebody from adding that, it's open, the content is all free. You can get a four gig dump of all the Wikipedia text and more if you want all the images, and put it on your machine. You could actually make that available to the public. And you could start hacking on it. You could develop a demonstration of what the Wikipedia could do if it also dealt with structured information. If there was a compelling demonstration, I think you could take it back to the community and convince them based on that that it's worth adopting. We can talk more about that, but time is short.



The one intriguing thing is if the Wikipedia was enabled in this fashion, we could -- in an almost perverse, bizarre and ironic way -- get to the big semantic web in which all of the information of the world is marked up. Usually when people talk about that, what they mean is they want people to produce information, whether it's a blog or a bibliogragh page, so anything that's both human-readable and machine-readable to begin with. There are whole courses here that go into that, multiple courses. And you'll probably get a degree in it, so I cannot possibly do justice to it.



The one thing I realized was if it was as easy to add semantic information or meta data to a page in the Wikipedia as it is to edit it, the Wikipedia community would mark the whole thing up in no time. It would be human-powered. You would never have to have all the machines do it. Over a period of time, every timeline, every fact, it would all get marked up people because that's what Wikipedians do. And the value of it would be enhanced. So it's waiting for somebody to take on this challenge.



Eco-system collapse. I'm not talking about global climate change. If the Wikipedia becomes big and globally important, I mean look what's happening to Google, look at the search engine optimization business, look at the economics being invested in fooling Google to the benefit or advantage of somebody with a website. If the Wikipedia becomes the essential source for global information, can any organization, any faction, any person afford not for it to say what they want it to say about them? No. If that happens, then there will be massive organized attacks on it. It is kind of like -- usually, your immune system does a pretty good job at detecting self from not self and marshalling forces when it recognizes alien invaders. The antibodies come and they attack whatever it is. But in this scenario where Wikipedia is real popular, you've got to ask if it's a kind of a race, could it be overwhelmed? Just in the face of everybody trying to game it. Or will the immune system get better? I don't know.



So I think that there are some areas of application of the Wikipedia where that would be pretty profound. I would like all of the papers assigned in all of the graduate courses at every department in every school that ever get assigned more than once to have a Wikipedia entry, which you could start and read and said, it's a summary of the paper. It says, "This is what this paper is trying to say. It's collaboratively edited by the world's graduate students." It's a much longer discussion about why that would be a good thing. I hope -- intuitively I saw some body language, people were, "Oh, that would be pretty good." I should say there are also very obvious reasons why it's not in the interest of lots of people in the academy for that to happen. But there's nothing stopping people from doing that, and I'm hoping it starts small. It would really change things a lot.



Nonprofits. I mentioned this before. I had a conversation with the head of a big environmental organization in Washington, DC. They've got a ton of stuff on their website. And he was initially really enthusiastic about the idea of transferring their intellectual capital into the Wikipedia in an appropriate kind of way, and then very little happened. So I did a follow up. And you begin to see that the Wikipedia community is one thing where you select into it if you feel like you fit and want to do something, and an existing organization out here with a budget and priorities, even if the mission is to disseminate knowledge and information, it's not so easy. And this guy said to me, "Yeah, some of our twenty-somethings, the young staffers got all excited about this because they're really into the Wikipedia." But most people, they don't decide what to do with their own time in this organization, and so, what, they're going to do it in your spare time? And management up here, they don't really get it, they're going, "Why do you trust this information here?" And there's some real stickiness in organizational adoption. I've tested the idea a dozen times with people to try to see the pattern. The issue is going to be one in which people will recognize the value generally, but will they take what they know and put it in a useful form? Will they become participants or not? I think it's going to be slow process.


But I will tell you a story about a conversation I had with David Kessler who is the dean of the medical school at UCSF and used to be the commissioner of the FDA with the tobacco suits. I met him. And we actually grew up in the same small town on Long Island, that's a strange coincidence. He's roughly my contemporary, and he'd never heard of the Wikipedia, but this was a year ago. I was explaining it, and he said, "You know, let me tell you about a conversation. I was just in "-- it was some part of Asia in the developing world -- "at a conference, and I was having a conversation with a very senior medical practitioner, and the conversation was, 'What is it that you most want from the developed world in terms of help with global health issues?'" And Kessler said, "So I thought it would be something involving money, you know, equipment or 'We need surgeons' or 'we need training' and they need all those things. And what he said was, 'Knowledge. If we knew what you know just now and we had that available to us, we could save thousands of lives a year. We just don't have the knowledge. We have Internet connections.'" And I said to Kessler, "So here is a great example." And we had a further conversation. So it turns out that if you're a medical resident, if you graduate there, and you do your internship in the hospital, there is an online service, it's published by Harvard, and it's the thing that you look at all the time when you're first actually treating patients clinically. It's a kind of a database organized by what it is that people have that has all of the latest stuff. Harvard charges a lot of money for this, they make a nice business. And I said, "So why don't you start the Wikipedia equivalent of this?" It's just information. People know what it is. The Wikipedia proves that you could get contributions from health practitioners on this, and keep it current, and then you could make it available for free, and you could be saving thousands of lives right now. And he said, "You know, you're absolutely right." That's a project that's waiting to be done.



So Wikipedia is not just about looking up the interesting stuff. If its potential can be manifested, and if we can carry it forward through our organizations, Wikipedia itself, a derivative, all open for debate, it could make a huge impact because information is what matters and counts. It's going to matter more and more, and it's just a wonderful, fabulous cultural invention. That's why I keep wrestling with it. Thank you.



Transcription by CastingWords