To huge fanfare, Google this week announced Knol. We really know little about it, other than what we might glean from a single blog post by Udi Manber, Google’s Vice President of Engineering, which hasn’t stopped breathless coverage in mainstream media and the blogosphere alike. My initial response was tepid, to say the best, and the more thought I give to it, the less I like it. Here I’ll try to outline some of my thoughts as to why not only does it not fill me with enthusiasm, but gives me cause for concern.
Let’s work our way through the blog post announcing and describing Knol (since it’s about all we really know about it). It outlines Google’s motivation for the as yet to be released system, and then provides some details on what the system will do.
Manber starts by making some observations about two significant problems with the web.
[on the web] not everything is written nor is everything well organized to make it easily discoverable
which is certainly true, and something that many folks have been looking to address on both the authoring and indexing sides for no short time. The lack of rich semantics in HTML, and the generally poor use of such semantics as there are, make gleaning from web pages their meaning a very non trivial problem to solve.
There are a number of possible solutions to this problem.
Natural language processing (in essence a kind of artificial intelligence) would enable much richer indexing of web page content, by essentially “understanding” that content. NLP is hard, and arguably not even possible, and certainly not a solution for today. If anyone has the resources, and the capacity to find the right folks to develop NLP for web pages, then it is surely Google.
The standardization of semantics for the web, and HTML in particular, is another potential solution. Microformats demonstrate that this is feasible, and practical at least on a certain scale. Will it scale up to the level of the web? Microformats depend on web authors or tools marking up content in a specific way, and so require education, and the adoption of new, or upgraded tools to widen their use. They are also used in on the order of a hundred million pages.
The Semantic Web project, an ambitious attempt to bring much richer semantics to the web, at, unlike microformats, the expense of today’s web. Whereas microformats aim for a high level of compatibility with today’s web standards, technologies, tools and skills, The Semantic Web project requires us to largely abandon all that.
Google, with Knol, has chosen yet another approach. Like the Semantic Web, to start all over again, but to also have much of the web in one format, developed using their tools, and hosted on their servers. While none of that is explicitly stated, close reading of the blog post supports all those assertions, and are the core of my concerns with Knol.
The second key problem Manber outlines in discussing the motivation for Knol is that
[t]here are millions of people who possess useful knowledge that they would love to share, and there are billions of people who can benefit from it. We believe that many do not share that knowledge today simply because it is not easy enough to do that
While true, it overlooks the fact that millions of people do share that knowledge today. They do it on Wikipedia, to be sure, but they do it on blogs, websites, forums, they do it all over the web. And while it is doubtless true that millions possess useful knowledge that they currently don’t share, because it is not easy enough to do, there are many potential ways in which to make that easier, while also harnessing the value of the content that already exists (AKA The Web).
So, let’s summarize the two problems Manber identifies as significant ones for Google to solve.
- Stuff on the web is hard to find
- Stuff on the web is hard to produce
Both in an of themselves pretty fair observations. But, where do these observations lead us?
According to google, Knol.
Knol is from what little information we really have about it, a database system, where experts contribute articles in their areas of expertise. Coupled with this is a set of community tools for rating and reviewing these articles. Think of it as a kind of Digg for encyclopedia articles. Reasonably cool, but if announced by anyone other than Google, I doubt I would have heard about it, let alone my mum.
In fact, if we take the concept apart piece by piece we see that while superficially like Wikipedia (which it is widely touted as a “killer” of) and Digg in parts, it in fact embodies little of the revolutionary natures of either of them. In this case the whole is less than the sum of its parts.
How so? Let’s start with Wikipedia, with which Knol is almost invariably bracketed by commentators (though any mention of which is curiously absent from the announcement, as is the word “encyclopedia”). Its revolutionary aspect is the nature and granularity of contribution - anyone can contribute to any article, down to the level of a single character. This might be as superficial as tidying up a stray comma, or as complex as an entire article on string theory.
Knol’s granularity is the article, by an author. Rather than arriving at some kind of dynamic consensus between potentially many different participating contributors, who may indeed have axes to grind, differing ideological perspectives and so on, as Wikipedia articles do, Knols are simply an article by an author.
Google’s argument in favour of this approach is as follows.
The key idea behind the knol project is to highlight authors. Books have authors’ names right on the cover, news articles have bylines, scientific articles always have authors — but somehow the web evolved without a strong standard to keep authors names highlighted. We believe that knowing who wrote what will significantly help users make better use of web content
In a sense we can read this as “Knol will do the opposite of what the web does”. If the web somehow “evolved without a strong standard to keep authors names highlighted” hey, maybe the web knows something we don’t. And afterall, Google has become a near half trillion dollar company by listening to what the web knows - page rank anyone?
But it is also a somewhat disingenuous argument. Afterall, most blog posts, web pages and sites, and so on, do have authorship associated with them. The rise of the individual, well known, “blogger” is evidence of this. The one place where the observation is pointedly true is in fact Wikipedia - whose name we have observed is not mentioned in the announcement of Knol.
In a sense, Knol versus Wikipedia is an excellent experiment - will “knowing who wrote what … significantly help users make better use of web content”? Will Knol become more useful than Wikipedia?
What’s also interesting in this sentence is that all of a sudden we are presented with another problem that Knol is trying to solve, in addition to the two we outlined earlier. Knol is trying to make web content more useful to users, by attaching a name to the creator of the content, by ranking reliability as a function of the author.
But none of this is really my concern with Knol - afterall testing the relevant strategies of content development - a wiki approach as opposed to a community rated and discussed yet otherwise unedited approach with atomic authors - is an interesting experiment. So what exactly is my concern? I’ll get to that after addressing the issue of how Knol leaves out the revolutionary aspects of Digg, just as it does Wikipedia.
Fundamentally, Digg allows its users to rate, review and recommend web content from anywhere (I’m using “Digg” here as a shorthand for the many similar ranking type sites that have emerged over the last few years, it being the best known generally).
What will Knol’s “strong community tools” let users “rate” or “review” (these are not shudder quotes, rather direct quotations from the Google post).
They’ll let you rate stuff in Knol. And that’s it.
The revolutionary aspect of Digg type systems, that help tame the complexity of the web, “help users make better use of web content” is essentially abandoned. Knol helps users make better use of Knol content.
But, asserts Manber, “[a]t the heart, a knol is just a web page”. Respectfully, no it’s not. It’s both more and less than just a web page. By privileging a certain kind of web content - Knol content - it diminishes the value of the “useful knowledge” which many many folks currently already do share. Google is in effect saying “contribute it again, within our system, or be ranked lower than those who contribute Knols”. While this is not explicitly stated, it’s a clear and fair implication from this announcement - otherwise, in essence, Knol is not required to “make [information] easily discoverable”.
With Knol, some web pages are more equal than others.
But is this that big a deal? Why not contribute? Afterall Google assures us they “do not want to build a walled garden of content”, and Google, as we know, isn’t evil. Not for a moment am I attributing evil to Google’s intent, but the consequences of Knol for the web are far from good. It’s like Facebook for everything. While other search engines will be able to index the content, we all know it’s Google that matters. And Google has told us in effect that you’ll get better ranking by contributing to Knol than otherwise. Which will create a de facto walled garden.
But is this really really a problem? Consider an article which discusses the present Chinese government, in ways which might be considered critical. Manber assures us “Google will not serve as an editor in any way, and will not bless any content”. But will they censor content? Or, will they allow authors to post content under pseudonyms, or anonymously, should an author wish to not be publicly associated with their article? Given the prominence given to “authors” in the Knol system, one wonders.
So, clearly, while not for a moment doubting Google’s motivations or intentions, the proposed solution to the outlined challenges of making stuff easier to find on the web, and for making it easier to contribute your knowledge to the web has significant downsides.
Manber tells us “we want to disseminate it as widely as possible” - which I am more than willing to take at face value. But that’s the heart of the problem of Knol as it stands. It wil creat de facto walled gardens, and will downplay existing knowledge, in essence diminishing the we as it is today. So how best to go about solving the stated problems?
I’d suggest that you can keep a whole lot of Knol, indeed essentially all of it, and make it much better. Here’s how.
Publish a standard set of semantics, like microformats, that can be used in any old HTML for what constitutes a “Knol”. I’d suggest reusing as many standard microformats and associated design patterns as possible in doing so, to leverage the microformatted content that already exists. Then let anyone publish “Knols” - whether on a Blogger or Wordpress, or whatever system blog, on their static web site, or using specialized Knol publishing applications, one of which Google would provide. This competition would surely give rise to all kinds of systems making it easier for anyone who wants to contribute and share their knowledge to do so. Afterall, as Manber says in this very post “Competition … is a good thing”. Google is free to aggregate these Knols no matter where they are published, and the standardized semantics would make for content that is “well organized to make it easily discoverable”. I have no doubt that a great many folks would retrofit the Knol semantics into their existing content, giving Knol an instant huge boost to the amount of content in the system, and with the payback benefit that the web gets better semantics more generally.
Aboveall, rather than potentially undermining, it would reinforce the importance of the open web, the source of Google’s extraordinary rise to prominence, wealth and influence, and whose health I am sure Google is extremely keen to maintain.


{ 3 } Comments
In one sense this situation is not significantly different to a Google-controlled Blogger (’just’ another CMS/database with a particular spin), and indeed the value is largely the same in each; producing more pages with Google Ads.
One view could suggest that Google sees much of their traffic directed to ad-free Wikipedia and want-in on the action — perhaps it sounds a little uncharitable, but at the end of the day, advertising *is* their business.
While your microformats-based vision is more open, and compatible with a ‘healthy’ web, it wouldn’t guarantee Google ads, so I suspect they’d be less enthusiastic, and on the engineering side indexing is harder than hosting in the first place.
Their opposing actions with ‘opening the closed’ with OpenSocial and Blogger, Orkut, and now knol ‘closing the open’, or at least bringing it in-house.
I dunno, but the ‘e’ word isn’t seeming quite so laughable these days.
Of course a more pragmatic view might say that building your own database fixes the problem quicker (and there are problems with Wikipedia and it’s culture, usability etc.), and we’re just getting a whiff of their arrogance, but to stretch the hyperbole just that little further, we have to be careful it’s not a Faustian deal!
Hopefully this won’t gain too much traction, but it *is* something to be uneasy about.
…or I’m just being paranoid.
Yes, I agree that Google’s current approach adds nothing of interest to the web, and even lessens it (if a restaurant reviewer opened a restaurant, would we still expect his reviews of his neighbour to be unbiased?)
I *do* like your suggestion, and hope that Google is made aware of it to consider.
Adding to the HTML spec (something they are already familiar with) for how to ‘knolify’ a page so that they can more readily scrape its information.
Now that could be revolutionary, and in a good way!
Lea,
no need even to add to HTML - that’s the beauty of microformats - you can extend the semantics of HTML using the existing mechanisms of the language, with no need to alter it in any way.
john
{ 1 } Trackback
[…] L’articolo critico su Microformatique.com che propone anche un’alternativa “emergenteâ€. […]
Post a Comment