Welcome to Talis Platform News
Welcome to the latest issue of Talis Platform News. There has been quite a flurry of "Web 2.0 versus Semantic Web" commentary in recent weeks, with things further muddied by the various competing definitions of Web 3.0. Tim O'Reilly and John Battelle's latest Web 2.0 Summit kicks off in San Francisco next week, and with a Semantic Web session firmly established on the programme it will be interesting to talk with people on the ground there. So in this issue we try to shed some light on how and why 'Web 2.0' and the 'Semantic Web' are far from being in competition with one another.
Elsewhere in this issue, Talis CTO Ian Davis describes the benefits of hosting applications on a platform such as the Talis Platform, Danny Ayers provides an update on our Early Access Programme, we have an update on the Open Data Commons licensing activity, and Mills Davis of Project 10x provides some insight into his forthcoming Semantic Wave 2008 report.
I hope you find this issue of interest, and welcome suggestions for items you'd like to see covered in future issues. Don't forget to subscribe in order to receive an email reminder and highlights each month.
Paul Miller, Editor
Web 2.0 and the Semantic Web - a perfect match?
The latest Web 2.0 Summit kicks off in San Francisco next week, and Ian Davis and I will be making the trip for Talis once again.
In previous years, the event has attracted a lot of media buzz, and it has been the scene of not a few product launches and big announcements. Who knows what we'll see this year? It does appear, though, that the worst excesses of hype around 'Web 2.0' are receding as companies get down to building sustainable applications and businesses that just happen to benefit from aspects of the trends that O'Reilly and others have brought together under the Web 2.0 umbrella.
At a time when debate once again seems to be attempting to polarise Web 2.0 and the Semantic Web as an 'either/or' proposition, it's refreshing to see a track devoted to pragmatic applications of semantic technologies at the conference. Participants include;
- Talis Platform Advisory Group member, podcast subject and Radar Networks CEO Nova Spivack;
- Danny Hillis of Metaweb (boss of Talis Platform Advisory Group member and podcast subject Jamie Taylor);
- Barney Pell of Powerset, Inc.
Our perspective at Talis is one that does not see Web 2.0 and the Semantic Web as in conflict. Instead, we see the best of both combining to deliver rich, stimulating and usable applications that harness the flows of explicit, implicit, intentional and attentional data to bring us ever closer to the Web of Intentions. Some of these ideas were explored in a presentation in Cambridge this week, and we'll be digging into them further over the next few months.
If any of you are in or around San Francisco next week and would like to explore these ideas further, please do get in touch.
Notes from Behind the Curtain
One of the advantages to working with a hosted platform is that your applications get better without you even having to lift a finger. Every little tweak, every improvement we make, benefits all the applications that rely on our platform services. Even while you're sleeping your applications could be getting better.
For example, we have been working to improve our release process, the set of steps that we perform when we want to deliver new features to the public. So, in addition to our existing scaling lab, we introduced a new staging environment which closely mirrors our live platform infrastructure. Each month we call a halt to development and wrap up all the completed work into a new release candidate. This is handed over to Dave, who's in charge of our live services, and the developers get back to work on the next release. In the meantime, Dave deploys it to the staging environment for Matt, who's in charge of our performance testing, to subject it to all manner of cruel and unusual punishment. We call this our performance soak test, it runs for up to 24 hours and outputs pages upon pages of stats, charts and measurements of all the intimate details of the platform.
This month Matt turned up a nice shiny nugget: some significant performance improvements for the platform release. In terms of transactions per second, contentbox searching has improved 25% and posting of changesets has improved by over 100% when compared to the previous release.
That means that every application using the platform can be 25% faster when searching and could double its update performance, without you needing to make any changes at all. That release went live a couple of weeks ago and instantly we saw the benefits in our own applications such as faster, snappier responses when saving records in Engage. That's the
kind of news that really makes me grin :)
Mills Davis rides the Semantic Wave
This month Project10X is publishing a study on semantic technologies and their market impacts that we think should be must reading for investors and ICT companies as well as public. It's called “Semantic Wave 2008: Industry Roadmap to Web 3.0” (SW2008). You can click this link to download the executive summary and prospectus.
Semantic Wave 2008 provides groundbreaking technology and market research. The report is 350+ pages and includes more than 250 figures and illustrations. It provides the first comprehensive industry study of web 3.0 and the semantic technology space.
The technology section of the report examines five strategic technology themes and shows how innovations in these areas are driving development of new categories of products, services, and solution capabilities. These themes include: executable knowledge, semantic user experience, semantic social computing, semantic applications, and semantic infrastructure. The study examines the role of semantic technologies in more than 100 application categories. An addendum to the report surveys more than 250 companies that are researching and developing semantic technology products and services.
The market section of the report examines the growth of supply and demand for products, services and solutions based on semantic technologies. Specifically, the report segments and discusses semantic wave markets from five perspectives: research and development, information and communication technology, consumer internet, enterprise horizontal, and industry verticals. Viewed as horizontal and vertical market sectors, each presents multi-billion dollar opportunities in the near- to mid-term. The study presents 150 case studies in 15 horizontal and vertical sectors that illustrate the scope of current market adoption.
In addition to the main report, there are two addenda: a supplier directory, and an annotated bibliography.
Semantic Wave 2008 tells the story of web 3.0
The semantic wave embraces four stages of internet growth. The first stage, Web 1.0, was about connecting information and getting on the net. Web 2.0 is about connecting people — putting the “I” in user interface, and the “we” into a web of social participation. The next stage, web 3.0, is starting now. It is about representing meanings, connecting knowledge, and putting them to work in ways that make our experience of internet more relevant, useful, and enjoyable. Web 4.0 will come later. It is about connecting intelligences in a ubiquitous web where both people and things can reason and communicate together.

Open Data Commons update
I've written before about our interest in 'open data', and was pleased to bring that right up to date in recent issues of the newsletter when I was able to point to the new Open Data Commons license which we funded Jordan Hatcher and Charlotte Waelde to prepare.
As part of that drafting process, we - and they - were always keen to invite widespread consultation and engagement. This falls into two main areas, namely;
- explaining why 'open data' needs to be protected by a license at all, when surely the whole point is to make it as open and accessible as possible;
- assessing the legal rigour of the licenses as drafted.
On both counts, there has been a lot of discussion, and Jordan has been doing a great job of tracking conversations on numerous lists and blogs, and then abstracting common themes in order to address them more fully on his site in a series of posts.
This consultation process is drawing to a close, but there's still time to have your say on the license and what it's trying to do.
We're also hard at work on the next stage of the work; finding a good neutral home in which the finished license can grow and be nurtured over the long term. We'll certainly use the license, and support its upkeep, but we're probably not the best place to maintain it.
I note that Marc Canter will be leading a workshop on Open Data at next week's Web 2.0 Summit, and look forward to further discussion of the Open Data Commons licenses and their implications in that forum.
Talis Platform Early Access Programme Update
The fledgling Talis Platform "N-Squared" community can now be considered officially up and running in the form of the Early Access programme, with the 25 or so invitees having now received access credentials to their stores. But for reasons I'll come to shortly, my progress on support material such as tutorials and code snippets has been considerably slower than I'd have hoped.
As well as the stores, the resources available to members of the programme include a public mailing list (n2-dev) and Wiki, with space on a public-readable Subversion repository and hosting of demos being provided as requested. The IRC channel #talis (on irc.freenode.net) is a key communication channel of the Platform development team, and developers building on the Platform - the N-Squared group - are encouraged to join the chat.
Anyone interested in joining the Early Access programme, please drop me a line.
Encouraging Early Adopters
The aim of the Talis Platform is to make it considerably easier to develop and deploy innovative web-based systems, relieving the developer of much of their critical but often mundane infrastructure work. With sophisticated scalable storage, search and indexing built in to the Platform, the developer is free to concentrate on their ideas. However, while site/service hosting of one form or another has been around since the dawn of the Web, and there's been a resurgence in creativity with "Web 2.0", there aren't any existing systems that combine these characteristics alongside the flexibility and data sophistication that Semantic Web technologies provide. The Talis Platform is breaking new ground in this area, and will be reliant on early adopters to recognise the benefits of this approach.
At present, there seem to be two prominent groups of candidates for early adoption (not including Talis' existing customers). The first are people with data problems that are difficult to solve using traditional approaches - for example, right now a lot of people in the science community need exactly the kind of facilities the Platform can offer. The second group are people already developing innovative applications on the Web. So the purpose of the early access programme is to reach out to these groups, and to foster a community of developers building on the Platform.
Outreach and Support
To demonstrate the benefits of the Platform to developers, it'll be desirable to have small yet reasonably realistic applications to show, alongside tutorial and reference material. The Platform is based on open standards, but clearly it's necessary to explain how the Platform uses those standards. Long-term, the evolution and maintenance of the supporting material the developer needs to get started with the Platform will hopefully be a natural by-product of the experience of early adopters. But some priming of the system is needed, and that brings me back to why my progress on producing support material such as tutorials and code snippets has been slow.
Too Much Fun
Here's the rub. I'm a Web enthusiast and developer myself, and have spent a lot of time in the past few years exploring the kind of things Semantic Web technologies enable. But in practice, a fairly significant proportion of this time has been spent setting up the basic infrastructure, while flitting from one storage option to another. Then there is the time spent getting to know the specific programming APIs of the particular system. When it comes to putting time in on maintenance of things I've put together, frankly my record is abysmal.
To be able to produce support material for the Talis Platform, clearly I have to get to know it a little. But the main (HTTP) interfaces I've been working against have been the same, no matter what the choice of programming language or libraries locally. This I've found not only tremendously liberating, but hugely distracting from my task of preparing support material. Let me give a couple of examples.
Working remotely, I need to keep track of my time, so I put together a little logging tool I could report into. A simple Java desktop client which posted data of to my store. This is time-oriented material, so I couldn't resist hooking the contents of the store to a new viewer from the SIMILE folks at MIT, Timegrid. For this I used a little SPARQL and PHP to produce the JSON data it required. While the internals of what I needed here were a little fiddly in parts, the core of the system was already in place - my store on the Talis Platform. (There's more on this in a blog post, here's my live Timegrid). I got to a point of reasonable satisfaction with that, but I'd already been playing with extracting the history of visited sites from my browser in a form which would make the data reusable, and soon I plan to hook this up to the Platform store as well, so it'll be possible to see which sites I was visiting alongside the activity I'd logged. In my defence on this in relation to my official role at Talis, it has led me into communication with the Attention Profiling Markup Language group, to whom I'm sure I can demonstrate the utility of the Platform in this context.
In a similar fashion, curiosity on how some of the data used by the Metalinker system for file description could be exposed on the Semantic Web using GRDDL led me into discussion with one of the developers over there - Ant Bryan, who's now been inducted into the N-Squared community.
A slightly less defensible bit of procrastination on the tutorial material came last week when initially I was trying to come up with a demo of the Contentbox parts of the Platform stores. Simple photo sharing seemed suitable, but while looking for related resources I stumbled across the software for a motion detector to use with a webcam. A couple of hours later I had a little script hooked up, ready to post photos of a sleeping cat to the store whenever she woke up. One day I might hook this up to the Timegrid view as well (and/or Twitter), but that really better wait until after I've finished the tutorial material.
To receive notification of new issues of the Talis Platform News please remember to SUBSCRIBE.

