Welcome to Talis Platform News
Hello, and welcome to this month's issue in which we report on additions to the team and some of the things that we and others have been getting up to in and around the Talis Platform. In their regular slots, Ian Davis looks Behind the Curtain and Danny Ayers updates us on the N2 Developer Community. Tom Heath talks about Linked Data, and Keith Alexander illustrates the ease with which one developer was able to tackle an issue of interest to him, thanks to the Talis Platform. Finally we offer an opportunity for you to attend one of the big events of the Spring at a discount, in what will hopefully be the first of many similar tie-ups with the events that are driving growth of this fledgling community.
I hope you find this issue of interest, and welcome suggestions for items you'd like to see covered in future issues.
For those too impatient to wait a month for their Semantic Web news, don't forget to check out Danny Ayers' This Week's Semantic Web post each week on Nodalities.
Don't forget to subscribe in order to receive an email reminder and highlights each month, and feel free to join us in IRC conversation over on the #talis channel hosted on freenode.net.
Paul Miller
Semantic Spring?
As Ian Davis mentioned in a blog post earlier this month, it really is going to be (with apologies to those from the southern hemisphere!) a Semantic Spring. One of the plethora of events at which we'll be presenting over the next few months is the Semantic Technology conference in San Jose, California. Talis is delighted to announce that we are supporting the event this year. More news will follow on exactly what this entails, but for now we're able to offer all readers of Talis Platform News a discount on registration at the event. Simply go ahead and register, using the special registration code of 'ST8TLS' to qualify for $200 off the full event, or $100 off the price of the tutorials.
A number of us will be travelling to San Jose for the event, so please do get in touch if you'd like to meet up there.
Tom Heath joins Talis
The Talis Platform team continues to grow, and this month we're delighted to announce that the latest addition is Tom Heath.
Tom has just completed his Ph.D thesis at the Open University's Knowledge Media Institute (KMi), and will be well known to the Semantic Web community as an active member of the Linked Data community, winner of last year's Semantic Web Challenge and the brains behind Revyu.com.
Tom's Ph.D research was focused on using the Semantic Web to support recommendation-seeking in social networks, and his understanding of this space will be put to good use in and around Talis.
Tom took part in one of our Talking with Talis podcasts last year, where he discussed his research and shared thoughts on the direction in which the Semantic Web is moving.
Notes from Behind the Curtain: Out of the Labs
The Semantic Web has been the subject of intense academic research since the late nineties. Some of the output of that effort has been standardised by the W3C to become things like RDF, OWL and SPARQL. Even those standardised technologies are the subject of further research into their structures, complexities and possible shortcomings. However the difference between what is possible in theory and what is feasible in practice is often huge. This is especially true when dealing with very large amounts of data.
As we build out the Talis Platform we're very aware of this and we take care in what we select for inclusion. We study the research and pay attention to the discussions in the community. Then we pick the technologies that are stable and are well understood and supported by developers and existing software. I'm personally very involved in this process but our entire development team are constantly researching and understanding about the best developments in this space.
Wherever possible we adopt existing standards, such as using RSS for our search results. We find this makes our Platform services instantly usable by many existing software tools. Sometimes though we find that there is no accepted way to achieve something that we need, or the usual ways are not sufficient. One example is where we needed a way to update RDF held in a store. We looked at the existing protocols and standards and found that none were suitable for simple update mechanism we wanted to provide to our users. In this case we invented our own Changeset protocol.
In other cases we might decide that a particular technology not mature enough to be included in the Platform. An example of this would be our decision to hold off from implementing support for OWL. While we believe OWL can be useful it's not clear that it is feasible to operate over even moderate quantities of data at this time. Perhaps it will become feasible via some great breakthrough so we'll wait and see what research is published in this field and possibly contribute some of our own.
The net result is that the Platform supports the best and most useful technologies drawn from a wide range of possibilities. We're thankful that there's a vibrant research community around the Semantic Web and we hope that increased commercial usage driven by companies like Talis will enable this research to continue at an even greater pace.
Why not join us in #talis, and discuss these and other ideas with those responsible for the Talis Platform?
Semantic Web Podcast news
We've continued to release episodes in our Semantic Web podcast series, with Mathieu d'Aquin, Andreas Harth and Inigo Surguy all published since January's newsletter.
Entirely unintentionally, the majority of our recent podcasts have been substantially concerned with the problem of search on the Semantic Web. Along with Mathieu and Andreas, Eyal Oren also discussed this. Through these conversations, we learn about pingthesemanticweb.com, Sindice, Swoogle, SWSE, Watson, and more. Each fulfils a slightly different purpose, and each is driving a body of research. But for the Semantic Web to go mainstream, do we need a Semantic Web search juggernaut that clearly lifts itself far above the competition, as Google did? Is it one of these, or is there space for a new and overwhelming fresh entrant?
We've got some great conversations to release over the next few weeks, and as usual they'll be blogged on Nodalities as they are published, and then added to the catalogue.
Technical Podcasts
To complement Paul's generally high-level Talking with Talis podcasts, Danny Ayers has commenced recording more technically-oriented conversations with leading experts on Web technologies. The first of these is with Benjamin Nowack, who's pioneering approach is to make Semantic Web capabilities available to Web developers who may only familiar with more traditional tools. He recently released a completely revised version of his PHP toolkit for RDF, ARC.
Danny's latest column for IEEE Internet Computing (titled "Graph Farming") is now available from the Talis Platform publications page.
We're always looking for new podcast topics, so if you have a topic to discuss or someone you'd really like to hear please do get in touch.
'No more toy examples' - the Linking Open Data project, one year on
Which events do you see as landmarks in the development of the Semantic Web? The invention of the Web itself; the publication of key recommendations such as RDF, OWL, and more recently SPARQL; the stimulating, if not uncontroversial, Scientific American article in 2001; or the creation of companies explicitly aligned with the Semantic Web vision?
There is another event that I believe will sit alongside these when we look back on how the Semantic Web came into being: the creation of the 'Linking Open Data project'. In February 2007, Chris Bizer and Richard Cyganiak made an open invitation to members of the Web community: to join them in identifying sets of open data available on the Web, re-publishing these in RDF and interlinking them with others to create a Web of 'Linked Data'. The emphasis was not on talking about the Semantic Web, but actually building it. If the project had a motto it would be "no more toy examples".
The stimulus for the project came from the W3C SWEO Working Group’s call for Community Projects - unfunded (but morally supported) efforts driven by enthusiasts keen to demonstrate the potential of a Semantic Web. Proposals were solicited from the community, and after a deadline had passed the SWEO Working Group would select a number of projects to be officially endorsed. Even before this deadline was reached the Linking Open Data project had a momentum all of its own. Numerous enthusiasts from industry, academia and beyond had registered their commitment to taking part, a newly formed mailing list was buzzing with activity, and there were already practical outputs to report.
Key nodes in this emerging 'Web of Data', and lynchpins of the project from day one, were initiatives such as DBpedia and Geonames. DBpedia extracts RDF triples from the 'Infoboxes' commonly seen on the right hand side of Wikipedia entries, and makes these available on the Web in RDF to be crawled or queried with SPARQL. Geonames in turn provides RDF descriptions of millions of geographical locations worldwide. Many of the things in the world to which we want to refer can be classed as people, places or things. By providing URIs (and RDF descriptions) for many of these, DBpedia and Geonames serve as hubs to which other data sets can be connected. It is these kinds of connections that put the 'link' in Linking Open Data, and the 'Web' in Semantic Web.
One year on the 'Linked Data' meme has moved from niche interest into the mainstream Semantic Web community, and the 'Web of Data' label has helped clarify Semantic Web ideas for a wider audience. The Linking Open Data project now counts nearly 30 data sets among its membership, covering domains as diverse as census information, photos, music, books, companies, reviews and human languages. Together these data sets comprise over 2 billion RDF triples interconnected by around 3 million RDF links.
These are impressive numbers, but there's always room for more. Perhaps you have a data set, large or small, that could be interlinked with others on the Web. The Linking Open Data project always welcomes new members, so how can you get involved? Firstly, have a look at the project home page, and join the mailing list. Secondly, think about how you might publish and interlink your data; there are many different approaches to doing this, including wrappers for relational databases, dedicated server products, and hosted solutions such as the Talis Platform. Lastly, Linking Open Data is all about a group of people with a shared goal, and we'd be delighted if you joined this community.
So, please sign up to the mailing list and say hello, or better still join us in person in April at WWW2008 for 'Linked Data on the Web 2008'.
N2 community update
The recently deployed SPARQL Query Demo (running on live data on the Platform) and Twitter review proof-of-concept have attracted several more Web enthusiasts to the N2 Platform developer community.
The generic nature of the Platform is reflected in the membership of the n2-dev e-mail group, which includes not only existing Semantic Web developers, but also individuals from the sciences, mainstream media and (of course) Web 2.0 developers wishing to expand their capabilities.
You can find more information about the Platform developer community on the N2 Wiki, and if you want to join the fun, simply email me with your details.
The Universality of SPARQL, SIOC and Widgets
Keith Alexander discusses the way in which the Talis Platform 'does the heavy lifting' in order to let one developer simply get on with addressing the problem at hand...
Earlier this month (as you might have seen on Danny’s blog) Alexandre Passant wrote a SIOC Widget which will display a list of blog titles retrieved from a SPARQL endpoint. He needed an endpoint to a store with SIOC data in it, and I pointed him to the Talisians store’s endpoint (which contains the aggregated blog posts by Talis people). His widget, however, wanted to retrieve JSON from the endpoint, and the Platform endpoints currently only provide XML output (though JSON support is coming soon). Luckily it was easy to extend our experimental RDF/SPARQL converter service into a proxy for platform stores to enable SPARQL/JSON and RDF/JSON output, so that he could test his widget against the Talis Platform.
What is exciting about this is not so much the output of the widget itself (a list of blog titles and links), but the stack of technology standards that Alexandre so easily plugged together to produce it. He used the NetVibes Universal Widget API (which claims to allow widget authors to “write once, run anywhere”), and that widget queries the Talisians SPARQL endpoint (using the standard SPARQL protocol and query language) for data on blog posts (which are described using the SIOC vocabulary).
As Benjamin Nowack explained in his podcast with Danny Ayers, RDF allows developers to stop worrying about writing and maintaining complex relational database schemas. You just create your data as RDF and put it into the data-store. By the same token, developers such as Alexandre can query the data-store without needing to know very much about the structure of the data overall; with an RDF query language like SPARQL, you can just cherry pick the data that you want from the store, and popular community vocabularies like SIOC help enable this by providing a standard set of terms that the data can be described and retrieved with.
Why not join the N2 community, request a Platform store of your own, and see at first hand the extent to which the Talis Platform can make it easier for you to build your own applications?
To receive notification of new issues of the Talis Platform News please remember to SUBSCRIBE.

