Welcome to the third issue of the Talis Library Platform News, the monthly touch point for innovation and discussion in the world of libraries and the technology they use.

Open Data - what licensing is needed to protect what from whom? It is not as clear cut as you may assume. Joining the regular articles in this month's issue Rob Styles explores some of the issues around the Open Data debate.

If you are new to the newsletter sign up, pass it on and tell your colleagues about it and join us in creating a community that shares, innovates and learns from each other. If you would like to contribute an article or offer comments please email richard.wallis@talis.com.

Richard Wallis, Editor

Open Data

Rob Styles, Programme Manager, Data Services, Talis

There has over many months been a debate about the licensing and copyright issues that are holding back the opening up of library data. Much of this has centered on how much of library metadata, the MARC records, is fact and how much is creative work. The factual data cannot be protected by Copyright. Facts, titles, names, short phrases, single words; none of these can be Copyrighted in the sense of a Creative Work, but there is more to Copyright than that.

A few months back I was talking in Banff and then in Paris about the need to license data, not to keep it closed, but to keep it open. In that discussion I broke the world into three parts, Data, Metadata and Content. For the creative aspects there is Copyright protection, and various licences extend this in different ways, CC, GFDL and others.

But what about the data? The question isn't "can it be protected?" but "how does it need protecting and what from?". This question is one that the launch of the Open Library, hosted by the Internet Archive, has brought to the fore.

I trust the Internet Archive. They're probably the only people on the internet to have a wholly untarnished halo and that's a very good thing. But things can change. There are direct parallels between what The Open Library is doing and what CDDB did back in the 90's. Fez has an overview of what happened to CDDB/Escient/Gracenote, but to summarize... A large community generated database of music metadata got locked away by a corporate body. It didn't happen because CDDB planned all along to dupe their community, it didn't happen because anyone was 'evil'. It happened because a commercial organization needed to make money and the community had no protection from that.

An alternative service did spring up, which is what you want to happen in that circumstance. FreeDB.org set up using the CDDB software and someone in the Gracenote extended staff leaked them a copy of the database. With the correct licensing in place - a data equivalent of the GPL - FreeDB could simply have requested a copy. The community would have been protected.

I don't want what happened with CDDB to happen with The Open Library, and other initiatives to open up library data. To prevent it requires a clear licence that protects the community from The Open Library as well as The Open Library from anyone else.

This is the area we developed Talis Community Licence to cover (the name is draft, it will change). We've been using it to protect data contributions to the Talis Platform for over a year. It protects contributors from us as it prevents us, or anyone else, from locking the community's data away at a later date. It's an Open License, anyone can use it to protect their users' contributions in the same way. An new draft of the licence will be opened up for public consultation in the next few weeks, keep an eye on the Panlibus blog for announcements.

 

Ross Singer to join the Talis Team

Ross Singer is to join Talis from Georgia Tech Library. Ross, well known in library technology circles, was the winner of the Second OCLC Research Software Contest for his innovative work on the Umlaut OpenURL Link Resolver.

Ross anounced the move on his blog Dilettante’s Ball, saying ...

I am really looking forward to Talis; not only do I think the work they’re doing is exciting and innovative, but, in my opinion, I think it’s the only way to push major ideas into libraries. Libraries are generally too risk-averse to look at the interesting things their peers are doing and adopt them. My work at Tech doesn’t show up in many places outside of Tech. It never will.

Featured Developer

Michael Stead, e-Resources Librarian, Bolton, Trafford and Wigan Libraries

I have a guilty secret: some people seem to think I know a lot of tech stuff. I know a bit: I did a little BASIC when I was a kid; I set up a wireless network at home; and I seem to provide a gratis 24-hour tech support hotline for my dad. But I don’t know that much.

Being young, confident and hopelessly optimistic, I thought I might like to try out the Talis APIs when I first read about them on Panlibus. One of the shortfalls of the online systems we use in libraries is that they’re inflexible; you can only search the OPAC by going to the OPAC. We all know that the OPAC is somewhat lacking, so I was interested in seeing if I could put a catalogue search in another environment. The Talis APIs opened that door for me.

The Silkworm Directory lists libraries whose holdings data are contributed to Talis Source, which is in turn copied to the Platform. I looked up Bolton Libraries and had a play with the example search box. I thought I might be able to copy and paste it onto the blog I had set up to support our old community information solution, Signpost. I was right: my Ctrl+C and Ctrl+V skills were unstoppable! Copying and pasting HTML is about as hard as your initial foray into APIs needs to get. I don’t even have an HTML editor at work, so I did all of this in Notepad and Blogger’s admin interface. It required a little thought and muttering to myself, but so do most aspects of librarianship.

Buoyed by this, I thought about applying the same principle to the community information solution itself. Signpost was very much a last-gen product: it looked dated and its search functionality was particularly lacking. I spent a morning looking at the HTML in Phil Bradley’s examples, the API-powered OPAC box in my blog and the way that Signpost builds URLs for searches, and I had a few search boxes built in time for lunch. I can’t remember what I had for lunch that day, but I’m pretty sure I’d earned it.

I was surprised a few months later when my boss came back from a day at the Talis offices, and told me that she’d seen a presentation that featured my work. It was cited as an encouraging, early use of the APIs. A few months after that, I was surprised again while reading Panlibus: Richard Wallis had uploaded the slides from one of his Library 2.0 presentations, which featured the same OPAC and Signpost searches.

That’s all there is to it. Honest. I went on to build a widget for the Council intranet using the code for the OPAC search box. I even got really fancy, and built a different version using Open URL Search to query the OPAC directly, rather than going via the Platform. In theory that’s a less stable option, because the Platform-hosted data store is backed up and stays operable if the OPAC itself falls over. But the advantage of this route is that it opens up more search options: the API is limited to author, title and ISBN. Open URL Search allows me to add keyword searches, which can provide more results.

Podcast of the Month

Although not strictly about library technology, this month's recommended Talking with Talis podcast is about something we all have an interest in, getting services to the users. The interviewee, Zoinul Abidin, is an Idea Store Manager for the London Borough of Tower Hamlets. Idea Stores are an innovative fusion of libraries, education and community engagement. This podcast provides an insight into the thinking behind Idea Stores, where the values still include 'books are core'.

Meet the API - Augmentation

This month we take a look at the Augmentation API for a Talis Platform Store. Like Item Query, the subject of last month’s Meet the API, Augmentation is a standard API available on every Platform store.

Augmentation is the process where a set of query results from another store, or from somewhere external to the Platform, are passed through a store for them to have data from that store added to them. In practice the way this works is as follows:

  • The URL of a search query is passed as a parameter to the Augment API.
  • The results of the query are parsed by the store to identify URIs that it recognizes. Dependent on the type(s) of data held in the store, each store will recognize different types of URI. For instance, a store containing book-jacket images would recognize an ISBN identifier URI, whereas a store containing Wikipedia extracts for authors would recognize Dublin Core creator URIs.
  • For each URI recognized, the store checks its contents and if it finds a match it adds its data to the result.
  • The only prerequisite is the initial results is in RSS 1.0 format, the Platform’s default result passing format.

As always showing is much better than telling, so open up a web browser and off we go. I will take you through an example of augmenting a set of bibliographic results by adding links to Wikipedia abstracts about the author.

The Augmentation API for the Platform store containing Wikipedia abstracts can be found here: http://api.talis.com/stores/wikipedia/services/augment As with all Platform APIs, if you do not give it any parameters it provides a nice friendly form for you to experiment with. As you will see from the screen, the main parameter it requires is the URL of search results - so let's give it a search. This one is the Harry Potter search we described in last month's Meet the API about Item Query: http://api.talis.com/stores/ukbib/items?query=Harry+Potter&max=10&offset=0. Copy that URL and paste it in to the prompt and click the Augment button.

As described last month, you will be presented with your browser's attempt at nicely formatting an RSS feed. Using the View Source option will show you the underlying XML of the results - a quick search for wikipedia will take you to the individual results which have had links to Wikipedia articles added to them. Unfortunately your browser will not be expecting this type of information, so the only way to see it in this experimentation mode is to view source, but as you will see an application built on top of the Platform API, either directly or by using the XSL Stylesheet capability of the API, will have no problem using it for user display.

Augmentation is not restricted to results obtained by searching other Platform Stores. The Talis Developer Network user guide article Bigfoot - an initial tour contains an example of taking a set of results from OCLC's xISBN service and augmenting them with bibliographic information.

Meet the Team

Amanda Gaynor, Developer, Talis

This month we meet Amanda Gaynor. After finishing her successful MSc placement at Talis, in which she delivered an innovative project which enabled libraries to dispose of stock using Amazon Marketplace, Amanda was offered a permanent job as Developer in the research department, which later evolved into the Platform Team. During this time she's worked on the Silkworm directory component, and is now one of the team of developers implementing functionality for the storage components of the Talis Platform.

"I've worked as a software developer at a number of different companies before joining Talis and though the work was rewarding what is special about being here is the way in which we work. As developers we are given a lot of responsibility not just at the coding stage but at the requirements and design stage. A software design is not "handed down" to you but must be thought through yourself after understanding the requirements. Requirements are written as stories. These can be deliberately under-specified so that as developers we have to get involved in fleshing them out before we can start design and implementation. Being involved in the requirements process at this early stage gives you a more in depth understanding of what is needed and how things should be designed.

Working at Talis was the first time I'd practised test driven development and writing your tests first is a good way of ensuring that you understand the requirements and what the behaviour of the system that you are implementing should be.

At my time at Talis I have had the privilege to work with some very bright people and have learned a lot from them. It is also an environment where people are comfortable discussing ideas and learning from one another."

 

To receive notification of new issues of the Talis Library Platform News please remember to SUBSCRIBE.