Platform Banner

Navigation

Talis Connected Commons Frequently Asked Questions

The goal of this page is to answer questions concerning the Talis Connected Commons scheme offered by Talis. The programme provides a free hosting solution for public domain datasets. If you have a question that is not answered here, then please contact us at platform@talis.com .

What Is the Talis Connected Commons Scheme?

It is an offer made by Talis to the web community to freely host public domain datasets within the Talis Platform .

What is the Talis Platform?

The Talis Platform is a semantic web application platform developed by Talis Group Ltd . The Platform is offered as Software as a Service to developers wanting to build powerful data-rich applications.

Talis is already using the Platform to transform its existing products and to enable rapid development of a new generation of semantically rich applications and services. However the Platform is being developed as a separate product that can be used as a component within any application, either to drive application functionality or as a means to publish and share data.

What qualifies as Public Domain Data?

To qualify for inclusion in the scheme the data must be made available under a public domain license. We consider data placed into the public domain to be unconstrained by copyright and to be available for both non-commercial and commercial usage. We recommend the Open Data Commons Public Domain Dedication and License .

What license(s) can be used for Public Domain Data?

As noted above, we recommend the Open Data Commons Public Dedication and License . The recently announced Creative Commons CC0 license can also be used to qualify for the scheme.. As the development of licenses for data is still ongoing we recognise that the community may eventually standardise around alternative licenses and are willing to host data so long as it is supported by a clear license that places it into the public domain for unrestricted use. However we do reserve the right to refuse to host a dataset if we think there are any ambiguities around the license.

Why are Talis doing this?

Talis is keen to foster the continued growth of the Linked Data web and the emergence of a true web of data. While many organisations are able to publish and host datasets for themselves, we believe that organisations, communities, research groups and individuals would benefit from having a low cost way to sustainably publish data onto the web. By making the Talis Platform freely available for hosting public domain data we hope to further reduce the barrier to entry for taking part in the development of the web of data.

Are there any hidden costs?

No. We are committed to providing this service for free as long as is commercially feasible. Talis does offer commercial data hosting services as well as consultancy for data publishers and application developers who want to make use of the Talis Platform and semantic web technologies, but these are separate services and no-one who signs up to the Talis Connected Commons scheme will be expected to pay for any extra services that they don’t need.

What data formats can be hosted within the Platform?

The Platform is capable of hosting two kinds of data: unstructured data and structured metadata. The unstructured data storage can host any binary stream. The structured data storage is an RDF database (i.e. a triple store). The goal of this programme is to encourage the spread of RDF metadata on the web, so our primary goal is to offer a service to host this kind of data. Your existing data will need to be converted into RDF (specifically RDF/XML ) in order to load it into the Platform. The Platform API can then be used to manage the data.

Can you crawl my data?

No. We are exploring whether the ability to do a limited kind of web crawling would be useful as an extra feature in the Platform, but this is not yet complete. At present the onus is on you to load data directly into the Platform.

Can you help convert my data into RDF?

If you don’t already have your data available as RDF, then we are willing and able to provide advice on how best to structure your data so that you, and others, can make the most of it when it is loaded into the Platform. However Talis is not offering to provide developers to carry out this conversion for you. Our involvement will be limited to online community support , e.g. in the #talis IRC channel, or on the n2-dev community mailing list . There is also a large and friendly community of semantic web developers who are also able to provide advice and support, so you might also consider looking for advice from, e.g. the linking open data mailing list .

Talis does provide consultancy services to organisations who are interested in more hands-on help in publishing their data online using semantic web technologies and the Talis Platform. If you are interested in making use of these services then please contact us at platform@talis.com .

Why should I should I host my data in the Platform, rather than say, Amazon?

In order to decide which environment is best for your data, its useful to review the relative benefits in using either service.

Amazon provide a service to host large public data sets so that they can easily be integrated into Amazon cloud based applications. This makes it easy to perform computation on large datasets without incurring massive costs in transferring the data to and from the Amazon services. However the datasets are provided “as is” with no supporting services or infrastructure. In contrast the Talis Connected Commons provides an environment for hosting data on the Talis Platform in order to tie that data into the Linked Data web. The Platform also provides a ready made set of services , including data management tools, full text search and querying, that can help make a dataset immediately useful. So by using the Talis Platform you immediately benefit from some value-added services.

Of course, these are not mutually exclusive options: it would be possible to publish data within the Talis Platform and also provide a dump of the whole dataset within Amazon. Similarly a dataset hosted by Amazon could be loaded into the Talis Platform assuming it was available as RDF. However Talis does not provide a facility to make this automatic.

In short there are benefits to be had in hosting your data in either Platform, and “multi-homing” — publishing the data through multiple environments — is always an option.

Can I host private data in the Platform?

Yes, but not under the terms of this programme. Hosting of datasets that are not in the public domain is a chargeable service. Contact us at platform@talis.com to discuss this option further.

How long will you host data under this programme?

We live in uncertain times and so we are cautious about making commitments that we might not be able to keep. We aim to offer this programme for as long as it is commercially feasible.The programme is essentially sponsored by the commercial services (hosting and consultancy) that we are building around the Platform. If we decide to end the programme we will ensure that there is plenty of notice provided and will support the community in migrating its data elsewhere. We are considering how best to put in place safe-guards to make sure that your data will remain on the web and will exploring options for archiving of data to ensure its longevity. We are also considering open sourcing the Talis Platform API in order to ensure that data publishers can migrate away from the service in the event of a disaster or if the service must be discontinued.

How safe is my data?

While we have disaster recovery procedures in place for ensuring the continued operation of the Talis Platform, under this programme, we are not making any strong committments with respect to backing up your data. As the owner of the data it is your responsibility to take regular backups, using the “snapshot” feature provided in the Platform API.

What are my responsibilities?

The formal terms and conditions for the Talis Platform and the Talis Connected Commons scheme will spell out your (and our) responsibilities in full, but to summarize:

  • It is your responsibility to ensure that you have the rights to publish the data into the public domain, i.e that it doesn’t infringe on anyone else’s rights or privacy
  • It is your responsibility to ensure that you take regular snapshots of your data
  • It is your responsibility to ensure that adminstrator access to the Platform is properly controlled. i.e. you cannot freely give out the administrator password and should take reasonable steps to ensure that the credentials are not stolen or misused.

Can I mirror an existing dataset already published as Linked Data?

Yes. We’re happy to provide a mirror for data that is already part of the Linked Data web. However we also keen to see other new and interesting datasets brought onto the web, so in the case where we have a number of people interested in taking advantage of this offer, we will prioritise access for new datasets over and above providing a mirror.

What kinds of data do you want to host?

Broadly, anything! We’re keen to see the Platform used to publish data in all kinds of different domains whether it is publishing, education, scientific research, local and national government, climate change research, travel, company and product listings, media archives. Anything.

How much data can we store on the Platform for free?

You can host up to 50 million RDF triples and up to 10Gb of other content.

Can I really host any kind of data in the Platform?

Within reasonable limits, yes. We do reserve the right to refuse any request for hosting, either because we’re concerned about the associated licensing or provenance and also grounds such as whether we are concerned about infringement of privacy, if the content is unsavory, etc. Having given you access to the Platform to host data, as outlined in the terms and conditions, we also reserve the right to terminate the agreement at any time.

Our goal here is to foster growth of the linked data web, so we are expecting that interested parties will primarily be loading RDF data into the Platform, rather than making use of the unstructured storage. So for example, we’re interested in seeing datasets that include both photos and their metadata (as RDF), rather than just a large collection of photos.

What are the terms and conditions for the scheme?

We will shortly be publishing a full set of terms and conditions for using both the Talis Platform as a whole, and specific conditions that relate to this scheme.

Is there a service level agreement?

Again, this is an area we are working on, and hope to shortly publish a limited service level agreement that applies to all free usage of the Platform API and facilities.

Are there any usage limits on accessing the data?

At present we’re offering a fair usage system for accessing data within the Platform, but reserve the right to change this at any time. We are considering how best to offer improved quality and range of services available to both data publishers and application developers. These will be chargeable extras. However we will always ensure that there is some level of free access to data freely hosted in the Platform under this agreement.

Are all of the Platform services freely available under the scheme?

The Platform API includes a number of different services including search, facetted searching, RDF data management, content storage, RSS feed augmentation, resource description. These services offer a rich set of options for publishing and manipulating linked data. All of these services, as well as access to a public SPARQL endpoint, are included within the Connected Commons scheme.

Some additional features of the Platform, e.g the ability to store and query private datasets, are not part of the scheme. As Talis continues to develop the Platform as a product, additional services may also be made available. However these services may not be available free of charge, even under the Connect Commons scheme. We will make sure that services that are not free to use are all clearly marked.

How do I sign up?

You can contact us at platform@talis.com and provide us with the following information:

  • A Title and Description of the dataset
  • An indication of its size
  • A name and contact email address for the primary contact/administrator of the dataset (this person will have the administrator privileges for the data)
  • The public domain license you will be using to publish the data, and some indication of your rights/ability to publish the data using this license.

We’ll then be in touch to set up access. We’ll provide you with a development store that can be used to carry out any test accesses, e.g. to test out data loading procedures, as well as a separate store for publishing the live dataset(s). A platform user account with rights to adminster both of those stores will also be provided. If additional user accounts are required, then please let us know.

How can I find datasets that are being hosted under this scheme?

We have plans to improve the ability to browse and search for datasets on the Platform, and this includes providing support for automated dataset discovery through support for technologies like VoID . However in the short term, once datasets are live on the Platform, we’ll be announcing them on the Nodalities blog.

We will also be encouraging dataset owners to register the data with CKAN — the Comprehensive Knowledge Archive Network , this will provide another route for open data hackers to find the data.