Linked Data was introduced by Tim Berners-Lee as a concept defined around four principles encouraging people to apply the basic tenets of the Web to data access:
- Use URIs as names for things
- Use HTTP URIs so that people can look up those names
- When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
- Include links to other URIs, so that they can discover more things
Linked Data uses RDF as the data model (not just as a format) and uses HTTP as the protocol, in a way similar to how the web is built using HTML, HTTP, and URLs.
Linked Data enjoys considerable success as a technology for publishing data in the World Wide Web. Large amount of data is now available as Linked Data, such as DBpedia, and when it is freely accessible it is referred to as Linked Open Data. But Linked Data can also be used as an architectural style for integrating applications or for integrating data within the Enterprise.
Linked Data's powerful distributed open graph data model makes it well suited to integrate data stored in various databases and file systems and to integrate applications around this data.
Some of the features that make Linked Data exceptionally well suited for integration include:
- A single interface – defined by the HTTP methods – that is universally understood and is constant across all applications. This is in contrast with the Remote Procedure Call (RPC) architecture where each application has a unique interface that has to be learned and coded to.
- A universal addressing scheme – provided by HTTP URLs – for both identifying and accessing all “entities”. This is in contrast with the RPC architecture where there is no uniform way to either identify or access data.
- A simple yet extensible data model – provided by RDF – for describing data about a resource in a way which doesn’t require prior knowledge of the vocabulary being used.
Linked Data builds on the existing World Wide Web infrastructure and presents some unique characteristics, such as being distributed and scalable.
In December 2011, W3C hosted a workshop to examine Linked Enterprise Data Patterns which resulted in a decision to spawn a formal effort in the W3C. The workshop ended with unanimous agreement that "the W3C should create a Working Group to produce a W3C Recommendation which defines a Linked Data Platform [...], [expecting] this to be an enumeration of specs which constitute linked data, with some small additional specs to cover things like pagination, if necessary".
In March 2012, IBM submitted to W3C the Linked Data Basic Profile 1.0 specification to seed this effort. This was a joint submission with EMC, Oracle, Red Hat, DERI, SemanticWeb.com, as well as Siemens and Cambridge Semantics.
Linked Data Basic Profile is based on lessons learned from IBM's Open Services Lifecycle Collaboration (OSLC) initiative. It defines a set of best practices and a simple approach for a read-write Linked Data architecture, based on HTTP access to web resources that describe their state using RDF. The specification builds on Tim Berners-Lee's four principles and provides some new rules as well as clarifications and extensions to achieve greater interoperability between Linked Data implementations.
The proposed new W3C Working Group (WG), called Linked Data Platform, will be chartered to produce a W3C Recommendation for HTTP-based (RESTful) application integration patterns using read/write Linked Data, with IBM's submission serving as a starting point.
Linked Data technologies can be used in various use cases.
- Linked Data can be used to expose information via URLs – for example public records – on the Internet in a machine-readable format.
- Linked Data can be used for inferring new information from existing information, for example in pharmaceutical applications or IBM Watson.
- Linked Data can be used for integration. The IBM Rational team has been using Linked Data as an architectural model and implementation technology for application integration in the Product and Application Lifecycle Management (ALM) domain, and Tivoli is now using it in the Integrated System Management domain.
RDF can model resources and their relationships, such that for ALM a change request becomes a resource exposed as RDF. The change request can be linked to the defect it is to address, and to the test that will validate the change made. With Linked Data the change management, defect management, and test management tools no longer connect to each other via specific interfaces but simply access the resources directly, following the Linked Data principles.
- Read further about Tim Berners-Lee's four principles.
- Open Services Lifecycle
Collaboration (OSLC) Open Services Lifecycle Collaboration (OSLC)
- Additional information about the November 2011 W3C Linked Enterprise Data Patterns workshop.
- Details about the Linked Data
Basic Profile 1.0 submission.
- Details about the Linked Data Platform WG draft charter.
- Further information about Linked Data at W3C.
- Read W3C's collection of Semantic Web Case Studies and Use Cases .
- Further information about the Semantic Web at W3C.
Arnaud Le Hors, a member of IBM software standards group, is responsible for driving the coordination of several IBM standards activities from a strategic and technical point of view. Arnaud has been working on open standards for 15 years, both as a staff member of the X Consortium and W3C and as a representative for IBM. He has been involved in every aspect of the standards development process, including technical, strategic, political, and legal, both internal and external to an SDO and to a company like IBM. Arnaud was involved in the development of standards such as HTML and XML and one of the lead architects for Xerces, the XML parser developed by the Apache Software Foundation. Arnaud is currently IBM's Linked Data Standards Lead.