 | Level: Intermediate Martin C Brown (questions@mcslp.com), Freelance Writer and Consultant, MCslp
13 Sep 2005 The semantic grid uses metadata to describe information in the grid. Turning information into something more than just a collection of data means understanding the context, format, and significance of the data. The semantic Web follows this model by providing additional metadata that helps describe the information being viewed on a Web page, thereby allowing browsers, applications, and users to make better decisions about how to deal with the data. The semantic grid applies similar principles to the information used in a grid environment. In this article, we take a closer look at what the semantic grid is, how to use it, and what this will mean to your future grid applications.
Understanding information
As humans, we have an amazing capacity to identify and work with a wide variety of information and understand how we can use and exploit that information for our needs. Ignoring the basics of the senses in the human body, simple data types like what we see (video), what we hear (audio), and what we read (textual) can easily be converted, translated, manipulated, combined, and reused by our brains.
Even more complex is that we can read, watch, or listen to information with an incredibly high probability of correctly identify the context and subject of that information to the point of being able to mentally identify and file it along with other similar information.
For example, nobody thinks twice about regurgitating as speech something we have watched or listened to. Similarly, we can mentally combine the information from a number of sources (video, audio, and textual) and create a summary of that information in any one of those formats.
But for a computer, that process is much more difficult. Understanding the textual content of a Web page, for example, is quite complicated. Even using basic text analysis techniques may not be enough to identify the correct subject. Part of the problem is the vagaries of the language we use. For example, in English, some words, even complex technical ones, have multiple meanings. Just consider the term services, which has many meanings, even within the realm of IT and computing.
The problem is that as we use computers to store, manipulate, and interact with more and more information, it becomes increasingly difficult for a computer to organize and work with the information effectively. Searching, and finding resources and information are particularly challenging tasks. As we make computers more autonomous (take Google as a good example of a resource that updates itself), our ability to more effectively find and work with that information must also increase.
The semantic Web
Google, of course, searches the Web for information, indexing the words on a page. However, there is very little context to the information indexed and searchable through Google. For example, if you are looking for services, you'll need to add extra qualifications to the search words you use to help Google identify what type of service you are referring to (see Figure 1). However, even with this information, Google still will only pick out pages that contain those qualifiers. To find what you are looking for, you may need to be more creative about the word combinations you use in your search.
Figure 1. Searching for "services" on Google
The current shortcomings of Web pages are in large part due to the fact that they are written in Hypertext Markup Language (HTML), a language primarily concerned with marking up the text for formatting, rather than identifying and tagging content. There have been extensions (meta tags) to the HTML standard to allow some additional information to be added to a page, but this is largely superfluous and used to tag the page as a whole, rather than the individual pieces of content on a given page.
The other aim of HTML was for a page to provide links to other pages and sources of information. However, like the textual content of the page itself, these links are provided without any context beyond that provided within the language that surrounds or describes it.
The semantic Web has the aim of resolving these issues by using additional technology to help categorize and organize the human information contained on a page with machine-readable and understandable information that can then be used by applications to help categorize and organize the information.
It shouldn't be any surprise that XML forms a key part of the process, as do other technologies you may have heard of before. The key parts are:
-
XML is used to define the structure of the document and also (with RDF) to help describe the additional metadata for that document. Note, however, that the use of XML doesn't necessarily imply that the documents should be sourced in XML and translated on the fly to XHTML (for example), just that XML is a key base standard for the easy formatting and sharing of information.
-
Resource Description Framework (RDF) is a tool for describing metadata about objects and references. RDF is a recognized format for the distribution of metadata information in existing situations. For example, RDF is one of the formats used in syndication for distributing descriptions about stories within a Web page.
-
Web Ontology Language (OWL) is a markup language for describing ontological data about a given resource based on the principles of RDF. Ontological data is a structured description of the content within a realm of a specific subject. OWL provides not only structural and content information but also methods for describing the links between topics and subjects, and how the subjects relate (i.e., whether a subject is a subclass of a larger topic, and whether it has a direct or indirect relationship). The basic principle is similar to the classic animal, vegetable, mineral game and classification system often taught in school. For example, using ontology, you could define an herbivore (vegetarian) one that eats only vegetable matter, a carnivore as one that eats only other animals, or an omnivore as an entity that eats both things. Check Resources for some examples of RDF and OWL-based classifications.
All three of these technologies are used together to help provide semantic information about a given resource and the resources to which it may be linked (see Figure 2). Resources are identified by a Uniform Resource Identifier (URI); the RDF and OWL data are then connected to the URI describing not only it's content but also the relationship between content items and their relationship with other URIs or content types.
Figure 2. Semantic document structure
Because the information about the URI is in a structured format, it can be parsed and manipulated by a computer to determine the links and relationships to other URIs by analysis. For example, through analysis, you could identify whether two Web pages covered similar topics by comparing the information in the RDF and OWL resources.
In a searching context (e.g., within the realms of Google), it would be possible to explicitly state the type of information you were looking for without having to rely on the page containing the words you specified. For example, you could look for cleaning services, even though the page returned may not actually contain either of those two words.
The semantic grid
The semantic grid is the application of the principles of the semantic Web applied to the grid environment. I really can't put it any better than the quote from the semantic grid group, which describes the semantic grid as "an extension of the current grid in which information and services are given well-defined meaning, better enabling computers and people to work in cooperation."
In fact, the semantic grid is often viewed as the result of combining grids and semantic Web technology upon a graph of increasing integration and data computation, as shown in Figure 3.
Figure 3. Graph of increasing integration and data computation
There are really two aspects to the semantic grid: the discovery of available resources for processing the data and the ability to integrate the data. Let's take a quick closer look at how these two aspects affect the implementation of a semantic grid.
Discovery and reuse
The discovery side of the semantic Web is designed to make it easier for grids to be discovered across the Internet. This requires a more detailed definition of the capabilities required by a user or application to enable them to find and make use of the grid. This will help grid users reuse existing resources and technology for their grid requirements, instead of new grids and applications being built for processing new data.
For example, a grid that provides computational resources for the scientific community can probably be used by a wide range of users and organizations. Rather than developing a single-use application for a grid environment, the existing grid infrastructure and grid application could be made available to other people.
The complexity is in the way the grid services and capabilities are described. This is where the nature of the semantic grid, with the heavy classification and description of abilities and facilities, will be employed to make it easier to determine what facilities a specific grid can provide.
Data integration
As with information in the Web and semantic Web, the other half of the power with the semantic grid will be the way in which we can link and cooperate with the information stored and available within a grid.
For resource grids (those sharing disk and storage space, instead of providing CPU power), which is how many see the Web already acting, using grid technology (Web services, security, etc.) to provide links and connectivity between information will provide an efficient way of storing and information. For example, with a semantic grid component that stores photos, combined with a semantic grid that stores video material, it should be possible to make connections and associations. For example, you could search for photos of a subject, such as whale sharks, then also find related videos (Figure 4). This is a simplified example. More likely, we will see semantic grids used for storing and identifying complex data types, such as complex proteins and DNA.
Figure 4. Making connections and associations across the semantic grid
In addition, the definition of the data stored and processed by a grid will enable users to string together multiple grids to provide complex calculations. A good example here does come from the science world -- that of DNA and proteins. Using a semantic grid, you could process DNA information using one grid, and identify and manipulate proteins related to that DNA structure using another grid, simply by feeding information generated by one grid into the other. The process would be possible because the data and structure of the data will be known and usable by both grid systems. Sharing the data between these two will simply be a case of parsing and understanding the semantic data -- the structure and format -- of the two grid systems (see Figure 5).
Figure 5. Sharing data and data structure between two grid systems
Building a semantic grid
The standards and (dare I say) semantics of how the semantic Web and the semantic grid will work are still being discussed, but there is some work and thought that you can put into your existing grid environment to make it easier to migrate to the semantic grid standard once it has been agreed upon.
First and foremost, when designing your grid, give some consideration to how it might be developed or adapted in a way that would allow other people to make use of the resources you are looking to provide. For example, if you are building a computational grid, think about whether the computational resources or applications used to support them can be made flexible enough that others could use them.
Second, consider how you might describe and define the data that will be stored and used by your grid. Does the data used or generated by your grid have a structured format, and can it be described or defined in a way that could be used by other people? Or could the grid be used to process other people's data if defined in a suitable format?
Last, ensure that you are familiar with the core elements and practicalities of the different standards (XML, RDF, OWL) and how they can be used and applied to your own grids and applications.
Resources Learn
-
SemanticGrid.org contains numerous white papers, examples and documents detailing the theories of the semantic grid.
-
The Global Grid Forum Semantic Grid Research Group is a focal point for discussing the semantic grid within the realms of the wider grid community and standards.
-
For examples of the OWL language and ontology for describing 'things' and their relationships, see the documentation for the Protege OWL Plugin.
-
Wikipedia has a number of good articles covering some of the main technologies used. A good starting point is the Semantic Grid wiki. And from the Semantic Web wiki, you can find links to:
-
Read "An introduction to RDF" to explore the standard for Web-based metadata.
-
Friend of a friend technology is a popular application of semantic Web principles. The description given here may help you to understand how information and resources could be linked.
-
The Semantic Web: A Guide to the Future of XML, Web Services, and Knowledge Management
, by Michael C. Daconta, Leo J. Obrst, and Kevin T. Smith (John Wiley & Sons, ISBN 0-471-43257-1), is an excellent guide to semantic Web development and will help you to understand the main principles.
-
Towards the Semantic Web: Ontology-Driven Knowledge Management
, by John Davies, Dieter Fensel, and Frank van Harmelen (John Wiley & Sons, ISBN 0-470-84867-7), provides information on ontology -- the method with which information will be described within the semantic Web and semantic grids.
-
Read
Adaptive Information: Improving Business through Semantic Interoperability, Grid Computing, and Enterprise Integration
, by Jeffrey T. Pollock, Ralph Hodgson (John Wiley & Sons, ISBN 0-471-48854-2).
-
The World Wide Web Consortium provides pages on the Semantic Web and OWL Web Ontology Language.
-
The Semantic Web Community Portal provides links and information about other sites looking at the issue of the semantic Web.
Get products and technologies
Discuss
About the author  | |  | Martin Brown has been a professional writer for over eight years. He is the author of numerous books and articles across a range of topics. His expertise spans myriad development languages and platforms -- Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Mac OS/X and more -- as well as Web programming, systems management and integration. Martin is a regular contributor to ServerWatch.com, LinuxToday.com and IBM developerWorks, and a regular blogger at Computerworld, The Apple Blog and other sites, as well as a Subject Matter Expert (SME) for Microsoft. He can be contacted through his Web site at http://www.mcslp.com. |
Rate this page
|  |