IBM Mashup Center Need-to-know tips and tricks, Part 1: Work with feeds and build data mashups

Accelerating your use of IBM Mashup Center versions 2 and 3

IBM® Mashup Center is a powerful tool aimed at gaining insight and sharing information by rapidly assembling and visualizing data. This article shares some of the essential "need-to-know" tips we have collected in our experience with the product. The first article in the "IBM Mashup Center — Need-to-know tips and tricks" series shares some practical and valuable hints on working with feeds and building data mashups.

Share:

Aaron Kasman (akasman@us.ibm.com), Advisory Software Engineer, IBM

 Aaron KasmanAaron Kasman is an advisory software engineer in the Innovation Engineering team in the IBM CIO Office where he focuses on internal Platform-as-a-Service (PaaS) offerings, with an interest toward supporting innovators and situational application development. Prior to this role, he was part of the IBM Software Services for WebSphere team developing IBM.com’s e-commerce presence. His interests include WebSphere sMash, Platform and Software-as-a-Service technologies, content and community management with Drupal and CiviCRM, and visual design.



Klaus Roder, Solutions Architect, IBM Mashup Center, IBM

Klaus RoderKlaus Roder is a solutions architect for IBM Mashup Center at the IBM Silicon Valley Lab. His current focus is developing and deploying enterprise mashups with customers using IBM Mashup Center. Prior to his current role, he worked on the IBM Web Interface for Content Management (WEBI) project and the WebSphere Information Integrator Content Edition (IICE) development team. He holds a master's degree in computer science from the University of Applied Science in Würzburg and is a member of the San Francisco Bay Area ACM chapter.



07 April 2011

Overview

IBM Mashup Center is a powerful tool for quickly combining data sources to gain insight and share feed format and visualized information. Uses run a wide gamut. It can be used to develop rich web interfaces without programming through the reuse of flexible components. On the other hand, it can be customized extensively through the development of components for specific scenarios. This article leans toward the former use case, helping you quickly get deep into Mashup Center by reusing existing components. We'll assume you have some basic familiarity with the product. For links to some introductory material, check out the Resources section.

In this first article in the "IBM Mashup Center — Need-to-know tips and tricks" series, we'll review the components of IBM Mashup Center and some terminology, then provide nine practical tips for working with feeds and building data mashups. In Part 2, we cover the part of the product where most of your end users will use your mashups: the mashup builder.


Mashup Center components and terminology

What about InfoSphere MashupHub and Lotus Mashup Builder?

You may have heard of InfoSphere® MashupHub and Lotus® Mashup Builder. Sometimes, the feed, catalog, and data mashup capabilities are grouped under the name InfoSphere MashupHub. Mashup Builder is also known as Lotus Mashup Builder.

Mashup Center enables you to connect, catalog, combine, and expose data in innovative ways. The data can come from a range of source types within your enterprise, from your team or department, the web, and beyond. Let's review what is offered by each of the main product components, illustrated in Figure 1:

  • The feed generator lets you quickly and securely generate feeds from a wide variety of data sources. Feeds can be created from enterprise, departmental, personal sources like Microsoft® Excel®, or the web. You can also expose existing external REST feeds as feeds within Mashup Center. Feeds that have been cataloged can be used elsewhere within your instance of Mashup Center, and they can also be accessed over HTTP directly from outside of the scope of the product. In both cases, all access control to the feeds you create depends on the settings you put in place.
  • The catalog lists feeds, widgets, and pages you have created or that you are authorized to view. From the catalog, you can adjust, run, and edit your existing feeds and data mashups. You can also add cataloged widgets and pages to the mashup builder.
  • The data mashup builder lets you remix and transform data from existing feeds into new ones.
  • The mashup builder is the front end of your mashup, where you design your data presentation pages without coding. It's the place where you create and host pages that contain widgets that display data you wish to present.
Figure 1. Mashup Center components
Image shows Mashup Center components: feed generator, data mashup builder, catalog, and mashup builder

Now that we've reviewed the core components, let's dive deeper into some tips for making the most of working with feeds and data mashups.


Tips for working with feeds and building data mashups

1. Using cataloged sources makes sense

The data mashup function in Mashup Center provides a rich facility to combine, transform, and otherwise manipulate data from a variety of sources. Mashup Center's catalog allows you to organize your data from a variety of sources and data mashups let you combine them effectively.

All data mashups contain at least one Source operator, which allows you to pick an input of data to manipulate. Your Source operator can reference an external feed directly, or you can reference a feed you have already cataloged in Mashup Center. For a number of reasons, we recommend referencing cataloged feeds. The easiest way to do this is shown in the Source operator dialog, depicted in Figure 2.

There are a few advantages to taking this approach. First, it will enable you to reference feeds that require authentication. Also, cataloging assists with the management of your feeds, as the catalog shows the dependency relationships between data mashups and feeds. Figure 3 shows that in the catalog entry for a data mashup, you can see what cataloged sources the mashup depends on and what, if any, downstream mashups depend on it. Also, if you rename a cataloged feed, for example, mashups that depend on it will use the new name and shouldn't break. Yet another advantage is that Mashup Center can be configured to capture statistics for cataloged feeds. And perhaps most importantly, using this technique will let you define other settings such as parameters on that source feed that you may wish to reuse across multiple mashups.

Figure 2. Selecting a source from the catalog
Image shows shows selecting a source from the catalog
Figure 3. Data mashup dependencies shown on Details page
Image shows a sample list of items

2. Relating operators and functions to elements

A central point to keep in mind is that data mashups work based on models of XML operations, with an assumption that input XML generally consists of one of more repeating entries. In Atom feeds, the repeating elements are marked as <entry>, and we'll use that convention to explain this concept. The trick with building data mashups is keeping in mind how operators and functions relate to these entries.

Operators, the basic building blocks of data mashups, include Filter, Transform, Group, Extract — everything you see in the toolbox palette in the data mashup builder. The operators are shown in Figure 4. These operators each work across the set of entries provided to the given operator. In other words: When you use the Filter, Extract, or For Each operators, the operators will perform their operations against each <entry> in the list. You could use the Filter operator across each <entry> in the sample XML in Listing 1, for example, so you only get the expense record for employee xy24.

Also, you could sort all of the entries within your list, but you couldn't use sort to directly sort child elements within the entries; you would need to extract elements into their own entries first to sort them. Note that the sample intends to represent Atom code, but we've removed some of the namespace syntax for clarity.

Figure 4. Data mashup operators
Image shows data mashup operators
Listing 1. Sample XML of employee expense data
<feed>
	<entry>
	  <content>
		<expense-record>
			<employeeId>xy24</employeeId>
			<expense>
				<type>auto</type>
				<cost>115</cost>
			</expense>
			<expense>
				<type>hotel</type>
				<cost>1025</cost>
			</expense>
		</expense-record>
	  </content>
	</entry>
	<entry>
	  <content>
		<expense-record>
			<employeeId>sy86</employeeId>
			<expense>
				<type>auto</type>
				<cost>238</cost>
			</expense>
			<expense>
				<type>hotel</type>
				<cost>1227</cost>
			</expense>
		</expense-record>
	  </content>
	</entry>
</feed>

Now let's contrast functions to operations. Functions are available in some operators, such as Transform and Sort. Functions all operate within<entry> blocks. Figure 5 shows a few of the functions available. Say you want to add a set of numbers within an <entry>. The sum operator will help you sum over the elements inside each <entry> in the list. Again, referring to Listing 1, we could use a sum operator to directly sum the expenses within each employee expense record entry <entry>, as this is what the function is designed to do. At the same time, we couldn't use the sum function to directly sum the expenses across all employee <entry>s without first using an extract or group to gather all of the expense types together because employee expenses are in different <entry>s.

Figure 5. Selecting a function
Image shows data mashup functions

3. Setting the repeating element

As noted in the previous section, data mashups work in terms of repeating elements, which are commonly <entry> blocks when the feed is formatted as Atom. In fact, in Mashup Center, database and spreadsheet feeds are automatically generated to Atom format. Some sample Atom output generated from a database feed is shown in Listing 2. In this case, the case of repeating elements is well known to be a set entries.

Listing 2. Snippet of Atom output from a database feed
<?xml version-"1.0" encoding=UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://helium.svl.ibm.com:9081/mashuphub/client/plugin/generate/ent
  <title type="text">Sample DB Customer data</title>
  <link href="http://helium.svl.ibm.com:9081/mashuphub/cllient/plugin/gene
  <updated>2011-03-18T05:28:54.331Z</updated>
  <subtitle></subtitle>
  <generator>InfoSphere MashupHub</generator>

<entry xmlns="http://www.w3.org/2005/Atom">
  <title type="text">Item 1</title>
  <id>urn:uuid:1</id>
  <updated>2011-03-18T05:28:54.334Z</updated>
  <summary type="text">Atom Feed entry 1</summary>
  <content type="application/xml">
    <row xmlns="http://www.ibm.com/xmlns/atom/content/datarow/1.0">
      <custid>C10000</custid>
      <fname>Tony</fname>
      <lname>Kolbeck<lname>
      <address>91400 harry st</address>
      <city>Bradford</city>
      <state>Florida</state>
   </row>
  </content>
</entry>
<entry xmlns="http://www.w3.org/2005/Atom">
  <title type="text">Item 2</title>
  <id>urn:uuid:2</id>
  <updated>2011-03-18T05:28:54.334Z</updated>
  <summary type="text">Atom Feed entry 1</summary>
  <content type="application/xml">
    <row xmlns="http://www.ibm.com/xmlns/atom/content/datarow/1.0">
      <custid>C10001</custid>
      <fname>Shawn</fname>
      <lname>Nuriddin<lname>
      <address>67357 creek st</address>
      <city>Taylor</city>
      <state>Florida</state>
   </row>
  </content>
</entry>

If the input to your data mashup is not Atom and is instead custom XML, you may need to specify what you wish your repeating element to be. For example, see some sample input to a data mashup in Listing 3. For our data mashup, we want to use <content> elements as our repeating element.

Listing 3. Sample XML with repeating element
<root>
	<content>
		<vegetable>carrot</vegetable>
		<color>orange</color>
	</content>
	<content>
		<vegetable>beet</vegetable>
		<color>purple</color>
	</content>
</root>

Mashup Center automatically tries to figure out the repeating element in your source, but in some cases, it needs your help to determine the elements you consider to be the repeating element in your source. To ensure that the source operator has detected the correct element as a repeating element, select the advanced tab and check the repeating element field. If needed, change the field that represents the repeating element in your list. For the code in Listing 3, we'll want to set the repeating element to /root/content. You can see a screenshot of this setting in Figure 6.

Figure 6. Specifying a repeating element
Image shows specifying a repeating element

4. Variables — Passing arguments into data mashups

Data mashups are powerful and let you combine data from multiple feed sources. For example, you may want to combine data from a relational database with data from a web service or from an Excel spreadsheet. Often, however, you want to take user input (e.g., a customer ID) to make the feed output dependent on the user input.

To add a parameter to a data mashup, we use the "Variables" dialog. The three screenshots below show how to specify a variable in Source, Filter, and Merge operators, respectively. The dialog allows you to specify the variable name, a default value, and a description. The variable name will be the parameter name that will be added to the feed.

Figure 7. Define a variable in a source operator
Image shows global variable in source operator
Figure 8. Define a variable in a Filter operator
Image shows defining a variable in a Filter operator
Figure 9. Define a variable in a Merge operator
Image shows defining a variable in a Merge operator

All of the options presented above will open the following Variables dialog.

Figure 10. Variables dialog to define or select variables for a data mashup
Image shows variables dialog to define or select variables for a data mashup

One thing to remember in the Variables dialog is that all variables defined in the given data mashup will be shown. Therefore, if you want to create additional variables, you need to click the + (plus) button highlighted in Figure 10. If you do not click the "+" button you will overwrite an existing variable.

Also, keep in mind that the row highlighted in dark gray in the Variables dialog is the variable value that will be used for that operator. Be careful to not accidentally select an unintended variable.

The defined variables will be added as parameters to the feed URL and available as input parameters to the data mashup. In our example above, we created two variables: customerID and productID. Both variables appear as parameters in the URL after the data mashup is saved, as shown in Figure 11.

Figure 11. Data mashup variables appear as parameters in the feed URL
Image shows data mashup variables appear as parameters in the feed URL

5. Creating flexible database feeds

In the Enterprise Database (JDBC) feed generator, you can also use a variable directly in your SQL statement. The notation is ':variable' for non-integer values and :variable for integer values, as shown in Figure 12.

Figure 12. Variables in the Enterprise Database (JDBC) feed generator
Image shows variable in Enterprise Database (JDBC) feed generator

This will make the variable (in this case, customerID) available as a parameter in your feed URL, as shown in Figure 13, and allows you to pass a variable directly to your SQL statement.

Figure 13. Parameter in database feed
Image shows parameter in database feed

6. The trick with the Merge operator

One common scenario for a data mashups is to merge two feeds together based on a common element. You can, for example, merge a feed with customer data from a database with a feed with customer data from an Excel file based on the customer ID, as shown in Figure 14.

Figure 14. Merged feeds
Image shows merged feeds

Since, by default, both feeds have a content element, you will end up with two content elements in your output feed, making it difficult to distinguish which data comes from which source feed, as illustrated in Figure 15.

Figure 15. Merged feed has two content elements
Image shows that merged feed has two content elements

The trick, as shown in Figure 16, avoid this is to add a transform operate after one of the Source operators and create an element that has a name different from "content."

Figure 16. Transform operator to create distinct content elements
Image shows Transform operator to create distinct content elements

This trick will create two distinct elements in the merged feed, allowing you to easily distinguish the content of the two feeds and transform them further.


7. Group operator with a twist

The Group operator is invaluable for grouping together elements that share a common attribute. Say we have a list of elements representing people, each with a country element inside. The Group operator lets us create groupings by country. We can use associated data field to specify what from each element we want to include in the group. Figure 17 shows an example of an input list of people, each with certain sub elements, including a country. Figure 18 shows how to configure the Group operator to group elements into countries. Notice how we use the associated data settings to pull the list of people in each country into the country groups. Figure 19 shows the output preview of using the group by operator. If we didn't want all the data associated with each person but simply wanted their names, for example, you could set the associated data to be the path to the name element. That's the basic group functionality in a nutshell.

Figure 17. Input feed to Group operator
Image shows input feed to Group operator
Figure 18. Configure the Group operator to group by country
Image shows configuring Group operator to group by country
Figure 19. Output preview of group by country
Image shows output preview of group by country

There is another and almost distinct usage of the Group operator. When you think of grouping, you normally specify a grouping criteria, like country, in the example above. But sometimes, you might want to group common elements into a single entry. In our example, you might want to move everyone's names from individual entries into a single entry. You might be doing this to perform a function on the entry, as discussed earlier. To do this, specify the data you want to lump into one entry, but leave the group expression field blank. Figure 20 shows the configuration, and Figure 21 shows the output.

Figure 20. Configure the Group operator to country names to a single element
Configure the Group operator to country names to a single element
Figure 21. Output preview of group by country
Output preview of group by country

8. Set data mashup feed type to Atom

Most feed generators return an Atom feed. Atom feeds work really well with out of the box, so the data can be displayed directly in out-of-the-box widgets, such as the Data Viewer, as shown in Figure 22.

Figure 22. Feed generators Atom output
Image shows feed generators Atom output

After creating a feed from a data source, the next logical step is often to modify the feed via a data mashup. A simple sample data mashup is shown in Figure 23.

Figure 23. Data mashup example
Image shows second feed generators Atom output

Attempting to display the output of your data mashup in a Data Viewer widget often leads to the unexpected error shown in Figure 24, indicating that the specified data source is not valid

Figure 24. Data viewer error: The specified data source is not valid
Image shows Data Viewer error: The specified data source is not valid

The reason for this is that the default feed type in the data mashup Publish operator is XML, but the Data Viewer is expecting data formatted as Atom. Changing the publish feed type to Atom in the Publish operator, as shown in Figure 25, is a quick fix and will make the data appear in the Data Viewer. We generally recommend outputting data mashups as Atom since most of the out-of-the-box widgets support the Atom format. Of course, there are good reasons why other output formats, such as XML or JSON, might be most appropriate in different circumstances.

Figure 25. Changing the Publish output to Atom
Image shows changing the Publish output to Atom

9. Caching

In any data rich application, caching is key for keeping things running well from a performance point of view. You will also want to ensure that your mashups aren't overextending any sources that they are calling. For example, if you catalog a DB2® feed, you may want to enable caching on that feed so the source database isn't requested each time the corresponding mashup is requested. How long you choose to retain your cache will depend on your application. Depending on your needs, an hour might be suitable, or a full day might even be fine.

There are two types of caches that can be enabled in Mashup Center. First, you can enable the cache on a cataloged feed. To do this, from the Hub catalog, select Edit details for the feed of your choice. In the details section, expand the caching section. There, you will be able to enable caching for the endpoint, as well as specify an expiry time for that cache. Figure 26 shows where you put in these details for any feed.

Figure 26. Enabling feed cache
Image shows enabling feed cache

The second type of cache is specific to the data mashups. All data mashups comprise at least one Source operator. You may decide that you cache the source you specify to enhance the performance of your overall data flow. By default, Source operators are set to cache for an hour, and you can adjust these settings as required. To adjust the caching setting for a Source operator, click on the advanced tab. Figure 27 illustrates where these settings can be adjusted in the Source operator.

Figure 27. Enabling Source operator caching
Image shows enabling Source operator caching

In using the caching features, we've come across a couple tips. To keep things easier to debug, when considering using a feed in a mashup, we recommend using a feed cache or a Source operator cache, but try to avoid combining the two. In other words, if you enable source caching for a feed in a data mashup, also avoid enabling caching on that cataloged feed used in your Source operator. We tend to use feed caching when we want to optimize the output of a feed for use by a mashup page, or a system or user outside of Mashup Center. We recommend source caching to improve the performance of a particular data mashup.

Note that in Mashup Center 2.x, if you have a feed that has cache enabled and you load that feed into a Source operator, the Source operator will not use the feed cache; it will always trigger the source data. You can, however, specify caching in the Source operator, as described in Figure 27. In that case, Mashup Center will not exercise the source directly and will leverage the Source operator's cache. In Mashup Center 3.x, the behavior has changed. If the feed registered in the Source operator exists in the local catalog and has caching enabled, the Source operator will use the feed's cache. Source caching is also available in version 3.x. Remember that for a given input, we recommend not combining source and feed caching.


Wrapping up

Mashup Center's robust toolkit lets you develop data-driven applications in hours and days, and not months often required by conventional development. We hope that these tips will help you get going in Mashup Center even more quickly. Stay tuned for Part 2, where we cover tips for using the Mashup Builder part of the application. Happy mashing!

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=644532
ArticleTitle=IBM Mashup Center — Need-to-know tips and tricks, Part 1: Work with feeds and build data mashups
publish-date=04072011