IBM Mashup Center is a powerful tool for quickly combining data sources to gain insight and share feed format and visualized information. Uses run a wide gamut. It can be used to develop rich web interfaces without programming through the reuse of flexible components. On the other hand, it can be customized extensively through the development of components for specific scenarios. This article leans toward the former use case, helping you quickly get deep into Mashup Center by reusing existing components. We'll assume you have some basic familiarity with the product. For links to some introductory material, check out the Resources section.
In this first article in the "IBM Mashup Center — Need-to-know tips and tricks" series, we'll review the components of IBM Mashup Center and some terminology, then provide nine practical tips for working with feeds and building data mashups. In Part 2, we cover the part of the product where most of your end users will use your mashups: the mashup builder.
Mashup Center components and terminology
Mashup Center enables you to connect, catalog, combine, and expose data in innovative ways. The data can come from a range of source types within your enterprise, from your team or department, the web, and beyond. Let's review what is offered by each of the main product components, illustrated in Figure 1:
- The feed generator lets you quickly and securely generate feeds from a wide variety of data sources. Feeds can be created from enterprise, departmental, personal sources like Microsoft® Excel®, or the web. You can also expose existing external REST feeds as feeds within Mashup Center. Feeds that have been cataloged can be used elsewhere within your instance of Mashup Center, and they can also be accessed over HTTP directly from outside of the scope of the product. In both cases, all access control to the feeds you create depends on the settings you put in place.
- The catalog lists feeds, widgets, and pages you have created or that you are authorized to view. From the catalog, you can adjust, run, and edit your existing feeds and data mashups. You can also add cataloged widgets and pages to the mashup builder.
- The data mashup builder lets you remix and transform data from existing feeds into new ones.
- The mashup builder is the front end of your mashup, where you design your data presentation pages without coding. It's the place where you create and host pages that contain widgets that display data you wish to present.
Figure 1. Mashup Center components
Now that we've reviewed the core components, let's dive deeper into some tips for making the most of working with feeds and data mashups.
Tips for working with feeds and building data mashups
1. Using cataloged sources makes sense
The data mashup function in Mashup Center provides a rich facility to combine, transform, and otherwise manipulate data from a variety of sources. Mashup Center's catalog allows you to organize your data from a variety of sources and data mashups let you combine them effectively.
All data mashups contain at least one Source operator, which allows you to pick an input of data to manipulate. Your Source operator can reference an external feed directly, or you can reference a feed you have already cataloged in Mashup Center. For a number of reasons, we recommend referencing cataloged feeds. The easiest way to do this is shown in the Source operator dialog, depicted in Figure 2.
There are a few advantages to taking this approach. First, it will enable you to reference feeds that require authentication. Also, cataloging assists with the management of your feeds, as the catalog shows the dependency relationships between data mashups and feeds. Figure 3 shows that in the catalog entry for a data mashup, you can see what cataloged sources the mashup depends on and what, if any, downstream mashups depend on it. Also, if you rename a cataloged feed, for example, mashups that depend on it will use the new name and shouldn't break. Yet another advantage is that Mashup Center can be configured to capture statistics for cataloged feeds. And perhaps most importantly, using this technique will let you define other settings such as parameters on that source feed that you may wish to reuse across multiple mashups.
Figure 2. Selecting a source from the catalog
Figure 3. Data mashup dependencies shown on Details page
2. Relating operators and functions to elements
A central point to keep in mind is that data mashups work based on models
of XML operations, with an assumption that input XML generally consists of
one of more repeating entries. In Atom feeds, the repeating elements are
<entry>, and we'll use that
convention to explain this concept. The trick with building data mashups
is keeping in mind how operators and functions relate to these
Operators, the basic building blocks of data mashups, include Filter,
Transform, Group, Extract — everything you see in the toolbox palette
in the data mashup builder. The operators are shown in Figure 4. These operators each work
across the set of entries provided to the given operator. In
other words: When you use the
For Each operators, the
operators will perform their operations against each
<entry> in the list. You could use the
Filter operator across each
the sample XML in Listing 1, for example,
so you only get the expense record for employee xy24.
Also, you could sort all of the entries within your list, but you couldn't use sort to directly sort child elements within the entries; you would need to extract elements into their own entries first to sort them. Note that the sample intends to represent Atom code, but we've removed some of the namespace syntax for clarity.
Figure 4. Data mashup operators
Listing 1. Sample XML of employee expense data
<feed> <entry> <content> <expense-record> <employeeId>xy24</employeeId> <expense> <type>auto</type> <cost>115</cost> </expense> <expense> <type>hotel</type> <cost>1025</cost> </expense> </expense-record> </content> </entry> <entry> <content> <expense-record> <employeeId>sy86</employeeId> <expense> <type>auto</type> <cost>238</cost> </expense> <expense> <type>hotel</type> <cost>1227</cost> </expense> </expense-record> </content> </entry> </feed>
Now let's contrast functions to operations. Functions are available in some
operators, such as Transform and Sort. Functions all operate
<entry> blocks. Figure 5 shows a few of the functions
available. Say you want to add a set of numbers within an
<entry>. The sum operator will help you
sum over the elements inside each
<entry> in the list. Again, referring to
Listing 1, we could use a sum
operator to directly sum the expenses within each employee expense record
<entry>, as this is what the
function is designed to do. At the same time, we couldn't use the sum
function to directly sum the expenses across all employee
<entry>s without first using an extract
or group to gather all of the expense types together because employee
expenses are in different
Figure 5. Selecting a function
3. Setting the repeating element
As noted in the previous section, data mashups work in terms of repeating
elements, which are commonly
blocks when the feed is formatted as Atom. In fact, in Mashup Center,
database and spreadsheet feeds are automatically generated to Atom format.
Some sample Atom output generated from a database feed is shown in Listing
2. In this case, the case of repeating elements is well known to be a set entries.
Listing 2. Snippet of Atom output from a database feed
<?xml version-"1.0" encoding=UTF-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <id>http://helium.svl.ibm.com:9081/mashuphub/client/plugin/generate/ent <title type="text">Sample DB Customer data</title> <link href="http://helium.svl.ibm.com:9081/mashuphub/cllient/plugin/gene <updated>2011-03-18T05:28:54.331Z</updated> <subtitle></subtitle> <generator>InfoSphere MashupHub</generator> <entry xmlns="http://www.w3.org/2005/Atom"> <title type="text">Item 1</title> <id>urn:uuid:1</id> <updated>2011-03-18T05:28:54.334Z</updated> <summary type="text">Atom Feed entry 1</summary> <content type="application/xml"> <row xmlns="http://www.ibm.com/xmlns/atom/content/datarow/1.0"> <custid>C10000</custid> <fname>Tony</fname> <lname>Kolbeck<lname> <address>91400 harry st</address> <city>Bradford</city> <state>Florida</state> </row> </content> </entry> <entry xmlns="http://www.w3.org/2005/Atom"> <title type="text">Item 2</title> <id>urn:uuid:2</id> <updated>2011-03-18T05:28:54.334Z</updated> <summary type="text">Atom Feed entry 1</summary> <content type="application/xml"> <row xmlns="http://www.ibm.com/xmlns/atom/content/datarow/1.0"> <custid>C10001</custid> <fname>Shawn</fname> <lname>Nuriddin<lname> <address>67357 creek st</address> <city>Taylor</city> <state>Florida</state> </row> </content> </entry>
If the input to your data mashup is not Atom and is instead custom XML,
you may need to specify what you wish your repeating element to be. For
example, see some sample input to a data mashup in Listing 3. For our data mashup, we
want to use
<content> elements as our
Listing 3. Sample XML with repeating element
<root> <content> <vegetable>carrot</vegetable> <color>orange</color> </content> <content> <vegetable>beet</vegetable> <color>purple</color> </content> </root>
Mashup Center automatically tries to figure out the repeating element
in your source, but in some cases, it needs your help to determine the
elements you consider to be the repeating element in your source. To
ensure that the source operator has detected the correct
element as a repeating element, select the advanced tab and check the
repeating element field. If needed, change the field that represents the
repeating element in your list. For the code in Listing 3, we'll want to set the
repeating element to
/root/content. You can see
a screenshot of this setting in Figure 6.
Figure 6. Specifying a repeating element
4. Variables — Passing arguments into data mashups
Data mashups are powerful and let you combine data from multiple feed sources. For example, you may want to combine data from a relational database with data from a web service or from an Excel spreadsheet. Often, however, you want to take user input (e.g., a customer ID) to make the feed output dependent on the user input.
To add a parameter to a data mashup, we use the "Variables" dialog. The three screenshots below show how to specify a variable in Source, Filter, and Merge operators, respectively. The dialog allows you to specify the variable name, a default value, and a description. The variable name will be the parameter name that will be added to the feed.
Figure 7. Define a variable in a source operator
Figure 8. Define a variable in a Filter operator
Figure 9. Define a variable in a Merge operator
All of the options presented above will open the following Variables dialog.
Figure 10. Variables dialog to define or select variables for a data mashup
One thing to remember in the Variables dialog is that all variables defined in the given data mashup will be shown. Therefore, if you want to create additional variables, you need to click the + (plus) button highlighted in Figure 10. If you do not click the "+" button you will overwrite an existing variable.
Also, keep in mind that the row highlighted in dark gray in the Variables dialog is the variable value that will be used for that operator. Be careful to not accidentally select an unintended variable.
The defined variables will be added as parameters to the feed URL and available as input parameters to the data mashup. In our example above, we created two variables: customerID and productID. Both variables appear as parameters in the URL after the data mashup is saved, as shown in Figure 11.
Figure 11. Data mashup variables appear as parameters in the feed URL
5. Creating flexible database feeds
In the Enterprise Database (JDBC) feed generator, you can also use a variable directly in your SQL statement. The notation is ':variable' for non-integer values and :variable for integer values, as shown in Figure 12.
Figure 12. Variables in the Enterprise Database (JDBC) feed generator
This will make the variable (in this case, customerID) available as a parameter in your feed URL, as shown in Figure 13, and allows you to pass a variable directly to your SQL statement.
Figure 13. Parameter in database feed
6. The trick with the Merge operator
One common scenario for a data mashups is to merge two feeds together based on a common element. You can, for example, merge a feed with customer data from a database with a feed with customer data from an Excel file based on the customer ID, as shown in Figure 14.
Figure 14. Merged feeds
Since, by default, both feeds have a content element, you will end up with two content elements in your output feed, making it difficult to distinguish which data comes from which source feed, as illustrated in Figure 15.
Figure 15. Merged feed has two content elements
The trick, as shown in Figure 16, avoid this is to add a transform operate after one of the Source operators and create an element that has a name different from "content."
Figure 16. Transform operator to create distinct content elements
This trick will create two distinct elements in the merged feed, allowing you to easily distinguish the content of the two feeds and transform them further.
7. Group operator with a twist
The Group operator is invaluable for grouping together elements that share a common attribute. Say we have a list of elements representing people, each with a country element inside. The Group operator lets us create groupings by country. We can use associated data field to specify what from each element we want to include in the group. Figure 17 shows an example of an input list of people, each with certain sub elements, including a country. Figure 18 shows how to configure the Group operator to group elements into countries. Notice how we use the associated data settings to pull the list of people in each country into the country groups. Figure 19 shows the output preview of using the group by operator. If we didn't want all the data associated with each person but simply wanted their names, for example, you could set the associated data to be the path to the name element. That's the basic group functionality in a nutshell.
Figure 17. Input feed to Group operator
Figure 18. Configure the Group operator to group by country
Figure 19. Output preview of group by country
There is another and almost distinct usage of the Group operator. When you think of grouping, you normally specify a grouping criteria, like country, in the example above. But sometimes, you might want to group common elements into a single entry. In our example, you might want to move everyone's names from individual entries into a single entry. You might be doing this to perform a function on the entry, as discussed earlier. To do this, specify the data you want to lump into one entry, but leave the group expression field blank. Figure 20 shows the configuration, and Figure 21 shows the output.
Figure 20. Configure the Group operator to country names to a single element
Figure 21. Output preview of group by country
8. Set data mashup feed type to Atom
Most feed generators return an Atom feed. Atom feeds work really well with out of the box, so the data can be displayed directly in out-of-the-box widgets, such as the Data Viewer, as shown in Figure 22.
Figure 22. Feed generators Atom output
After creating a feed from a data source, the next logical step is often to modify the feed via a data mashup. A simple sample data mashup is shown in Figure 23.
Figure 23. Data mashup example
Attempting to display the output of your data mashup in a Data Viewer widget often leads to the unexpected error shown in Figure 24, indicating that the specified data source is not valid
Figure 24. Data viewer error: The specified data source is not valid
The reason for this is that the default feed type in the data mashup Publish operator is XML, but the Data Viewer is expecting data formatted as Atom. Changing the publish feed type to Atom in the Publish operator, as shown in Figure 25, is a quick fix and will make the data appear in the Data Viewer. We generally recommend outputting data mashups as Atom since most of the out-of-the-box widgets support the Atom format. Of course, there are good reasons why other output formats, such as XML or JSON, might be most appropriate in different circumstances.
Figure 25. Changing the Publish output to Atom
In any data rich application, caching is key for keeping things running well from a performance point of view. You will also want to ensure that your mashups aren't overextending any sources that they are calling. For example, if you catalog a DB2® feed, you may want to enable caching on that feed so the source database isn't requested each time the corresponding mashup is requested. How long you choose to retain your cache will depend on your application. Depending on your needs, an hour might be suitable, or a full day might even be fine.
There are two types of caches that can be enabled in Mashup Center. First, you can enable the cache on a cataloged feed. To do this, from the Hub catalog, select Edit details for the feed of your choice. In the details section, expand the caching section. There, you will be able to enable caching for the endpoint, as well as specify an expiry time for that cache. Figure 26 shows where you put in these details for any feed.
Figure 26. Enabling feed cache
The second type of cache is specific to the data mashups. All data mashups comprise at least one Source operator. You may decide that you cache the source you specify to enhance the performance of your overall data flow. By default, Source operators are set to cache for an hour, and you can adjust these settings as required. To adjust the caching setting for a Source operator, click on the advanced tab. Figure 27 illustrates where these settings can be adjusted in the Source operator.
Figure 27. Enabling Source operator caching
In using the caching features, we've come across a couple tips. To keep things easier to debug, when considering using a feed in a mashup, we recommend using a feed cache or a Source operator cache, but try to avoid combining the two. In other words, if you enable source caching for a feed in a data mashup, also avoid enabling caching on that cataloged feed used in your Source operator. We tend to use feed caching when we want to optimize the output of a feed for use by a mashup page, or a system or user outside of Mashup Center. We recommend source caching to improve the performance of a particular data mashup.
Note that in Mashup Center 2.x, if you have a feed that has cache enabled and you load that feed into a Source operator, the Source operator will not use the feed cache; it will always trigger the source data. You can, however, specify caching in the Source operator, as described in Figure 27. In that case, Mashup Center will not exercise the source directly and will leverage the Source operator's cache. In Mashup Center 3.x, the behavior has changed. If the feed registered in the Source operator exists in the local catalog and has caching enabled, the Source operator will use the feed's cache. Source caching is also available in version 3.x. Remember that for a given input, we recommend not combining source and feed caching.
Mashup Center's robust toolkit lets you develop data-driven applications in hours and days, and not months often required by conventional development. We hope that these tips will help you get going in Mashup Center even more quickly. Stay tuned for Part 2, where we cover tips for using the Mashup Builder part of the application. Happy mashing!
- Read "Introduction to creating mashups using IBM Mashup Center," where you will learn the end-to-end process of using IBM Mashup Center to create a real-world mashup and publish it to the catalog for others to use. You will learn how to turn data from a spreadsheet into a format you can use in your mashups and then display the data in a widget.
- Check out "Creating a feed from an enterprise database (JDBC)" to learn how to work with relational feeds.
- Read "Solution development using DB2 and InfoSphere MashupHub" to learn how to create various feeds for use with databases.
- Check out demos and videos at the ItsMashtastic YouTube channel.
- Visit the IBM Mashup Center wiki to learn more about specific product features and check out community-written articles for IBM Mashup Center.
- Visit the IBM Mashup Center 3.0 Information Center, where you can find information about administering and using IBM Mashup Center 3.0.
- Visit the IBM Mashup Center 2.0 Information Center for information about administering and using IBM Mashup Center 2.0.
- Learn more about Information Management at the developerWorks Information Management zone. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Follow developerWorks on Twitter.
Get products and technologies
- Try IBM Mashup Center on Lotus Greenhouse.
- Visit the IBM Mashup Center on Amazon EC2 and run your own instance of IBM Mashup Center on Amazon EC2.
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in the discussion forum.
- Check out the developerWorks blogs and get involved in the developerWorks community.