IBM Mashup Center comes with feed generators that can access and generate XML feeds directly from many enterprise data sources. At the same time, given the diversity of data stores and software, there will be data sources that cannot be accessed by these built-in generators. To allow you to augment the capability of the IBM Mashup Center, the feed generation capability can be extended by the addition of plug-ins.
This article is a follow on to the article " Extend the reach of data for IBM Mashup Center" and is based on V1.1 of the software. It assumes that you are already familiar with the basics of writing an IBM Mashup Center plug-in. In particular, you should know how to program in Java™, JSP, and JavaScript. The article shows you how to develop a plug-in to convert HTML into XML, and uses this example to illustrate the writing of a more complex plug-in. As a side benefit, once the HTML is in XML format, it can be read into the Feed Mashup Editor, permitting data extraction.
Tools for converting HTML to XML
We will need a Java package that will convert HTML to XML. There are a number of such Java packages. This project uses JTidy, a Java port of HTML Tidy from W3C. HTML Tidy started as a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML. It supports, in addition, the generation of XML. You'll find a link in the Resources section to download and install JTidy.
We will be using the Tidy.jar in the build folder. To make the JAR self-contained, it contains an older version of W3C Document Object Model (DOM) classes. Since more recent version of the W3C DOM classes are now included with the JDK, classes from the package org.w3c.dom should be removed.
As described in Section 6.1 of the Application Programming Interface Reference, Version 1.0, the feed generation framework automatically searches for ZIP files containing third party plug-ins placed in the special folder <WebApplication>/WEB-INF/plugins. The ZIP archive must have the folder structure specified below:
- /client/plugins/PLUGIN_DIR -- Contains files for browsers, like images, JavaScript files and so on.
- /server/plugins/PLUGIN_DIR -- Contains files used by the plug-in to render itself (HTML files, JSP pages). Additional folders for plug-in files can be included.
- /WEB-INF/classes -- Contains the plug-in Java classes. This can be a hierarchy of folders. The classes will be copied to <WebApplication>/WEB-INF/classes.
- /WEB-INF/lib -- Contains JAR files used by the plug-in (third-party).
To simplify the final build and packaging of the plug-in, you might want to create a project using your favorite IDE having the same directory structure as required by the final ZIP archive. Figure 1 shows the layout of the Eclipse project created by the author.
Figure 1. Eclipse project
Note that the PLUGIN_DIR must be unique and the same as the package name of the plug-in. For this example, I used the package name sample.mashupcenter.tidyhtml. Observe also that we have placed Tidy.jar in the WEB-INF/lib folder and created a folder named lib_noship containing two additional JARs which are provided by the Mashup Center server during runtime but unavailable in the development environment. These JARs should not be included in the final deployment ZIP file. (In fact, loading of the plug-in ZIP file would fail if they were included by mistake.)
Every Mashup Center plug-in has two major operations: an editor to collect creation parameters, and a generator to create the feed. (Feed normally refers to an xml document conforming to the RSS or ATOM specification. Note that in this case, the generated XML will originate from HTML and will conform to RSS or the ATOM specification.) The plug-in framework finds out which Java classes implement the two key operations by reading the package.xml file, which must be placed in the server/plugins/PLUGIN_DIR folder. Listing 1 shows the package.xml file for this plug-in.
Listing 1. Package XML file
<plugin> <name>Tidy Html</name> <author>L. Mau</author> <version>1.0</version> <category>departmental</category> <editor>Html2XmlEditorPlugin</editor> <generator>Html2XmlGeneratorPlugin</generator> <description>convert any Html page into xhtml</description> <icon16path>/plugins/sample.mashupcenter.tidyhtml/icons/btn16_hello.gif</ icon16path> <icon32path>/plugins/sample.mashupcenter.tidyhtml/icons/btn32_hello.gif</ icon32path> <icon64path>/plugins/sample.mashupcenter.tidyhtml/icons/btn64_hello.gif</ icon64path> <objectType>feed</objectType> </plugin> |
It is worth mentioning that the name, description, and version elements are for the benefit of the creator and not used by the IBM Mashup Center plug-in framework. The plug-in framework uses the plugin.uiname properties inside the ui.properties file as the name of the plug-in when presenting the list of options after users select New Feed.
The ui.properties file resides in the /server/plugins/PLUGIN_DIR/nls folder and is loaded using the standard Java resource bundle loading convention. For each supported language, place the translated string for plugin.name in a properties file with the locale appended to "ui." For example, the Japanese version of the file should be named ui_ja.properties.
The Html2XmlEditorPlugin extends BaseEditorPlugin, a base class that requires us to implement the renderEditor method.
Listing 2. Html2XmlEditorPlugin class
package sample.mashupcenter.tidyhtml;
import : : : : : // omitted from listing
/**
* This plugin uses JTidy to convert Html to Xml.
*/
public class Html2XmlEditorPlugin extends BaseEditorPlugin {
private static final Log log = LogFactory.getLog(Html2XmlEditorPlugin.class);
public static final String I18N_RESFILE = Html2XmlConstants.PLUGIN_NAME
+ ".nls.tidyhtml";
public static final String HTTP_BASEURL = "plugins/"
+ Html2XmlConstants.PLUGIN_NAME + "/";
public static final String RES_BASEURL = "/" + HTTP_BASEURL;
public static final String HELPPATH = HTTP_BASEURL + "help/tidyhtml.htm";
|
The first statement in the class creates a static log instance from the Apache common logging package. This is the same logging infrastructure as used by the feed generation framework. Log messages will be interleaved with those from the feed generation framework and will be written to the following file <WebApplication>/META-INF/logs/javamashuphub.log. By default, only messages from WARN and above (for example ERROR) will be written to the log file.
The next statement is the string constant to a resource file. Even if there is no need to translate all the string constants to different languages, it is still good programming practice to keep all text strings displayed by the user interface in a separate resource file. Note that the file tidyhtml.properties is placed in the same directory as the ui.properties file described earlier.
The last few string constants define paths to various resources needed by the plug-in. Note that the runtime location of those resources mimics the structure of the plug-in ZIP file.
The renderEditor method takes two parameters: RequestData and Entry. This method is called by the framework when users select to create a new feed using this plug-in, or when users edit an existing feed previously created by this plug-in. As we will see, the method takes two parameters of types RequestData and Entry. The two parameters are actually common to all methods invoked by the framework in response to user actions. RequestData contains information sent from the browser, and Entry contains all the information maintained by the framework for this feed instance.
Listing 3. renderEditor method body
public ViewBean renderEditor(RequestData rdata, Entry entry)
{
ResourceBundle i18n = ResourceBundle.getBundle(I18N_RESFILE,rdata.getLocale());
String pluginId = this.getId();
Html2XmlUrlViewBean htViewBean = new Html2XmlUrlViewBean();
htViewBean.setEntry(entry);
htViewBean.setHtmlUrl( entry.getAttribute(Html2XmlUrlViewBean.PARAM_HTMLURL ) );
htViewBean.setSnapshot( entry.getAttribute(Html2XmlUrlViewBean.PARAM_SNAPSHOT ) );
FormViewBean form = new FormViewBean();
form.setSuffix( htViewBean.getSuffix() );
form.addComponent( htViewBean );
form.setOnsubmit( PluginHelper.getClientMe(pluginId, entry.getObjectId()) +
".invokeServer('displayHtmlPage'," +
PluginHelper.getClientId(pluginId, entry.getObjectId()) +
"_" + form.getSuffix() + ");");
form.setEntry(entry);
FrameViewBean frame = new FrameViewBean();
frame.addComponent(form);
frame.setLabel(entry.getTitle());
frame.setTitle(i18n.getString("frame.urltitle"));
frame.setEntry(entry);
frame.setHelpPath( HELPPATH );
return frame;
}
|
The method returns an instance of type ViewBean. ViewBean is similar to Java Bean with getters and setters for display properties. Its main purpose is to specify the JSP used by the feed generation framework to create HTML for the plug-in specific editor. Since the renderEditor method could be called to edit an existing instance, it retrieves any previously saved data for this instance by calling the Entry::getAttribute method. We will see later when and how these data are saved. The retrieved value are then passed to the Html2XmlUrlViewBean so that the associated JSP can display the value previously supplied by the user. Note that the plug-in specific Html2XmlViewBean is not returned directly, but instead is wrapped inside a FormViewBean instance via the addComponent method. The FormViewBean provides the custom JavaScript logic to send the user-entered information to the plug-in when the Next button from the wizard-like editor interface is clicked. Finally, the FormViewBean is in turn wrapped inside a FrameViewBean. It is the latter which is returned.
One last note before moving on. We call the setOnsubmit method to provide a chunk of JavaScript code to execute when the Next button is clicked. The JavaScript code calls the hub.managers.InvokePlugin's invokeServer function described in Section 6.3.2 of the Application Programming Interface Reference. The first parameter specifies the displayHtmlPage method in this class which will be used to service the next editor page.
Html2XmlUrlViewBean and its associated JSP file
We have briefly described the Html2XmlUrlViewBean already in the previous section. Listing 4 shows part of the class definition:
Listing 4. Html2XmlUrlViewBean
public class Html2XmlUrlViewBean extends ViewBean
{
public static final String PARAM_HTMLURL = "htmlurl";
public static final String PARAM_SNAPSHOT = "snapshot";
private String htmlUrl;
private String snapshot;
public Html2XmlUrlViewBean()
{
this.setI18NProperties( Html2XmlConstants.PLUGIN_NAME + ".nls.tidyhtml");
}
/* (non-Javadoc)
* @see com.ibm.mashuphub.model.ViewBean#getJSPPath()
*/
@Override
public String getJSPPath() {
return "/server/plugins/" + Html2XmlConstants.PLUGIN_NAME + "/tidyhtmlUrl.jsp";
}
public String getSuffix() {
return "tidyhtmlUrl";
}
|
The getJSPPath method will be called by the FormViewBean when it tries to generate the HTML form for gathering these plug-in specific parameters. The getSuffix method should return a string unique among the various ViewBeans from this plug-in. Before looking at the associated JSP file, it helps to first look at the rendered HTML form:
Figure 2. InfoSphere MashupHug first editor page
Notice that the form has two input elements:
- A textfield for collecting the URL for the HTML the user wants to convert to XML, and
- A checkbox to indicate that we will save the generated XML at the first invocation and will simply return the XML in subsequent feed generation requests. This is appropriate when the HTML page is static and rarely changes.
Now that we have seen what is to be generated, it is much easier to understand the JSP file.
Listing 5. tidyhtmlUrl.jsp
<%@page import="sample.mashupcenter.tidyhtml.Html2XmlUrlViewBean"%>
<%
Html2XmlUrlViewBean htViewBean = new Html2XmlUrlViewBean();
htViewBean = (Html2XmlUrlViewBean) htViewBean.getViewBeanFromRequest(request);
ResourceBundle i18n = ResourceBundle.getBundle(htViewBean.getI18NProperties(),
request.getLocale());
String objectId = htViewBean.getEntry().getObjectId();
String id = com.ibm.mashuphub.helper.PluginHelper.getClientId(
htViewBean.getPluginId(), objectId);
%>
<br/>
<label for='htmlurl'><%=i18n.getString("form.htmlurl.label") %></label>
<div class="rightCol">
<input type='text'
id='<%=id%>_htmlurl'
name='<%= Html2XmlUrlViewBean.PARAM_HTMLURL %>'
value='<%= htViewBean.getHtmlUrl() %>'
maxlength='256' style='width=600px;' />
</div>
<div class="rightCol">
<input type='checkbox'
id='<%=id%>_snapshot'
name='<%= Html2XmlUrlViewBean.PARAM_SNAPSHOT %>'
value='y'
<%= "y".equals(htViewBean.getSnapshot()) ? "checked" : "" %> />
<%= i18n.getString("form.snapshot.label") %>
</div>
|
Ignoring the import statement, the purpose of the first two statements is to retrieve the ViewBean associated with the JSP. It differs slightly from the way JSPs typically retrieve their associated Java bean, that is from the request object. Corresponding to the two form input elements, there are two HTML input elements of type text and checkbox respectively. Note that we use the constants PARAM_HTMLURL and PARAM_SNAPSHOT from the class Html2XmlUrlViewBean to name the two input elements. These names will appear as names in the URL query string sent when the Next button is clicked. Using string constants is the best way to ensure that they correspond exactly to what the server expects. Lastly, we initialized these input elements using the potentially previous value retrieved by the renderEditor method.
I mentioned in an earlier section that the displayHtmlPage method in the Html2XMLEditorPlugin class will be used to service the next editor page. The method displayHtmlPage is not inherited from the base class BaseEditorPlugin and takes two parameters of type RequestData and Entry. An EditorPlugin can introduce any number of public methods with the same signature. All such methods may be invoked by the client running on the browser through an AJAX call.
Listing 6. displayHtmlPage method
public ViewBean displayHtmlPage(RequestData rdata, Entry entry)
{
ResourceBundle i18n = ResourceBundle.getBundle(I18N_RESFILE,rdata.getLocale());
String pluginId = this.getId();
// do not use "url" since the latter got intercepted in RequestData.init();
String sHtmlUrl = rdata.getParameter( Html2XmlUrlViewBean.PARAM_HTMLURL );
String snapshot = rdata.getParameter( Html2XmlUrlViewBean.PARAM_SNAPSHOT );
log.debug("snapshot,sHtml=" + snapshot + "," + sHtmlUrl );
Html2XmlContentViewBean htViewBean = new Html2XmlContentViewBean();
htViewBean.setEntry(entry);
htViewBean.setHtmlUrl( sHtmlUrl );
htViewBean.setSnapshot( snapshot );
FormViewBean form = new FormViewBean();
form.setSuffix( htViewBean.getSuffix() );
form.addComponent( htViewBean );
form.setOnsubmit(PluginHelper.getClientMe(pluginId,
entry.getObjectId())+".submit();");
form.setEntry(entry); // must be set, used to init JS plugin object
FrameViewBean frame = new FrameViewBean();
frame.addComponent(form);
frame.setLabel(entry.getTitle());
frame.setTitle(i18n.getString("frame.tabtitle"));
frame.setEntry(entry);
frame.setHelpPath( HELPPATH );
JSONAJAXResponseViewBean ajaxViewBean = new JSONAJAXResponseViewBean();
ajaxViewBean.setMethod(JSONAJAXResponseViewBean.METHOD_SHOW_EDITOR);
ajaxViewBean.setCode( JSONAJAXResponseViewBean.PAGE_CONTENT );
ajaxViewBean.addComponent(frame);
return ajaxViewBean;
}
|
The purpose of this method is to render a second editor page for users to verify the content of the retrieved HTML. Accordingly, the return type has to be ViewBean. The logic inside the displayHtmlPage method is similar to the renderEditor method we discussed earlier with three notable differences:
- Instead of retrieving previously entered configuration values from the Entry instance, we retrieved what the user entered during this editing session by calling the RequestData's getParameter method. These parameters correspond to the input elements in the JSP form sent via an AJAX call to the server.
- Each page requires a different ViewBean. This method instantiate an instance of Html2XmlContentViewBean. As before, it has to be wrapped inside of a FormViewBean, FrameViewBean chain. In addition, we need to further wrap the FrameViewBean in a JSONAJAXResponseViewBean instance. The latter happened automatically in the renderEditor method but needs to be explicitly done here.
- Since we will be providing our own JavaScript, we show a slight variation in the JavaScript passed to the setOnsubmit method. Instead of calling invokeServer directly, we will be calling the submit method in the associated JavaScript.
One additional detail worth pointing out is the call to the static logger instance to log user specified parameters to help with problem determination.
Html2XmlContentViewBean and the associated JSP
The Html2XmlContentViewBean is fairly simple and basically just returns a different JSP path and suffix from the Html2XmlUrlViewBean we looked at earlier. The reader can examine it by downloading the attached package and we will not dwell on it further. The editor page to be generated is also simple, consisting of an area to display the retrieved HTML. The following screen shot shows one corner of the display area:
Figure 3. Preview HTML content page
We next examine the associated JSP file tidyhtmlContent.jsp. To generate the display area, you can see that the associated JSP simply includes a single div element at the bottom of the JSP file. Since we will be using the id attribute later, this is a good place to discuss its construction.
The id attribute must be unique among all HTML elements within a browser window. Using the id, the browser provided API can retrieve the HTML elements as JavaScript DOM objects, allowing dynamic manipulation. Since a user could have multiple instances of a given plug-in editor opened at the same time, HTML elements in the JSP template will be instantiated multiple times. To ensure that ids of such elements are unique, we call the PluginHelper's getClientId method to retrieve the unique feed instance id and append it to the id.
Listing 7. tidyhtmlContent.jsp
<%@page import="sample.mashupcenter.tidyhtml.Html2XmlContentViewBean"%>
<%
Html2XmlContentViewBean htViewBean = new Html2XmlContentViewBean();
htViewBean = (Html2XmlContentViewBean) htViewBean.getViewBeanFromRequest(request);
ResourceBundle i18n = ResourceBundle.getBundle(htViewBean.getI18NProperties(),
request.getLocale());
String objectId = htViewBean.getEntry().getObjectId();
String id = com.ibm.mashuphub.helper.PluginHelper.getClientId(
htViewBean.getPluginId(), objectId);
String me = com.ibm.mashuphub.helper.PluginHelper.getClientMe(
htViewBean.getPluginId(), objectId);
String snapshot = "\"" + htViewBean.getSnapshot() + "\"";
String htmlUrl = htViewBean.getHtmlUrl();
htmlUrl = ( htmlUrl == null ? "\"\"" : "\"" + htmlUrl + "\"" );
%>
<script type="text/javascript">
dojo.registerModulePath("plugins.tidyhtml" ,
"../../../../client/plugins/sample.mashupcenter.tidyhtml/script");
dojo.require("plugins.tidyhtml.PreviewHtml");
new plugins.tidyhtml.PreviewHtml(
<%= me %>.plugin_id,
<%= me %>.entry_id,
<%= me %>.workflow);
<%=me%>.init( <%= htmlUrl %> , <%= snapshot %> );
<%=me%>.onLoadEditor();
</script>
<div id='<%=id%>_htmlContent' style='width:100%;
overflow:auto; border: 2px solid #000000;'>
</div>
|
One new aspect of this JSP is the inclusion of custom JavaScript to be run on the client side. The IBM Mashup Center feed generation framework uses the Dojo AJAX package. See the Resources section for the link to the Dojo documentation. We will be using the Dojo AJAX package in our custom JavaScript. Most of the custom JavaScript resides in a Dojo class named "plugins.tidyhtml.PreviewHtml".
To use it, we need to import it using a dojo.require function call. The Dojo registerModulePath function call is used to tell Dojo how to locate classes from the "module" plugins.tidyhtml. Note that the specified path is relative to where the Dojo package is located and hence requires the backward reference "../../../..". The above initialization logic is generated inline enclosed inside a script tag. In addition, the inline JavaScript creates an instance of the PreviewHtml class and calls its init and onLoadEditor methods. The next section examines in greater detail the PreviewHtml class.
The PreviewHtml Dojo class inherits from the hub.managers.InvokePlugin class which is part of the client side feed generation framework. The InvokePlugin class is further described in section 6.3.2 of the Application Programming Interface Reference, Version 1.0. The methods of importance in the PreviewHtml Dojo class are onLoadEditor and populateContent.
Listing 8. PreviewHtml Dojo class
onLoadEditor: function()
{
this.id = this.getEditorId();
this.htmlContentNode = dojo.byId( this.id + '_htmlContent' );
this.populateContent();
},
populateContent: function( )
{
console.log( "populateContent called" );
var baseUrl = hub.urls.getAjaxUrl( this.plugin_id,this.entry_id, 'getHtmlContent');
var htmlurl = baseUrl + "?htmlurl=" + escape( this.htmlUrl );
if ( this.htmlContentInternalNode )
this.htmlContentNode.removeChild( this.htmlContentInternalNode );
this.htmlContentInternalNode = document.createElement( 'iframe' );
this.htmlContentInternalNode.setAttribute( "src", htmlurl );
this.htmlContentInternalNode.setAttribute( "width", "100%" );
this.htmlContentInternalNode.setAttribute( "height", "400px" );
this.htmlContentNode.appendChild( this.htmlContentInternalNode );
},
|
The function populateContent is called by onLoadEditor during page loading time. It dynamically creates an iframe to display the retrieved HTML localizing the effect of any included style sheets and scripts preventing them from affecting the appearance of other pages. The dynamically created iframe is appended to the static div created by the JSP. To retrieve the DOM node corresponding to the display area, we used the unique id of the div element generated by appending the unique feed instance id to a common suffix.
On the server side, we used a method on the PluginHelper class to get the unique feed instance id. On the browser side, we call the getEditorId function from the PreviewHtml Dojo class's parent i.e. hub.managers.InvokePlugin. To retrieve the HTML content, we will take advantage of the Iframe "src" attribute. The iframe will automatically retrieve and display the content pointed to by the src attribute during initialization. We will set the src attribute to invoke the editor plug-in getHtmlContent method. Note the way we create the URL by calling the getAjaxUrl function and appending the result to the string "getHtmlContent".
I mentioned in an earlier section that any public methods with RequestData and Entry as parameters may be invoked using an AJAX call. In particular, the method getHtmlContent can be called by the PreviewHtml Dojo class to return HTML from the user supplied URL. Because the actual HTML retrieval is common to feed generation and will be covered later, I will not provide any code snippets here. The only thing I want to point out is the return type of the method. In the earlier example, the AJAX method displayHtmlPage returns a ViewBean. AJAX methods in general can return any object and its toString value will be returned. See section 6.3.2 of Application Programming Interface Reference, Version 1.0.
Our last editor method: saveFeedEntry
saveFeedEntry is another public method of Html2XmlEditorPlugin invoked via AJAX to handle the final step in the editing process, saving what the user has entered. Is it similar to the save methods in other plug-ins. What's new is "resource" handling. Resources differs from attributes in size and type. Resources can be binary and can be up to one gigabyte in size. In contrast, attributes are limited to strings of size 10MB. The size limit for attribute should be sufficient for content, but for the illustrative purpose, we will save the HTML content as a resource. When the snapshot option is checked, the generator will only retrieve the HTML content from the specified url once. The HTML content is then converted to XML and saved. All subsequent feed generation requests will be satisfied from the saved XML. To handle the case where the user wants to make another snapshot because the site might have changed, we would like to delete the saved copy whenever the feed is edited. The code fragment shows how this is done in a two step process: retrieve the resource by name, followed by calling the deleteResource method on the returned object.
Listing 9. saveFeedEntry method
try {
entry.generateURL(rdata.getBaseUrl(), this.getId() );
entry.addAttribute(Html2XmlUrlViewBean.PARAM_HTMLURL, sHtmlUrl , this.getId() );
entry.addAttribute(Html2XmlUrlViewBean.PARAM_SNAPSHOT , snapshot , this.getId() );
// after every edit, cleanup any previously cached snapshot
Resource oldRes = entry.getResource( Html2XmlConstants.CACHED_XHTML );
if ( oldRes != null )
oldRes.deleteResource();
} catch (HubException ex) {
log.error("Error adding entry attribute.",ex);
}
|
We are finally done with the Editor and on to the Generator.
The Html2XmlGeneratorPlugin class extends the BaseGeneratorPlugin and must implement the abstract method generateFeed. It should be no surprise that the input parameters of type RequestData and Entry are identical to what are being passed in to EditorPlugin methods called by the feed generation framework. To generate the feed, one must first retrieve the attributes containing the configuration information saved during the editing process. This is done by calling the getAttribute method from Entry.
Listing 10. generateFeed method
public FeedContent generateFeed(RequestData rdata, Entry entry) {
String sHtmlUrl = entry.getAttribute(Html2XmlUrlViewBean.PARAM_HTMLURL );
String snapshot = entry.getAttribute(Html2XmlUrlViewBean.PARAM_SNAPSHOT );
|
Since this plug-in has no parameterization support, we do not need to retrieve the runtime supplied parameters. We will either return the saved XML content or retrieve the HTML content and convert to XML using JTidy. The logic illustrates how resources are created and is fairly straightforward.
Listing 11. generateFeed body
String result = "Html might have changed. Table not found.";
Resource oldRes = entry.getResource( Html2XmlConstants.CACHED_XHTML );
if ( "y".equals( snapshot ) && oldRes != null ) {
log.warn( "returning cached, snapshot=" + snapshot );
return new FeedContent(oldRes.loadResource(), entry.getLifeTime());
}
String sHtml = getXhtml( sHtmlUrl );
if ( sHtml.length() > 0 ) {
result = sHtml;
if ( "y".equals( snapshot ) ) {
try {
Resource prepared = new Resource();
prepared.setObjectid( entry.getObjectId() );
prepared.setMimetype( "text/xml; charset=utf-8" );
prepared.setFilename( Html2XmlConstants.CACHED_XHTML );
prepared.uploadResource( sHtml.getBytes( "utf-8" ) );
} catch (HubException e) {
log.error(e);
}
}
}
return new FeedContent( result.getBytes( "utf-8" ), entry.getLifeTime());
|
Further details on how HTML input is converted to XML can be found in the java source files. I will just mention two key points. To make the output XML usable by the feed mashup editor, we stripped out any DOCTYPE declaration. In addition, the generation logic makes the simplifying assumption that the input HTML is in UTF-8 and need to be enhanced to support other languages.
The complete Eclipse project with all the source files is available as a zip file in the download area. In addition, to make it easy to try out the plug-in, the plug-in zip (sample.mashupcenter.tidyhtml.zip) file is also provided. To install the plug-in, perform the following steps:
- Download Tidy.jar from the link in the resource section.
- After removing class files from the package org.w3c.dom, add Tidy.jar to the plug-in zip file under the directory WEB-INF/lib.
- Place the plug-in zip file in the <WebApplication>/WEB-INF/plugins directory.
- Stop and restart the server.
We have just walked through the construction of a more complicated plug-in involving multiple editor pages, custom JavaScript and saving of resources. You now have the basics to begin extending the feed generation capabilities of IBM Mashup Center. A subsequent article will discuss more advanced topics such as security and parameterization.
| Description | Name | Size | Download method |
|---|---|---|---|
| Samples for this article | Download.zip | 325KB | HTTP |
Information about download methods
Learn
-
To learn the basics of creating an IBM Mashup Center plug-in, read "
Extend the reach of data for IBM Mashup Center"(developerWorks, Aug 2008).
-
Section 6 of Application Programming
Interface Reference, Version 1.0 is the official documentation for the IBM Mashup Center Plug-in API.
-
Learn about the W3C Document Object Model (DOM).
-
To learn about Dojo, read The Book of Dojo.
-
For examples on how to use the converted HTML, read "
Convert from HTML to XML with HTML Tidy(developerWorks, Sep 2003).
- In the
Information Management area on developerWorks,
get the resources you need to advance your skills on IBM Information Management products.
- Browse the
technology bookstore
for books on these and other technical topics.
Get products and technologies
-
You can download JTidy from SourceForge.
-
To get hands-on experience with IBM Mashup Center, visit Lotus greenhouse.
-
Download a free trial version of IBM Mashup Center
from developerWorks.
-
You may also access Mashup Center on the Amazon Elastic Compute
Cloud.
- Download
IBM product evaluation versions
or explore
the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from
DB2®, Lotus®, Rational®, Tivoli®, and
WebSphere®.
Discuss
- Participate in the discussion forum.
- Check out
developerWorks
blogs and
get involved in the
developerWorks community.
Comments (Undergoing maintenance)





