Plotting scientific data with Eclipse BIRT

BIRT was made for business reports, but that doesn't mean you can't use it for creating plots of scientific data. Learn how to use BIRT for scientific purposes by creating two plots: one of the magnitude of a variable star and one of the number of sunspots per year.

Cesar Otero, Consultant, Freelance Consultant

Cesar OteroCesar Otero is a freelance Java and Python consultant. He holds a degree in electrical engineering with a minor in mathematics.



01 September 2009

Also available in Japanese

The Eclipse Business Intelligence and Reporting Tool (BIRT) is — as its name implies — meant for creating business reports. But that doesn't mean you can't use it for plotting scientific data, as well. As it turns out, BIRT is great for generating a quick, professional-looking 2-D plot of time series data from different data sources — for example, SQL or plain-text files. This article shows how to:

  • Find sources of data from the Internet.
  • Create a time series plot of a variable-magnitude star using data from a plain-text file.
  • Create a plot of the number of sunspots per year, retrieving the data from a database.

Frequently used acronyms

  • 2-D: Two-dimensional
  • JAR: Java™ Archive
  • POSIX: Portable Operating System Interface
  • SQL: Structured Query Language
  • URL: Uniform Resource Locator

It was recently the anniversary of the Apollo landing on the moon, and the name of the new Eclipse release is Galileo, so I just couldn't resist the temptation to do some space science plots.

Obtaining data

Depending on the kind of data you want to visualize, you have a plethora of sources to choose from. For example, you can use the National Oceanic and Atmospheric Administration (NOAA), the National Astronomy and Ionosphere Center (NAIC), the European Incoherent Scatter Scientific Association (EISCAT), or the IAU Minor Planet Center. Often, the data you find will be in a binary format, and you may need a third-party library to extract the data. This article focuses on generating the plots with data that is in a simple plain-text file.

You'll be using data from the Time Series Data Library. This site contains data sets from numerous fields, all of which are in text format.


Setting up

First, download Eclipse, if you don't already have it (see Resources). If you use the most recent version of Eclipse — Galileo — you also need to create a new workspace to avoid some issues. The report files for this article were created and tested using the latest version of Eclipse and BIRT, but you should be able to create the same reports with previous versions. BIRT is a plug-in for Eclipse that has the following dependencies:

A note about acquiring data

The data you obtain doesn't need to come from an observatory or lab; it can be data published by the government or historical weather data, for example. The data you acquire may have terms of use, so be sure to read any licensing and not to abuse those terms.

  • DTP — Data Tools Platform
  • EMF — Eclipse Modeling Framework
  • GEF — Graphical Editor Framework
  • WTP — Web Tools Platform

Fortunately, there is an all-in-one download that includes Eclipse, all the dependencies for BIRT, and BIRT itself (see Resources). After getting and installing BIRT, start Eclipse:

  1. From the Eclipse menu, click File > New > Other.
  2. In the window that appears, click Business Intelligence and Reporting Tools > Report Project.
  3. Enter a project name, then click OK. For this example, use birtPlotting.
  4. A window appears, asking to switch to the Report Design Perspective; click Yes.
  5. Right-click the birtPlotting folder on the Navigator tab, then click New > Report.
  6. Type starmag.rptdesign as the name for the new report, then click Next.
  7. In the window that appears, select Blank Report, then click Finish.

Plotting star magnitude using a flat-file data source

The first plot is of the changes in magnitude (brightness) from a variable star. According to Wikipedia, "A star is classified as variable if its apparent brightness as seen from Earth changes over time." Your plot will be a simple 2-D plot showing the change in magnitude of a variable star over time — specifically, of observations made over 600 nights. (The data file, starmagnitudetimeseries.ssv, is available in Download.)

Create a bar chart

On the left side, click the Report Items tab. From there, drag a chart to the designer. Although you might be tempted to use a line chart, a bar chart is more suitable. As the number of data points increases, the size of the bars decreases to accommodate space. Click Next.

Add a data source

The data is in space-separated value (SSV) format. To add a data source:

  1. Select Use Data From, then, from the drop-down list, select <New Data Set>.
  2. When prompted to add a new data source, click Yes.
  3. In the window that appears, select Flat File Data Source. Give the data source a name and click Next. For this example, use the name starMagDataSource.
  4. In the next window, select the flatfile style as ssv. Other options include comma-separated value (CSV), pipe-separated value (PSV), and tab-separated value (TSV).
  5. Clear the User first line as column name indicator check box.
  6. Optionally, click Test connection to make sure BIRT can find the SSV file.
  7. Click Finish to continue.

A window appears, prompting for a data source selection: Your previously created data source should now appear under Flat file data source. Give the data set a name, such as starMagDataSet, and click Next. On the following page, there are two lists: the left side showing the available columns from the data set, the right side showing the columns selected ready for use in the chart. There should only be one column on the left side. Select it, then click the right arrow. Change the column name to magnitude, select the integer type, then click Finish. Finally, click OK.

Create the category integers

You should now be back at the bar chart wizard. Perform the following steps to create the category integers:

  1. In the Select Data area, select the custom-created starMagDataSet. In the preview area, you should see the column name magnitude, along with several integer values.
  2. Near Category(X) Series, click the button with a function symbol to invoke the expression builder.
  3. In the Invoke expression builder window, select Available Column Bindings, then choose the subcategory Chart.
  4. Double-click rowNum.
  5. In the editor, you should see row.__rownum. Click OK.

Repeat the procedure for the category Y series, but this time, double-click magnitude instead of rowNum.

Customize the chart

Optionally, to make the chart a bit prettier, click the Format Chart tab. From there, you can change the chart title, remove the legend on the right side, change the X and Y axis titles, change the colors, or even change the plot scaling. For example, the default scaling is linear, but you can change it to a logarithmic scale. For this plot, use the title Variable Star Magnitude Time Series. Label the X axis Period (nights) and the Y axis Magnitude. When you're done, click Finish.

On the report designer, expand the chart object to fill the full width of the report, and make the height a little more than 3 inches. To preview the plot, click the Preview tab or, from the menu, click Page > Preview.

You'll probably notice that not all 600 nights are on the chart. At the top of the plot, it says something like, "Note: Current maximum number of data rows is...," followed by "Note: (Click to change Preview Preferences)." Click that message. In the resulting window, click No limits of the number of rows to display, then click OK. You will be prompted to refresh the page view: Click Yes. You should now see all 600 points on the plot. Figure 1 shows the finished chart.

Time series

A time series is a periodically measured sequence of data points.

Figure 1. A time series of a variable-magnitude star
Time series of a variable-magnitude star

To save the plot, click Run > View Report > As PDF or whatever format you wish to preview the chart in. Then just save to disk.


Plotting the number of sunspots using a database

In the next plot, you'll enter the data into a relational database, create your report, and use a Java technology program to generate a final product. The plot will be of the number of sunspots within a chosen time span.

First, download the H2 database (see Resources). H2 is implemented as a pure Java database and has a small size. After downloading it, extract the file, and navigate to the bin directory. There should be a JAR file named h2-version-number (on my machine, h2-1.1.114.jar). You can type java -jar h2-1.1.114.jar or run the .sh or .bat file (depending on whether you're running a POSIX or Windows® machine).

Now that the server is running, populate it with data. The sunspot data you need is in the file you downloaded from Download. In the console, click Tools > Run Script. In the Target database URL field, type jdbc:h2:~/sunspots.db. Because the database doesn't yet exist, H2 will create one for you. In the Source script file name field, type the full path to your sunspots.sql file, then click Run.

Back in Eclipse:

  1. Right-click the birtPlotting project, then click New > Report.
  2. Change the name of the new report to sunspots.rptdesign, then click Finish. In addition to the previously mentioned method for adding data sources and data sets, you can add them using the menu.
  3. From the menu bar, click Data > New Data Source.
  4. In the window that appears; select jdbc data source, give the data source a name, then click Next.
  5. Add the H2 jdbc driver by clicking Manage Drivers and clicking Add.
  6. In the file chooser, navigate to your H2 installation. Select the h2 JAR file, then click OK. The H2 jdbc should now appear in the Driver class box.
  7. Select the H2 jdbc driver, then type the driver URL jdbc:h2:/path/to/sunspot.db.
  8. Type sa in the User Name field.
  9. Click Test Connection to make sure Eclipse can find the sunspot database, then click Finish. As before, now that you have a data source, you need a data set.
  10. From the menu bar, click Data > New Data Set > New Data Set.
  11. In the Data set window, select the sunspot data source you just created, set the data set to sql select query, type a name, then click Next.
  12. Type the following query in the query text area, then click Finish:
    select * from sunspots where year between 1900 and 1980;

Now, drag a new chart object to the layout. Again, you'll be using a bar chart. Click Next in the chart window. Select the Use Data From option, then select sunSpotsDataSet. Click the expression builder button for the category X series, then select Available column bindings > Chart and double-click YEAR. Click OK.

Next, invoke the expression builder for the category Y series. Select Available column bindings > Chart, double-click NUM, then click OK. Make changes to the formatting you desire, then click Finish. Upon previewing the plot, you'll see a very neat time series. The first thing you should notice is that the number of sunspots follows a cycle, as shown below. (Neat, hunh?)

Figure 2. A plot of the number of sunspots per year
A plot of number of sunspots per year

One obvious advantage to having the data in a database is the ability to manipulate the data through queries. Change the query to select * from sunspots to see all the data.


Conclusion

In this article, you've seen how you can use a business intelligence tool for creating reports of scientific data. You don't have to buy an expensive platform to get a quick visualization of good quality. The point of this article is to demonstrate how this business tool can be used differently from what was originally intended. You could just as easily adapt it for other types of data.


Download

DescriptionNameSize
Code sample1birt_source.zip3KB

Note

  1. SSV and SQL data sources

Resources

Learn

Get products and technologies

Discuss

  • The Eclipse Platform newsgroups should be your first stop to discuss questions regarding Eclipse. (Selecting this will launch your default Usenet news reader application and open eclipse.platform.)
  • The Eclipse newsgroups has many resources for people interested in using and extending Eclipse.
  • Participate in developerWorks blogs and get involved in the developerWorks community.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=422496
ArticleTitle=Plotting scientific data with Eclipse BIRT
publish-date=09012009