Integrate traffic data with IBM Intelligent Transportation using a traffic data gateway

Meet the challenges of today's traffic raw data collection from detectors embedded in roadway infrastructures

The transportation industry must modernize its information technology resources to meet current business and regulatory standards and to gain better visibility and control across the transportation roadways. A particular challenge is an advanced approach to capturing traffic data in both near real time and real time. In this article, we demonstrate how an implementation team can develop a traffic data gateway to meet the challenges of today's traffic raw data collection from various detectors embedded in roadway infrastructures.

Qiang Bai (baiqiang@cn.ibm.com), Software Engineer, IBM

Qiang Bai's photoBai Qiang is an Application Architect in IBM Global Business Solution Center (GBSC). He has more than eight years experience for IT application development and architecture in the Telecom and Transportation industries. Currently, he is the asset technical owner of traffic data gateway and integrated traffic management system asset in GBSC team.



Tony Carrato, Distinguished Chief IT Architect, IBM  

CarratoTony Carrato is the Chief Product Architect for Smarter Cities products in IBM SWG Industry Solutions. As such, he is responsible for architecture across the IBM Intelligent Operations Center, IBM Intelligent Transportation and other products in the Smarter Cities portfolio. Prior to taking this role, Tony was a lead architect in IBM SWG's SOA Advanced Technology team. He has over 30 years of IT experience, working in North America, Asia and Australia. Tony is an IBM Senior Certified IT Architect, an Open Group Certified Distinguished IT Architect and a member of the IBM Academy of Technology.



Christopher M. Laffoon (claffoon@us.ibm.com), Software Engineer, IBM

Christopher M. Laffoon author photoChris is an industry solution standards engineer in the Transportation and Education industries. He has valuable development experience with various industry messaging standards (including TMDD and DATEXII), XML, XML Schema, and Java™ technologies.



Dr. Magda Mourad (magdam@us.ibm.com), Distinguished Chief IT Architect, IBM

Photo of Magda MouradDr. Magda Mourad is a Certified Distinguished Chief IT Architect with IBM Software Group’s Industry Solutions team. She is the architect of IBM Intelligent Transportation product. She joined IBM in 1989 as a Research Scientist at the T.J. Watson Research Center, where she became a Manager then CTO of the Digital Media business unit in 2005. She also went on two international assignments in Europe and the Middle East. She is currently chairing the IEEE working group that developed a Recommended Practice for Digital Rights Expression Languages (DRELs) Suitable for eLearning Technologies.



Pam Nesbitt (pnesbitt@us.ibm.com), Systems Management Specialist, IBM

photo of P NesbittPam Nesbitt is Senior Technical Staff Member in Industry Solutions Software at IBM. Much of her work in the last year has centered around helping to operationalize IBM’s Smarter Cities technical strategy, alignment of IBM’s business and technical strategies, and helping enable clients with IBM’s new Smarter Cities solutions. Her previous activities include software development and solutions delivery to clients. Ms. Nesbitt has won a number of external and internal awards for her work, has presented at numerous international conferences and has published in a number of peer-reviewed journals. She is an IBM Master Inventor and has 110 patents issued and pending with the USPTO. She holds a B.S. in Neurobiology from Cornell University and an MCIS from Cleveland State University.



Jacqueline L. Ryan (jacqryan@us.ibm.com), Program Director, IBM

Photo of J RyanJackie Ryan currently leads industry solution product management within IBM's Information Management division. With over 20+ years experience in information technologies in various roles across software development, marketing strategy, and product management, Jackie has actively worked with clients to achieve their business goals leveraging IBM's technologies across data management, information integration, and information analytics.



Lei Zhang (zzhangl@cn.ibm.com), Advisory IT Architect, IBM

Lei ZhangLei Zhang is an IBM Certified Architect, and is the ITS Lead Architect in the Global Business Solution Center. Lei has participated in many Smarter Transportation project engagements and deliveries to build and harvest the reusable assets, which is enabling more Smarter Transportation opportunities for IBM.



Viswanath Srikanth (ahs@us.ibm.com), Senior Software Engineer, IBM

Viswanath Srikanth author photoSri is a senior software engineer and Industry solution standards leader for the Transportation Industry in IBM. He is an active member at transportation standards organizations including Open Travel Alliance working on creating standardized service models for core functions such as reservation and ticketing. Sri has previously co-chaired technical reports from industry consortia on the subject of applying service-oriented architecture into industries such as Hotel, Retail, and Education.



15 July 2011

Also available in Chinese

Introduction

Transportation is the backbone of our civilization and the reason for our economic prosperity. Departments of Transportation worldwide are struggling with technologies and infrastructures implemented decades ago and must modernize a mix of information technology resources and existing resources to meet citizens’ needs.

Frequently used acronyms

  • ATMS: Advanced Traffic Management Systems
  • ETL: traffic data extraction, transformation, and loading
  • OC: TMDD owner center
  • SOAP: Simple Object Access Protocol
  • TMC: Transportation Management Center
  • TMDD: Traffic Management Data Dictionary
  • XML: Extensible Markup Language

Transportation departments are focusing on deploying communications, control, and computer information technologies to improve the performance of highway, transit (rail and bus), and air and maritime transportation systems. Consequently, transportation systems and infrastructures are being monitored at increasing levels, resulting in tremendous captured data sets. This data can include information, such as geographic location, speed, count, and behavior patterns, which are combined to obtain estimates of traffic conditions. There are many unique challenges involved, such as:

  • Large volumes of data from disparate sources have to be collected in a timely manner to support near real-time traffic representation.
  • Multiple clean-up strategies are required to manage the error characteristics of the different data streams, filtering out bad data and handling gaps within the data.
  • Different data types must be combined to accurately represent traffic information.
  • Data must be converted into a standard format that can be interpreted by distributed traffic information centers.

More specifically, consider the business requirements for current traffic data collection, where we must:

  • Integrate different traffic data sensors (loop, microwave, video sensors and floating car data).
  • Integrate the same traffic data source from different vendors.
  • Obtain consistent time and space coordinates from different types of traffic data sensors.
  • Adjust for the different acquisition frequencies; 2 minutes for fixed sensor system, 5 minutes for floating car data, and uneven distribution of sensors.
  • Expand the lack of traffic data. Traffic data expansion algorithms are needed for lack of sensors and limitations of traffic data acquisition methods.
  • Modify abnormal, unreliable, false, and missing traffic data due to hardware fault, communication breakdown, noise jamming, weather factors, and traffic accidents.
Figure 1. Example of traffic data collection points.
Traffic data collection points example

This article describes how to build a traffic data gateway that gathers raw data, transforms it into a standard conformant format, and loads it into the IBM Intelligent Transportation system.

In the following sections, we identify and give a brief overview of the challenges facing data collection and integration using a real-life use case. Further, we present the overall architectural view of IBM Intelligent Transportation with special focus on the raw traffic data-collection component. Additionally, we describe a detailed implementation using IBM InfoSphere Information Server to extract, cleanse, transform, and load the incoming traffic data in near real time. This data is transformed into Traffic Management Data Dictionary (TMDD) V3.0 standard format and made available to traffic engineers and planners to conduct freeway operational analyses, bottleneck identification, and evaluation of advanced control strategies.


Typical roadside sensor configuration

Figure 2 depicts a typical configuration in an urban environment, where traffic from an arterial road feeds into the primary road with traffic being regulated by a traffic signal system. However, the traffic at this intersection is variable throughout the day, and the traffic operators can improve traffic flow by controlling the duration of the signal at this intersection based on volume, average speed, and length-of-queue data at the intersection.

Figure 2. Typical configuration of road traffic feeds in an urban environment
Typical configuration of road traffic feeds

Data of this nature is acquired, in this case, through two loop detectors, (labeled A and B in Figure 2), which detect vehicles passing over them. One detector is on the arterial road and another is on the primary road before the intersection. Continuous wave microwave radar on the primary road after the intersection also monitors the traffic flow. The traffic operator uses this information to make decisions on the duration of the signal.

Loop detectors are among the most widely deployed stationary traffic detectors and can be configured to detect volume of traffic, speed, and queue of vehicles at a given section of the road. Continuous wave microwave radar is non-invasive and is used when excessive interference can be avoided.

All data feed is continuous (streaming) and needs to be cleansed and transformed in a logically consistent fashion that is representative of the physical world. Using standards, such as TMDD, helps achieve that consistency.

Subsequent sections detail how the IBM Intelligent Transportation product, combined with a traffic data gateway, helps take the sensor feed data and make it available for productive use to operators of traffic management centers.


Implementation challenges

The primary traffic data source today is loop detector data from traffic monitoring stations.

Loop detector data requires investment in physical infrastructure to expand traffic monitoring coverage, plus ongoing maintenance expenditure. New technologies (for example, traffic data from wireless networks and non-pavement-embedded sources) have the potential to assist in obtaining system-monitoring data by providing greater coverage at a fraction of the cost associated with more permanent data collection technologies. The most commonly available data today is cell-phone usage data due to the high penetration rate of mobile phones. Also significant is the increasing penetration of smart phones with GPS sensors that provide location data and yield accurate velocity information. Additionally, a wide range of data collection methods is offered including GPS probes, Bluetooth sensors, and imaging and radar technologies.

Secondary data sources (data vendors) also collect, process, and repackage data for sale.

The combination of various data types is referred to as data fusion, as shown in Figure 3. Data fusion is the process of combining traffic data of various types (speed, count, and so on) from multiple sources to obtain estimates of traffic conditions for an entire roadway. In general, data fusion involves capturing data from multiple sources, filtering the data to remove unnecessary artifacts, combining the data in a mathematical model loaded in to a data warehouse, outputting estimates or predictions of traffic conditions, and visualizing the output.

Figure 3. Traffic data collection process flow for permanent data collection technology
Traffic data collection process flow

There are several challenges to such an implementation:

  • Large volumes of data from disparate sources have to be collected in a timely manner to support real-time traffic representation.
  • Multiple clean-up strategies are required to manage the error characteristics of the different data streams and handle gaps in the data.
  • Process management strategies have to be developed to successfully execute the fusion process using a mix of multiple data streams with overlapping values.
  • The sustainability of such a process has to be examined and the risks mitigated given the inherent variability, complexity, and growth.

The market of real-time traffic information with different types of data sources is growing. Suppliers can provide new services, at competitive costs, to more users who are willing to pay for having precise traffic information in real time. However, major obstacles for success are a clear policy framework and standards, better transparency about the technology performance from the providers, and improved synergy between the actors (mobile phone operators, traffic engineering, service providers).


Accessing traffic data

IBM Intelligent Transportation obtains traffic data in real time from a city’s Transportation Management Center (TMC). The data is transferred through the city’s wide area network to IBM Intelligent Transportation, which connects all other city districts’ data managed by multiple TMCs. Traffic operators can access IBM Intelligent Transportation over the Internet through a web browser. The software architecture is modular and open; it uses industry-leading software commercially available for communication and computation. An overview of the system components is shown in Figure 4. (View a larger version of Figure 4.)

Figure 4. IBM Intelligent Transportation multi-tier architecture model and standards support
IBM Intelligent Transportation multi-tier architecture model and standards support

IBM Intelligent Transportation expects data originating from traffic detectors to be cleansed, filtered, and transformed to the TMDD XML format. Often this is done by the Advanced Traffic Management Systems (ATMS) operating at the Traffic Management Center; however, many times the ATMS system does not support the TMDD XML format or does not exist. The remaining sections of this article show how a traffic data gateway can consume raw traffic data from field sensors and output TMDD XML traffic data that can be integrated with IBM Intelligent Transportation.


Traffic data collection and processing

The sample traffic data used in this article was collected from single loop detectors for approximately 28 links over a period of one week. The road shown in Figure 5 has four lanes and one loop sensor; this means that there is one road and four points that link to four traffic-flow data records from each loop sensor. The time interval of loop-sensor data collection (input) is 30 seconds, and the time interval of output is five minutes.

Figure 5. View of the speed and volume detectors providing the sample traffic data
View of the speed and volume detectors providing the sample traffic data

The 30-second data received from the loop sensor consists of counts (for example, the number of vehicles crossing the loop) and occupancy (for example, the average fraction of time a vehicle is present over the loop). A data-capturing component processes the data in real time and aggregates 30-second values of counts and occupancy.

  • It calculates the speed for each lane.
  • It calculates the aggregated value of flow, occupancy (flow data shows the count of vehicles passing over a loop detector every 30 seconds, and occupancy data shows how long vehicles are present over this loop detector in percent of an hour), and speed across all lanes at each detector station. (One station typically serves the detectors in all the lanes at one location.)

The TMC uses the five-minute average values of flow and speed to compute the following performance measures:

  • VMT (vehicle-miles traveled)
  • VHT (vehicle-hours traveled)
  • Delay
  • Travel time

Details of the processed outcome traffic data are references in the sample files in the Download section of this article.

The traffic data collected by the TMC is then sent to a traffic data gateway for further cleansing and filtering, in addition to transformation to the TMDD standards conformance format as described in the following section. The steps followed for cleansing and filtering the data use IBM InfoSphere DataStage and IBM InfoSphere QualityStage to perform the traffic data extraction, transformation, and loading (ETL) jobs as shown in Figure 6.

Figure 6. Detailed view of the IIS ETL jobs for traffic data
Detailed view of the IIS ETL jobs for traffic data

Traffic data extraction

This section is intended to provide a general guide of the InfoSphere DataStage and InfoSphere QualityStage jobs typically developed to extract, cleanse, transform, and load traffic data. Sample code is provided in the sample files (trafficxpath3.dsx), see Download.

Traffic data cleansing job

  • Function description: Designing and developing jobs to cleanse the data by extracting the data fields that contain data and eliminating the data fields that did not contain values.

Figure 7 and Figure 8 show deploying and applying the extraction. IBM InfoSphere DataStage and QualityStage Designer provide a Microsoft® Windows® client that enables developers to design data integration and data cleansing jobs without having to write code by using a graphical-based user interface. The integration process is drawn, and then the details are added thereafter for each stage. In these examples, the process for extracting data from a file is graphically defined and the attributes that will be maintained are identified.

Figure 7a. InfoSphere DataStage and QualityStage Designer Data cleansing job details
InfoSphere DataStage and QualityStage Designer Data cleansing job details

(View a larger version of Figure 7a.)

Figure 7b. InfoSphere DataStage and QualityStage Designer Data cleansing job details
InfoSphere DataStage and QualityStage Designer Data cleansing job details

Input

Figure 8 shows a sample of the attributes of the source traffic data.

Analyzing source data by use of InfoSphere QualityStage indicated that there were fields of data that did not have an input value (empty columns can be seen). The input of the extract job (TrafficDataIn.txt) is provided in the Download. (View a larger version of Figure 8.)

Figure 8. InfoSphere DataStage and QualityStage Designer input data for the extraction job
InfoSphere DataStage and QualityStage Designer input data for the extraction job

Output

The following is a view of the data that was generated through the ETL job. As shown in Figure 9, the columns with non-values were removed by the ETL job to load only columns that are required for further downstream processing.

The output of the extract job (CleanseData.txt) is provided in the Download. More details about the output data table model are shown in Figure 9. (View a larger version of Figure 9.)

Figure 9. InfoSphere DataStage and QualityStage Designer output data for the extraction job
InfoSphere DataStage and QualityStage Designer output data for the extraction job

Traffic data filtering job

The filter stage, shown in Figure 10, is a processing stage that transfers records from an input file that satisfies specified requirements and filters out all other records. In this case, rule sets were developed to filter traffic data to eliminate sensor's data that contained an error. The data source input and output specifications shown in Figure 11 and Figure 12 are examples.

The data records that were kept are those that conform to a specified predicate. This predicate determines the valid data records or rows whose data entries satisfy the following logical expression:

(VOLUME>0 AND SPEED>0) OR (VOLUME=0 AND SPEED=0).

Figure 10. InfoSphere DataStage and QualityStage Designer data filtering job details
InfoSphere DataStage and QualityStage Designer data filtering job details
InfoSphere DataStage and QualityStage Designer data filtering job details

Input

Deploy and run the filtering job, as shown in Figure 11. The input of the filter job (CleansedData.txt) is provided in the Download.

Figure 11. InfoSphere DataStage and QualityStage Designer input data of filtering job
InfoSphere DataStage and QualityStage Designer input data of filtering job

Output

The output of the filter job (FilteredData.txt) is provided in the Download.

Figure 12. InfoSphere DataStage and QualityStage Designer output data of filtering job
InfoSphere DataStage and QualityStage Designer output data of filtering job

Traffic data transformation jobs

Several data transformation jobs, see Figure 13, were created and executed to transform the traffic detector data into the format that is specified by TMDD standards. The Column Import stage shown in this example is a restructure stage that imports data from a single column and outputs it to one or more columns. This stage is used to divide data arriving in a single column into multiple columns. (View a larger version of Figure 13.)

Figure 13. DataStage and QualityStage data transformation jobs
DataStage and QualityStage data transformation jobs

The input of the transformation jobs (FilteredData.txt) is provided in the Download. Figure 14 shows a sample of the input data.

Figure 14. InfoSphere DataStage and QualityStage Designer input data of transformation jobs
InfoSphere DataStage and QualityStage Designer input data of transformation jobs

Output

The input of the transformation jobs (binarydatetimeFilteredData15.txt) is provided in the Download. Figure 15 shows a sample of the output data. (View a larger version of Figure 15.)

Figure 15. InfoSphere DataStage and QualityStage Designer job output
InfoSphere DataStage and QualityStage Designer job output

Transform error-free traffic data stream into TMDD XML data

IBM Intelligent Transportation supports interfacing with Traffic Management Centers and Advanced Traffic Management Systems (ATMS) using the Traffic Management Data Dictionary (TMDD) standard from the Institute of Transportation Engineers (ITE). TMDD V3 not only standardizes the data objects for traffic and event data, but also defines the messages and dialogs to be exchanged between systems in a U.S. ITS National Architecture Center-to-Center (C2C) pattern. (See Resources.)

In this C2C pattern of communication, TMDD defines the abstract interface between an owner center and an external center. The owner center is the organization or system that is capturing and processing the raw traffic and event data from the field and thus owns that information, while the external center is the organization or system that is interested in receiving the traffic and event data from the owner center. For the purpose of integration with IBM Intelligent Transportation, IBM Intelligent Transportation fills the role of the external center, and organizations or systems providing data to it fill the roll of owner centers. Figure 16 shows a conceptual view for the operational environment of a C2C interface as defined by TMDD standards.

Figure 16. External traffic management center communications environment (source: TMDD standard documentation)
External traffic management center communications environment

The next step in preparing the sample data presented for feeding into IBM Intelligent Transportation is to convert the cleansed traffic data into appropriate TMDD XML data object formats and implement a web service that supports processing of relevant TMDD Owner Center dialogs. The web service must support the receiving of TMDD dialogs within Simple Object Access Protocol (SOAP) envelopes over HTTP in accordance with the SOAP/HTTP application profile of the NTCIP 2306 standard. (See Resources.) It is important to note that the TMDD dialogs, which conform to one of the three generic request-response, subscription, or publication dialogs (as shown in Figure 17, Figure 18, and Figure 19), are defined and intended to be used to communicate and load near real-time traffic, events, and device data into the Intelligent Transportation operational data store.

Figure 17. Generic TMDD request – response dialog
Generic TMDD request – response dialog
Figure 18. Generic TMDD subscription dialog
Generic TMDD subscription dialog
Figure 19. Generic TMDD publication dialog
Generic TMDD publication dialog

To help with implementing TMDD-based owner center interfaces, IBM has built a Traffic Management Owner Center Simulator. This sample owner center implements the TMDD dialogs with the corresponding owner center web services and loads TMDD XML data object data from a file system. For the purposes of this article and sample data, the InfoSphere DataStage XMLOut Stage can be used to convert the cleansed traffic data presented above to TMDD XML data object files. These data object files can then be picked up by the Traffic Management Owner Center Simulator to implement a full TMDD owner center interface that can be used by IBM Intelligent Transportation to gather and store this sample traffic data.

Traffic Management Owner Center Simulator

Editor's Note: This article will be updated to include information on where you can download the Traffic Management Owner Center Simulator when it becomes available soon.

At a high level, the following steps should be taken to transform cleansed traffic data, such as what is shown above, to the appropriate TMDD XML data object files that can be consumed by the Traffic Management Owner Center Simulator:

  1. Design how sample traffic data and metadata should be represented with various TMDD data objects.
  2. Map appropriate fields from sample traffic data to the selected TMDD data object to represent the traffic data.
  3. Transform cleansed traffic data, based on the preceding mapping, to the TMDD XML format using the InfoSphere DataStage Transformation Utility.
  4. Validate that the generated set of TMDD XML files conform to the XML schema provided by the Traffic Management Owner Center Simulator, using a simple XML validation tool, such as that built into Eclipse.
  5. Map appropriate fields from traffic metadata to this TMDD object, for each selected metadata TMDD object.
  6. Transform traffic metadata to the TMDD XML format for that object, using InfoSphere DataStage.
  7. Validate that the generated TMDD XML file conforms to the XML schema provided by the Traffic Management Owner Center Simulator, using a simple XML validation tool, such as one built into Eclipse.

Now that a high-level process has been defined, the rest of this section walks through this process to show how we completed the sample traffic data and metadata shown in this article.

Design how sample traffic data and metadata should be represented with various TMDD data objects

To best design how the sample traffic data and metadata should be represented with various TMDD data objects, you must first understand to some degree how each representation views the world of traffic management. The better you understand how the two different models or representations view the same physical transportation world, the more accurately you can design a mapping between the two different representations. This article does not discuss in depth either of these data models or why mapping was designed as shown but rather shows you an example of a mapping.

The metadata associated with the sample traffic data has been modeled into these high-level objects: roads, links, nodes, points, and geo-locations. In the TMDD standard, the set of metadata objects is mostly the objects focused on inventory. For example, in the traffic data world, the primary metadata objects include NodeInventory and LinkInventory. These two models and sets of objects can be mapped fairly easily, as shown in Table 1; this easy mapping occurs because they share the same terminology, and they represent the same physical entities. The sample traffic data itself originally came from loop detectors in the field, and thus mapping to TMDD detector objects is a logical fit.

Table 1. Mapping between sample traffic data and TMDD data objects
Sample traffic data concept TMDD data objects
node (with geo-location associations) NodeInventory
link (with road, node, and geo-location associations) LinkInventory
point (with link and geo-location associations) DetectorInventory
traffic data DetectorData

Map appropriate fields from sample traffic data to the TMDD data object

Table 2 shows the mapping from the traffic data fields to the TMDD DetectorData object, based on the understanding of what each field represents in each corresponding model.

Table 2. Mapping from TMDD detector data fields to the sample traffic data column
TMDD detector data field Sample traffic data column
detector-id POINTID
detection-time-stamp (date and time) DATEYMD & TIMEHMS
vehicle-count VOLUME
vehicle-occupancy OCCUPANCY
start-time (date and time) DATEYMD & TIMEHMS (-30 seconds)
end-time (date and time) DATEYMD & TIMEHMS
detector-data-type “actual” (static string)
vehicle-speed SPEED

Transform cleansed traffic data, based on the preceding mapping, to the TMDD XML format

To generate a TMDD XML document that corresponds to the preceding mapping using InfoSphere DataStage, the filtered traffic raw data had to be transformed, through several jobs, to get it in a format that makes it easier for the InfoSphere DataStage XMLOut job. Below are some of the details of the XMLOut job, the incoming data format, and a sample output data format. Sample files ("devWorks_TMDD_Data" subdirectory) are provided in the Download of this article.

Traffic detector data XML transformation stage

  • Function description: InfoSphere DataStage and QualityStage Designer provide a processing job XMLOut stage to transform a comma-based input file (see Figure 20) into an XML file format (see Figure 21). This file (DetectorData_2009-06-01T00_37_59.xml) is also provided in the Download section of this article.
Figure 20. InfoSphere DataStage and QualityStage Designer XMLOut stage
InfoSphere DataStage and QualityStage Designer XMLOut stage

(View a larger version of Figure 20.)

Figure 21. Sample Detector Data TMDD XML format
Sample Detector Data TMDD XML format

(View a larger version of Figure 21.)

The TMDD XML metadata schema shown in Figure 22 is used to generate the following graphic, which depicts a structure that contains traffic detector data information. In the schema, a document might contain multiple detector-id elements, and each detector-id, in turn, might have multiple vehicle-count and vehicle-occupancy elements. Finally, a detectorData can have multiple detector-id elements.

Figure 22. InfoSphere DataStage and QualityStage imported XML metadata of traffic detector data
InfoSphere DataStage and QualityStage imported XML metadata of traffic detector data

(View a larger version of Figure 22.)

Input

The cleansed and filtered traffic data is used as the input file for this final XML transformation stage. The data elements in this file have to be mapped to the metadata TMDD traffic data detector elements, see Figure 23, Figure 24, and Figure 25.

The source file (binarydatetimeFilteredData15fix1.txt) that is used, the job export file (trafficxpath2.dsx), and the source and XML schema definition (DetectorData_2009-06-01T00_37_59.xml) are provided in the Download section of this article.

Figure 23. DataStage mapping of TMDD traffic data detector metadata elements to XML table definitions
DataStage mapping of TMDD traffic data detector metadata elements to XML table definitions

(View a larger version of Figure 23.)

Figure 24. XMLOutputPX stage configuration
XMLOutputPX stage configuration

(View a larger version of Figure 24.)

Figure 25. XMLOutputPX stage configuration
XMLOutputPX stage configuration

(View a larger version of Figure 25.)

Output

The final XML output stage receives the results coming out of the transformer stage and produces the target TMDD XML as shown in Listing 1.

After the output files from InfoSphere DataStage are created, the last step for transforming this data in a format that the Traffic Management Owner Center Simulator will accept is to name the files in a manner consistent with the given structure. This naming is a relatively minor step and one that was done manually for the purposes of this article. The output (devWorks_TMDD_Data.zip subdirectory "DetectorData") is provided in the Download section of this article.

Listing 1. TMDD XML file generated by InfoSphere DataStage and QualityStage
<?xml version="1.0" encoding="UTF-8"?>
<tns:DetectorDataTimeEntries
	xmlns:p1="http://www.ntcip.org/c2f-object-references" 
                  xmlns:p2="http://www.ntcip.org/c2c-message-administration"
	xmlns:p3="http://www.LRMS-Adopted-02-00-00" 
                  xmlns:p4="http://www.LRMS-Local-02-00-00"
	xmlns:p5="http://www.ITIS-Adopted-03-00-02" 
                  xmlns:p6="http://www.ITIS-Local-03-00-02"
	xmlns:tmdd="http://www.tmdd.org/3/messages" 
                  xmlns:tns="http://www.ibm.com/xmlns/transportation/tmdd/simulation"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<tns:DetectorDataTimeEntry entryStartTime="2009-06-01T00:37:59.000000">
		<tns:DetectorData>
			<tmdd:detector-id>10</tmdd:detector-id>
			<tmdd:detection-time-stamp>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003759</tmdd:time>
			</tmdd:detection-time-stamp>
			<tmdd:vehicle-count>0</tmdd:vehicle-count>
			<tmdd:vehicle-occupancy>0</tmdd:vehicle-occupancy>
			<tmdd:start-time>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003729</tmdd:time>
			</tmdd:start-time>
			<tmdd:end-time>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003759</tmdd:time>
			</tmdd:end-time>
			<tmdd:detector-data-type>actual</tmdd:detector-data-type>
			<tmdd:vehicle-speed>0</tmdd:vehicle-speed>
		</tns:DetectorData>
		<tns:DetectorData>
			<tmdd:detector-id>12</tmdd:detector-id>
			<tmdd:detection-time-stamp>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003759</tmdd:time>
			</tmdd:detection-time-stamp>
			<tmdd:vehicle-count>0</tmdd:vehicle-count>
			<tmdd:vehicle-occupancy>0</tmdd:vehicle-occupancy>
			<tmdd:start-time>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003729</tmdd:time>
			</tmdd:start-time>
			<tmdd:end-time>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003759</tmdd:time>
			</tmdd:end-time>
			<tmdd:detector-data-type>actual</tmdd:detector-data-type>
			<tmdd:vehicle-speed>0</tmdd:vehicle-speed>
		</tns:DetectorData>
		<tns:DetectorData>
			<tmdd:detector-id>14</tmdd:detector-id>
			<tmdd:detection-time-stamp>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003759</tmdd:time>
			</tmdd:detection-time-stamp>
			<tmdd:vehicle-count>1</tmdd:vehicle-count>
			<tmdd:vehicle-occupancy>3</tmdd:vehicle-occupancy>
			<tmdd:start-time>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003729</tmdd:time>
			</tmdd:start-time>
			<tmdd:end-time>
				<tmdd:date>20090601</tmdd:date>
				<tmdd:time>003759</tmdd:time>
			</tmdd:end-time>
			<tmdd:detector-data-type>actual</tmdd:detector-data-type>
			<tmdd:vehicle-speed>1</tmdd:vehicle-speed>
		</tns:DetectorData>
	</tns:DetectorDataTimeEntry>

</tns:DetectorDataTimeEntries>

Validate that the TMDD XML files conform to the XML schema

Using a simple XML validation tool, such as the one built into Eclipse, validate that the generated set of TMDD XML files conforms to the XML schema provided by the Traffic Management Owner Center Simulator.

For this validation exercise, the set of generated XML files, named according to the Traffic Management Owner Center Simulator expected format, are loaded into Eclipse, and then the XML validation is run against this entire set. Figure 26 displays what the validation looks like when completed with no errors or warnings.

Figure 26. Display of Eclipse XML validation of DetectorData files
Display of Eclipse XML validation of DetectorData files

For each selected metadata TMDD object, map appropriate fields from traffic metadata to this TMDD object.

There are a total of three different TMDD data objects that need to be created in TMDD XML format to best represent the metadata for these traffic detectors from which the sample traffic data has come. These three, as listed in Table 1, are NodeInventory, LinkInventory, and DetectorInventory. Table 3, Table 4, and Table 5 capture the mappings from the metadata objects to these three objects. As seen in these tables, some of the fields are static, some of the fields are simple mappings, and some of the fields require looking at several different values and tables.

Table 3. NodeInventory mapping to metadata
NodeInventory TMDD fields Mapping to metadata
network-id Static field to be set
node-id Node::Node_ID
node-name Node::Node_Name
node-location::latitude Geo_info::Latitude (where geo_info's link_id matches a link where either this node is the “from node” and geo_info::pointtype=1 OR where this node is the “to node” and geo_info::pointtype=2)
node-location::longitude Geo_info::Longitude (where geo_info's link_id matches a link where either this node is the “from node” and geo_info::pointtype=1 OR where this node is the “to node” and geo_info::pointtype=2)
last-update-time Static field to be set
Table 4. LinkInventory mapping to metadata
LinkInventory TMDD fields Mapping to metadata
network-id Static field to be set
link-id Link::Link_ID
link-name Link::Link_Name
link-type Map from Road::Road_Type (where Link::Road_ID = Road::Road_ID) {freeway=Highway}
link-begin-node-id Link::F_Node
link-begin-node-location::latitude Geo_info::Latitude (where Link::Link_ID=Geo_info::Link_ID and Geo_info::Pointtype=1)
link-begin-node-location::longitude Geo_info::Longitude (where Link::Link_ID=Geo_info::Link_ID and Geo_info::Pointtype=1)
link-end-node-id Link:T_Node
link-end-node-location::latitude Geo_info::Latitude (where Link::Link_ID=Geo_info::Link_ID and Geo_info::Pointtype=2)
link-end-node-location::longitude Geo_info::Longitude (where Link::Link_ID=Geo_info::Link_ID and Geo_info::Pointtype=2)
link-length Link::Link_Len
last-update-time Static field to be set
Table 5. DetectorInventory mapping to metadata
DetectorInventory TMDD fields Mapping to metadata
organization-id Static field to be set
device-id Point_ID
device-location::latitude Latitude of coresponding link (midpoint)
device-location::longitude Longitude of coresponding link (midpoint)
device-name “Device at Point “ + Point_ID
link-id Link_ID
link-name Link::Link_name (where Link_ID=Link:Link_ID)
link-direction Direction
last-update-time Static field to be set
detector-type “inductive loop”
detection-lanes::lanes One “lanes” entry set to Point::Road_Lane

Transform traffic metadata to the TMDD XML format for that object

For this step, three different TMDD metadata objects need to be created based on the preceding design. InfoSphere DataStage transformation and XMLOut stages can definitely be used to generate these files. InfoSphere FastTrack can be used to specify the mappings that InfoSphere DataStage can execute.

Validate that the generated TMDD XML file conforms to the XML schema

Using a similar process as described in Step 4 in the preceding list of instructions, the three TMDD inventory data files are validated using Eclipse and the XML schema from the Traffic Management Owner Center Simulator.

The combination of these three files, a static OrganizationInformation.xml file, and the detector data files form the simulator input data structure. This data is now ready to be used by the Traffic Management Owner Center Simulator to feed into IBM Intelligent Transportation.


Load the TMDD XML messages to the Traffic Management Owner Center Simulator

The last step in this process is to create an owner center and to load into it the sample data (see "devWorks_TMDD_Data.zip" in the Download) that you have cleaned and converted so that it can be sent to the external center when it is requested.

We discussed and diagrammed the various methods for sending data between the owner center and the external center earlier under, "Transforming error free traffic data stream into TMDD XML data." Those methods are: request-response, subscription, and publication dialogs (as shown in the figures in the previous section). Any of these three methods is a valid mechanism to communicate and load near real-time traffic, events, and device data into the IBM Intelligent Transportation operational data store (EC).

The steps in this section cover the creation of a simulated owner center in WebSphere Application Server, the loading of sample data or the data you have just cleansed and converted if you are following along in this article, and the use of the simulator to begin sending that data to the external center using SOAP. Follow these steps:

  1. Preparing and running TMDD owner center (OC) code on WebSphere Application Server:
    1. Ensure that you have a clean installation of WebSphere Application Server V7.0.
    2. Enable start-up bean service on WebSphere Application Server by doing the following:
      1. Open the WebSphere Application Server Administration console.
      2. Browse to Servers, then WebSphere Application Servers.
      3. Select server1.
      4. Expand Container Services under Container Settings.
      5. Select the Startup Beans Service option.
      6. Select the Enable service at server startup option.
      7. Click OK.
      8. Save the configuration.
      9. Restart WebSphere Application Server to ensure that the setting is saved. (This can also be done after increasing the Java™ Virtual Machine heap size, see step C that follows.)
      10. To stop WebSphere Application Server from the command line on the server, enter the following command from the <WAS install directory>/bin:
        ./stopServer.sh server1
      11. To start WebSphere Application Server from the command line on the server, enter the following command from the <WAS install directory>/bin:
        ./startServer.sh server1
    3. Increase Java Virtual Machine heap size appropriately in WebSphere Application Server
      1. Open the WebSphere Application Server Administration console.
      2. Browse to Servers, then WebSphere Application servers.
      3. Select server1.
      4. Expand Java and Process Management under Server Infrastructure.
      5. Select process definition option.
      6. Select Java Virtual Machine from the properties on the right.
      7. Set the maximum heap size to 1024 MB.
      8. Click OK.
      9. Save the configuration.
      10. Restart WebSphere Application Server to ensure that the setting is saved (this step can be done in conjunction with step B above) – see the previous set of steps for instructions on stopping and starting the server.
    4. Install the TMDD Simulator EAR to WebSphere Application Server:
      1. Open the WebSphere Application Server admininstration console.
      2. On the left navigation, select Applications then New Application.
      3. Select New Enterprise Application, as shown in Figure 27.
        Figure 27. WebSphere Application Server – New Enterprise Application
        WebSphere Application Server – New Enterprise Application
      4. Select the location of the TMDD TMD Simulator.ear file that should be part of the TMDD Simulator.
      5. Use the default settings except select the Deploy Web services option as shown in Figure 28.
        Figure 28. Installation options for TMDD Simulator EAR on WebSphere Application Server
        Installation options for TMDD Simulator EAR on WebSphere Application Server
    5. Copy the sample data file structure, including all files, to the server's file directory:
      1. Download the TMDD Simulation Data (devWorks_TMDD_Data.zip); refer to the Download section of this article
      2. Extract devWorks_TMDD_Data.zip. (The default is to extract to "c:/" if you are working on a Microsoft Windows system) Data has to be local to the WebSphere Application Server.
      3. You can extract to any other directory as long as you set the WebSphere variable TMDD_SIMULATOR_DATA_DIRECTORY to the correct directory. See the instructions that follow.
    6. Update the WebSphere Application Server variable that points to the directory.

      Note: The default value for the variable is C:/TMDD_Simulator_Data/. If you installed the data there and it's on a Windows system, you don’t need to do anything.

      1. To change the variable, open the WebSphere Application Server Administration console.
      2. Expand the Environment menu.
      3. Select the WebSphere Variables option.
      4. Select Node in the scope drop-down menu.
      5. Select New to create a new variable.
      6. Set the name of the variable to TMDD_SIMULATOR_DATA_DIRECTORY.
      7. Set the directory (be sure to include "/" at end of the directory name) to the data directory. See Figure 29.
        Figure 29. WebSphere Application Server Variable Management
        WebSphere Application Server Variable Management
      8. Click Apply.
      9. You must restart the server for the new value to be picked up by the simulator application.
    7. Start the TMDD TMC Simulator enterprise application:
      1. Open the WebSphere Application Server Administration console.
      2. Expand the Applications menu.
      3. Select WebSphere enterprise applications.
    8. Verify that the TMDD TMC Simulator application started successfully and is running, as shown in Figure 30.
      Figure 30. TMDD TMC Simulator running on WebSphere Application Server
      Figure 30. TMDD TMC Simulator running on WebSphere Application Server

      If it is not running, check the server logs to see why. It might be that you mistyped the name of the sample data directory. The trace.log in <WAS install dir>/profiles/AppSrv01/logs/server1 contains startup and failure information. An example in Figure 31 shows that it checks the WebSphere Application Server variable and then goes back to the default. The reason in this case was that the actual directory did not match what was found in the WebSphere Application Server variable, and so it defaulted back.

      Figure 31. Sample log file from WebSphere Application Server
      Figure 31. Sample log file from WebSphere Application Server

      (View a larger version of Figure 31.)

  2. Run the owner center and sending data to the external center.

    You now have a functional owner center that is reading TMDD XML from a local directory and is ready to send it to a requesting external center. Here are instructions for configuring the simulator and using it.

    1. Configure the simulator owner center according to the desired preferences:
      1. Open TMDD_Simulator_Data/simulation.properties in a text editor (the directory will vary depending on where you installed it on the server).
      2. Edit the configuration properties as desired.
      3. Save and close the properties file.
    2. Find and open the Simulator Administrator Console:
      1. Identify the ports that the simulator is listening on (you can find this information in the trace.log referred to above).
      2. Open the Simulator Administrator Console http://server-host:port/Simulator_Web_Admin/ including the port number identified in the previous step.
      3. The screen should look like Figure 32.
        Figure 32. View of successfully running Traffic Management Owner Center
        View of successfully running Traffic Management Owner Center
      4. If you modified the configuration in the preceding steps, verify that those configurations are reflected.
      5. If they are not, click Reload Configuration.
      6. Verify that the configuration values are updated.
    3. After the simulator is installed and running correctly, start sending data to a TMDD External Center:
      1. To start the simulator sending data, click Start Simulation.
      2. After you start the simulation, you can tell that it's sending data because the simulation time continues changing as it cycles through the sample owner center data as shown in Figure 33.
        Figure 33. Running simulator view from Simulator Administration Console
        Running simulator view from Simulator Administration Console

        You can verify this transmission by checking that the external center is receiving the data. Figure 34 and Figure 35 show receipt of the data in a simulated external center.

        You see the data request and subscription commands being processed, as well as receipt of the data.

        Figure 34. Sample testing output showing sample data running correctly
        Sample testing output showing sample data running correctly

        In Figure 35, you can see the external center simulator receiving the SOAP envelope metadata as well as the XML data.

        Figure 35. Sample TCP/IP monitor showing TMDD XML being sent from simulator
        Sample TCP/IP monitor showing TMDD XML being sent from simulator

        (View a larger version of Figure 35.)


Conclusion

This article is meant to have demonstrated how to consume data that is provided in relatively arbitrary formats, using the traffic data gateway, into the IBM Intelligent Transportation system. We have included step-by-step instructions and samples of both code and data to help build your own integration to data sources. We hope that the material we have provided will help make you more productive when you build your traffic data integration.


Acknowledgement

The authors wish to thank Curt Brobst for his contributions to this article.


Download

DescriptionNameSize
Sample code and data for this articleSampleCodeandData.zip47KB

Resources

Learn

Get products and technologies

  • Try out IBM software at no cost. Download a trial version, log in to an online trial, work with a product in a sandbox environment, or access it through the cloud. Choose from over 100 IBM product trials.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere, Information Management
ArticleID=732807
ArticleTitle=Integrate traffic data with IBM Intelligent Transportation using a traffic data gateway
publish-date=07152011