Machine data analytics

Drop-in-place security and safety monitors

The effort to improve cyber-physical security and increase safety systems has spawned a growing industry, focused on surmounting the challenge. The use of closed-circuit TV analog cameras is rapidly being replaced by higher-definition, feature-rich digital cameras for image-based security and safety systems. In comparison, digital cameras are more flexible, smarter, and more integrated with cloud services and data analytics. This article describes drop-in-place safety and security monitors, combined with cloud-based data analytics, which make it possible to quickly deploy monitoring in areas without power or networking infrastructure.

Share:

Sam B. Siewert, Assistant Professor, University of Alaska Anchorage

Sam Siewert photoDr. Sam Siewert is an assistant professor in the Computer Science and Engineering department at the University of Alaska Anchorage. He is also an adjunct assistant professor at the University of Colorado at Boulder and teaches several summer courses in the Electrical, Computer, and Energy Engineering department. As a computer system design engineer, Dr. Siewert has worked in the aerospace, telecommunications, and storage industries since 1988. Ongoing interests as a researcher and consultant include scalable systems, computer and machine vision, hybrid reconfigurable architecture, and operating systems. Related research interests include real-time theory, digital media, and fundamental computer architecture.



14 January 2014

Also available in Russian Japanese

In the not-too-distant future, we are likely to rely on a rapidly evolving machine-to-machine infrastructure with advanced sensing that closely rivals (and in some cases, surpasses) human capabilities. Intelligent transportation systems and self-driving cars have already been demonstrated. Toyota and Audi are joining Google in field testing similar technology in Nevada. Today, the autonomous operations are enabled by using costly light-detection and ranging (LIDAR) technology, in addition to simpler instruments and software. But costs for this instrumentation are decreasing, and the potential for visible-spectrum computer vision solutions can also lower the cost of implementation.

Even more interesting are applications such as using vehicles to transmit fleet behavior to optimize traffic flow and using vehicles to communicate with infrastructure such as traffic control systems (stop lights, for example), airport air traffic control equipment, and infrastructure sensors, automation, or control systems found in buildings, on roadways, ports, airports, and transportation systems, in general.

As is already the case, no one needs to stop or even slow down for toll collection. This technology is known as machine-to-machine automation.

Congested and heavily traveled corridors such as Interstate 5 in California could benefit from fleet automation for trucking. Private vehicles might be able to take advantage of this function in less than a decade (see Resources). The Google car has now been around for more than eight years, since Sebastian Thrun's team at Stanford University won the 2005 Defense Advanced Research Projects Agency Challenge, and work goes on. In places much less congested, such as Alaska, similar benefits can come from machine-to-machine advanced sensing to monitor arctic operations, to provide better safety and security for ports, to do environmental surveys and protect resources, and to explore for energy on the north slope (see Resources).

The largest payoff for machine-to-machine systems is likely to be realized in some of the most remote, harsh, and congested environments. In these locations, the savings from automation will most quickly outweigh the initial costs. For remote applications where infrastructure is much more limited, drop-in-place self-powering safety and security monitors have the highest value.

Drop-in-place sensor networks (see Resources) have also been in development for some time. Perhaps the best known are sensor networks often called motes, such as the Berkeley Network Embedded Systems Technology (NEST) and Smart Dust projects (see Resources). Likewise, machine vision has been in use for decades, with automation that employs the full spectrum, from thermal imaging to x-ray. Machine vision often surpasses human ability to inspect fabrication processes and improve safety with remote operations.

This article focuses on drop-in-place deployment of low-cost 3D and multi-spectral imaging for machine-to-machine systems.

Webcams and wearable flash memory cameras

Webcams, dashboard cameras, backup cameras, and wearable solid-state recorder cameras have become so ubiquitous that events that were rarely recorded or witnessed in the past are regularly recorded by the public. One example is the Chelyabinsk meteor that impacted a remote Ural Mountain region of Russia recently. In this situation, scientists relied on amateur video as well as broadcast news (see Resources). Note that some of the video of the event includes fairly graphic depictions of car accidents and the meteor. These images underscore the value of high-fidelity observing in vehicles not only to prevent accidents but also to determine what went wrong. Science, public safety, accident investigators, and insurance agencies find the graphic detail useful. Perhaps this is just the beginning of what's to come.

The value of drop-in-place computer vision

Many safety and security applications that can benefit from computer and machine vision exist outside urban environments with power and data network infrastructure. A simple example is field operations such as those found on the north slope of Alaska for energy production and exploration, as well as for the pipeline (and proposed new natural gas pipeline). Another example is disaster areas where power and data infrastructure has been damaged. Environmental compliance, resource management, and surveys (handled by the U.S. Geological Survey in the United States; see Resources) involve field data collection. Finally, construction projects where power and data might not be fully developed can benefit from drop-in-place computer vision.

In the past, safety and security services relied on piloted aircraft, human hikers, and arduous land navigation at high cost and low frequency. To provide truly useful data, a drop-in-place computer vision platform must rival and even surpass human observation by extending vision into the infrared spectrum and by employing 3D imaging to produce ranging and point-cloud models of environments (see Resources). Point clouds provide a 3D data model of a scene based on multiple camera observations, often with binocular cameras or multiple, coordinated viewpoints, located at the same scene, from many cameras. Any camera that can be dropped in place (perhaps on a tripod) or mounted on a tree can be adapted for use in unmanned, aerial systems. Imagine a multi-spectral 3D camera similar to the popular GoPro wearable cameras (see Resources), but with more sophisticated detectors and computational capability built in.

Along with some research sponsors, and graduate and undergraduate students at the University of Colorado and University of Alaska Anchorage, I have been working on this type of device. Although you can create such a device with off-the-shelf hardware and software, our team decided we needed something better. The following sections describe the steps and required technology.


How to build a drop-in-place 3D camera

Building an off-the-shelf, drop-in-place, 3D camera is fairly straightforward. To make the device practical, however, the power requirements, size, uplink mechanism, data storage, and intelligence of the computer vision processing must be workable. To build this device with open source software and off-the-shelf hardware, you must integrate the following components:

Computational photography

The combination of digital image capture, image processing, graphics, and computer vision integrated into a single device or perhaps a mobile device with a cloud connection has led to the concept of computational photography (see Resources). Think of it as Adobe Photoshop embedded in your digital camera, and you have the basic idea. However, with more scientific uses and field-based computer vision and augmented reality (AR) applications, this concept could be extended to a more general concept of computational photometry for intelligent instrumentation, perhaps using the drop-in-place, 3D and multi-spectral imaging devices described in this article.

  • Computational photometer open hardware can be built readily using the BeagleBoard xM platform from Texas Instruments running Linux®. In fact, I preconfigured a pre-built distribution based on Ubuntu and Debian for this purpose (see Resources). I use this simple recipe as a platform for mobile, embedded computer vision teaching and lab work for university classes. The xM platform has a built-in digital camera port for HD cameras.
  • Solid-state recording with the BeagleBoard xM simply requires file system space on the Linux file system or an external USB flash drive with proper codecs (encoding and decoding software that uses Texas Instruments Open Multimedia Applications Platform [TI-OMAP] hardware acceleration to compress digital video and images with MPEG and JPEG, respectively). The reference image available in the Downloads section includes FFmpeg (also known as avconv) and OpenCV.
  • GStreamer streams video off the open computational photometer that either can't be stored or must be stored in the cloud. This is a new feature that student researchers working on the Open Computational Photometer project intend to integrate in future reference Linux configurations.
  • Integration of field-programmable gate array (FPGA) or GPU co-processing for the camera interface is a key feature to support low-power, highly concurrent pixel transformations for the Open Computational Photometer concept. The project uses the Altera Development and Education boards DE0 and DE2i. This work is unfinished, but you can explore on you own or look for updates from the project in 2014 (see Resources). This integration step is tricky but important and has required the Open Computational Photometer project at the University of Colorado and University of Alaska Anchorage to build custom hardware that will be released as an open hardware reference design by our research sponsors.
  • Battery power for the BeagleBoard xM TI-OMAP and FPGA co-processing is required for the drop-in-place aspect of the open computational photometer. Luckily, many mobile Linux users have already invented battery power options for the BeagleBoard xM. Likewise, Altera has suitable battery-powered solutions for the DE0 Nano (see Resources).
  • Wireless uplink has simple off-the-shelf USB device solutions, including Xbee, ZigBee, and cellular modem GSM (see Resources). The challenge is to enable wireless uplink from really remote locations at MPEG transport bandwidths ranging from 1Mbps and up for standard-definition video. Higher compression available from H.265 (standard released in 2013) and H.264 will help, but uplinks will be lossy video. Urban deployments will of course be much easier and might simply use an 802.11 USB adapter and an urban wireless hotspot.
  • Advanced Linux power management and awareness is also required for the drop-in-place aspect of the open computational photometer. Significantly more work is needed than is presented here, but starting points are general-purpose input/output (I/O) to turn off external devices when not in use (see Resources). A primary reason for using the Altera DE0 and DE0 Nano for the image processing features in the open computational photometer is for higher-efficiency frame transformation compared with digital camera port or USB cameras.
  • Custom binocular camera interface board is the one element that is not off the shelf. The University of Colorado and University of Alaska Anchorage team is working to build this board as open hardware. It is not available to the public yet, but we intend to publish a reference design and would like to identify a manufacturer that might make it available for purchase through a university program and for developers, much like the DE0. Look for it in 2014, after we test our first revision. The team came to the conclusion to build a camera board for the following reasons:
    • Fully open design down to the signal level (for education)
    • Fully time-coordinated image capture from two or more cameras for accurate time and image registration
    • Reliable frame rates and buffer delays
    The most important aspect is that the camera data paths feed directly into an FPGA first in, first out (FIFO) buffer for concurrent state-machine-driven pixel transformations, rather than much more costly CPU-based transformation. The idea is to have the FPGA do as much of the low-level pixel-by-pixel transformations as a computer vision coprocessor, much like a GPU offloads the low-level aspects of raster graphics. We think of the FPGA as a computer vision processing unit (CVPU).

This is by no means an exhaustive list of components required for computational photometer technology, but you can find more to explore in Resources. The goal here is to reset your thinking on computational photography and photometry, to lower cost, and to make computational 3D and multi-spectral drop-in-place instrumentation more accessible for education and research.


Design concept for the computational photometer

In the embedded systems and Capstone programs at many universities, students often build stereo vision systems based on frame grabbers (and a VxWorks driver I adapted from Linux for the Bt878).

Capstone students build custom interface solutions as well, but the cameras are often difficult to integrate (at full frame rate), difficult to embed for mobile use, or simply difficult to deal with because they are proprietary hardware with poor documentation. Additional challenges of low-cost off-the-shelf webcams and mobile cameras include limited options for advanced optics configurations (most lenses are built in, and you have to live with them), awkward packaging, and frame encoding that makes computer vision a challenge (for example, MJPEG with no raw RGB or YUV output).

The old analog National Television System Committee (NTSC) cameras and frame grabbers like the Bt878 remained one of the better options based on large-gamut NTSC color, consistent frame rate, low-level driver integration for direct memory access (DMA) and interrupts, and a huge number of options for optics and complementary metal-oxide semiconductor (CMOS)/charge-coupled device (CCD) detectors that ranged from $10 (U.S.) up to $1,000 (U.S.) but with clear optical advantages for the higher-cost cameras. The webcam is a poor replacement — proprietary, fixed optics; inconsistent frame rates and buffering; limited documentation; and pixel color formats that are lossy compared with using Alpha, Red, Green, and Blue (ARGB) channels for color pixel illumination (YCbCr, for example). Furthermore, for research and education, it would be ideal to feed captured pixel data directly through a FIFO, with data transformation provided by highly concurrent FPGA state machine operations and with highly accurate stereo vision two-channel coordination, time-stamping, and registration.

None of this seems to be available off the shelf, so we started a project to build a two-channel analog camera capture board (using Texas Instruments TVP decoders) for the DE line of Altera FPGA boards. The goal of this project is to use open hardware that the research and education community can use for computational photometry. Alternatives like Camera Link (see Resources) are available for higher-end Altera DE4 boards using daughter cards, but this solution is expensive and not great for students. The project has as a goal a cost for the two-channel card that is no more than a textbook.

Figure 1 shows the conceptual design for the open computational photometer. The key feature to note is that the analog cameras feed directly through dual Texas Instruments analog video decoders into the DE0 or DE2i I/O header and into an FPGA FIFO. This configuration allows the open computational photometer FGPA design to transform data in the FIFO and to produce associated metadata-like timestamps for cross-link on the DE2i over PCIe to the mobile Intel Atom multi-core microprocessor or over dual USB 2.0 links to the TI-OMAP BeagleBoard processors (or any Linux laptop, for that matter). The open computational photometer includes an update to the USB Video Class driver (UVC) to allow this custom binocular camera to interface easily with Linux. Much more technical information will be published in the future, and the reference design will be made available through a university program working with our industry sponsors. Exactly how this will work remains to be determined, but we are planning to test and integrate in spring 2014 and want to release sometime in 2014.

Figure 1. The partially off-the-shelf open computational photometer: Design concept
Image showing the design concept for the partially off-the-shelf open computational photometer

So why not just use webcams, OpenCV, and Linux for the computational photometer?

OpenCV, the Open Computer Vision application programming interface (API) developed by Intel and returned to open source, came about because of the observation that universities researching computer vision and interactive systems benefited greatly from reusable algorithms for image processing. Based on the numerous excellent references for OpenCV, I have created a stereo webcam example (see Downloads).

The example works, but as you'll probably see, the frame rate is not a deterministic 29.97Hz, such as color NTSC closed-circuit cameras provide. The frame rate tends to vary. Second, all of the image transformation is done after the data has been accessed through direct memory access (DMA) from the webcam to the Linux CPU. Assume that they are roughly synchronized in time (probably a safe assumption unless it is a high-motion scene). The real problem with CPU Canny edge transformations, even blurring, sharpening, and more advanced transformations like Hough linear, is that CPUs weren't optimized for this processing. Much like a GPU offloads a CPU from rasterization of render data with vector processing and purpose-built multi-core steam processors, we envision a CVPU, and so does Khronos (OpenVX).

This purpose-built co-processing to offload OpenCV is a huge advantage. OpenCV has hardware acceleration features that use GPUs and general-purpose GPUs, but why not apply the processing directly on the path from the cameras rather than doing I/O from the camera to the CPU out to a GPU back to the CPU. A coprocessor that directly interfaces with low-cost cameras is highly useful — and possible with Camera Link and DE4 daughter cards, but not for the cost of a textbook (comparable with a high-end webcam). For now, let's proceed with the webcam and OpenCV on a Linux laptop (costs $60 at most, if you don't already have a webcam) to explore more. I used Logitech C270 cameras (see Resources).

Frame acquisition and disparity image generation using OpenCV and webcams

Another challenge with webcam stereo vision is computation of intrinsics and extrinsics for the cameras, which is needed to compute distances to objects based on binocular disparity. This process is complicated and requires observations of reference scenes (the OpenCV chessboard) to calibrate camera pixel coordinates to physical coordinates, compute dimensions of the camera optics (detector size and focal length), and account for any out-of-plane orientation of the two focal planes separated on a baseline.

Figure 2 depicts the stereo vision ranging calculation for perfectly planar cameras (detectors lie in the same plane separated by a known baseline, with known detector size and focal lengths that are identical). Non-linearities in the cameras and distortions (spherical aberration — fish-eye or hourglass scene distortions) as well as out-of-plane detectors and differences in the camera detectors and focal lengths all cause significant error in the simple triangulation. Furthermore, for a webcam, you might have trouble finding the focal length and detector size (except by indirect characterization of the camera). Chapter 12 of Learning OpenCV and numerous excellent OpenCV references cover this in much more detail (see Resources).

Ubiquitous imaging of all public areas

High-resolution satellite imagery combined with street cameras have created useful mobile applications such as Google Maps, Google Earth, street view photo sphere, and Microsoft® Virtual Earth among others (see Resources). I would be lost without Google Maps, as I'm sure many of us would. This is simple, already-in-use machine-to-machine automation where the Google cloud feeds geographical information system data to my mobile device to correlate my position, the map, and locations of interest with voice navigation (and maybe translation).

The concept of virtual tourism makes sense, and with more immersive and convincing devices such as Google Glass and Oculus Rift that can better fool our senses and blend reality with virtual presence or rendered worlds, no one needs to miss seeing the wonders of the world. To balance this euphoria, though, note that we must give careful thought on monitoring with good intention vs. privacy invasion. The use of high-resolution and 3D imaging from unmanned aerial systems has been tied up in government deliberation for some time, not only because of privacy concerns but because of safety concerns, too (see Resources).

Figure 2. Simple planar stereo triangulation
Image showing simple planar stereo triangulation

Figure 2 shows the computation for distance to an object registered on each camera detector with a common center pixel. (Some processing is required to find the common point. This topic is not covered in this article, but you can learn more in OpenCV samples/cpp/stereo_match.cpp.) The diagram is conceptual and only works for perfectly aligned cameras and focal planes (a feat that is not really possible), but the figure shows a good concept for the basic math to derive distance to objects observed by a binocular camera, such as the one proposed here or by two simple web cams. Chapter 12 of Learning OpenCV and stereo_match.cpp provide much better computational examples; in fact, stereo_match.cpp can produce a point cloud from a left and right image. A good model for the optical intrinsics and extrinsics for binocular cameras is a huge help, along with accurate alignment. But even with a better optical design, characterization and calibration of each camera is still required and well supported by OpenCV.

Similar to stereo ranging, two cameras (or two eyes) see a common object offset on the left and right image, proportional to the distance to the observed (common) object from the camera baseline. This is a major distance cue that allows for fine judgement of distance to objects that are close and much less accurate judgement for objects far away. As shown in Figure 3, it is possible to compute a disparity image that shades a gray map image based on the disparity between common points in a left and right image. The example has many errors resulting from misalignment, blurring in the right image because of a rapidly moving subject (my 3-year-old son) and lack of calibration, but it does at least produce a disparity estimation. With significantly more work required to learn the parameters in the disparity algorithms in OpenCV and to calibrate and align cameras using tripods or an optical bench, we could get a good disparity map.

Figure 3. Right, left, and disparity image from webcam software
Image showing a right, left, and disparity image from webcam software

The main point of the stereo_capture.cpp download (see Downloads in example-stereo.zip) is to enable you to experiment with and learn OpenCV but also to underscore the value of a high-quality, low-cost binocular camera that is not proprietary and includes a computational coprocessor. As described, webcam frame rates are often not reliable.

Figure 4. Low and inconsistent frame rates typical of web cams
Image showing low and inconsistent frame rates typical of webcams

The open hardware computational photometer

Open hardware is a new idea pioneered by the Raspberry Pi to produce affordable hardware for computing. The Open Computational Photometer project at the University of Colorado and University of Alaska Anchorage intends to make 3D vision and multi-spectral photometry more affordable for education and research. Designed for Linux and open source software such as OpenCV, it won't replace high-end computer and machine vision hardware, but it will make it accessible to students, researchers, and product innovators who can help create tomorrow's vision of the world. Watch for our release in 2014 so you can join us in this effort. Seeing is believing.

The goal for the computational photometer effort under way at the University of Colorado Boulder and University of Alaska Anchorage is to create a reference design for use by other researchers and educators interested in computer vision and instrument building and design. The emphasis is visible 3D image capture, but we hope this opens up more academic open hardware work to include multi-spectral computational photometers, as well, especially infrared. This could be useful, for example, in Alaska for ice observation in the arctic. Watch the comments section of this article for updates.


Conclusion

Drop-in-place advanced computer vision systems are valuable for use in safety and security applications. Join us in supporting open hardware and the Open Computational Photometer project through soon-to-be announced university and developer programs. Even if our design does not work perfectly, much like open source Linux, everyone will have access to improve it. High-quality, affordable stereo computer vision will be accessible to all. The value, of course, goes beyond safety and security to include applications such as agriculture (crop damage assessment from unmanned aerial vehicles and automated pesticide application, for example). For geophysical surveys and monitoring, the ability to create high-quality 3D images and models is valuable to waterway monitoring, volcanoes, forests, and ecosystems. For remote social interaction, these devices are valuable for telemedicine and social networking applications.

One challenge that remains is how to present and display 3D information. To that end, the Point Cloud Library offers options as well as a growing pool of consumer displays. Explore these topics further and consider suggestions for low-cost hardware accelerated computational photometry.

The use of drop-in-place computational photometers and digital video analytics processing in the cloud is only getting started. The promise to provide a safer and more secure world is exciting. At the same time, many people worry that there will never be privacy again. Keep in mind that the more than 7 billion people of the world are basically organic, binocular vision systems that invade your privacy every day. Sites such as EarthCam enable you to view the world from your web page. It's not clear how many Internet-connected cameras exist today. Join the discussion on vision (one of the most valued human senses) and the efforts to extend, replicate, or even improve on it. It all starts with better and smarter cameras.


Downloads

DescriptionNameSize
OpenCV stereo camera examplesexample-stereo.zip96KB
Beagle xM Natty Linux Imagebeagle-xm-image.zip2397372KB

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Big data and analytics on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Big data and analytics
ArticleID=959826
ArticleTitle=Machine data analytics
publish-date=01142014