Editor’s note: This posting was authored by Zvi Kons, researcher in the Speech Technologies group at IBM Research – Haifa
To gather, process, analyze and ultimately separate useful sound from white noise, my team at IBM Research-Haifa is working on new technology for searchable audio analysis as part of the EU-funded project called SMART (Search Engine for Multimedia Environment Generated Content).
Capturing the sights and sounds of city streets to gain insight
Our team collected data from two locations in Santander, Spain. Because the municipality is a partner in the SMART project, they offered to support the technical aspects of the infrastructure needed and are helping test the technology. Cameras and microphones set up in the town square and market area provided continuous audio and visual data of normal daily activity for one month, collecting more than 1,000 hours of data. We analyzed the sounds to note various types of activities, and to identify patterns and anomalies, like peak hours for busy crowds in the market square, traffic, and special events.
Santander city square
|Visual representation of weekly audio from the city square|
The audio from the video above and others produced this diagram that shows a visual representation of the weekly crowd activity level; blue for low activity, red for high activity.
Another sample detected a day with unusual crowd noise, music, and applause. By cross-referencing with video footage from nearby street cameras, it turned out to be from a protest rally on a nearby street, which could be important information for analyzing any immediate security risk, or the need to send a news team to report on a developing story.
Applying audio analytics
The sounds of privacy
To address potential privacy and legal issues, the SMART team used wide angles and low resolution for the video cameras. The microphones were placed at a distance to pick up crowd noise rather than intelligible speech or individual conversations.
Our research highlights the enormous potential of easily accessible information in our physical surroundings. The technology to use that information has exciting and practical applications for smart cities, with innovative ways to interpret sounds and images.