Voice Ingestion service

IBM® Surveillance Insight® for Financial Services processes voice data in the following formats:

  • WAV file in uncompressed PCM, 16-bit little endian, 8 kHz sampling, and mono formats
  • PCAP files and direct network port PCAP

The voice ingestion service accepts multipart requests from the user. The multipart requests should contain the following parts:

  • A part name metadata containing the metadata JSON
  • A part name audiofile containing the audio binary data

The voice metadata JSON is parsed to get the call start date and gcid values. These values are used to store the audio binary data on the HDFS, where the converted audio file is persisted. The voice metadata JSON is published to the Kafka topic for further processing by the Voice Streams application. The incoming audio file can be WAV, MP3, or OPUS formats. MP4, M4A, M4P, M4B, M4R, and M4V are not supported.

The audio file is converted by using the ffmpeg utility to an uncompressed PCM, 16-bit little endian, 8 kHz sampling mono format WAV file. For example, if an audio file named call001.mp3 is passed to the Voice Ingestion service, the file is converted to cal001.wav and persisted on HDFS.

A sample voice dataset consisting of audio and metadata JSON is provided to help ingest voice files. Use the following command to run the script:

./processvoice.sh

The following is a sample voice ingestion service multipart request:

HEADERS

Content-Type: multipart/form-data; boundary=--------------------
--------e73b4c199aee


BODY

------------------------------e73b4c199aee
Content-Disposition: form-data; name="metadata"; filename="meta.
json"
Content-Type: application/octet-stream
 
{
	"Initiator": {
		"ContactID": "(+1)-204-353-7282",
		"Name": "Chris Brown",
		"DeviceID": "dev004",
		"CallStartTimeStamp": "2017-04-13 11:18:20",
		"CallEndTimeStamp": "2017-04-13 11:19:26"
	},
	"Participants": [{
		"ContactID": "(+1)-687-225-8261",
		"Name": "Jaxson Armstrong",
		"DeviceID": "dev002",
		"CallStartTimeStamp": "2017-04-13 11:18:20",
		"CallEndTimeStamp": "2017-04-13 11:19:26"
	}, {
		"ContactID": "(+1)-395-309-9915",
		"Name": "Henry Bailey",
		"DeviceID": "dev003",
		"CallStartTimeStamp": "2017-04-13 11:18:20",
		"CallEndTimeStamp": "2017-04-13 11:19:26"
	}],
	"ContactIDType": "phone",
	"AudioFileName": "File1.wav",
	"CallStartTimeStamp": "2017-04-13 11:18:20",
	"CallEndTimeStamp": "2017-04-13 11:19:26",
	"GlobalCommId": "gcid100906524390995"
}

------------------------------e73b4c199aee
Content-Disposition: form-data; name="audiofile"; filename="File
_2.mp3"
Content-Type: application/octet-stream

ID3......vTSS....Logic 10.1.0COM..h.engiTunNORM. 00000284 000002
87 00004435 000043F8 00005A00 00005A00 00007D9C 00007E47 0000715
E 0000715E.COM....engiTunSMPB. 00000000
............................
................................................................
------------------------------e73b4c199aee--