Contents


Developing cognitive IoT solutions for anomaly detection by using deep learning, Part 3

Using Deeplearning4j for anomaly detection

Create a deep learning neural network on Apache Spark with Deeplearning4j

Comments

Content series:

This content is part # of 5 in the series: Developing cognitive IoT solutions for anomaly detection by using deep learning, Part 3

Stay tuned for additional content in this series.

This content is part of the series:Developing cognitive IoT solutions for anomaly detection by using deep learning, Part 3

Stay tuned for additional content in this series.

In the first article in this series, Introducing deep learning and long-short term memory networks, I spent some time introducing concepts about deep learning and neural networks. I also described a demo use case on anomaly detection for IoT time-series data. Our task is to detect anomalies in vibration (accelerometer) sensor data in a bearing as shown in Figure 1.

Figure 1. Accelerometer sensor on a bearing records vibrations on each of the three geometrical axes x, y, and z
Accelerometer sensor on a bearing
Accelerometer sensor on a bearing

Because it is hard to take such a system with you, I generated test data by using a physical Lorenz Attractor model because it is capable of generating a three-dimensional data stream. I used the generated data in this demo to detect anomalies, predicting when a bearing is about to break.

We'll need to do some development environment setup, but an overview of the process is as follows:

  • Test data is generated in Node-RED and run in the IBM Cloud (or alternatively on an IoT Gateway like a Raspberry Pi to simulate an even more realistic scenario).
  • The Watson IoT Platform Service is used as the MQTT message broker (also running in the cloud).
  • Eclipse, installed on your desktop and running a deep learning system, subscribes to the data on the MQTT message broker.

How are we going to deploy the Node-RED test data to the IBM Cloud platform? Which deep learning system are we going to use? Several different technologies exist to implement a deep learning system. As mentioned, these open standard and open source solutions can run in the IBM Cloud: Deeplearning4j, ApacheSystemML, and TensorFlow (TensorSpark). This article will present the Deeplearning4j solution.

What you’ll need to build your app

  • An IBM Cloud account. (Sign up for an IBM Cloud Lite account, a free account that never expires.)
  • Eclipse (an integrated development environment (IDE) for JVM-based languages).
  • Eclipse Maven Plugin (dependency management and automated build tool).
  • Eclipse Scala Plugin (programming language).
  • Eclipse GIT Plugin (version control system).

Setting up your development environment

Before we talk about the deep learning use case, spend some time setting up your development environment.

  1. Install Eclipse Oxygen. Select the IDE for Java Developers.
  2. Install the Eclipse Maven Plugin.
  3. Install the Eclipse Scala Plugin as described for Scala 2.10.
  4. Install the Eclipse GIT Plugin.
  5. Follow the instructions in the getting started docs of my deeplearning4j GitHub repo to import the source code for this tutorial.
  6. Finalize the setup.
    1. Switch to the Scala Perspective. Right-click the dl4j-examples-spark project, and then click Configure > Add Scala Nature.
    2. Right-click the dl4j-examples-spark project again, and then click Maven > Update Project.
      Note: Ignore the Maven errors. As long as Run.scala compiles without error you are fine!
    3. Update src/resources/ibm_watson_iot_mqtt.properties with the credentials of the IBM Watson IoT Platform. Specify the Organization-ID, Authentication-Method (apikey), API-Key, and Authentication-Token. You noted these credentials in my "Generating data for anomaly detection" article.
  7. Run the Scala application to test the connection.
    1. Open the Eclipse package explorer.
    2. In the dl4j-examples-scala project expand the src/main/scala folder.
    3. Find the Run.scala file, right-click the file, and select Run As > Scala Application.

      You should see the following output as shown in Figure 2.

      Figure 2. Scala application output
      Scala application output
      Scala application output

      Note: Ignore warnings that the Vfs.Dir is not found. Those are only warnings and don't affect the behavior of the application.

Congratulations, the most important part is working. Stop the application by clicking the red Stop button in the upper right of the window as shown in Figure 2. We will run this application again during a later stage in the article.

What is Deeplearning4j?

Deeplearning4j is a Java-based toolkit. It is open-source, distributed deep learning and runs in many different environments including Apache Spark. Deeplearning4j does not need any additional components to be installed because it is a native Apache Spark application using the interfaces which Apache Spark provides.

The most important components of the framework for this article are:

  • Deeplearning4j runtime is the core module. With this runtime module, you can define and execute all sorts of neural networks on top of (but not directly on) Apache Spark by using a tensor library.
  • ND4J is a scientific computing libraries for the JVM. This tensor library is really the heart of Deeplearning4j. It can be used stand-alone and provide accelerated linear algebra on top of CPUs and GPUs. For porting code to a GPU no code changes are required because a JVM property configures the underlying execution engine which can also be a CUDA backend for nVidia GPU cards.

ND4J is a tensor and linear algebra library. This means multidimensional arrays (also called tensors) and operations on them are the main purpose. Operations are simple, but fast. The advantage of using ND4J are:

  • When using Apache Spark, you stay in the same JVM process and don't have to pay the overhead of interprocess communication (IPC).
  • ND4J is capable of using SIMD instruction sets on modern CPUs, which doubles the performance of ND4J over another tensor library such as NumPy. This is achieved by using the OpenBLAS, an open-source implementation of the Basic Linear Algebra Subprograms (BLAS) API.
  • ND4J can take advantage of GPUs present on your machine by just setting a system property on the JVM (provided a recent version of the CUDA drivers and framework is installed on your system).

How can ND4J take advantage of the GPUs? Look at this Scala syntax to understand how it works.

import org.nd4j.linalg.factory.Nd4j
import org.nd4j.linalg.api.ndarray.INDArray
var v: INDArray = Nd4j.create(Array(Array(1d, 2d, 3d), Array(4d, 5d, 6d)))
var w: INDArray = Nd4j.create(Array(Array(1d, 2d), Array(3d, 4d), Array(5d, 6d)))
print(v.mul(w))

As you can see, I created two matrices v and w of type INDArray using the Nd4j.create method. I provided a nested Scala array of type double, which I can create inline like this:

Array(Array(1d, 2d, 3d), Array(4d, 5d, 6d))

The code v.mul(w) triggers the matrix multiplication. Again, either on a CPU or GPU. But this is totally transparent to us.

Practice training a neural network using the XOR operation

Now that you can see what ND4J can do, I want to show you how to create a neural network. Before we start with our IoT time-series data, start with an XOR example. First, using Scala, generate some training data inline:

/*
* List of input values: 4 training samples with data for 2 input-neurons each.
*/
var input: INDArray = Nd4j.zeros(4, 2)
/*
* Corresponding list with expected output values, 4 training samples with
* data for 2 output-neurons each.
*/
var labels: INDArray = Nd4j.zeros(4, 2);
/*
* Create first data set when first input=0 and second input=0.
*/
input.putScalar(Array(0, 0), 0);
input.putScalar(Array(0, 1), 0);
/*
* Then the first output fires for false, and the second is 0 (see class comment).
*/
labels.putScalar(Array(0, 0), 1);
labels.putScalar(Array(0, 1), 0);
/*
* When first input=1 and second input=0.
*/
input.putScalar(Array(1, 0), 1);
input.putScalar(Array(1, 1), 0);
/*
* Then XOR is true, therefore the second output neuron fires.
*/
labels.putScalar(Array(1, 0), 0);
labels.putScalar(Array(1, 1), 1);
/*
* Same as above.
*/
input.putScalar(Array(2, 0), 0);
input.putScalar(Array(2, 1), 1);
labels.putScalar(Array(2, 0), 0);
labels.putScalar(Array(2, 1), 1);
/*
* When both inputs fire, XOR is false again. The first output should fire.
*/
input.putScalar(Array(3, 0), 1);
input.putScalar(Array(3, 1), 1);
labels.putScalar(Array(3, 0), 1);
labels.putScalar(Array(3, 1), 0);

Now that we have created two ND4J arrays, one called input containing the features and one called labels containing the expected outcome. Just as a reminder, see the input and outputs in Table 1.

Table 1. XOR function table inputs and outputs
Input 1Input 2Output
000
011
101
110

Note: The output is only 1 if only one input is 1.

Now, let's use the data we've created above for neural network training.

var ds: DataSet = new DataSet(input, labels)
print(ds)

The DataSet array, not to be confused with the one from Apache Spark SQL, is a Deeplearning4j data structure containing ND4J arrays for training. Here is what the internal mathematical representation of this ND4J array looks like:

===========INPUT===================
[[0.00, 0.00],
[1.00, 0.00],
[0.00, 1.00],
[1.00, 1.00]]
===========OUTPUT==================
[[1.00, 0.00],
[0.00, 1.00],
[0.00, 1.00],
[1.00, 0.00]]

This array reflects the structure of the Table 1 XOR function table with two differences:

  • ND4J uses float as internal data type representation.
  • The output is in binary form, that is, a two-dimensional array. Two-dimensional arrays are very handy for training binary classifiers with neural networks where we have two output neurons because binary classification is done using two output neurons instead of one. After training, each output neuron outputs a probability of being the either or other class.
1

Create a Deeplearning4j neural network for XOR

Still using our XOR inputs and outputs we will define and create neural networks in Deeplearning4j with the NeuralNetConfiguration.Builder class.

You can find all of the code that I discuss in the following sections in the XOrExampleScala class.

1a

Set the global parameters

This code basically sets global parameters to the neural network. Digging into each of those parameters is beyond the scope of this article.

/*
* Set up network configuration.
*/
var builder: NeuralNetConfiguration.Builder = new NeuralNetConfiguration.Builder();

/*
* How often should the training set be run? We need something above
* 1000, or a higher learning-rate; found this value just by trial and error.
*/
builder.iterations(10000);

/*
* Learning rate.
*/
builder.learningRate(0.1);

/*
* Fixed seed for the random generator. Any run of this program
* brings the same results. Might not work if you do something like ds.shuffle()
*/
builder.seed(123);

/*
* Not applicable as this network is too small, but for bigger networks it
* can help that the network is less prone to overfitting to the training data.
*/
builder.useDropConnect(false);

/*
* A standard algorithm for moving on the error-plane. This one works
* best for me. LINE_GRADIENT_DESCENT or CONJUGATE_GRADIENT can do the
* job, too. It's an empirical value which one matches best to
* your problem.
*/
builder.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT);

/*
* Initialize the bias with 0; empirical value, too.
*/
builder.biasInit(0);

/*
* From "http://deeplearning4j.org/architecture": The networks can
* process the input more quickly and more accurately by ingesting
* minibatches of 5-10 elements at a time in parallel.
* This example runs better without, because the data set is smaller than
* the minibatch size.
*/
builder.miniBatch(false);

/*
* Create a multilayer network with two layers (including the output layer, excluding the input layer)
*/
var listBuilder: ListBuilder = builder.list();
var hiddenLayerBuilder: DenseLayer.Builder = new DenseLayer.Builder();

/*
* Two input connections simultaneously defines the number of input
* neurons, because it's the first non-input-layer.
*/
hiddenLayerBuilder.nIn(2);

/*
* Number of outgoing connections, nOut simultaneously defines the
* number of neurons in this layer.
*/
hiddenLayerBuilder.nOut(4);

/*
* Put the output through the sigmoid function, to cap the output
* value between 0 and 1.
*/
hiddenLayerBuilder.activation(Activation.SIGMOID);

/*
* Random initialize weights with values between 0 and 1.
*/
hiddenLayerBuilder.weightInit(WeightInit.DISTRIBUTION);
hiddenLayerBuilder.dist(new UniformDistribution(0, 1));
1b

Set the neural network layers

After setting the global parameters, we next need to add individual neural network layers to form a deep neural network. This code adds two layers to the neural network, an input layer with two neurons (each one for one column of the XOR function table shown in Table 1) and an output layer with two neurons, one for each class (as we have outcome zero and one in the XOR function table).

Note: We can specify an abundance of layer-specific parameters, but this is beyond the scope of this article.

/*
* Build and set as layer 0.
*/
listBuilder.layer(0, hiddenLayerBuilder.build());

/*
* MCXENT or NEGATIVELOGLIKELIHOOD (both are mathematically equivalent) work for this example. This
* function calculates the error-value (or 'cost' or 'loss function value'), and quantifies
the goodness
* or badness of a prediction, in a differentiable way.
* For classification (with mutually exclusive classes, like here), use multiclass cross entropy, in conjunction
* with softmax activation function.
*/
var outputLayerBuilder: Builder = new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD);

/*
* Must be the same amount of neurons in the layer before.
*/
outputLayerBuilder.nIn(4);

/*
* Two neurons in this layer.
*/
outputLayerBuilder.nOut(2);
outputLayerBuilder.activation(Activation.SOFTMAX);
outputLayerBuilder.weightInit(WeightInit.DISTRIBUTION);
outputLayerBuilder.dist(new UniformDistribution(0, 1));
listBuilder.layer(1, outputLayerBuilder.build());
1c

Create the neural network

You have the global parameters and the neural network layers, now create the neural network.

/*
* No pretrain phase for this network.
*/
listBuilder.pretrain(false);

/*
* Seems to be mandatory
* according to agibsonccc: You typically only use that with
* pretrain(true) when you want to do pretrain/finetune without changing
* the previous layers finetuned weights that's for autoencoders and restricted Boltzmann machines (RBMs).
*/
listBuilder.backprop(true);

/*
* Build and initialize the network and check if everything is configured correctly.
*/
varconf: MultiLayerConfiguration = listBuilder.build();
var net: MultiLayerNetwork = new MultiLayerNetwork(conf);
net.init();
1d

Train the neural network with XOR data

Now the net variable contains our ready-made neural network and the only thing we have to do to train it with our XOR function table is the following code.

net.fit(ds)

If we now look at the output (sysout) we see debug message on how the learning progresses.

08:52:56.714 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 400 is 0.6919901371002197
08:52:56.905 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 500 is 0.6902942657470703
08:52:57.085 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 600 is 0.6845208406448364
....
08:53:11.720 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 9700 is 0.0012604787480086088
08:53:11.847 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 9800 is 0.0012446331093087792
08:53:11.994 [main] INFO o.d.o.l.ScoreIterationListener - Score at iteration 9900 is 0.001229131012223661

As you can see there are 9900 iterations where the neural network is trained (basically the very same data set is shown to the neural network multiple times) and every 100 iterations a measure called score is printed. This is the so-called RMSE (root-mean-square error), a measure on how well the neural network fits to the data; the lower the better. As you can observe after 10000 iterations the RMSE went down to 0.001229131012223661 which is a very good value in this case.

1e

Evaluate how well the training went

We can check on how well we are actually doing, because Deeplearning4j has a built-in component for the evaluation.

/*
* Let Evaluation print stats on how often the right output had the correct label.
*/
var eval: Evaluation = new Evaluation(2);
eval.eval(ds.getLabels(), output);
println(eval.stats());

The code outputs the following measures on prediction (classification) performance:

==========================Scores=========================
Accuracy:1
Precision: 1
Recall: 1
F1 Score: 1
=========================================================

Getting a one for all measures means that we have scored 100% and we've build a perfect classifier to compute XOR.

2

Create a Deeplearning4j neural network for anomaly detection

Learning how to train a neural network using XOR as an example was educational, but now we need to build something useful on Apache Spark with Deeplearning4j using a generated data set. Remember we used a Lorenz Attractor model to get simulated real-time vibration sensor data. And we need to get that data to the IBM Cloud platform; see my "Generating data for anomaly detection" article for the steps.

I'm using Scala because not only is it similar to Java, it is also considered a data science language. This example consists of three Scala classes.

  • WatsonIoTConnector is responsible to subscribing to real-time data from the MQTT message broker.
  • IoTAnomalyExampleLSTMFFTWatsonIoT contains the actual neural network configuration.
  • Run contains some glue between the WatsonIoTConnector and the IoTAnomalyExampleLSTMFFTWatsonIoT.
2a

Subscribe to the IBM Watson IoT Platform with MQTT to ingest the IoT sensor data stream in real-time

Start with the WatsonIoTConnector first. I'm only showing relevant code here, but you can download the complete code from my GitHub repo, dl4j-examples.

First, create an MQTT application client to subscribe to a MQTT sensor data stream.

val props = new Properties()
props.load(getClass.getResourceAsStream("/ibm_watson_iot_mqtt.properties"))
val myClient = new ApplicationClient(props)
myClient.connect

Now you can subscribe to so-called device events. As we probably don't want to receive all the traffic, which is going on the message bus, we're quite happy to filter it already. This is a very smart way of decoupling sensors attached to IoT devices and gateways from the actual analytics applications because they don't have to know anything from each other anymore. So how do we react on incoming data? By a callback handler which is set on the ApplicationClient instance myClient.

myClient.setEventCallback(eventbk)

Look at this event handler defined in the Run class.

object MyEventCallback extends EventCallback {

The first thing we do is to create a fifo variable to store a tumbling count window of events.

var fifo: Queue[Array[Double]] = new CircularFifoQueue[Array[Double]](windowSize)

Next, we implement the processEvent method, which is being called whenever a message arrives from the MQTT queue.

override def processEvent(arg0: Event) {

Now convert the event to an array of type double and add it to the fifo object.

val json = arg0.getData().asInstanceOf[JsonObject]
def conv = { v: Object => v.toString.toDouble }
val event: Array[Double] = Array(conv(json.get("x")), conv(json.get("y")), conv(json.get("z")))
fifo.add(event);

After our tumbling count window is filled we apply fast Fourier transformation (FFT) to obtain the frequency spectrum of the signals and finally transform it to a NDArray, an internal Deeplearning4j data type.

val ixNd = Nd4j.create(fifo.toArray(Array.ofDim[Double](windowSize, 3)));
def xtCol = { (x: INDArray, i: Integer) => x.getColumn(i).dup.data.asDouble }
val fftXYZ = Nd4j.hstack(Nd4j.create(fft(xtCol(ixNd, 0))), Nd4j.create(fft(xtCol(ixNd, 1))), Nd4j.create(fft(xtCol(ixNd, 2))))

Now it's time to instantiate the neural network.

val lstm: IoTAnomalyExampleLSTMFFTWatsonIoT = new IoTAnomalyExampleLSTMFFTWatsonIoT(windowSize * 6)

After this is done, we can actually send our tumbling count window downstream to the neural network to detect anomalies.

Note: The training and anomaly detection is taking place at the same time because the neural network continuously learns what normal data looks like and after it sees anomalies it will raise an error.

lstm.detect(fftXYZ)
2b

Create the deep neural network LSTM auto-encoder for anomaly detection

But how does this magic happen? Let's have a look at our neural network implementation in IoTAnomalyExampleLSTMFFTWatsonIoT:

val conf = new NeuralNetConfiguration.Builder()
.seed(12345)
.iterations(1)
.weightInit(WeightInit.XAVIER)
.updater(Updater.ADAGRAD)
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.learningRate(learningRate)
.regularization(true)
.l2(0.0001)
.list()

First, we set global parameters to the neural network such as the learning rate for example. And then it's time to add the actual layers. We'll start with an long-short term memory (LSTM) layer – the layer responsible for recognizing temporal patterns in our IoT time-series sensor data stream.

.layer(0, new GravesLSTM.Builder().activation(Activation.TANH).nIn(windowSize).nOut(10)
.build())

To detect anomalies it is crucial to use an autoencoder, which we'll add as second layer.

.layer(1, new VariationalAutoencoder.Builder()
.activation(Activation.LEAKYRELU)
.encoderLayerSizes(256, 256) 
//2 encoder layers, each of size 256
.decoderLayerSizes(256, 256) 
//2 decoder layers, each of size 256
.pzxActivationFunction(Activation.IDENTITY) 
//p(z|data) activation function
//Bernoulli reconstruction distribution + sigmoid activation - for modelling binary data (or data in range 0 to 1)
.reconstructionDistribution(new BernoulliReconstructionDistribution(Activation.SIGMOID))
.nIn(10) //Input size: 28x28
.nOut(10) //Size of the latent variable space: p(z|x) - 32 values
.build())

Finally, we conclude with an output layer and we are done.

.layer(2, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE)
      .activation(Activation.IDENTITY).nIn(10).nOut(windowSize).build())
2c

Run the neural network on a single, local machine

Now let's first have a look at what a single node configuration actually would look like.

val net = new MultiLayerNetwork(conf)

That's all. We just use the configuration and obtain a neural network object on which we can train. Going from single node to Apache Spark is actually really easy in Deeplearning4j.

val tm = new ParameterAveragingTrainingMaster.Builder(batchSizePerWorker)
.averagingFrequency(5)
.workerPrefetchNumBatches(2)
.batchSizePerWorker(16)
.build();

val net = new SparkDl4jMultiLayer(sc, conf, tm);
2d

Parallelize this neural network using Apache Spark

Let's skip TrainingMaster for now and have a look the constructor signature of SparkDl4jMultiLayer. The conf parameter we already know; this is the neural network configuration. Then, sc stands for SparkContext, which we have available when we are using Apache Spark. Finally, let's check out the TrainingMaster. Parallel training of neural networks happens using parameter averaging. During training, the neural network parameters, or weights, are updated in each training iteration. Because multiple neural networks are trained in parallel on different data partitions the learned parameters of each individual neural network are sent to the parameter server here end then where they are getting averaged and sent back.

Let's review the following source code to see the minimal differences in the source code to switch from local to parallel execution on top of Apache Spark.

if (runLocal) {
    net = new MultiLayerNetwork(conf)
    net.setListeners(Collections.singletonList(new ScoreIterationListener(1).asInstanceOf[IterationListener]))
  } else {
    val tm = new ParameterAveragingTrainingMaster.Builder(20)
      .averagingFrequency(5)
      .workerPrefetchNumBatches(2)
      .batchSizePerWorker(16)
      .build();

    val sparkConf = new SparkConf()

    sparkConf.setAppName("DL4J Spark Example");
    sc = new JavaSparkContext(sparkConf);
    sparkNet = new SparkDl4jMultiLayer(sc, conf, tm);
  }

If runLocal is false, a ParameterAveragingTrainingMaster is instantiated; running on the Apache Spark master, it is responsible for parallel neural network training. Then sparkNet is created using SparkContext, the actual neural network configuration conf, and the Training Master instance we’ve just created, all using the SparkDl4jMultiLayer's constructor. If you want to know more on how Parameter Averaging works in detail, please look at the data parallelism explanation in my video, "Parallelization Strategies of DeepLearning Neural Networks" (which is at the 8:19 point in this 32:11 min video).

2e

Close the loop

So let's finally close the loop by showing you the implementation of the detect method which is called directly from the MQTT callback handler after the tumbling count window is full.

def detect(xyz: INDArray): Double = {
 for (a <- 1 to 1000) {
   net.fit(xyz, xyz)
 }
 return net.score(new DataSet(xyz, xyz))
}

This does nothing else other than showing the very same data multiple times to the neural network. Actually, it works better to have a smaller learning rate and repetitively train a neural network with the same data set. Here we are showing the neural network the same data set 1000 times.

3

Start the local neural network and see how it reacts to the received data

Let's actually start the process and see what happens. We'll do two rounds of training with healthy data and then finally switch the test data generator to a broken state. We will see a significant difference in the so-called reconstruction error the neural network experiences when suddenly seeing unknown data after some time of training with normal data.

First, run the Run.scala class again as described in the Setting up your development environment section. You should see an output similar to Figure 3.

Figure 3. Scala application output

This output means that the neural network has been instantiated locally and we are waiting for data to arrive in real-time from the IBM Watson IoT Platform MQTT message broker.

Next, switch back to the browser window of your Node-RED instance where the test data generator is running in. Start the test data generator by clicking the Reset button to produce some data.

Note: Although the Lorenz Attractor model continuously generates data (like a real accelerometer sensor attached to a bearing would), it's only when we hit the Reset button will it publish another 30 seconds worth of data to the IBM Watson IoT Platform MQTT message broker; this prevents our locally running neural network from trashing.

You can observe in the Node-RED debug pane how the data is streamed to the message broker because the flow contains nodes subscribing to and debugging the very same data stream as shown in Figure 4.

Figure 4. Node-RED flow that shows the nodes for subscribing to the data stream
Node-RED debug pane
Node-RED debug pane

Last, switch back to Eclipse where our neural network is running. You see in the console in Figure 5 some debug messages that data is arriving. We are waiting for a count-based, tumbling window to be filled so that it contains 30 seconds worth of data. We will submit each tumbling window to the neural network.

Figure 5. Node-RED flow debug tab that shows the data that is arriving
Eclipse console debug tab

Figure 6 shows the output of the neural network during training after it has received the first tumbling window for processing. It prints the actual training iteration and the current reconstruction error. It is important that the number converges to a local minima after some time and also that there is a significant drop of reconstruction error after some time.

Figure 6. Output of the neural network during training
Neural                 network output
Neural network output

In Figure 7 we have started at iteration 0 with an initial reconstruction error of 392314.67211754626. This is due to random initialization of the neural network weight parameters.

Figure 7. Neural network training output with healthy data iterations 0 to 30
Healthy data iterations 0 to                 30
Healthy data iterations 0 to 30

Note: Every time we run this example we will get slightly different numbers.

In Figure 8 we end up with a reconstruction error of 372.6741075529085 at iteration 999, which is significantly lower than iteration 0 at 392314.67211754626.

Figure 8. Neural network training output with healthy data iterations 969 to 999
Healthy data iterations 969                 to 999
Healthy data iterations 969 to 999

In Figure 9, after a second round of training - and after processing the second tumbling count window - we end up with a reconstruction error of 77.8737141122287 at iteration 1999.

Note: You can see the score at iteration 1999 is higher than at iteration 1969. This is due to oscillations, but doesn't mean there is a problem unless you are converging to some low value.

Figure 9. Neural network training output with healthy data iterations 1969 to 1999
Healthy data iterations 1969                 to 1999
Healthy data iterations 1969 to 1999

As you can see in Figure 10, now we've fed abnormal data into the neural network and we can clearly see that the reconstruction error of 11091.125671441947 at iteration 2999 is significantly higher than in Figure 8 and Figure 9.

Figure 10. Neural network training output with healthy data iterations 969 to 999
Healthy data iterations 969                 to 999
Healthy data iterations 969 to 999

Note: I demonstrated this end-to-end scenario at the HadoopSummit 17 conference, which you can watch in this video (I begin the demo at the 25:06 point in this 34:47 min video).

Conclusion

This completes our first deep learning tutorial for IoT time-series data. As you have seen, defining and running a deep neural network is straightforward in DeepLearning4J. The DeepLearning4j framework takes care on the entire complex math that is involved in parallel neural network training. Running it on Apache Spark makes it an ideal candidate for building highly scalable cognitive IoT solutions in elastic Apache Spark cloud environments like IBM Watson Studio, which has its attached elastically scalable Apache Spark “as a Service” offering. In addition, you don’t lock yourself in to a specific cloud provider and can even run this system in your own private cloud or traditional data center.

In the next two articles we'll be working with the same generated test data, but with two different deep learning frameworks: ApacheSystemML and TensorFlow (TensorSpark).


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Internet of Things, Cognitive computing, Data and analytics
ArticleID=1047871
ArticleTitle=Developing cognitive IoT solutions for anomaly detection by using deep learning, Part 3: Using Deeplearning4j for anomaly detection
publish-date=07192017