Share this post:
If I say sky,
…the very next thing that comes to your mind is? don’t think…just say it…
…blue, or maybe clouds. And if I say green, you’ll likely say grass. Although some of my students are quick to say cut!
Let’s keep up the rigor: if I say AI what might you say?….don’t think….first thought that comes to mind….
Let me suggest Machine Learning. Sprinkling AI throughout our rhetoric caters better to a TV ad and dilutes our message. But machine learning is about building patterns, whether the goal is to gain insights from images or bag of words. Now that, is something tangible, more intuitive–it is about building patterns.
Let’s keep this up…..if I say machine learning, I encourage you to say Prediction. Whether your data is qualitative (classification) or quantitative (regression), it is about predicting–displaying a passage from the corpus or generating an image comes about by bubbling up the output that has garnered the highest confidence.
It comes easy to us when we think of predicting the weather patterns, yet so do translation systems: the prediction machine runs all the tools it has in it’s NLP (Natural Language Processing) stack to understand the question and squeezes the bag of words now normalized into 1s and 0s through an RNN (Recurrent Neural Network) and likely an LSTM (Long Short Term Memory) to garner output with varying confidence values….and there is always a top score.
Funny, if you think about this question-answer paradigm between machine and human, us humans tend to understand the question with built-in innuendos and colloquialism yet we do not have all the answers; the machine has a hard time understanding the questions, but has all the answers!
So how/when/where did machine learning happen? That my dears beckons far more than this article can offer as an answer. But the trick is, I believe, in explaining the complex in it’s simplest form. So let me try to explain machine learning using the following analogy:
One day, your machine learning colleagues come up to you and challenge you with climbing a small hill, blind folded and that you must do so in as minimum steps as possible. You agree and so they cover your eyes with a bandana. The path is safe, friends all around and so you begin the climb. Very soon you notice that it is quite steep and that they told you in as minimum steps as possible, so you take giant steps…all of a sudden…you have reached the apex and are struggling to break your hectic pace as the slope all of a sudden heads downward. Well, your friends hypothesis that you will overshoot the top was spot on.
You decide to attempt the climb again, going back, in ML lingo, it is called back propagation and so you do that. This time, being leery of the fast approaching crest, you take smaller steps, it ends up being numerous small steps and that does not satisfy the “minimum-step” condition. The ML folks call that not optimized.
You are determined to get this right, up you go again (they call it feed-forward). This time, you note that the slope (slope is calculated using derivatives, partial derivatives in this case) starts to change after about 10 steps and becomes less steep…can’t see it, but you are sensing it as it happens…you decide to adjust your gate, the size of your steps, maybe smaller steps would be better at this new less steep of a climb. Changing your step is akin to the neural network adjusting it’s hyper-parameters, it’s weights, that dangle from the synapses in a neural network.
The trick here is that the system is not “told” when and by how much to alter the size of steps, nor is it able to see what’s coming. It lives in the moment, but moment by moment becomes training exercises for the system. Patterns begin to emerge and solidify as the confidence value of said patterns is darn close to the actual parameters (ML folks also call it feature vectors–if you must, think of columns in a CSV file as features and so are the 42 muscles in the human face, just another set of feature vectors).
Important to note that this optimization, if you will, comes about after numerous forward feeds and back propagations. Lots happens to the 1s and 0s that pass through the network, but all efforts are meant to have you climbing the hill as though you did not have a blindfold and could tell when it’s time to ease up on the big steps.
Being a probabilistic system, you will never mimic 100% the actual event that happened in the past, but you can come darn close to predicting from the ample data that was ingested. You might even be 99.998 percent confident that you got it classified properly, for example, but it’s never 100%….that’s deterministic systems, the very laptop in front of you and me….not cognitive systems.
Perhaps it helps to think that we are happy to let the machine predict, knowing that we are the final arbitrators of that prediction; machine predicts, human decides…for now.