I went to China in April, to meet colleagues and participate to various conferences and events. I was very happy and honored to have a session on Zhihu Live, a very respected China media. I spoke about artificial intelligence, machine learning, and deep learning. The most important part was a Q&A session with attendees.
The questions themselves are very interesting to me as they paint a landscape of what is hot in Chain now. Not surprisingly, deep learning is hot, but I think the focus on deep learning is stronger than in western countries. I reproduce the questions below together with the answers I gave.
The questions were most often asked in Chinese (Mandarin), and were translated by my colleagues Ke Wei Wei and Henry Zeng. They also translated my answers back to Chinese.
- Are the mature machine learning algorithms used commercially?
Yes. Machine learning is used in several areas. One of them is product recommendation where matrix factorization algorithms are routinely used. Other areas where machine learning is now used commercially include natural language processing, image recognition, sales forecast, predictive maintenance, customer churn prediction.
- What can be done in machine learning? What products?
A lot can be done as soon as one has a clear business goal and the data that support this business goal. For instance, if your business goal is to reduce the time it takes to ship goods once they are ordered, then you must have enough data from that past to learn what influences the time to delivery.
- How to implement and realize the machine learning?
Start small with a well define, small scope project. Then use open source to build models. Then use an industry platform like IBM Machine Learning to manage the lifecycle of your models.
- Voice recognition, natural language processing, image recognition, currently in e-commerce, is nothing more than voice customer service. Based on the searching and recommendation of deep learning, identification etc., are there any other application directions?
Customer service is key but there are other areas where ML is relevant. For instance, predictive maintenance is a great area for machine learning. The idea is to use IoT to gather information on various equipment, and predict their health condition so that failures can be prevented. Another area is health, where machine learning can help diagnosis, and help select best treatment.
- So for the classification, what are the classic cases? Do you have any thoughts or suggestions? When do we need to consider complex models?
Anomaly detection is a classical use case where you want to distinguish between what is normal and what is not. This is a two class, or binary classification, problem. This includes fraud detection (normal vs fraud), predictive maintenance (normal operation vs failure), health (normal vs disease), etc. I recommend starting with simple models, e.g. logistic regression, then look for more complex models, e.g. gradient boosted decision trees or deep learning, if accuracy isn’t good enough and if there is lots of training data.
- What is the progress of the cancer diagnosis project jointly studied by IBM Watson and Department of Medicine of the University of Tokyo last year?
I don’t know, I need to check.
- Watson Pepper is able to get images and text by social media. I wonder how do Pepper process the information and what is it for?
Watson Pepper uses deep learning to process that data.
- Do you think is there any bubble/hype in machine learning now?
Yes. I think deep learning is oversold and that people have unrealistic expectations. Deep learning is great, and it enables breakthrough in computer vision and natural language processing. But this comes with significant investment and lots of data. Most companies do not have enough data to make deep learning relevant. Moreover, deep learning isn’t the technology of choice in many areas where other machine learning techniques are better suited. I wish the power and limits of deep learning were better explained in general.
- Why is deep learning more academic than industrial?
This is changing fast. The most advanced teams work in companies like IBM, Facebook, Google, etc, not academia. Yet, deep learning is still in the hands of researchers instead of engineers. One reason is that deep learning is not well understood. Designing the right network architecture is still an art that few master.
- What do you think of transfer learning?
It is a great idea. It can save lots of time when training complex models.
- If deep learning has better performance than any other algorithms, is it possible to replace other classic ML algorithms?
Deep learning has greater performance for sound and images, but not for the rest. The other classic ML algorithms are here to stay for a while for many ML applications, either because deep learning isn’t yielding good results, or because there isn’t enough training data.
- What are the industry application directions of unsupervised learning?
I don’t think unsupervised learning is used much as a standalone technique. Unsupervised learning is used a lot as a preprocessing step for supervised learning. For instance, clustering data then using cluster id as a new feature may help the performance of supervised ML algorithms.
- If enterprises use machine learning, how should they start it? Is the technical threshold high? Which industries have the opportunity?
Enterprises need to start with trained data scientists on a small and well defined project. Enterprise can train their employees to become data scientists via online course like Stanford ML course on Coursera. But training isn’t enough, people must practice. A good way to practice is to enter ML competitions. Several web sites host such competitions.
- What is the difficulty of reinforcement learning? Is it closer to general AI?
Reinforcement learning aims at learning next best action. It demonstrated great success in domains where the number of possible actions is limited, like board games (Go), or Poker. It remains to be seen how these successes can be extended to real world situations where the number of possible actions is endless. If we can do it, then yes, we would be closer to general AI.
- Is there a plan to release IBM NN chips to the market?
IBM does not disclose plans about potential future products.
- I’m trying to predict the price of arts with machine learning. In trained data, the price of works and other parameters are known. I would like to know which algorithm I should make, supervised or unsupervised? Can IBM’s current products make it?
You need to use regression algorithms. I guess you want to learn both from art images, and meta data such as artist, year of creation, dimensions, material, etc. I would recommend a mix of deep learning to process the images, and classical ML for the rest. My favorite classical ML algorithm is gradient boosted decision trees like XGBoost or LightGBM. We intend to support these in IBM ML.
- Do you think there is any privacy issue in machine learning?
Yes, definitely. Think of using ML for health, for instance for diagnosing cancer from lung radios. In order to train a ML model you need to get a large sample of lung radios. If not dealt with care, it can be possible to identify who has a cancer and who has not from the training data. This would be a major privacy breach, and could be unlawful in some countries. One way to deal with it is to anonymize the data before it is sent to machine learners.
- Is it possible to combine deep learning with traditional programming? Will the development of NTM replace that of some programs?
Not sure I understand the question right. If you ask about combining deep learning with traditional machine learning, the answer is definitely yes. For instance, if you have training data that is a mix of pictures and structured data, you would use an ensemble approach. Train a deep learning model on the pictures, train a classical ML model on the rest of the features, then use a third classifier that takes the predictions of the first two models as input.
- Deep learning is a multiple layer NN, is it possible to use other multiple-layer-algorithms? Say, multi-layer trees
Yes, see for instance deep forests: https://arxiv.org/abs/1702.08835