My first contact with Machine Learning (ML) was through the field of image processing back in the late 90s, while I was pursuing an academic career. A few years later, I started working in Alcatel on troubleshooting problems where I combined ML with signal processing theory. Over time, I got to my own interpretation of ML as sort of a “classical” maths applied to data problems: Monte Carlo, matrices, feature engineering, Markov, Kalman, clustering etc. ML approach still required deep understanding of the underlying phenomena, either because feature engineering was involved or simply because some reasonable assumptions had to be taken while processing data.
I considered other things, with fancier-sounding names, such as reinforcement learning, unsupervised learning, smart agents, evolutionary algorithms or knowledge modeling (fuzzy logic, Bayes nets and others) as falling under the scope of Artificial Intelligence (AI). Here, things were less “clear-cut” and the right strategy to achieve the desired goal (taking into account utility) would evolve over time, after some level of optimization or convergence.
So, that is how I made up my mind and got my own definitions of ML and AI. If it was about image processing, clustering or data mining, I would call it ML. If it was about fancy robotics, natural language processing (NLP), knowledge modeling or fuzzy algorithms, I would call it AI.
During that time, Neural Networks (NN), a technique inspired by the information processing patterns found in the human brain, seemed a dead-end to almost everyone in the field. I had a vague idea that it had to do with optimization and classification in contexts where things were not completely known or understood. But there was a lot of scepticism around NN approach that can be best sum up by one of my professors, which said something along these lines: “Once you know the distribution, you can’t beat Bayes anyway, so what’s the point..”.
At the time I didn’t fully understand all the nuances in that sentence, nor was I fully aware of the grievances and resentment that NN folks had towards these sorts of views. To them, similar statements probably sounded as if the first unknown in the “unknown unknown” problem was due to lack of knowledge or ignorance of NN practitioners. Anyway, as an opportunistic PhD candidate, I chose to skip the muddy waters of AI and NN and stick to ML.
I was happy with all of that, but I was heading for a surprise. Over the next decades we were to witness the rise of NN, under the new name of Deep Learning (DL), like a phoenix from its ashes.
Neural Networks aka Deep Learning
Deep Learning has exploded thanks to two different forces that have risen at about the same time. Firstly, huge data sets became available to large audiences, with the advent of Google and the internet at large. Secondly, cloud computing and almost infinite compute capacity allowed researchers to test neural networks of much higher complexity- what we call today deep nets. And the results were astonishing! Neural Networks were making a comeback in style, this time under the name Deep Learning (DL).
As results got better and DL was being applied to more areas, there was a need to find a better place for NN in the overall AI story. At first DL, like NN before, was seen by many as a subset of ML, which was seen as a subset of AI:
A few years passed, ML started to be viewed as the old boring classical statistical approach, while symbolic AI, Bayes nets etc would still fall within AI. Deep Learning was grabbing all attention, so we got this new venn diagram:
Shortly after, DL had become the new AI! Or to put it differently, the future was unsupervised!
That has become so bizarre that now if you would do anomaly detection with SARIMA, you would be doing ML, while if you were trying to achieve the same with DL, you would be labeled as doing AI. In this particular example, it doesn’t really matter that much that the only thing that anomaly detection does is to find the fitting curve (via training) against which we will be looking for the outliers later. In this particular application, DL is the better tool, as it is after all the best curve fitting approach, providing you have plenty of data to start with and you can watch out for the overfitting. But who cares about details, let’s call it AI and we have a new fancy tagline.
The same story has been repeating in many other commercial fields as well, where the allure of using the term AI has led to utter confusion about what was really happening.
Since for the layman Deep Learning had by now become AI, and any AI was perceived as GAI, guess what happened next? Everyone who wanted to get attention, funding or a few minutes on stage, started selling DL not as the next big thing — but as the only next thing.
Several application fields that used to have their own proper names, like image processing, language translations, etc. were now being labeled and sold as AI, as soon as Deep Learning was applied in one way or another.
The promise of magic
I grew up reading Isaac Asimov and Arthur Clark and my first contact with AI came from these SF novels and movies where robots behave and act as humans. Or where humans can just ask machines things like: “make me a coffee” or “get me to the airport” and then things would happen, just like magic.
Magic is the layman’s only definition of AI. To them, it really doesn’t matter that the tasks I’ve given as examples above would be described by scientists as a General Artificial Intelligence (GAI) problem. In one of my earlier talks “Automation, intelligence and knowledge modelling” I tried to address these misconceptions by presenting different definitions (and views) of AI. I think that no matter how hard we try, the truth is, magic is all people hear. And I think that everyone that works in the field of AI (both scientifically and in applied engineering) needs to be well aware of this general perception when they discuss AI topics in the public domain.
Over the past few years, “machines” have beaten us in games that were until recently considered to be too computationally challenging for programs to deal with: chess and go. So the promise of magic was now so tempting that it was used to sell us on the dream that the holy grail (“singularity”) was just around the corner. Despite the fact that both chess and go are “deterministic perfect-information games”, many jumped on the bandwagon and started selling to the public things that looked as they came straight from Isaac Asimov’s novels. And what happened?
After years of sunny optimism and big promises, automakers are beginning to realize just how difficult it is to make a market-ready, full self-driving car. The CEO of Volkswagen’s autonomous driving division recently admitted that Level 5 autonomy — that’s full computer control of the vehicle with zero limitations — might actually never happen.
From the article: “Key Volkswagen Exec Admits Full Self-Driving Cars ‘May Never Happen”
From the very beginning of this frenzy, large parts of the AI community pointed out that DL is not capable of “generalizing”, which means it can’t lead us towards GAI (General Artificial Intelligence). People who were most vocal about it were being labeled as old fashioned skeptics, envious that the other camp got some glory on their side.
After all, many folks working on neural networks had suffered from lack of funds for decades, working hard to prove that there was something important there that others didn’t quite see. This is to be absolutely admired and supported, as humanity needs people who follow their passions, regardless of what the fashion of the day is. And regardless of what governments through public funds dictate that academics should study.
And let’s be clear, DL results are impressive! So the awards, the news and the attention were rightfully theirs. But things have changed. It became painfully apparent that current DL is nowhere closer to GAI, so people have started talking about the “AI winter 2.0”
And now, instead of engaging in calm and sincere conversation about how to deal with the shortcomings, what have we got instead? We got ourselves into two different arguments (lead by AI giants Peter Norvig, Judea Pearl, Yoshua Bengio, Yann LeCun, Geoffrey Hinton, Gary Marcus …). The first one (again) is on the definition of intelligence. The second one, on what Deep Learning actually is, and what qualifies as Deep Learning. So, here we go, the scientific arguments 2.0 (twitter, obviously).
One of the founding fathers of modern Neural Networks research, Yann LeCun posted the following on his facebook page
Some folks still seem confused about what deep learning is. Here is a definition: DL is constructing networks of parameterized functional modules & training them from examples using gradient-based optimization. That’s it.
To which Judea Pearl, another pioneering figure in AI, has this to say
All the impressive achievements of deep learning amount to just curve fitting.
In more recent tweet by Yann LeCun, we see another effort to broaden the scope of DL, by stating:
Deep Learning is far, far more than old-style neural nets with more than a couple of layers. DL is an “architectural language” with enormous flexibility.
This definition is similar to Andrej Karpathy’s idea presented in this blog post , who is a director of AI at Tesla, where he called DL a “software 2.0”:
Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we write software.
And indeed, all respect to Andrej for “eating his own dog food”, all Tesla’s auto-pilot features are solely based on DL. Tesla got to Level 3, which truly is a great achievement, no doubt about it! But are we getting closer to fully autonomous computer control vehicles with zero human intervention any time soon? Not really, not even close.
Where we go from here
First, let’s not worry! The technological singularity — a hypothetical future point in time when technological growth becomes uncontrollable and irreversible, resulting in inconceivable changes to human civilization will not come any time soon.
But neither is a new AI winter coming. On the contrary, there are thousands of new students going into AI every year. These young men and women will be pushing the boundaries of AI every day, wondering what intelligence is, and how we can trick nature into telling us just a little bit more about it. Sceptics would also say that AI has been weaponized by all world superpowers, similar to what happened with the nuclear and space race before. In that sense, maybe a little bit of a “fight” is not that bad after all, as we all only crave for answers.
Sometimes, we fool ourselves to believe that we are almost there and that the answers are just around the corner. And yes we are all invited to dream about it: evangelists, bullshiters, scientists — the greatest voyage of our time is for everyone to enjoy!