Machine Learning is getting a lot more air time these days but are we actually sure what it is?
The most common definition goes along the lines of:
It gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959).
This is an old quote but it has held the test of time. But,how can computers “learn” – have we really reached the age of artificial intelligence where they will take over the world and make humans redundant? I suspect not.
Let’s explore the core of the definition: the ability to learn
What this really means is there are a set of algorithms that, rather than simply following a static set of program instructions, they can make data driven predictions, or decisions through building a model.
There are three recognised categories of algorithms:
Supervised learning – The computer is presented with example inputs (training data) and their desired outputs, given by a “teacher”, and the goal is to learn a general rule that maps inputs to outputs. The “easiest” example of supervised learning is a decision tree – this uses a tree-like graph or model of decisions and their possible consequences, including chance-event outcomes, resource costs, and utility. From a business decision point of view, a decision tree is the minimum number of yes/no questions that one has to ask, to assess the probability of making a correct decision, most of the time. As a method, it allows you to approach the problem in a structured and systematic way to arrive at a logical conclusion.
There are many other supervised learning alogorithms – which I will only reference; Naïve Bayes Classification, Ordinary Least Squares Regression, Logistic Regression, Support Vector Machines and Ensemble Methods.
Unsupervised learning – this is where the data is not labeled and so there are no error or reward signals; hence leaving the algorithm to find structure. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end ). Examples include Clustering Algorithms, Principal Component Analysis, Singular Value Decomposition and Independent Component Analysis.
Reinforcement learning – this has been inspired by behavioural psychology, concerned with how software agents ought to take an action in an environment so as to maximise the “reward”. There are many other adjacent theories in this space – from game theory, control theory, Operational research, swarm intelligence etc. Reinforcement learning is different because the correct input/output are never presented, nor suboptimal actions corrected. There is a focus on on-line performance. A good and relevant example is a self driving car (autonomous car)– where it operates without a teacher explicitly telling it whether it has come to close to its goal.
So, we now have a set of algorithms that clearly need people who understand the techniques deeply – hence the need for Data Scientists.
An interesting summary can be found here. This uses the tube map analogy, and as you can see, Machine Learning is merely a stop on the way in the full journey.
However, even with the best algorithms, we need ways of storing the data and visualising it. There are many analytic platforms; the Gartner magic quadrant top quartile being held by the likes of SAS, IBM, KNIME, Rapidminer, and Dell. There is also MS-Cortana (see CSC announcement regarding IML solutions) hosted on Azure.
Its important to put Machine Learning into context– best summarized, I believe via the 2016 Gartner Hype Cycle for Smart Machines. The diagram shows that machine learning is on the wave of data science but there is a lot before it and a lot after, and that there is a long cycle time; 5-10 years before it becomes properly mainstream.
Don’t be put off with data science – this is a very new and exciting area. First, you don’t need to be a deep mathematician who understands the complexities of clustering algorithms vs principal component analysis. Like many of these phenomena, it’s knowing people that do understand these approaches. Of equal importance, however, is knowing how to apply the data findings. Data science has its own challenges – a data scientist can look at the data, write some algorithms, find some patterns and start to tell a story – oftentimes, however it doesn’t get the right level of stakeholder commitment as it’s not close enough to the business problem itself. Using an agile technique here is clearly one of the options that will increase stakeholder engagement: involve the right people (a variety) and iterate on the hypotheses. One technique I am particularly enthused about is the OODA loop.
Software tooling (the platform and the applications) is going to become the catalyst for increased speed-to-market and to a certain degree, dumbing down the complexities so it can become more mainstream. It will be a multi-layered architecture: industry applications and solutions at the top, followed by visualisation/collaboration, analytics, storage, and finally infrastructure. The danger, as is the case with many of these stacks, is spending too much on the technology and associated components, rather than the actual business problem. A classic example would be to ask an organisation whether they have a “Big Data strategy”–with the likely retort of “Yes, of course. We have deployed Hadoop.” Clearly not the right answer in this context.
As with many of these next generation technologies, the best way to understand and learn is to simply try it with a use case that is relevant to where you are. Get as close to the business problem and iterate on a hypothesis.
A great pace to start is kaggle. Try the “Machine Learning from the Titanic” prediction competition.
This post first appeared in Neil’s blog.
Neil Fagan — Distinguished Architect
Neil Fagan is CTO of the UK Government Security and Intelligence Account in Global Infrastructure Services and chair of CSC’s Architecture & Engineering Community. He is an enterprise architecture expert, leading teams of architects who work on solutions from initial concept through delivery and support. He is co-creator of the CSC Global Architecture (A10) Capability Framework and created the Architecture Best Practice course at CSC, delivering it to hundreds of architects. He has received the Silver President’s Award twice.
See Neil’s full bio.