What Is Machine Learning?
Machine learning, by definition, is any technology that uses algorithms to try to create repeatable results. When you talk about machine learning, you’re talking about machine learning algorithms, no matter what form they may take.
Another way to put this is that the algorithms allow the machine to learn from its operations. The process is iterative – as the machine runs, it works on new sets of data to provide insights. Much of what the algorithms do involves extrapolating from available data. Essentially, the algorithms are taking in that available data and parsing it, evaluating it and comparing different data pieces to come up with results.
How does this happen? Where do the data results come from?
That’s where the training set comes in. A training set is critically important in a machine learning program. The idea is that the training data set offers the machine learning program something to start with – initial baseline data to compare to new data sets.
Here’s a very simple example that helps to show how machine learning works: An engineer might want a machine learning program to learn to differentiate between three different kinds of fruit: bananas, grapes and oranges. The training data set will be critical, because it will orient the machine learning program to understand markers for each fruit type – the names of the fruits and the colors yellow, purple and orange, and the shapes: long and thin, round or clustered.
After training runs utilizing a training set, and perhaps a test or validation set, the machine will take in new pieces of information and look for those markers or properties which are often called “labels” in the machine learning world. In a sense, the computer will learn to recognize these fruits from any data background – a new data set might have all sorts of fruits and vegetables, or even consist of a whole virtual world in which the machine learning program will learn to pick out those individual fruits and identify them.
The whole key is that it has done this through internalizing the original training set and applying that to new sets of data.
Read More - Techopedia |