AI (Artificial Intelligence) technology known as “ChatGPT” etc. It has grown rapidly in recent years and more and more organizations are using it for marketing and other commercial purposes. One of the data analysis technologies used to understand this type of artificial intelligence is called “machine learning”, and the general learning method in machine learning is called “supervised learning”. In this article, we will explain how supervised learning works, its usage examples, and its differences from other learning methods such as “unsupervised learning” and “reinforced learning”.
What is supervised learning?
Supervised learning is one of the learning methods in Machine Mearning. First of all, machine learning is a technology in which a machine (computer) automatically learns according to the data given to it and discovers the principles and patterns behind the data, it is a method that includes “knowledge”, which refers to data-based learning. “. Let’s take a closer look.
How does supervised learning work?
For example, let’s say you have a large amount of visual data depicting different animals and you want to label each piece of data with the name of the animal in the image (information that correctly provides the answer); for example, “monkey, ” “dog,” and “partridge.” Using this data, the computer that performs machine learning learns the shape and characteristics of each animal and can say “This is a monkey,” “This is a dog,” or “This is a pheasant” just by looking at it. You will be able to identify which animal it is from the picture.
It is a method that can recognize the content and predict the correct answer when data with unknown answers is entered through iterative learning based on data with correct answers. Therefore, supervised learning consists of two processes: “learning” and “recognition/prediction.”
- –.Working
Learn rules and patterns using real data.
- – Recognition/Prediction
Recognize data using previously learned data with unknown correct answers and predict .
As I will explain later, there is also a learning method called “unsupervised learning”, which is similar to supervised learning. It is a data-driven learning method that does not include information about the correct answers. Naturally, supervised learning that uses data that contains correct answers has higher learning accuracy, and supervised learning is generally more pragmatic.
“Classification” and “Regression” of Supervised Learning
Supervised learning can be divided into two types: classification and regression.
Classification means “predicting which category certain data belongs to”. As mentioned above, this includes predicting names from animal image data. Classification also includes applications such as predicting whether a new email is spam by learning text features from emails that have been identified as spam. Classification can be called the process of predicting whether a certain factor is valid or not.
Regression means “predicting numbers”. For example, regression involves learning the relationship between data such as weather and average temperature and data on the number of ice cream units sold to predict “this amount of sales can be expected at this average temperature”. Other examples include predicting changes in prices. From homes and cars. Regression can be called predicting future values based on past data for which no data is available.
The main purpose of using supervised learning
The main purpose of using supervised learning methods is to use all the data around us in a way that will gain practical value. In recent years, many organizations have needed to use large amounts and types of data known as Big Data in their work.
For example, in the manufacturing industry, data such as the temperature of machines operating on a factory line can be obtained. Based on this data, AI can quickly detect anomalies in devices and perform supervised learning by providing accurate information such as “if it is above 60 degrees Celsius, it is abnormal”.
Points to consider when implementing supervised learning
There are three important points to keep in mind when implementing supervised learning. There is some overlap with the above, but if you are considering introducing it, please keep the following in mind.
If you only look at the temperature of the machine, people can handle it, but if there are only objects to control the temperature and the number of machines increases, it will take too much time and cost for people to control it. Each one by one. If we can learn AI using supervised learning and increase its accuracy, AI can reduce labor and cost by performing these tests. As a result, data such as temperature has practical value and can be used in business.
Examples of using supervised learning.
Here are some examples of how supervised learning can be used: It can be said that there are many examples where it is used mainly for reasoning and prediction.
- – Visual inspection
Perform automatic visual inspection of products on the production line.
- – Infrastructure inspection
Discover rust and cracks in buildings such as factories and bridges
- – Crop decision
Predict the harvest time of crops, fruits, etc.
- – X-ray inspection
Identify areas and shapes of cancer, etc., and help doctors interpret images.
- – Price prediction
Predict stock prices, house prices, etc. predict
Another specific use case is the prediction of fraud in banking transactions. For example, data from thousands or tens of thousands of banking transactions is labeled as “fraud” or “not fraud” for each transaction. This label is correct information, and you can predict whether a transaction is fraudulent or not by learning patterns learned from the data.
Therefore, supervised learning is effective when large amounts of historical data are used for learning.
The difference between “unsupervised learning” and “assisted learning”.
Other learning methods for machine learning include “unsupervised learning” and “reinforced learning”.
Unsupervised learning is the opposite of supervised learning, a method that repeatedly learns based on data that does not contain information about the correct answers. It calculates the degree of similarity between each piece of data from a set of input data and discovers the rules and patterns behind the data. The most important example of the use case is the recommendation logic in online shopping.
Reinforcement learning is a learning method that does not require data from scratch, unlike supervised and unsupervised learning, and the AI itself improves accuracy through trial and error. Taking the Go AI “AlphaGo” as an example, the number of move patterns in Go is so large that even the most advanced computers currently have to read all the moves. So instead of trying to read all the moves that will lead you to victory, you will learn which moves will bring you closer to winning. By playing games over and over again and learning, you will gradually be able to choose only the best actions. The important thing is that when there is no clear right answer, you learn which course of action is best by trial and error.
Advantages and disadvantages of supervised learning
The advantages of supervised learning are that the learning accuracy is high and the learning speed is fast due to the provision of information that is the correct answer. The more data is used for learning, the higher the learning accuracy.
The disadvantage is that you need to generate data that contains the correct answer. To increase learning accuracy, it is necessary to generate a large amount of data, which requires significant effort and cost to label each piece of data. Also, if you generate poor-quality data such as incorrect labeling or insufficient labeling, this will lead to a loss of learning accuracy.
- –.Merit
High learning accuracy and fast learning speed
- – Disadvantages
A large amount of training data must be generated.
Things to keep in mind when implementing supervised learning
There are three important things to keep in mind when implementing supervised learning. There is some overlap with the above, but if you are considering introducing it, please keep the following in mind.
Prepare a large amount of training data in advance.
First, we need to create a large amount of training data. In supervised learning, the more data you learn, the better the accuracy, so you need to prepare this data in advance. Suppose the data is collected within an organization. In that case, it can be effectively used as organization-specific data, but if the data is not aggregated, it can be collected by web browsers. There are ways for companies or researchers to use open datasets.
Guarantee the quality of learning data.
Second, the quality of training data must be ensured. In supervised learning, the correct answer to unknown data is predicted by iterative learning based on data that contains the correct answer. If there is an error in the correct answer information for labeling, the accuracy will naturally decrease because the learning will be performed using incorrect information. In supervised learning, it is important to create a large amount of data and ensure the quality of all data.
Increase accuracy by running the PDCA cycle.
Third, the PDCA cycle is critical to increase accuracy. For example, AI developed using only open datasets may be less useful than AI developed by competitors. In such cases, it is possible to differentiate from the competition and increase accuracy by training the system as new data is collected in the organization. The PDCA cycle requires repetition and continuous learning.
Summary
how it happened We introduced the supervised learning methodology, its use cases, advantages and disadvantages, etc. Supervised learning is repeated based on the information that is the correct answer, so it can be expected to be more accurate than other learning methods. On the other hand, it is necessary to generate a large amount of data to learn AI. If generating data is difficult, there are services that can do it for you; so it may be a good idea to use them when necessary. It is expected that the commercial use of AI in organizations will become even more widespread in the future. We hope that this article will be helpful to those who are considering using AI in their work.