Understand the categories of machine learning algorithms and grasp the level of knowledge of the problem


Primers

The system's learning machine learning course made me feel very benefited, and I think it is very necessary to understand some basic problems, such as the category of machine learning algorithms.

Why do you say that? I admit that as a beginner, it may not be possible to have a comprehensive and clear understanding and examination of an object of study at the beginning. However, there is a preliminary and clearer understanding of some key concepts that will help us grasp the understanding of the problem. At the level, plainly speaking, it is to help us with a purpose to learn new knowledge, to study with problems, and to experiment with the motivation to solve problems. I think this method is beneficial and benign.

Previously, I encountered many problems in this area. It may be due to inadequate analysis of the issues. When looking for ways or models to solve the problems, I may be at a loss. There are two possible reasons for this: 1. The foundation is not profound, and the most commonly used algorithm models are not known; 2. In the learning process, there is a lack of understanding of the practical problems applicable to the model, and there is no experience in applying the model to practical problems. .

Therefore, in the learning process, it is necessary to pay special attention not only to the in-depth study of the essence of the algorithm, but also to the application of the algorithm, applicable conditions, and limitations. If you only explore the principles and do not understand practical applications, you will only be a nerd and will only talk on paper; if you only want to use it, you will not go deep into the essence of the algorithm, but you will only be free at the boundary of the core technology, and you cannot truly comprehend it. Only by combining theory and practice can we maximize the learning effect. From this path, we must persevere.

Input space, feature space, and output space

The input space and output space are actually a set of all possible values ​​for input and output. The input and output space can be a collection of finite elements, or it can be the entire European space. The input space and output space can be either a single space or a different space. In general, the output space is much smaller than the input space.

The feature space is the space where all feature vectors exist. Each dimension of a feature space corresponds to one feature. Sometimes it is assumed that the input space and the feature space are the same space, and they are not distinguished; sometimes it is assumed that the input space and the feature space are different spaces, and the instances are mapped from the input space to the feature space. The model is actually defined in the feature space.

This provides a good basis for the classification of machine learning algorithms. It can classify the specific conditions defined by the algorithm according to the different conditions of input space, feature space and output space.

Understand the categories of machine learning algorithms and grasp the level of knowledge of the problem

Classification of various learning algorithms

First of all, let's make a statement. The following categories are summarized in the course of learning related courses and learning. They are not perfect, but they can summarize certain problems. We hope to summarize them here so that we can clear our thinking and improve them later.

The classification of the algorithm introduced next is based on the general perspective of the contents of the beginners' learning. The classification angles are from the commonly used classification methods to relatively unfamiliar classification methods.

Use the difference of output space as a classification basis

Two class classification (binary classificaTIon), commonly known as non-question (say YES/NO). Its output space Y={-1,+1}

Multi-class classification (mulTIclass classificaTIon), output space Y={1,2,...,K}

Regression (regression), output space Y = R, the real number range, the output is infinite possibilities

Structured learning, Y=structures, this learning model can also be viewed as a type of multi-class learning, which may involve a large number of categories

For example:

The two categories of applications are very broad, such as whether to judge whether it is spam, whether the investment in advertising can be profitable, and whether the answer to the next question in the learning system is correct. The two-class classification is very important in machine learning and is the basis of other algorithms.

The application of multi-category classification, for example, according to a picture, figure out whether the picture is apple, orange or strawberry, etc. There is also a case like Google Mail, which automatically divides the e-mail into spam, important e-mails, social e-mails, and promotional e-mails. Multi-class classification is widely used in visual or auditory identification.

Regression analysis is widely used in forecasting stock prices and forecasting weather and temperature.

Structured Learning

In natural language processing, automatic part-of-speech tagging is a typical example of structural learning. For example, given a machine sentence, because the word may have different parts of speech in different sentences, this method is used to understand the structural characteristics of the sentence. This learning method can be viewed as a multi-category classification, but unlike multi-category classification, the structure of its target may be very large, and its category is hidden behind the sentence.

Examples of such structural learning include, for example, the 3D structure of proteins in organisms and natural language processing.

Structure learning is described as a tagging problem in some places. The input of the tagging problem is a sequence of observations and the output is a tag sequence or sequence of states. The goal of the labeling problem is to learn a model that allows it to give a sequence of markers to the observation sequence as a prediction.

Use data labels as a basis for classification

According to this classification method, the most common categories are supervised learning, unsupervised learning, and semi-supervised learning. In the above discussion of the difference in the output space as the classification basis, the introduction is basically supervised learning, let's take a look at the other two.

Unsupervised learning

Clustering, {x[n]} => cluster(x) , where the type of data is unknown and different classifications are obtained according to certain rules.

It can be approximated as an unsupervised multi-category classification.

Density escalation, {x[n]} => density(x) , where density(x) can be a probability density function or a probability function.

It can be approximated as an unsupervised bounded regression problem.

Outlier detection, {x[n]} => unusual(x) .

It can be viewed as an approximate unsupervised two-class classification problem.

Examples of clustering are like dividing various articles on the web into different topics. Business companies divide customers into different groups according to different customer information, and then adopt different promotion strategies.

A typical example of density estimation is to predict the areas where the risk of accidents is high, based on the location traffic reports.

An example of anomaly detection is to detect whether there is an abnormal intrusion based on the status of the network log. This is an extreme “right and wrong” and can be derived using unsupervised methods.

Semi-supervised learning

Semi-supervised learning is a combination of supervised learning and unsupervised learning. It mainly considers how to use a small number of annotation samples and a large number of unlabeled samples for training and classification. Semi-supervised learning is the use of large amounts of unlabeled data to improve the performance of machine learning algorithms.

There are five main types of semi-supervised learning algorithms: probability-based algorithms; methods based on existing supervisory algorithms; methods that rely directly on clustering assumptions; methods based on multiple attempts; graph-based methods.

Examples of semi-supervised learning, such as the recognition of face photos on Facebook, may be that only a small percentage of faces are marked and most of them are unmarked.

Reinforcement learning

Reinforcement learning is a special, environment-friendly machine learning method that takes environmental feedback as input. The so-called reinforcement learning refers to the learning from the environmental state to behavior mapping so that the cumulative reward value obtained by the system behavior from the environment is maximized. This method differs from supervised learning techniques in that it informs the behaviors adopted through positive examples and counterexamples, but discovers optimal behavioral strategies through trial-and-error methods.

The output here is not necessarily the output that you really want to get, but it is used to reward or punish the system to tell whether the system is doing well or not.

For example, an online advertising system can be seen as a customer training the advertising system. This system gives the customer an advertisement, that is, a possible output, and the customer has no point or no money for this advertisement, which is a measure of how well the advertisement is placed. This allows the advertising system to learn how to put more suitable ads.

Different from the way of communication with the machine as a basis for classification

Batch learning, batch input to the learning algorithm, can be referred to as spoon-feeding learning.

On-line learning (online learning), according to the order, sequential learning, constantly modify the model, and optimize.

Hypothesis 'improves' through receiving data instances

The first two classifications of learning algorithms can be viewed as passive learning algorithms.

Active learning, which can be viewed as the machine's ability to ask questions, specifies the input x[n] and asks its output y[n]

Improve hypothesis with fewer labels (hopefully) by asking questions

When the label's acquisition cost is very expensive, it will use this method

Most machine learning is mostly batch learning.

Examples of online learning are like spam filters. Instead of all training and recognition all at once, they come in serial form, one by one, so that the method of sequential learning is continuously updated.

A simple example of active learning is that, like QQ space, there are annotations of friends' photos, that is, the machine asks questions to people.

Use the difference of the input space as the classification basis

Concrete features, each feature of the input X is collated and analyzed by humans. This analysis is often associated with a professional domain.

Raw features, which require people or machines to convert, transform the original features into specific features, and the recognition of machine vision and sound signals belongs to this class.

Abstract features

Examples of abstract features, such as the student's numbering information in an online teaching system, and the ID number of the advertisement in the advertising system. Using them all require more feature extraction actions.

Supplement of original features

Here we need to add some relevant knowledge about deep learning in the raw features.

Deep learning is performed automatically by the machine for feature extraction. It requires a lot of information or unsupervised learning to learn how to extract very specific features from it.

Deep learning creates more abstract high-level representation attribute categories or features by combining low-level features to discover distributed representations of data.

The above is a brief description of the types of learning algorithms I used in the beginner stage. It may be inaccurate. It also requires the reader to analyze and judge. Finally, I want to say to myself that in the process of my study, I don’t need to pursue the detailed and comprehensive description of the contents of the records. Instead, I really need to deepen the understanding of the issues in the process of recording and writing. Think for yourself. After all, I don't write this content for the purpose of publishing books. Instead, I have to flexibly accumulate the key elements of learning and perform better knowledge management. Of course, it would be better if it could help the reader.

Front Hydrogel Films

The hydrogel Screen Protector has super ductility and shrinkage, has a strong and effective self-repair function, impact resistance, durability, better toughness, and has a certain buffering effect on sharp objects. The use of hydrogel film can adapt to the contours of any device, so it can be attached to curved screens and rounded edges. The full-coverage screen protector can perfectly fit your screen and provide maximum protection.

The 0.14mm ultra-thin thickness is more sensitive to the touch, and the ultra-thin design gives you a bare-metal experience. The oleophobic and waterproof coating prevents fingerprints and dust, makes the hydrogel screen protector easy to clean, and makes your phone look more beautiful.

Hd Clear Hydrogel Protector Sheet,Anti Blue-Ray Hydrogel Protector,Matte Hydrogel Film,Privacy Hydrogel Film

Shenzhen TUOLI Electronic Technology Co., Ltd. , https://www.hydrogelprotector.com

Posted on