What are the VR algorithms behind the positioning and capture?

Lei Feng network: The author of this article is a senior practitioner of VR industry.

Via:news.livedoor.com

In 2016, VR engulfed the entire technology circle as a tornado. For a time, it was tempting for all industries to follow the VR to reflect their innovativeness, and more and more people began to understand and walk into VR. Of course there is no shortage of people who are busy. As the saying goes, when this year's bragging, not all professional words are embarrassing. The author simply puts together several commonly used algorithms in VR for your reference. These seemingly tall algorithms, you know a few?

1, FK algorithm

The movement is divided into forward movement and reverse movement. F K is the abbreviation of forward kinematics, ie forward dynamics; IK is the abbreviation for Inverse Kinematics, ie inverse kinematics. The hierarchical structure skeleton of the human body is composed of a number of link chains that adopt a hierarchical method group, including hierarchical joints or chains, motion constraints and effectors, and all the parts are simultaneously moved by the effector.

For example, shoulder joints, elbow joints, wrist joints, and their sub-skeletons are a link chain, that is, a kinematic chain, which is a branch of the entire human kinematic chain, and our body uses the kinematic chain to control the movement. Knowing the rotation angle of each joint on the chain, find the position information of each joint and the position information of the end effector, which is the problem of forward kinematics; and knowing the position information of the end effector, to determine its ancestral joints. The rotation angle and position are the inverse kinematics.

First, let's take a closer look at FK, the positive dynamics.

Positive dynamics believes that the child joints follow the motion of the parent joints, and the child joints can move independently without affecting the state of the parent joints. Take human motion as an example. When we raise our arms, the wrist joints will lift with their parent elbows, and the elbows will move with the rotation of their parent shoulder joints. When the wrist joint is rotated, no movement occurs in its superior joints. This is a typical positive dynamic movement. Therefore, if we know the rotation angle of each joint in the kinematic chain, we can control the movement of its child joints.

The advantage of forward dynamics is that the calculation is simple and the calculation speed is fast. The disadvantage is that the angle and position of each joint need to be specified. Since there are inherent correlations between the various nodes of the skeleton, it is easy to specify the values ​​of each joint. Naturally coordinated action. When applied to the VR motion capture industry, users need to wear a motion capture device on each bone branch, which is inconvenient to use.

Positive dynamics is applied to VR motion capture technology. The specific implementation flow is:

Each skeletal branch on the user wears a motion capture node, such as a chain of hands, arms, arms, and shoulders. The motion capture node obtains the rotation angle of each bone joint during exercise, and applies the rotation angle to the FK algorithm. Together with the corresponding bone length, the position information of the sub-joint and the end effector can be calculated, and the information can be used again. Controls the movement of the entire manikin.

Application: Motion capture technology.

2, IK algorithm

Next, I will introduce you to IK, the inverse dynamics.

IK algorithm has been introduced above to solve the problem, I will use the ball as an example to illustrate: If we know the starting position of the ball, the final position and path, then the rotation of the pitcher arm can be reversed kinematics Calculated automatically. Inverse kinematics alleviates the tedious work of forward kinematics to a certain extent and is one of the best ways to generate realistic joint movements.

There are many ways to solve the IK problem, which can be roughly divided into two categories:

Analytical Method (AnalyticSolutiosn): all solutions can be obtained, for less freedom IK chain solved faster, less suitable for comparing the degree of freedom control, to facilitate real-time control. However, with the increase in the number of joints, the complexity of the analytical equation to solve the equation has also increased dramatically. Therefore, the analytical method is only suitable for chains with less degrees of freedom and is not suitable for complex IK chains.

Numerial solutions: The advantages of numerical methods are versatility and flexibility. They can deal with the more complex and more complex IK chains with hierarchical structure, and can easily implement new constraints in the IK chain. . The numerical method is actually a method of repeated approximation and continuous iteration. Due to the complexity of the IK problem, the inadequacy of the numerical method lies in the high computational complexity. Since the solution is repeated iteratively, the result obtained may not be accurate.

Since inverse dynamics can solve the positioning problem, VR motion capture technology and gesture recognition technology can use IK algorithm. I use the motion capture technology as an example to illustrate the specific implementation process as follows:

Motion capture technology uses a special sensor called a tracker to record the athlete's motion information. Then we can use the recorded data to generate animation motion.

The general flow of motion capture using the IK algorithm is as follows:

First, set up the human body model in the VR content, and then reserve the data interface for the human body model;

Using hardware to obtain the position information of the end effector, and then using the IK (inverse dynamics) algorithm to calculate the human motion data, including the joint rotation angle and position;

This information is then given to the interface reserved for the human model, which drives the human model to move in accordance with the target person wearing the hardware and displays it in the content.

Applications: motion capture technology, gesture recognition technology.

3, PNP

PNP is exactly a problem. The PNP problem was put forward by Fisher and Bolles in 1981.

The specific expression of the PNP problem is as follows: The distance between any two feature points in a given n feature points and the angle between the two feature points and the optical center are known to solve the distance between each feature point and the optical center. This is the PNP issue. The main use of PNP is to determine the coordinates of the n feature points on the target object in the camera coordinate system, and then calculate the coordinate values ​​of the feature point in the world coordinate system according to the camera internal and external parameters obtained by the calibration, and finally give the target Pose information.

The PNP problem is a localization method based on a single image, which has been widely used in VR target positioning and pose solving.

There are many ways to solve the PNP problem, which can be roughly divided into two categories: non-iterative algorithms and iterative algorithms:

The non-iteration algorithm mainly focuses on the PNP problems with less feature points such as P3P and P4P. It mainly uses the mathematic algebra algorithm to directly solve the relative pose of the measured target, and also derives a variety of analytical algorithms. Non-iteration algorithm The computational algorithm is fast, but it is affected by systematic errors, and the accuracy of the solution is generally not high. It is mainly used in the initial calculation of iterative algorithms. The non-iterative algorithm for solving the main object is aimed at more than six different features or four facets of coplanar feature points in two cases.

When the iterative algorithm is used to solve the PNP problem, it is derived based on the assumption that there is no image noise, and the sensitivity of the analytical solution to the position error of the camera image point is particularly high. In order to overcome the influence of noise and improve the precision of pose calculation, PNP iterative algorithm is often used to solve the pose information. The main idea is to further express the PNP problem as a constrained nonlinear optimization problem. The numerical solution to the relative pose of the measuring target. The optimized variable space of this processing method is N+6-dimensional (N is the characteristic number of points), the amount of iterative calculation is large, and it is affected by the accuracy of the initial value calculation. Therefore, the algorithm usually converges to the local minimum value or converges to the wrong solution. Not the global minimum.

Maybe the above description will be more abstract. Here I use P3P as an example to illustrate for everyone:

As shown above: O is the camera's optical center. The length between the three feature points A, B, C and O of the target is x, y, z. The known angle between the three lines is α, β, γ. ,|AB|=c ,|AC|=b |BC|=a, solving x, y, z using α, β, γ and a, b, c, this is the P3P problem.

The number of P3P problem feature points is only 3, and non-iterative algorithms can be used directly. The equations are described as follows:

Let A', B', and C' be the points of A, B, and C on the imaging plane of the camera. Then, after obtaining x, y, and z, use A', B', and C' coordinates, according to the camera's imaging. The relationship can be solved by the coordinates of the feature points in the camera coordinate system .

PNP algorithm can be applied to VR positioning technology, such as infrared optical positioning technology, used to obtain pose information.

Concrete implementation process:

The camera captures the image of the target object and then extracts the feature points from the image.

Then use PNP algorithm to obtain the coordinates of the feature points in the camera coordinate system;

Then use the rotation theory to convert the coordinates of the camera coordinate system to the world coordinate system, and finally get the information of the feature points in the world coordinate system.

Applications: Infrared optical positioning and other VR positioning technology. In addition, PNP acquisition pose information can also be applied to the IK algorithm to achieve VR motion capture.

4, POSIT algorithm

In fact, the POSIT algorithm is one of the PNP problem iterative algorithms mentioned above. The reason why the POSIT algorithm has the advantages of wide convergence area and fast algorithm speed is very widely used in the VR industry. Iterative algorithm as a branch of the PnP problem solution can avoid solving nonlinear equations compared to non-iterative solutions. It reduces the computational complexity to a certain extent. The POSIT algorithm is the typical representative of the iterative algorithm.

The POSIT algorithm input is the coordinates of the three-dimensional feature points on the surface of at least four non-coplanar three-dimensional objects and the coordinates of the two-dimensional feature points on the corresponding image. It is based on the fact that all points on the three-dimensional object have the same depth (ignoring the internal points of the object. The weak projection of the depth difference is assumed to be achieved.

Firstly, the initial value of the three-dimensional object pose parameter (POS, Pose from Orthography and Scaling algorithm) is obtained through the orthogonal projection and size transformation relationship, and then the initial feature point is re-projected using this initial value, and the new projection image is obtained. The point is used as a new pose measurement parameter, the POS algorithm is re-run, and iterations are repeated until the desired accuracy is satisfied.

The POSIT algorithm has the following advantages: Compared to the traditional iterative algorithm, POSIT does not need an approximate initial pose estimation; the algorithm is easy to write code to achieve, legendary in the MATLAB environment requires only 25 lines of necessary code is it; relative to numerical iterative algorithm The POSIT algorithm time is only 10% of the former.

Applications: Infrared optical positioning and other VR positioning technology.

I introduced several problems and algorithms that are more commonly used in VR. The content of the article is not deep, and it only deals with simple science. If you are an industry fan and want to learn VR, I hope my article can help you.

Lei Feng Net Note: This article is Lei Feng network (search "Lei Feng network" public number concern) exclusive draft, reproduced, please contact the authorization, and retain the source and author, not to delete the content.

Posted on