(01)ORB-SLAM2 source code analysis without blind spots-(04)Monocular tracking_overall framework explanationTrackMonocular→GrabImageMonocular

Summary link of a series of articles about slam that I explain:The most complete slam in history starts from scratch, for the (01) ORB-SLAM2 source code explained in this column, the dead-angle analysis link is as follows:
(01)ORB-SLAM2 source code no dead ends analysis-(00) Catalog_Latest no dead ends explanation: https://blog.csdn.net/weixin_43013761/article/details/123092196

Interested friends can add WeChat 17575010159 to discuss technology with each other – you can also follow the public account at the end of the article

I. Introduction

Earlier we used depth map debugging and gave a brief explanation. However, the depth map does not involve as many things as monocular images. In order for everyone to learn more, we will use monocular images to explain next. According to the previous blog, the instructions for running the monocular camera are as follows:

cd /my_work/01.ORB-SLAM2 source code analysis/ORB_SLAM2
	# executes instructions. / Examples/Monocular mono_tum Vocabulary/ORBvoc. TXT Examples/Monocular/TUMX yaml PATH_TO_SEQUENCE_FOLDER
	# Note that the TUMX.yaml file here should correspond to the data set type you downloaded, and PATH_TO_SEQUENCE_FOLDER should correspond to your data set folder path, so I modified it to:
	./Examples/Monocular/mono_tum Vocabulary/ORBvoc.txt Examples/Monocular/TUM1.yaml /my_work/01.ORB-SLAM2 source code analysis/Datasets/rgbd_dataset_freiburg1_xyz

After executing the above instructions, the code starts from the main function in ./Examples/Monocular/mono_tum.cc. The main core code is as follows (only the important code is pasted):

//Loop, read images for tracking
	for(int ni=0; ni<nImages; ni++)
		//Read the image to get the pixels
		im = cv::imread(string(argv[3])+"/"+vstrImageFilenames[ni],CV_LOAD_IMAGE_UNCHANGED);
		//Perform monocular tracking based on the input image
    // Stop all threads

In fact, you can see that the core of it is SLAM.TrackMonocular(im,tframe), and all the following chapters will be explained around this function.

2. TrackMonocular

In the src\System.cc file, we can see the specific process of TrackMonocular function implementation. When explaining this function, let us first review the ORB_SLAM2::System constructor we explained in the previous section. You can see the following code:

//Tracker, responsible for tracking some related operations
    mpTracker = new Tracking(this,mpVocabulary,mpFrameDrawer,mpMapDrawer,mpMap,mpKeyFrameDatabase,strSettingsFile,mSensor);
    //Local mapper, responsible for the construction of local maps			
    mpLocalMapper = new LocalMapping(mpMap,mSensor==MONOCULAR);	
    //Loop closer, closed loop detection and closed loop operation
    mpLoopCloser = new LoopClosing(mpMap,mpKeyFrameDatabase,mpVocabulary,mSensor!=MONOCULAR);

The class objects mpTracker, mpLocalMapper, and mpLoopCloser are very important and are the three core class objects of the entire system. Recalling what we explained in the previous chapter, let’s take a look at the TrackMonocular function:

//Similarly, the tracker interface when the input is a monocular image
cv::Mat System::TrackMonocular(const cv::Mat &im, const double &timestamp)
        cerr << "ERROR: you called TrackMonocular but input sensor was not set to Monocular." << endl;

    // Check mode change
        // Exclusive lock, mainly to avoid confusion in mbActivateLocalizationMode and mbDeactivateLocalizationMode
        unique_lock<mutex> lock(mMutexMode);
        // mbActivateLocalizationMode is true to close the local map thread

            // Wait until Local Mapping has effectively stopped

            // After the local map is closed, the thread that only performs tracking only calculates the pose of the camera and does not update the local map.
            // Set mbOnlyTracking to true
            // Closing the thread allows other threads to get more resources
            mbActivateLocalizationMode = false;
        // If mbDeactivateLocalizationMode is true, the local map thread is released and the keyframe is deleted from the local map.
            mbDeactivateLocalizationMode = false;

    // Check reset
    unique_lock<mutex> lock(mMutexReset);
        mbReset = false;

    //Get the estimation result of camera pose
    cv::Mat Tcw = mpTracker->GrabImageMonocular(im,timestamp);

    unique_lock<mutex> lock2(mMutexState);
    mTrackingState = mpTracker->mState;
    mTrackedMapPoints = mpTracker->mCurrentFrame.mvpMapPoints;
    mTrackedKeyPointsUn = mpTracker->mCurrentFrame.mvKeysUn;

    return Tcw;

Generally speaking, the operation on it is relatively simple. The main process is as follows:

1. Determine whether the sensor type is monocular mode. If not, it means the setting is wrong and the function returns directly.
	2. Lock mode lock (mMutexMode):
		(1) If the positioning mode currently needs to be activated, request to stop local mapping, wait for the local mapping thread to stop, and set it to tracking-only mode.
		(2) If you need to cancel the positioning mode at present, notify that local mapping can work and turn off the tracking only mode.
	3. Lock reset lock (mMutexReset): Check whether there is a reset request, and if so, perform a reset operation

	4. Core part: Obtain camera position and posture based on the input image (which includes feature extraction and matching, map initialization, key frame query, etc.)

	5. Update data, such as tracking status, map points of the current frame, key points after correction of the current frame, etc.

The core part is to obtain the camera position and posture based on the input image, which is the function GrabImageMonocular().


The calling code of this function is

cv::Mat Tcw = mpTracker->GrabImageMonocular(im,timestamp);

It mainly performs the following operations:

1. If the input image is not a grayscale image, convert it to a grayscale image.

	2. Use different parameters (number of extracted feature points) to create the Frame class depending on whether it is the first frame or whether it is initialized.
	3. Track(); to track

The code comments are as follows

cv::Mat Tracking::GrabImageMonocular(const cv::Mat &im,const double &timestamp)
    mImGray = im;

    // Step 1: Convert color image to grayscale image
    //If the picture is 3 or 4 channels, it needs to be converted into a grayscale image.
    else if(mImGray.channels()==4)

    // Step 2: Construct Frame
    //Determine whether initialization has been carried out currently. If it is the first frame, mState==NO_IMAGES_YET, indicating that no initialization has been carried out.
    if(mState==NOT_INITIALIZED || mState==NO_IMAGES_YET) //The previous state that was not successfully initialized is NO_IMAGES_YET
        mCurrentFrame = Frame(
            mpIniORBextractor, //Initializing the ORB feature point extractor will extract twice the number of specified feature points
        mCurrentFrame = Frame(
            mpORBextractorLeft, //ORB feature point extractor during normal operation, extracts the specified number of feature points

    // Step 3: Tracking

    //Return the pose of the current frame
    return mCurrentFrame.mTcw.clone();

4. Conclusion

From the GrabImageMonocular function, we can see that its core part should exist in the Track() function, but the Frame creation on it is also very important, and a lot of preparatory work for tracking is done, such as image pyramids, features Operations such as extraction, key point correction, and uniform distribution of feature points are also the main objects of our next study.


Related Posts

Overview of Augmented Reality 2D/3D Marker Recognition/Tracking Registration Technology

Ubuntu 20.04 configures ORB-SLAM2 and ORB-SLAM3 operating environment + ROS real-time running ORB-SLAM2 + Gazebo simulation running ORB-SLAM2 + installation of various related libraries

Problems and solutions encountered when Unity installs Vuforia and configures Android

Tens of thousands words of detailed explanation of high-precision maps in autonomous driving and vehicle-road collaboration

Using Python to automatically send messages and customize content is so easy!

Wechat applet AR engine kivicube v2 is officially launched

The simplest SLAM zero-based interpretation in history (1) – rotation and translation matrix → Euclidean transformation derivation

The Road to Computational Illusion (1): Augmented Reality Before It Was Defined

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>