Visual Perception: Fundamental Geometry and Camera Basics
A series about Visual Perception Fundamentals like Camera Calibration, Epipolar Geometry, and their mathematical implementations from scratch
--
Visual perception is the human ability to interpret the surrounding environment. [1] In this series, we will learn the Visual Perception fundamentals for robotics without using any Artificial Visual Perception method but using classical approaches step by step.
You need to make sure you know some geometric concepts. Therefore, in the first chapter, we will touch on the mathematical concepts required for Classical Visual Perception methods and then we will see these methods step by step.
- Fundamental Geometry and Camera Basics (starts right below 💃)
- Camera Calibration
1. DLT
2. Zhang’s Calibration - Epipolar Geometry
1. Essential Matrix
2. Fundamental Matrix
3. Triangulation
4. Feature Matching - Visual Perception GUI
A tool implemented using QT Creator, Opencv, and C++ where you can calibrate your camera and apply different Epipolar Geometry methods
Let’s start with the first section!
Fundamental Geometry and Camera Basics
In order to express the environment around us in geometric order, we assume that there is a coordinate system in the “world frame”. Using this world frame coordinate, we can express every single point as a 3D point (x,y,z).
As we know, an image consists of only 2D points with x and y coordinates. What we do using camera systems is actually to express these 3D world frame points in 2D in the image frame. But how?
Using a camera model, we should project these 3D points to 2D points. The projection may be perspective or orthogonal. We will examine perspective projections through Perspective Projection Camera Model:
P: 3D point in the world frame
o: The center of the projection = camera center = optical center = aperture
f: The focal length (the distance between the optical center and the retina)