iPhone Camera Calibration

iPhone Camera Calibration

2023. 12. 16. 10:00ㆍComputer Vision

하고 싶은 일

아이폰으로 사진을 찍었다.

사진의 특정 픽셀이 실제 world에서 어떤 위치에 있는지 알고 싶다.
그렇다고 UTM 좌표나 위도 경도를 알고 싶은 건 아니고, 내가 정하는 어떤 점을 기준으로 어디에 위치해 있는지 알고 싶다.
즉, image coordinate을 world coordinate으로 변환하고 싶다.

알아야 할 것

우선, 좌표계의 종류가 세 가지 있다는 것을 알아야 한다: (1) Image, (2) Camera, (3) World

source: Vehicle Localization in 3D World Coordinates Using Single Camera at Traffic Intersection, Sensors'23

우리가 가지고 있는 건 image coordinate system 상의 이미지이고, 알고 싶은 건 image coordinate이 world coordinate system 상에는 어디에 위치해 있을지이다.
따라서 서로 다른 좌표계간 변환 (image -> world)이 필요한데, camera matrix를 알면 할 수 있다.
Camera matrix는 두 종류로 나누어진다.
1. Intrinsic Matrix $\mathbf{K}$를 알면 image coordinate $(u, v)$ $\leftrightarrow$ camera coordinate $(x_c, y_c, z_c)$간 변환이 가능하다.
  $$\begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = s \mathbf{K} \begin{bmatrix} x_c \\ y_c \\ z_c \end{bmatrix}$$
2. Extrinsic Matrix $[\mathbf{R} \vert \mathbf{t}]$를 알면 camera coordinate $(x_c, y_c, z_c)$ $\leftrightarrow$ world coordinate $(x_w, y_w, z_w)$간 변환이 가능하다.
  $$\begin{bmatrix} x_c \\ y_c \\ z_c \end{bmatrix} = [\mathbf{R} \vert \mathbf{t}] \begin{bmatrix} x_w \\ y_w \\ z_w \\ 1 \end{bmatrix}$$
Camera matrix는 calibration을 통해 알 수 있다.

Intrinsic Matrix 구하기

$$\mathbf{K} = \begin{bmatrix} f_x & \gamma & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix}$$

1) $\gamma$: skew coefficient

이미지 센서 cell의 y축이 기울어진 정도를 의미한다.

요즘 사용하는 대부분의 카메라는 무시해도 될 정도의 값을 가진다고 하니 그냥 0으로 설정하자.

A true CCD camera has only four internal camera parameters, since generally s = 0. If s ≠ 0 then this can be interpreted as a skewing of the pixel elements in the CCD array so that the x- and y- axes are not perpendicular. This is admittedly very unlikely to happen.
출처: Computer Vision Algorithms and Applications by Richard Szeliski

참고로, opencv에서도 그냥 0으로 가정한다.
- opencv calibrateCamera func: https://docs.opencv.org/4.0.1/d9/d0c/group__calib3d.html#ga3207604e4b1a1758aa66acb6ed5aa65d

2) $c_x, c_y$: optical center

Image coordinate system의 원점과 camera coordinate system의 원점의 offset을 의미한다.
Image coordinate system의 원점은 보통 왼쪽 위 구석이고, camera coordinate system의 원점은 보통 이미지 중앙이기 때문에, $(c_x, c_y)$는 $(\textup{image_width} / 2, \textup{image_height} / 2)$로 계산된다.

3) $f_x, f_y$: focal length

Focal length는 렌즈와 이미지 센서 간의 거리이다.
그렇다면 하나의 값일 텐데, 왜 $f_x$가 있고 $f_y$가 있을까?
Focal length의 단위가 pixel이기 때문이다.
- Pixel은 이미지 센서의 cell에 대응되고, cell 간격이 가로와 세로가 다른 경우 $f_x$와 $f_y$가 다른 값을 가진다.
Cell의 가로 세로 간격 역시 요즘 사용하는 대부분의 카메라는 같다고 하니, 하나의 focal length $f = f_x = f_y$를 사용하자.
Focal length $f$는 field of view (fov) $\alpha$를 알면 구할 수 있다.

source: learnopencv.com/approximate-focal-length-for-webcams-and-cell-phone-cameras/

기본적인 삼각함수를 이용하면 초점거리 $f$ 계산이 가능하다.
$$f=\frac{w}{2 tan(\alpha/2)}$$
이때, fov $\alpha$가 어떤 방향 (vertical, horizontal, diagonal)에 대한 fov인지 잘 파악하여 $w$를 정확히 설정해야 한다.

source: https://learnopencv.com/approximate-focal-length-for-webcams-and-cell-phone-cameras/

iPhone 14 Pro 구매 페이지를 들어가 보면 camera의 fov가 120도라고 나와있는데, HFOV가 아니라 DFOV임에 주의하자.

source: www.apple.com/am/iphone-14-pro/specs/

iPhone 14 Pro의 fov를 정리하면:

	Type	x0.5 (ultra wide)	x1.0 (wide)
portrait (세로 모드)	VFOV	85°	50°
landscape (가로 모드)
source: pixelcraft.photo.blog/2023/06/16/the-real-info-regarding-angle-of-view-on-iphone-cameras/

4) 그래서 Intrinsic Matrix $\mathbf{K}$는

iPhone 14 Pro 0.5배 줌 portrait 모드로 1920x1080 사진을 찍었다면:
$$\mathbf{K} = \begin{pmatrix} 1048 & 0 & 540 \\ 0 & 1048 & 960 \\ 0 & 0 & 1 \\ \end{pmatrix}$$

Extrinsic Matrix 구하기

$$[\mathbf{R} \vert \mathbf{t} ] = \begin{bmatrix} r_{00} & r_{01} & r_{02} & t_{x} \\ r_{10} & r_{11} & r_{12} & t_{y} \\ r_{20} & r_{21} & r_{22} & t_{z} \\ \end{bmatrix}$$

Extrinsic matrix를 알려면 아래 방정식의 12개의 unknown variable을 알아내면 될 것 같다.
- N 쌍의 대응되는 world coordinate (x_w, y_w, z_w)과 camera coordinate (x_c, y_c, z_c)을 이용해 연립방정식을 풀면 되지 않을까?

$$\begin{bmatrix} x_c \\ y_c \\ z_c \end{bmatrix} = \begin{bmatrix} r_{00} & r_{01} & r_{02} & t_{x} \\ r_{10} & r_{11} & r_{12} & t_{y} \\ r_{20} & r_{21} & r_{22} & t_{z} \\ \end{bmatrix} \begin{bmatrix} x_w \\ y_w \\ z_w \\ 1 \end{bmatrix}$$

그런 생각이 든다면, 간과한 것이 있다. 우리가 알고 있는 것은 world coordinate $(x_w, y_w, z_w)$과 대응되는 camera coordinate $(x_c, y_c, z_c)$이 아니라, image coordinate $(u, v)$이다.
그렇다고 image coordinate system과 world coordinate system 간의 방정식을 풀자니 다른 문제가 발생한다.
- 2D image pixel의 depth를 알 수 없기 때문에, 여러 개의 camera coordinates가 하나의 image coordinate에 대응된다. 따라서 두 좌표계간의 변환 관계에는 unknown scale factor $s$가 남는다.
  $$\begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \textcolor{red}{s} \mathbf{K} \begin{bmatrix} x_c \\ y_c \\ z_c \end{bmatrix}$$
- 이 unknown scale factor $s$는 image coordinate system과 world coordinate system 간의 관계식에도 등장하기 때문에, 단순한 연립방정식으로 풀기는 힘들다.
  $$\begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \textcolor{red}{s} \mathbf{K} [\mathbf{R} \vert \mathbf{t}] \begin{bmatrix} x_w \\ y_w \\ z_w \\ 1 \end{bmatrix}$$
이런 방정식을 풀기 위해서는 Direct Linear Transform (DLT)을 이용하면 된다.
하지만 지금은 두가지 이유 때문에 DLT를 이용해서는 정확한 결과를 얻기 힘들다.
- Rotation matrix $\mathbf{R}$은 3 degrees of freedom을 가지는데, 9개의 unknown variable로 표현되어 있다. 따라서 DLT로 구한 $\mathbf{R}$이 rotation matrix라는 보장이 없다.
- 우리는 reprojection error를 minimize 하고 싶은데, DLT의 objective function은 그렇지 않다.
대신, $\mathbf{R}$과 $\mathbf{t}$를 살짝씩 바꿔가면서 reprojection error가 최소가 되는 extrinsic matrix를 찾는 여러 알고리즘이 있다.
- opencv의 solvePnP 함수를 이용하면 다양한 iterative update 알고리즘을 flag 설정으로 간단히 이용할 수 있다.
- 만약 대응되는 (3D points, 2D points) 쌍의 수가 충분히 많다면, outlier에 강건해지기 위해 solvePnP 대신 solvePnPRansac을 사용할 수도 있다.

Flag	Algorithm	Note
SOLVEPNP_ITERATIVE	Levenberg-Marquardt optimization	at least 4 points
SOLVEPNP_P3P	Complete Solution Classification for the Perspective-Three-Point Problem, PAMI’03	exactly 4 points
SOLVEPNP_AP3P	An Efficient Algebraic Solution to the Perspective-Three-Point Problem, CVPR’17	exactly 4 points
SOLVEPNP_EPNP	EPnP: An Accurate O(n) Solution to the PnP Problem, IJCV’09	at least 4 points
SOLVEPNP_SQPNP	A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem, ECCV’20	at least 3 points
SOLVEPNP_IPPE	Infinitesimal Plane-based Pose Estimation, IJCV'14	at least 4 points
~~SOLVEPNP_DLS~~	A Direct Least-Squares (DLS) method for PnP, ICCV’11	결과가 unstable하여 SOLVEPNP_EPNP로 대체됨
~~SOLVEPNP_UPNP~~	Exhaustive Linearization for Robust Camera Pose and Focal Length Estimation, PAMI’13	결과가 unstable하여 SOLVEPNP_EPNP로 대체됨

코드 예시

import cv2
import numpy as np


world_xyzs = np.array([
    (-1.0, 1.37, 0.0),   # left-up
    (1.0, 1.37, 0.0),    # right-up
    (1.0, -1.37, 0.0),   # right-down
    (-1.0, -1.37, 0.0),  # left-down
], dtype="double")
image_uvs = np.array([
    (380, 682),   # left-up
    (676, 678),   # right-up
    (686, 1051),  # right-down
    (406, 1077),  # left-down
], dtype="double")
intrinsic_matrix = np.array([
    [1048, 0, 540],
    [0, 1048, 960],
    [0, 0, 1],
], dtype="double")
success, rot, trn = cv2.solvePnP(
    world_xyzs,
    image_uvs,
    intrinsic_matrix,
    np.zeros((4,1), dtype="double"),
    flags=cv2.SOLVEPNP_ITERATIVE,
)

print("Success:\n {0}".format(success))
print("Rotation Vector:\n {0}".format(rot.flatten()))
print("Translation Vector:\n {0}".format(trn.flatten()))

"""
Success:
 True
Rotation Vector:
 [-2.93926979  0.0819878  -0.19144951]
Translation Vector:
 [-2.82854393e-03 -5.69217691e-01  7.36804737e+00]
"""

할 수 있는 일

Calibration을 통해 4개의 픽셀이 속한 3D plane을 알아낼 수 있고, 3D line을 다시 픽셀로 보낼 수 있다.

p.s. 어떤 알고리즘을 사용하느냐에 따라 결과가 꽤나 달라진다.
- 4개의 matching points 만으로 iterative 알고리즘이 global minimum을 찾기 바라는 것이 무리인가 싶다.

References

저작자표시 비영리

'Computer Vision' 카테고리의 다른 글

[논문 리뷰] Recovery of Intrinsic and Extrinsic Camera Parameters Using Perspective, BMVC'95 (5)	2024.01.05
Direct Linear Transform (0)	2023.12.10

hb.log

hb.log

태그

최근글

댓글

공지사항

아카이브

하고 싶은 일

알아야 할 것

Intrinsic Matrix 구하기

1) \(\gamma\): skew coefficient

2) \(c_x, c_y\): optical center

3) \(f_x, f_y\): focal length

4) 그래서 Intrinsic Matrix \(\mathbf{K}\)는

Extrinsic Matrix 구하기

할 수 있는 일

References

'Computer Vision' 카테고리의 다른 글

관련글

티스토리툴바

hb.log

태그

최근글

댓글

공지사항

아카이브

하고 싶은 일

알아야 할 것

Intrinsic Matrix 구하기

1) \(\gamma\): skew coefficient

2) \(c_x, c_y\): optical center

3) \(f_x, f_y\): focal length

4) 그래서 Intrinsic Matrix \(\mathbf{K}\)는

Extrinsic Matrix 구하기

할 수 있는 일

References

'Computer Vision' 카테고리의 다른 글

관련글

티스토리툴바

4) 그래서 Intrinsic Matrix \(\mathbf{K}\)는