Firefly Eindhoven - Localization - Verax: Difference between revisions

From Control Systems Technology Group
Jump to navigation Jump to search
Line 10: Line 10:
===Linear algebra===
===Linear algebra===
The automatically generated OPnP algorithm (thus an implementation of the Groebner basis solver) has two main function calls that limit its performance:  
The automatically generated OPnP algorithm (thus an implementation of the Groebner basis solver) has two main function calls that limit its performance:  
* The solving of a linear system <math> Ax = b </math> with dimensions of A (
* The solving of a linear system <math> Ax = b </math> with size(A) = 575x575 and size(b) = 575x81.
*
* An eigenvalue decomposition of a matrix with dimensions (81,81) to retrieve all its eigenvectors.
Parts that are required for the Groebner basis solver: matrix inversion/solving linear system and eigenvector solver
====Solving the linear system====
In general, there are three ways to approach this problem:
# Invert the matrix A and postmultiply the inverted matrix with b to obtain x. This requires explicitely calculating the inverse of matrix A, which is time consuming.
# Decompose matrix A into lower and upper triangular matrices L and U, such that A = LU and Ax = LUx = b. Then first solve Ly = b for y (=Ux) by forward substitution, as L is lower triangular. Afterwards solve Ux = y for x by backward substitution, as U is upper triangular.
# Decompose matrix A into orthonormal and upper triangular matrices R and Q, such that A = QR and Ax = QRx = b. Then first solve Ly = b for y (=Rx) by backward substitution, as Q is upper triangular. Afterwards solve Rx = y by inverting R. Since R is orthonormal, the inverse is given by the transpose, which greatly reduces the computational costs.


===Software===
===Software===

Revision as of 11:28, 22 May 2018

Working principle

1) Hardware system: LEDs, fisheye camera, jetson, find LEDs 2) Describe general PnP problem: locate camera with n camera points

3) Introduction to the OPnP algorithm and general steps, including step to Groebner basis solver

Implementation

Since a regular implementation on either the GPU or CPU of the MATLAB code by making use of MATLAB's embedded C/GPU coder does not meet the performance requirements, the software has to be ported to a lower level language that makes use of GPU accelerated libraries to fully utilize the Jetsons hardware. Since Rogier lacked the expertise to port the code and did not have time to fully look into this, Daan and Johan were tasked with porting the MATLAB code and achieve an update rate of approximately 30 Hz.

Linear algebra

The automatically generated OPnP algorithm (thus an implementation of the Groebner basis solver) has two main function calls that limit its performance:

  • The solving of a linear system [math]\displaystyle{ Ax = b }[/math] with size(A) = 575x575 and size(b) = 575x81.
  • An eigenvalue decomposition of a matrix with dimensions (81,81) to retrieve all its eigenvectors.

Solving the linear system

In general, there are three ways to approach this problem:

  1. Invert the matrix A and postmultiply the inverted matrix with b to obtain x. This requires explicitely calculating the inverse of matrix A, which is time consuming.
  2. Decompose matrix A into lower and upper triangular matrices L and U, such that A = LU and Ax = LUx = b. Then first solve Ly = b for y (=Ux) by forward substitution, as L is lower triangular. Afterwards solve Ux = y for x by backward substitution, as U is upper triangular.
  3. Decompose matrix A into orthonormal and upper triangular matrices R and Q, such that A = QR and Ax = QRx = b. Then first solve Ly = b for y (=Rx) by backward substitution, as Q is upper triangular. Afterwards solve Rx = y by inverting R. Since R is orthonormal, the inverse is given by the transpose, which greatly reduces the computational costs.

Software

This section describes the different software approaches we tried to increase the performance of the algorithm on the Jetson in chronological order.

  1. GPU coder. Rogier requested a trial of the experimental Matlab GPU Coder to generate CUDA code from the matlab implementation. The resulting performance was terrible, most likely due to the fact that the GPU Coder is not yet very well optimized and that some parts of the code that are more efficient to run on the CPU. Unfortunately the resulting code was also very messy and we could not reuse parts of it.
  2. C coder. We used Matlab's CPU coder to generate C code instead from the matlab implementation. This outperformed the GPU implementation even though the GPU of the Jetson is significantly more powerful than the CPU, but it was still far from optimal as we were not using the GPU at all and only a single CPU core.
  3. CUSolver for solving the linear system.
  4. Paper with CUDA pseudocode
  5. Armadillo with MAGMA