Understanding Probabilistic Inverse Graphics through Optical Digit Recognition with 2D Gaussians

Ethan Chun


Abstract

Probabilistic inverse graphics has the potential to revolutionize visual inference in the face of uncertainty. However, the prerequisite knowledge to develop state of the art systems is immense. In this work, we simplify the issue and target optical digit recognition, one of the simplest inverse graphics problems known. While the task itself has largely been solved, we aim to provide a intuitive gateway into the world of probabilistic inverse graphics to ease development of more complex systems. To this end, we present a system to recognize and reconstruct the ten basic digits by modeling each as a collection of 2D Gaussians. Running inference on noisy point set images, we demonstrate a success rate of 0.95 over 50 trials per digit. Furthermore, we present two of our past attempts at this problem and provide intuitive reasoning on their strengths and pitfalls

Paper

Please see our paper for additional details on the project. (pdf link here)

Frame 1 Frame 2 Frame 3 Frame 4 Frame 5

Webpage created by Ethan Chun and ChatGPT :)