{$ msg.text $}

Job Description

We are looking for master thesis students who are enthusiastic in applying deep learning techniques to address one of the most challenging questions in the virtual reality industry: how to realize natural interactions between humans and computers in virtual environments.

Natural interaction requires human-like perception and action, which poses complex yet exciting machine learning challenges that range from thorough understanding of 3D environments and objects, to understanding of avatar's kinematic structures from the 3D mesh, to solving inverse kinematics problems for arms and fingers, to generating human-like motion and animation, to analyzing human interactive behaviors in VR in the next-generation VR-powered training games.

Recently machine learning techniques that exploit the deep structure of neural networks have achieved significant progress towards many practical industrial problems. In the context of 3D geometric data, deep neural network (DNN) is an active research area with a wide range of potential applications [1], including 3D shape reconstruction [2], recognition and pose estimation [3, 4], segmentation [5-7], deformation [8], etc. The goal of this thesis is to exploit DNN to analyze organic 3D structure, i.e. humanoid avatar mesh data, and develop an automatic avatar rigging system leveraging DNN. More specifically, given an avatar mesh data, we want to automatically detect and segment hand structure, specify hand and finger joints, and estimate the basic finger geometric parameters for the purpose of animation generation and grasp synthesis.


  • Summarize state-of-the-art of deep learning study aimed at modeling and representing 3D object shape and segmentation of the shape.
  • Collect training database for avatar and hand representations.
  • Implement modeling and training of DNNs, preferrably using Caffe2 deep learning framework, in C++.
  • Test, optimize and evaluate the implemented process using the database.
  • Summarize and discuss the findings in a report / thesis.


[1] Introduction slides on 3D deep learning: http://ai.stanford.edu/~haosu/slides/IntroTo3DDL.pdf
[2] “DeepHuman: 3D Human Reconstruction from a Single Image”, Zerong Zheng et. al. 2019
[3] “V2v-posenet: Voxel-to-voxel prediction network for accurate 3d hand and human pose estimation from a single depth map.”, Moon, G. et al. CVPR, 2018
[4] “Robust 3D Hand Pose Estimation in Single Depth Images: from Single-View CNN to Multi-View CNNs.” Liuhao Ge et. al. CVPR, 2016
[5] “Learning Shape Abstractions by Assembling Volumetric Primitives”, Shubham Tulsiani et. al. CVPR, 2017. Project page
[6] “3D Shape Segmentation with Projective Convolutional Networks”, Evangelos Kalogerakis et al, 2017
[7] “Multi-view Convolutional Neural Networks for 3d Shape Recognition”. By SU, Hang, MAJI, Subhransu, Kalogerakis, Evangelos, et al. 2015
[8] “Learning free-form deformations for 3D object reconstruction”, Dominic Jack et. al. 2018

Hiring Manager
Dan Song

About the company

At Gleechi, we want to enable humans to interact naturally with the digital world, and robots to interact like humans. We are a Stockholm-based startup coming from robotics research and the first in the world to enable artificial hand movement and interaction in real-time. We collaborate with world-leading game and VR developers, we make it possible for stroke patients to interact with virtual worlds, and we enable robots to collaborate with people in hospitals. We're a small team based in central Stockholm that combines awarded entrepreneurs, top-ranked robotics researchers and experienced developers. The company was founded in the end of 2014 and since then we have been awarded the Super startup of the year by Veckans Affärer, won the European startup competition EIT Digital Idea Challenge and much else. We got a ridiculously exciting time ahead and we'd love to get more awesome people onboard!

Visit website

Our Location


Follow Gleechi