We introduce a learning-based method to reconstruct objects acquired in a casual handheld scanning setting with a depth camera. Our method is based on two core components. First, a deep network that provides a semantic segmentation and labeling of the frames of an input RGBD sequence. Second, an alignment and reconstruction method that employs the semantic labeling to reconstruct the acquired object from the frames. We demonstrate that the use of a semantic labeling improves the reconstructions of the objects, when compared to methods that use only the depth information of the frames. Moreover, since training a deep network requires a large amount of labeled data, a key contribution of our work is an active self-learning framework to simplify the creation of the training data. Speciically, we iteratively predict the labeling of frames with the neural network, reconstruct the object from the labeled frames, and evaluate the conidence of the labeling, to incrementally train the neural network while requiring only a small amount of user-provided annotations. We show that this method enables the creation of data for training a neural network with high accuracy, while requiring only little manual efort.

, , ,
ACM Transactions on Graphics
School of Computer Science

Hu, R. (Ruizhen), Wen, C. (Cheng), van Kaick, O, Chen, L. (Luanmin), Lin, D. (Di), Cohen-Or, D. (Daniel), & Huang, H. (Hui). (2018). Semantic object reconstruction via casual handheld scanning. ACM Transactions on Graphics, 37(6). doi:10.1145/3272127.3275024