HDN
Homography Decomposition Networks for Planar Object Tracking
Zhejiang University
Yueran Liu
Zhejiang University
Zhejiang University
East China Normal University
[ Paper ]
[ Codes ]
[ MindSpore ]

Examples

Video mosiac
Video replace
Image replace
Face editing

Abstract

Planar object tracking plays an important role in AI applications, such as robotics, visual servoing, and visual SLAM. Although the previous planar trackers work well in most scenarios, it is still a challenging task due to the rapid motion and large transformation between two consecutive frames. The essential reason behind this problem is that the condition number of such a non-linear system changes unstably when the searching range of the homography parameter space becomes larger. To this end, we propose a novel Homography Decomposition Networks (HDN) approach that drastically reduces and stabilizes the condition number by decomposing the homography transformation into two groups. Specifically, a similarity transformation estimator is designed to predict the first group robustly by a deep convolution equivariant network. By taking advantage of the scale and rotation estimation with high confidence, a residual transformation is estimated by a simple regression model. Furthermore, the proposed end-to-end network is trained in a semi-supervised fashion. Extensive experiments show that our proposed approach outperforms the state-of-the-art planar tracking methods at a large margin on the challenging POT, UCSB and POIC datasets.


Homography Decomposition Networks

In this paper, we propose a novel Homography Decomposition Networks approach to planar object tracking in video sequences, which decomposes the homography transformation into two groups, including a similarity group and a residual group. By estimating the similarity group firstly, the condition number of the entire system reduces substantially. Inspired by warped convolution, we employ a rotation-scale invariant convolution operator to predict similarity robustly. Then, the second stage predicts the residual transformation through the semi-supervised regression, where the residual transformation is the residual group with the extra error from the first stage.

Evaluation on POT

You can download here in case the url is blocked. We present tracking results on several sequence in POT. To clearly observe the tracking performance, we replace the object with other image in a promotional video.


BibTeX

@misc{zhan2021homography, title={Homography Decomposition Networks for Planar Object Tracking},
author={Xinrui Zhan and Yueran Liu and Jianke Zhu and Yang Li},
year={2021},
eprint={2112.07909},
archivePrefix={arXiv},
primaryClass={cs.CV} }



Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grants (61831015 and 62102152) and sponsored by CAAI-Huawei MindSpore Open Fund. Most of the demo videos are from Pexels. The image editing assets comes from Gangealing. This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.