Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Bridging the Domain Gap in Scene Flow Estimation via Hierarchical Smoothness Refinement
AU - Zhang, Dejun
AU - Zhang, Mian
AU - Tan, Xuefeng
AU - Liu, Jun
PY - 2024/8/31
Y1 - 2024/8/31
N2 - This article introduces SmoothFlowNet3D, an innovative encoder-decoder architecture specifically designed for bridging the domain gap in scene flow estimation. To achieve this goal, SmoothFlowNet3D divides the scene flow estimation task into two stages: initial scene flow estimation and smoothness refinement. Specifically, SmoothFlowNet3D comprises a hierarchical encoder that extracts multi-scale point cloud features from two consecutive frames, along with a hierarchical decoder responsible for predicting the initial scene flow and further refining it to achieve smoother estimation. To generate the initial scene flow, a cross-frame nearest-neighbor search operation is performed between the features extracted from two consecutive frames, resulting in forward and backward flow embeddings. These embeddings are then combined to form the bidirectional flow embedding, serving as input for predicting the initial scene flow. Additionally, a flow smoothing module based on the self-attention mechanism is proposed to predict the smoothing error and facilitate the refinement of the initial scene flow for more accurate and smoother estimation results. Extensive experiments demonstrate that the proposed SmoothFlowNet3D approach achieves state-of-the-art performance on both synthetic datasets and real LiDAR point clouds, confirming its effectiveness in enhancing scene flow smoothness.
AB - This article introduces SmoothFlowNet3D, an innovative encoder-decoder architecture specifically designed for bridging the domain gap in scene flow estimation. To achieve this goal, SmoothFlowNet3D divides the scene flow estimation task into two stages: initial scene flow estimation and smoothness refinement. Specifically, SmoothFlowNet3D comprises a hierarchical encoder that extracts multi-scale point cloud features from two consecutive frames, along with a hierarchical decoder responsible for predicting the initial scene flow and further refining it to achieve smoother estimation. To generate the initial scene flow, a cross-frame nearest-neighbor search operation is performed between the features extracted from two consecutive frames, resulting in forward and backward flow embeddings. These embeddings are then combined to form the bidirectional flow embedding, serving as input for predicting the initial scene flow. Additionally, a flow smoothing module based on the self-attention mechanism is proposed to predict the smoothing error and facilitate the refinement of the initial scene flow for more accurate and smoother estimation results. Extensive experiments demonstrate that the proposed SmoothFlowNet3D approach achieves state-of-the-art performance on both synthetic datasets and real LiDAR point clouds, confirming its effectiveness in enhancing scene flow smoothness.
U2 - 10.1145/3661823
DO - 10.1145/3661823
M3 - Journal article
VL - 20
SP - 1
EP - 21
JO - ACM Transactions on Multimedia Computing, Communications, and Applications
JF - ACM Transactions on Multimedia Computing, Communications, and Applications
SN - 1551-6857
IS - 8
M1 - 236
ER -