Precise Facial Landmark Detection by Reference Heatmap Transformer

Computing and Communications

Text available via DOI:

https://doi.org/10.1109/TIP.2023.3261749
Final published version

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Precise Facial Landmark Detection by Reference Heatmap Transformer. / Wan, Jun; Liu, Jun; Zhou, Jie et al.
In: IEEE Transactions on Image Processing, Vol. 32, 31.12.2023, p. 1966-1977.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Wan, J, Liu, J, Zhou, J, Lai, Z, Shen, L, Sun, H, Xiong, P & Min, W 2023, 'Precise Facial Landmark Detection by Reference Heatmap Transformer', IEEE Transactions on Image Processing, vol. 32, pp. 1966-1977. https://doi.org/10.1109/TIP.2023.3261749

APA

Wan, J., Liu, J., Zhou, J., Lai, Z., Shen, L., Sun, H., Xiong, P., & Min, W. (2023). Precise Facial Landmark Detection by Reference Heatmap Transformer. IEEE Transactions on Image Processing, 32, 1966-1977. https://doi.org/10.1109/TIP.2023.3261749

Vancouver

Wan J, Liu J, Zhou J, Lai Z, Shen L, Sun H et al. Precise Facial Landmark Detection by Reference Heatmap Transformer. IEEE Transactions on Image Processing. 2023 Dec 31;32:1966-1977. Epub 2023 Mar 29. doi: 10.1109/TIP.2023.3261749

Author

Wan, Jun ; Liu, Jun ; Zhou, Jie et al. / Precise Facial Landmark Detection by Reference Heatmap Transformer. In: IEEE Transactions on Image Processing. 2023 ; Vol. 32. pp. 1966-1977.

Bibtex

@article{82830fdc586f4b068b0930777d9a1d50,

title = "Precise Facial Landmark Detection by Reference Heatmap Transformer",

abstract = "Most facial landmark detection methods predict landmarks by mapping the input facial appearance features to landmark heatmaps and have achieved promising results. However, when the face image is suffering from large poses, heavy occlusions and complicated illuminations, they cannot learn discriminative feature representations and effective facial shape constraints, nor can they accurately predict the value of each element in the landmark heatmap, limiting their detection accuracy. To address this problem, we propose a novel Reference Heatmap Transformer (RHT) by introducing reference heatmap information for more precise facial landmark detection. The proposed RHT consists of a Soft Transformation Module (STM) and a Hard Transformation Module (HTM), which can cooperate with each other to encourage the accurate transformation of the reference heatmap information and facial shape constraints. Then, a Multi-Scale Feature Fusion Module (MSFFM) is proposed to fuse the transformed heatmap features and the semantic features learned from the original face images to enhance feature representations for producing more accurate target heatmaps. To the best of our knowledge, this is the first study to explore how to enhance facial landmark detection by transforming the reference heatmap information. The experimental results from challenging benchmark datasets demonstrate that our proposed method outperforms the state-of-the-art methods in the literature.",

author = "Jun Wan and Jun Liu and Jie Zhou and Zhihui Lai and Linlin Shen and Hang Sun and Ping Xiong and Wenwen Min",

year = "2023",

month = dec,

day = "31",

doi = "10.1109/TIP.2023.3261749",

language = "English",

volume = "32",

pages = "1966--1977",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

RIS

TY - JOUR

T1 - Precise Facial Landmark Detection by Reference Heatmap Transformer

AU - Wan, Jun

AU - Liu, Jun

AU - Zhou, Jie

AU - Lai, Zhihui

AU - Shen, Linlin

AU - Sun, Hang

AU - Xiong, Ping

AU - Min, Wenwen

PY - 2023/12/31

Y1 - 2023/12/31

N2 - Most facial landmark detection methods predict landmarks by mapping the input facial appearance features to landmark heatmaps and have achieved promising results. However, when the face image is suffering from large poses, heavy occlusions and complicated illuminations, they cannot learn discriminative feature representations and effective facial shape constraints, nor can they accurately predict the value of each element in the landmark heatmap, limiting their detection accuracy. To address this problem, we propose a novel Reference Heatmap Transformer (RHT) by introducing reference heatmap information for more precise facial landmark detection. The proposed RHT consists of a Soft Transformation Module (STM) and a Hard Transformation Module (HTM), which can cooperate with each other to encourage the accurate transformation of the reference heatmap information and facial shape constraints. Then, a Multi-Scale Feature Fusion Module (MSFFM) is proposed to fuse the transformed heatmap features and the semantic features learned from the original face images to enhance feature representations for producing more accurate target heatmaps. To the best of our knowledge, this is the first study to explore how to enhance facial landmark detection by transforming the reference heatmap information. The experimental results from challenging benchmark datasets demonstrate that our proposed method outperforms the state-of-the-art methods in the literature.

AB - Most facial landmark detection methods predict landmarks by mapping the input facial appearance features to landmark heatmaps and have achieved promising results. However, when the face image is suffering from large poses, heavy occlusions and complicated illuminations, they cannot learn discriminative feature representations and effective facial shape constraints, nor can they accurately predict the value of each element in the landmark heatmap, limiting their detection accuracy. To address this problem, we propose a novel Reference Heatmap Transformer (RHT) by introducing reference heatmap information for more precise facial landmark detection. The proposed RHT consists of a Soft Transformation Module (STM) and a Hard Transformation Module (HTM), which can cooperate with each other to encourage the accurate transformation of the reference heatmap information and facial shape constraints. Then, a Multi-Scale Feature Fusion Module (MSFFM) is proposed to fuse the transformed heatmap features and the semantic features learned from the original face images to enhance feature representations for producing more accurate target heatmaps. To the best of our knowledge, this is the first study to explore how to enhance facial landmark detection by transforming the reference heatmap information. The experimental results from challenging benchmark datasets demonstrate that our proposed method outperforms the state-of-the-art methods in the literature.

U2 - 10.1109/TIP.2023.3261749

DO - 10.1109/TIP.2023.3261749

M3 - Journal article

VL - 32

SP - 1966

EP - 1977

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

SN - 1057-7149

ER -

Research

Links

Text available via DOI: