Home > Research > Publications & Outputs > R-YOLO

Electronic data

  • sensors-1063450

    Accepted author manuscript, 12.6 MB, Word document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation. / Wang, Xiqi; Zheng, Shunyi; Zhang, Ce et al.
In: Sensors, Vol. 21, No. 3, 888, 28.01.2021, p. 1-20.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Wang, X, Zheng, S, Zhang, C, Li, R & Gui, L 2021, 'R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation', Sensors, vol. 21, no. 3, 888, pp. 1-20. https://doi.org/10.3390/s21030888

APA

Wang, X., Zheng, S., Zhang, C., Li, R., & Gui, L. (2021). R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation. Sensors, 21(3), 1-20. Article 888. https://doi.org/10.3390/s21030888

Vancouver

Wang X, Zheng S, Zhang C, Li R, Gui L. R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation. Sensors. 2021 Jan 28;21(3):1-20. 888. doi: 10.3390/s21030888

Author

Wang, Xiqi ; Zheng, Shunyi ; Zhang, Ce et al. / R-YOLO : A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation. In: Sensors. 2021 ; Vol. 21, No. 3. pp. 1-20.

Bibtex

@article{503d76d15d8544acb20a8cf4574cb7e7,
title = "R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation",
abstract = "Accurate and efficient text detection in the natural scene is a fundamental yet challenging task in computer vision, especially when dealing with arbitrary-oriented texts. Currently, the majority of text detection methods are designed to identify the horizontal or approximate horizontal text, which cannot satisfy various practical requirements in real-time detection such as image streams or videos. To address this gap, we proposed a novel method of Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrary-oriented texts in natural image scenes. First, the rotated anchor box with angle information was exploited to represent the text bounding box over different orientations. Second, features of different scales were extracted from the input image to achieve the probability, confidence, and inclined bounding boxes of the text. Finally, the Rotational Distance Intersection over Union Non-Maximum Suppression (RDIoU-NMS) is proposed to eliminate the redundancy and acquire the detection results with the highest accuracy. Experiments on benchmark comparison were conducted using four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and HRSC2016. For example, the proposed R-YOLO method obtains an F-measure of 82.3% at 62.5fps with 720p resolution on the ICDAR2015 dataset. The results demonstrate that the proposed R-YOLO method can outperform the state-of-the-art methods significantly in terms of detection efficiency and accuracy. The code will be released at: https://github.com/wxq-888/R-YOLO.",
keywords = "scene text detection, arbitrary-oriented text, rotation anchor, convolutional neural network, YOLOv4",
author = "Xiqi Wang and Shunyi Zheng and Ce Zhang and Rui Li and Li Gui",
year = "2021",
month = jan,
day = "28",
doi = "10.3390/s21030888",
language = "English",
volume = "21",
pages = "1--20",
journal = "Sensors",
issn = "1424-8220",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "3",

}

RIS

TY - JOUR

T1 - R-YOLO

T2 - A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation

AU - Wang, Xiqi

AU - Zheng, Shunyi

AU - Zhang, Ce

AU - Li, Rui

AU - Gui, Li

PY - 2021/1/28

Y1 - 2021/1/28

N2 - Accurate and efficient text detection in the natural scene is a fundamental yet challenging task in computer vision, especially when dealing with arbitrary-oriented texts. Currently, the majority of text detection methods are designed to identify the horizontal or approximate horizontal text, which cannot satisfy various practical requirements in real-time detection such as image streams or videos. To address this gap, we proposed a novel method of Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrary-oriented texts in natural image scenes. First, the rotated anchor box with angle information was exploited to represent the text bounding box over different orientations. Second, features of different scales were extracted from the input image to achieve the probability, confidence, and inclined bounding boxes of the text. Finally, the Rotational Distance Intersection over Union Non-Maximum Suppression (RDIoU-NMS) is proposed to eliminate the redundancy and acquire the detection results with the highest accuracy. Experiments on benchmark comparison were conducted using four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and HRSC2016. For example, the proposed R-YOLO method obtains an F-measure of 82.3% at 62.5fps with 720p resolution on the ICDAR2015 dataset. The results demonstrate that the proposed R-YOLO method can outperform the state-of-the-art methods significantly in terms of detection efficiency and accuracy. The code will be released at: https://github.com/wxq-888/R-YOLO.

AB - Accurate and efficient text detection in the natural scene is a fundamental yet challenging task in computer vision, especially when dealing with arbitrary-oriented texts. Currently, the majority of text detection methods are designed to identify the horizontal or approximate horizontal text, which cannot satisfy various practical requirements in real-time detection such as image streams or videos. To address this gap, we proposed a novel method of Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrary-oriented texts in natural image scenes. First, the rotated anchor box with angle information was exploited to represent the text bounding box over different orientations. Second, features of different scales were extracted from the input image to achieve the probability, confidence, and inclined bounding boxes of the text. Finally, the Rotational Distance Intersection over Union Non-Maximum Suppression (RDIoU-NMS) is proposed to eliminate the redundancy and acquire the detection results with the highest accuracy. Experiments on benchmark comparison were conducted using four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and HRSC2016. For example, the proposed R-YOLO method obtains an F-measure of 82.3% at 62.5fps with 720p resolution on the ICDAR2015 dataset. The results demonstrate that the proposed R-YOLO method can outperform the state-of-the-art methods significantly in terms of detection efficiency and accuracy. The code will be released at: https://github.com/wxq-888/R-YOLO.

KW - scene text detection

KW - arbitrary-oriented text

KW - rotation anchor

KW - convolutional neural network

KW - YOLOv4

U2 - 10.3390/s21030888

DO - 10.3390/s21030888

M3 - Journal article

VL - 21

SP - 1

EP - 20

JO - Sensors

JF - Sensors

SN - 1424-8220

IS - 3

M1 - 888

ER -