Home > Research > Publications & Outputs > R-YOLO

Electronic data

  • sensors-1063450

    Accepted author manuscript, 12.6 MB, Word document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation

Research output: Contribution to journalJournal articlepeer-review

Published
  • Xiqi Wang
  • Shunyi Zheng
  • Ce Zhang
  • Rui Li
  • Li Gui
Close
Article number888
<mark>Journal publication date</mark>28/01/2021
<mark>Journal</mark>Sensors
Issue number3
Volume21
Number of pages20
Pages (from-to)1-20
Publication StatusPublished
<mark>Original language</mark>English

Abstract

Accurate and efficient text detection in the natural scene is a fundamental yet challenging task in computer vision, especially when dealing with arbitrary-oriented texts. Currently, the majority of text detection methods are designed to identify the horizontal or approximate horizontal text, which cannot satisfy various practical requirements in real-time detection such as image streams or videos. To address this gap, we proposed a novel method of Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrary-oriented texts in natural image scenes. First, the rotated anchor box with angle information was exploited to represent the text bounding box over different orientations. Second, features of different scales were extracted from the input image to achieve the probability, confidence, and inclined bounding boxes of the text. Finally, the Rotational Distance Intersection over Union Non-Maximum Suppression (RDIoU-NMS) is proposed to eliminate the redundancy and acquire the detection results with the highest accuracy. Experiments on benchmark comparison were conducted using four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and HRSC2016. For example, the proposed R-YOLO method obtains an F-measure of 82.3% at 62.5fps with 720p resolution on the ICDAR2015 dataset. The results demonstrate that the proposed R-YOLO method can outperform the state-of-the-art methods significantly in terms of detection efficiency and accuracy. The code will be released at: https://github.com/wxq-888/R-YOLO.