Author : Anirudha Kolpyakwar 1
Date of Publication :24th October 2019
Abstract: Reading text from image is very difficult that has received a significant amount of attention. The major key components of most systems are (i) text detection from images and (ii) character recognition, and many recent methods have been proposed to design better feature representations and models for both. In this paper, an efficient algorithm which can automatically detect, localize and extract horizontally aligned text in images (and digital videos) with complex backgrounds is presented. The proposed approach is based on the application of a color reduction technique, a method for edge detection, and the localization of text regions using projection profile analyses and geometrical properties. The output of the algorithm are text boxes with a simplified background, ready to be fed into an OCR engine for subsequent character recognition. Our proposal is robust with respect to different font sizes, font colors, languages and background complexities. The performance of the approach is demonstrated by presenting promising experimental results for a set of images taken from different types of video sequences
Reference :
-
- R. Salakhutdinov and G. E. Hinton, “Deep Boltzmann Machines,” in 12th International Conference on AI and Statistics, 2009.
- M. Ranzato, A. Krizhevsky, and G. E. Hinton, “Factored 3- way Restricted Boltzmann Machines for Modeling Natural Images,” in 13th International Conference on AI and Statistics, 2010.
- Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, “Greedy layer-wise training of deep networks,” in Neural Information Processing Systems, 2006.
- R. Raina, A. Battle, H. Lee, B. Packer, and A. Ng, “Selftaught learning: transfer learning from unlabeled data,” in 24th International Conference on Machine learning, 2007.
- J. C. van Gemert, J. M. Geusebroek, C. J. Veenman, and A. W. M. Smeulders, “Kernel codebooks for scene categorization,” in European Conference on Computer Vision, 2008
- L.-J. Li, H. Su, E. Xing, and L. Fei-Fei, “Object bank: A high-level image representation for scene classification and semantic feature sparsification,” in Advances in Neural Information Processing Systems, 2010.
- Y. Pan, X. Hou, and C. Liu, “A robust system to detect and localize texts in natural scene images,” in International Workshop on Document Analysis Systems, 2008.
- J. J. Weinman, E. Learned-Miller, and A. R. Hanson, “A discriminative semi-markov model for robust scene text recognition,” in Proc. IAPR International Conference on Pattern Recognition, Dec. 2008.
- X. Fan and G. Fan, “Graphical Models for Joint Segmentation and Recognition of License Plate Characters,” IEEE Signal Processing Letters, vol. 16, no. 1, 2009.
- J. J. Weinman, “Typographical features for scene text recognition,” in Proc. IAPR International Conference on Pattern Recognition, Aug. 2010, pp. 3987–3990.
- A.K. Jain and B. Yu: Automatic Text Location in Images and Video Frames. Pattern Recognition. Vol. 31, No. 12, (1998)2055-2076.
- .S. Hua, W.Y. Liu, H.J. Zhang: Automatic Performance Evaluation for Video Text Detection. In: Int. Conf. on Document Analysis and Recognition (2001).
- D. T. Chen, H. Bourlard, J-P. Thiran: Text Identification in Complex Background Using SVM. Int. Conf. on CVPR (2001).
- Y. Zhong, H.J. Zhang, and A. K. Jain: Automatic Caption Localization in Compressed Video. IEEE Trans. on PAMI, Vol. 22, No. 4, (2000)385-392.
- R. Lienhart and A. Wernicke: Localizing and Segmenting Text in Images and Videos. IEEE Trans. on CSVT, Vol.12, No.4 (2002).