Select Language

AI社区

公开数据集

NEOCR:自然环境OCR数据集,包含659幅真实世界的图像

NEOCR:自然环境OCR数据集,包含659幅真实世界的图像

1.31G
597 浏览
2 喜欢
0 次下载
0 条讨论
NLP,Arts and Entertainment 2D Box,Classification

NEOCR数据集包含659幅真实世界的图像,带有5238个带注释的边界框(文本字段)。这些图像是由多人独立于数据集拍摄的,因此数据集......

数据结构 ? 1.31G

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    NEOCR数据集包含659幅真实世界的图像,带有5238个带注释的边界框(文本字段)。这些图像是由多人独立于数据集拍摄的,因此数据集涵盖了广泛的特征,这些特征将现实世界的图像与扫描的文档区分开来。所有人类可识别的文本都已对所有图像进行了注释。对于每个元数据维度,数据集中至少包含100个文本字段时,数据集创建过程停止。

     

    Example images from the NEOCR dataset. Note that the dataset also includes images with text in different languages, text with vertical character arrangement, light text on dark and dark text on light background, occlusion, good and bad contrast..


    地面真相不仅包含可见文本,还包含扭曲四边形,它比边界框更精确地包围可见文本。该数据集包含丰富的元数据,包括亮度、对比度、反转、纹理、分辨率、噪声、模糊、失真、旋转、字符排列、遮挡、字体和语言信息。注释以基于LabelMe模式的XML格式提供。

    metadata and Ground Truth Data

    The annotation was created manually by an adaptation of the LabelMe annotation tool. All text visible and recognizable by humans has been annotated for all images. The annotation is provided in XML, the schema of LabelMe was extended to our needs. The extended XMLschema is also provided as part of the dataset. metadata is provided globally and locally.

     

    Example of different text characteristics present in images of the NEOCR dataset, along with ground truth bounding boxes and distortion quadrangles.

    Global image metadata includes the filename, folder, source information, image width, height, depth, brightness and contrast. Textfield (local, bounding box) metadata contains the visible text and optical, geometrical and typographical characteristics. Bounding boxes are rectangular and parallel to the axes. Additionally distortion quadrangles are provided which enclose the visible text more precisely.

     

    The LabelMe interface used for ground truthing.

    Optical characteristics include texture, brightness, contrast, inversion, resolution, noise and blur information. Texture, noise and inversion were annotated manually, the rest was computed automatically using ImageMagick. Geometrical characteristics cover distortion, rotation, character arrangement and occlusion information. Typographical characteristics contain typeface and language metadata. Please see the CBDAR paper [1], the technical report [2] or the metadata documentation for further details on the metadata.

    Related Tasks

    References

    1. R. Nagy, A. Dicker and K. Meyer‐Wegener, "NEOCR: A Configurable Dataset for Natural Image Text Recognition". In CBDAR Workshop 2011 at ICDAR 2011. pp. 53‐58, September 2011. (PDF), (Presentation)

    2. R. Nagy, A. Dicker, and K. Meyer‐Wegener, "Definition and evaluation of the NEOCR Dataset for Natural‐Image Text Recognition". University of Erlangen, Dept. of Computer Science, Technical Reports, CS‐2011‐07, September 2011. (PDF)

    Submitted Files

    Disclaimer

    By downloading and using the dataset you agree to acknowledge it's source and cite the above papers in related publications. Please link to the authors' Web page of the set as http://www6.cs.fau.de/neocr.

    Contact Author

    Robert Nagy
    University of Erlangen-Nuremberg
    Chair for Computer Science 6 (Data Management)
    Matrensstr. 3
    D-91058 Erlangen
    Germany
    Email: robert[dot]nagy [at] cs[dot]fau[dot]de


    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:12 去赚积分?
    • 597浏览
    • 0下载
    • 2点赞
    • 收藏
    • 分享