Select Language

AI社区

公开数据集

MAGICDATA 汉语普通话朗读语料数据库(训练数据集)

MAGICDATA 汉语普通话朗读语料数据库(训练数据集)

52G
724 浏览
1 喜欢
3 次下载
0 条讨论
Music Analysis Audio

MAGICDATA Mandarin Chinese Read Speech Corpus was developed by MAGIC DATATechnology Co., Ltd. and freely published for n......

数据结构 ? 52G

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    MAGICDATA Mandarin Chinese Read Speech Corpus was developed by MAGIC DATA Technology Co., Ltd. and freely published for non-commercial use.

    The contents and the corresponding descriptions of the corpus include:


    • The corpus contains 755 hours of speech data, which is  mostly mobile recorded data.

    • 1080 speakers from different accent areas in China are  invited to participate in the recording.

    • The sentence transcription accuracy is higher than 98%.

    • Recordings are conducted in a quiet indoor environment.

    • The database is divided into training set, validation set, and testing  set in a ratio of 51: 1: 2.

    • Detail information such as speech data coding and speaker information is  preserved in the metadata file.

    • The domain of recording texts is diversified, including interactive  Q&A, music search, SNS messages, home command and control, etc.

    • Segmented transcripts are also provided.

    The corpus aims to support researchers in speech recognition, machine translation, speaker recognition, and other speech-related fields. Therefore, the corpus is totally free for academic use.

    The corpus is a subset of a much bigger data ( 10566.9 hours Chinese Mandarin Speech Corpus ) set which was recorded in the same environment. Please feel free to contact us via business@magicdatatech.com for more details.

    Citation

    Please cite the corpus as "Magic Data Technology Co., Ltd., "http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101", 05/2019".

    about us

    Magic Data Technology Co., Ltd. (referred to as Magic Data) was established in 2016. Through our higher-expertise and higher-precision data services, Magic Data has quickly grown into one of the foremost companies in artificial intelligence industry. We strive to provide the most efficient and highest quality one-stop data services for customers in the fields of speech recognition, intelligent imaging and Natural Language Understanding (NLU). Our services include data scheme design, data collection, data annotation/transcription, etc.

    Contact


    • Tel:  (+86) 10-82527250

    • Email:  business@magicdatatech.com

    • http://www.imagicdatatech.com


    External URL: http://www.imagicdatatech.com/index.php/home/dataopensource/data_info/id/101    Full description from the company website


    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:10 去赚积分?
    • 724浏览
    • 3下载
    • 1点赞
    • 收藏
    • 分享