Select Language

AI社区

公开数据集

EmojifyData数据集:1800万条英文推文,全部包含表情符号

EmojifyData数据集:1800万条英文推文,全部包含表情符号

2.58G
346 浏览
0 喜欢
3 次下载
0 条讨论
NLP,Online Communities,Text Data,Social Networks Classification

So, me and my friend was participating IPavlov course on deep learning in NLP. As out final project we want to work on s......

数据结构 ? 2.58G

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    So, me and my friend was participating IPavlov course on deep learning in NLP. As out final project we want to work on some sort of new, unresolved problem That's why we decided to train model for predicting how to put emojis around some text. Occasionally, the biggest publicly available source of emojized text is Twitter. We open this work to public, so everyone can work on model design, not data preparation.

    Content

    The original files for this dataset was four archives from ArchiveTeam TwitterStream project. We reformatted these files, selected all the english-language tweets with at least one emoji. Also there was some ordinary preprocessing: removing hashtags, urls, mentions.


    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:30 去赚积分?
    • 346浏览
    • 3下载
    • 0点赞
    • 收藏
    • 分享