Select Language

公开数据集

DBWorld电子邮件分类数据集

DBWorld电子邮件分类数据集

Scene:

Computer

Data Type:

Classification
所需积分:15 去赚积分?
  • 246浏览
  • 1下载
  • 0点赞
  • 收藏
  • 分享

Data Preview ? 153K

    Data Structure ?

    *数据结构实际以真实数据为准

    Data Set Information:

    I collected 64 e-mails from DBWorld newsletter and I used them to train different algorithms in order to classify between 'announces of conferences' and 'everything else'. I used a binary bag-of-words representation with a stopword removal pre-processing task before.


    Attribute Information:

    Each attribute corresponds to a precise word or stem in the entire data set vocabulary (I used bag-of-words representation).


    Relevant Papers:

    Michele Filannino, 'DBWorld e-mail classification using a very small corpus', Project of Machine Learning course, University of Manchester, 2011. [Web link]


    Citation Request:

    Thanks to ACM-SIGMOD for its useful service! :)


    Michele Filannino, PhD
    University of Manchester
    Centre for Doctoral Training
    Email: filannim_AT_cs.man.ac.uk

    0相关评论
    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。