Select Language

AI社区

公开数据集

微博客PCU数据集  数据集被用于探索微博中的垃圾邮件发送者

微博客PCU数据集 数据集被用于探索微博中的垃圾邮件发送者

5.8M
408 浏览
0 喜欢
5 次下载
0 条讨论
Computer Classification

Jun Liu(liukeen '@' mail.xjtu.cn), Hao Chen(lechenhao '@' gmail.com) , Mengting Zhan, Jianhong Mi,Yanzha......

数据结构 ? 5.8M

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    Jun Liu(liukeen '@' mail.xjtu.cn), Hao Chen(lechenhao '@' gmail.com) , Mengting Zhan, Jianhong Mi,Yanzhang Lv
    MOEKLINNS Lab, Department of Computer Science ,Xi'an Jiaotong University, China


    Data Set Information:

    Our dataset is used by us to explore spammers in microblog and you can access our demo system at
    [Web link]

    Please add :8080 after the domain name as port. The repository webpage fails to parse the weblink when it's added in the source. (under inspection)


    Attribute Information:

    weibo_user.csv has the following attributes:
    -user_id: account ID in sina weibo;
    -user_name: account nickname???
    -gender:account registration gender including male??? female and other???
    -class:account level given by sina weibo;
    -message:account registration location or other personal information;
    -post_num: the number of posts of this account up to now;
    -follower_num: the number of followers of this account;
    -followee_num: the number of followee of this account;
    -follow ratio: followee_num/follower_num;
    -is_spammer: manually annotated label, 1 means spammer and -1 means non-spammer;
    user_post.csv has the following attributes:
    -post_id:user post ID given by sina weibo;
    -post_time:the time when a post is posted;
    -poster_id: the user ID who posted this post;
    -repost_num:the number of retweet by others;
    -commnet_num: the number of comment by others;
    followe-followee.csv has the following attributes:
    -follower: the nickname of follower;
    -follower_id: the user ID of follower;
    -followee: the nickname of followee;
    -followee_id: the user ID of followee;
    post.csv is almost the as user_post.csv and the post in it are retrievalled by a certain key word related to a topic;

    -content: the post text(mostly in Chinese, please set your Microsoft Office to make it readable)


    Relevant Papers:

    N/A



    Citation Request:

    Thanks to MOEKLINNS Lab[[Web link]] especially Spammer Detection Group for opening its data

    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:10 去赚积分?
    • 408浏览
    • 5下载
    • 0点赞
    • 收藏
    • 分享