相关搜索
您是不是在找?
今日排行
本周排行
本月排行
用户评级为10M的Goodreads图书数据集
Arts and Entertainment,Social Science,NLP,Literature,Recommender Systems Classification
1128.5M
846
Bahram Jannesar
Septuagint
Earth and Nature,Religion and Belief Systems,NLP,Text Data,Languages Classification
7.39M
381
Abbrivia
来自wallstreetbets等的Subreddit数据,用于后验量化交易算法的情绪分析
All of the submissions to each of the r/wallstreetbets, r/investing, r/options, and r/SecurityAnalysis subreddits since...NLP,Online Communities,Investing Classification
1.49G
445
Sheridan Green
日语-英语字幕语料库(JESC)[CLEANED],由280万个句子组成的大型语料库
This dataset is cleaned version of JESC by handling misplelled English words and doing word segmentation using:English=...NLP,Business,Computer Science,Languages Classification
220.08M
459
Wahyu Setianto
Stackoverflow问题分类挑战
ContextAsking questions is a part of learning. There's no shame in not knowing something and coming to others for he...NLP Classification
6.37M
1128
Nasser Boan
Zeki MFC;任15E;ark131;SF6;zleri |歌词
Music,NLP,Artificial Intelligence,LSTM Classification
0.33M
343
ferhatmetin34
IMBD情绪分类数据集,用spacy标记并以JSON格式存储
ContextIMDB sentiment classification dataset from derived from torchtext, tokenized using spacy and then stored as JSON...NLP,Beginner,Earth and Nature,Movies and TV Shows,Text Data,Binary Classification,spaCy Classification
104.31M
404
Manoj Patra
HuggingFace变压器库的Lonformer基础模型
allenai-longformer-base-4096 Lonformer-base model for HuggingFace Transformers library...NLP,Arts and Entertainment,Transfer Learning Classification
568.34M
371
Akim Tsvigun
用于Sarcasm检测的新闻标题数据集,用于讽刺和假新闻检测任务的高质量数据集
Past studies in Sarcasm Detection mostly make use of Twitter datasets collected using hashtag based supervision but such...NLP,Deep Learning,Classification,Earth and Nature,Computer Science,Programming Classification
11.13M
513
Rishabh Misra
Gazeta摘要 俄罗斯新闻摘要数据集
文件的每一行都是一个JSON对象,包含5个字段:URL、标题、文本、摘要和日期。数据集由74126个示例组成。到目前为止,前60964个例...NLP,Arts and Entertainment,Computer Science,Programming,News,Russia Classification
545.11M
1542
Yallen
Python中情感库的比较,分析Vader、Pattern和Stanford CoreNLP的总统演讲和推文
**Context**- These datasets were produced as part of a little research project I undertook for a blog post on sentiment...NLP,Computer Science,Programming,Social Science,Python,Retail and Shopping Classification
14.45M
1086
Kristof Boghe
有毒评论检测多语言[扩展],Jigsaw有毒通信网络分类竞赛的补充
This is a compilation of all the toxic comment databases out there. I made this for ease of use during the Jigsaw Toxic...NLP,Deep Learning,Classification,Binary Classification Classification
117.55M
453
Alan Sun
Chatbot的心理健康常见问题解答,获取所有心理健康相关问题的答案
ContentMental health includes our emotional, psychological, and social well-being. Mental health is integral to living a...NLP,Beginner,Text Data,Mental Health Classification
0.16M
528
Narendra
Reddit印度NLP数据集,数据集包括2017-2020年从R/India子版块的帖子
[](https://www.python.org/) [!...NLP,Classification,Multiclass Classification,India Classification
117.86M
465
Pranav Hari
JigSaw有毒评论分类清理数据,竖锯评论,带感情,评论长度和翻译文本
I've been working on the JigSaw Multilingual Toxic Comment classification competition and found that the data requir...NLP,Deep Learning,Feature Engineering,Text Data Classification
263.44M
652
Sleeba Paul
OSCAR尼泊尔语语料库,尼泊尔语文本语料库,用于训练NLP的无监督语言模型
The files are from [OSCAR Corpus](https://oscar-corpus.com/). Please visit their site for more information.The dataset i...NLP,Computer Science,Movies and TV Shows,Text Data,Languages Classification
3.1G
555
Prabesh Dhakal



















