公开数据集

IMBD情绪分类数据集,用spacy标记并以JSON格式存储 ContextIMDB sentiment classification dataset from derived from torchtext, tokenized using spacy and then stored as JSON...NLP,Beginner,Earth and Nature,Movies and TV Shows,Text Data,Binary Classification,spaCy Classification
104.31M 483
Zeki MFC;任15E;ark131;SF6;zleri |歌词 Music,NLP,Artificial Intelligence,LSTM Classification
0.33M 357
线缆 19 QA Coronavirus,NLP Classification
23.62M 447
芬兰动词的补语格 NLP Classification
1.56M 400
Stackoverflow问题分类挑战 ContextAsking questions is a part of learning. There's no shame in not knowing something and coming to others for he...NLP Classification
6.37M 1246
大型影评数据集 Original: https://ai.stanford.edu/~amaas/data/sentiment/Это удобное изложение датасета отз...NLP,Arts and Entertainment,Movies and TV Shows Classification
63.18M 369
多种语言 Software,NLP,Deep Learning Classification
0.23M 1137
1994 2009年Arxiv量子物理学论文 Education,NLP,Physics Classification
92.19M 582
来自印度快报的新闻文章数据集 Business,Arts and Entertainment,News,NLP,Classification,Deep Learning,Linguistics,Recommender Systems Classification
63.24M 537
标记化器 NLP Classification
14.88M 468
泰米尔二进制分类1K tweets标签V1 NLP,Classification Classification
0.38M 386
海得拉巴Zomato餐厅 NLP,Ratings and Reviews,Cooking and Recipes,spaCy Classification
3.44M 1126
NERu数据集 NLP,Text Data,LSTM Classification
14.5M 377
ELI5记分器训练数据原型816000例,用于创建评分模型 ELI5 means Explain like I am 5 . It's originally a long and free form Question-Answering scraping from reddit eli5 s...NLP,Earth and Nature,Arts and Entertainment,Education,Social Science,Sports,Regression,Transformers Classification
672.61M 482
古腾堡 Education,Software,NLP,Text Data Classification
14.25M 367
日语-英语字幕语料库(JESC)[CLEANED],由280万个句子组成的大型语料库 This dataset is cleaned version of JESC by handling misplelled English words and doing word segmentation using:English=...NLP,Business,Computer Science,Languages Classification
220.08M 487
IMDB摘要 Arts and Entertainment,Movies and TV Shows,NLP,Text Data Classification
93.03M 403
来自wallstreetbets等的Subreddit数据,用于后验量化交易算法的情绪分析 All of the submissions to each of the r/wallstreetbets, r/investing, r/options, and r/SecurityAnalysis subreddits since...NLP,Online Communities,Investing Classification
1.49G 511
媒体文章集2020版 Arts and Entertainment,Computer Science,Education,NLP Classification
1.63M 560
印度Subreddit数据 Social Networks,NLP Classification
4.41M 387