公开数据集

印尼潘屯 NLP,Literature,Text Data,Art,Languages Classification
0.06M 369
印度尼西亚普伊斯 NLP,Literature,Text Data,Art,Languages Classification
10.07M 488
研究文章数据集 Earth and Nature,NLP,Research Classification
31.44M 392
曼基巴特 Business,News,Government,Politics,NLP,Psychology,India,Languages Classification
1.44M 366
一百万条新闻标题 Format: CSV ; Single Filepublish_date: Date of publishing for the article in yyyyMMdd formatheadline_text: Text of the h...NLP,News Classification
57.43M 475
绕口令数据集,带绕口令的数据集(英文) This is a dataset consisting of tongue twisters (in English), mostly from Web Scraping.This dataset contains about 600 s...NLP,TensorFlow,Languages Classification
0.16M 578
一个数据集,包含带有条件的评论中的标记和未标记的句子 This dataset was created during my PhD (http://www.tdg-seville.info/fogallego/Personal%20Info) at the University of Sevi...NLP,Text Data,Universities and Colleges,Ratings and Reviews Classification
794.68M 1004
WebMD药物评论数据集,各种药物的用户评论数据集 The dataset provides user reviews on specific drugs along with related conditions, side effects, age, sex, and ratings r...NLP,Computer Science,Education,Tabular Data,Drugs and Medications Classification
168.58M 449
纯文本维基百科,每个文件都包含维基百科文章的集合 Wikipedia dumps contain a tremendous amount of markup. WikiMedia Text is a hybrid of markdown and HTML, making it very d...NLP,Computer Science,Text Data,Text Mining Classification
23.71G 577
德国新闻数据集 Computer Science,Internet,Education,Software,News,NLP Classification
726.72M 574
品酒师给出的葡萄酒评论数据 ,使用文本分类来对评论中的评论者进行分类 Thinking of Natural Language Processing as a beginner!!The dataset has been about the wine comments or reviews that has...NLP,Business,News,Text Data,Multiclass Classification,Alcohol Classification
50.35M 970
阿拉伯新闻文章半岛电视台.net Business,Education,News,NLP,Text Data,Psychology,Text Mining Classification
111.89M 883
班加罗尔地区在线食品配送偏好 Business,Food,NLP,Text Data,Geospatial Analysis,Jobs and Career Classification
0.23M 472
多模态仇恨言语,150000条带有文本和图像的推特,用于仇恨检测 现有的仇恨语音数据集仅包含文本数据。我们创建了一个新的手动注释的多模态仇恨语音数据集,该数据集由150000条推文组成,每条推...NLP,Online Communities,Image Data,Multiclass Classification,Social Networks Classification
6.55G 1306
来自202个Stackexchange站点的标记集合 This data is extracted from StackExchange for over 200+ websites under the Umbrella. This data consists of all possible...NLP,Business,Online Communities,Text Data Classification
16.75M 417
名称语言 Email and Messaging,NLP,Deep Learning,LSTM Classification
0.16M 325
短篇小说语料库,埃德加·爱伦·坡的短篇小说集 ContentThe present data set includes the full corpus of 69 Edgar Allan Poe's short stories in tabular format. In add...NLP,Text Data,Literature,Text Mining Classification
1.86M 923
电子邮件分类NLP Business,Computer Science,Internet,Email and Messaging,NLP Classification
0.1M 354
阿拉伯语RT新闻标题20200419 News,NLP,Text Data,Languages Classification
88.17M 322
Reddit数据量巨大 Online Communities,Social Networks,NLP,Basketball Classification
38.72M 815