A_公开数据集帕依提提-人工智能高质量数据集服务平台

宝莱坞电影Kabir Singh在推特上写道 Online Communities,Movies and TV Shows,NLP,India Classification

5.48M 498

Shirish Kadam

SMS Spam Ham Prediction Business,Earth and Nature,Internet,Economics,NLP Classification

0.48M 400

Lampu

Warframe Steam 星际战甲用户评论数据 The data is crawled from STEAM, up until April 22nd, 2019...NLP,Video Games Classification

20.22M 429

Jiaxu Zhang

维基百科Word2Vec，Apache Spark word2vec由200K维基百科页面培训 I used Apache Spark to extract more than 6 million phrases from 200,000 English Wikipedia pages. Here is the process of...NLP,Business,Earth and Nature,Text Mining Classification

132.74M 612

Maziyar

ConceptNet Numberbatch 向量，来自 ConceptNet 的词向量 These are the word vectors released by the Conceptnet project.ConceptNet的本质是一个三元组：...NLP Classification

899.91M 456

Nohman

Allennlp包 Computer Science,NLP Classification

715.44M 482

bilal2vec

韩国极端主义网站Womad仇恨言论数据 NLP,Classification Classification

0.16M 439

Yoo Beyoung Woo(???)

Kaggle工作 Computer Science,Education,NLP,Recommender Systems,Search Engines Classification

0.27M 378

AbdullahAli

蔬菜（谷歌Word2Sec新闻） Vegetables (Google Word2Vec News)...NLP,News Classification

3.73M 1108

Liling Tan

阿拉伯ULMFiT模型，基于Ar Wikipedia语料库的阿拉伯语模型 Arabic is a major world language yet is is under represented on the Internet and there is a lack of resources for Arabic...NLP,Transfer Learning,Languages Classification

160.13M 1159

Abed Khooli

医学成绩单,从mtsamples获取的医学转录数据 Medical data is extremely hard to find due to HIPAA privacy regulations. This dataset offers a solution by providing med...NLP,Health,Medicine Classification

16.22M 649

Tara Boyle

FakeNewsNet 假新闻研究数据收集,假新闻、虚假信息、数据挖掘 This is a repository for an ongoing data collection project for fake news research at ASU. We describe and compare FakeN...NLP,News,Social Science,Social Networks Classification

72.61M 3381

Deepak Mahudeswaran

Strongbad邮件 Business,NLP,Text Data Classification

0.11M 415

Nolan Conaway

Medium Articles 包含标记为AI、机器学习、数据科学或人工智能的帖子，以及用户信息 Medium taps into the brains of the world’s most insightful writers, thinkers, and storytellers to bring you the smartes...NLP,Text Data,Literature Classification

1.8G 637

AiswaryaRamachandran

Stack Overflow 2018 问题数据集 In this dataset, we explore StackOverflow questions and try to use unsupervised algorithms to extract tags, then train c...NLP,Earth and Nature,Computer Science,Multiclass Classification Classification

230.27M 693

Réda

数以千计的关于爱情的问题，该数据集包含来自QA服务的爱情类问题和答案 ContextRUSSIAN LANGUAGEThis dataset collected from real answers to questions of the mail.ru service: https://otvet.mail....NLP,Education,Text Data,Languages Classification

176.23M 390

Boris Zubarev

ACL论文选集,论文数据来自ACL选集 The Accepted paper's data from ACL Anthology. An abstract of a paper is extracted from arXiv if it exists.The data i...NLP,Education,Literature Classification

1.14M 404

Takahiro Kubo

curationCorpus 策展语料库策展语料库汇集了 40,000 篇专业撰写的新闻文章摘要，并附有文章本身的链接。这个存储库提供了一个抓取工具来访问它们。如果您对...NLP Text

123.13M 721

Henry Dashwood

PANDA PANDA is the first gigaPixel-level humAN-centric video dataset, for large-scale, long-term,and multi-object visual analy...Person 2D Box

6.31G 1158

Tsinghua University

视频分析参与尺度（VASE）项目数据集此文件包含视频分析参与度量表（VASE）项目的有效性测试和评级数据。...Video Data Classification

4.5M 834

Lai, Daniel, LL,Crutch, Sebastian, J,West, Julian,Brotherhood, Emilie, V,Harding, Emma,Takhar, Rohan,Firth, Nicholas,Camic, Paul, M,

Dataset Category

公开数据集