公开数据集

CC-100 卡纳达语单语言数据集:来自Web爬网数据的1300万条单语言数据集 This monolingual dataset includes roughly 13 million uncleaned Kannada sentences crawled from numerous websites....NLP,Text Data,Languages Classification
3.51G 439
文本中的情感,句子中表达主要情感的文本数据 I was looking for a well labeled dataset to perform a multiclass classification. I wanted to do something more than just...NLP,Earth and Nature,Text Data,Multiclass Classification Classification
2.15M 515
四元 2.0 NLP,Deep Learning,Brazil Classification
74.9M 356
肯尼迪大学演讲 NLP,Text Data,Websites Classification
7.5M 322
OZON产品类别 Business,NLP,Text Data,Multiclass Classification,Marketing Classification
181.16M 344
来自AskUbuntu的意图识别聊天机器人语料库 Context190 questions and answers from https://askubuntu.com. ContentWhat's inside is more than just rows and columns...NLP,Artificial Intelligence Classification
0.23M 1034
西亚姆鲁帕利孟加拉语字体 NLP,International Relations Classification
0.38M 349
团队数据集(csv) Transportation,NLP Classification
72.43M 372
Bash.im公司引用 Internet,NLP,Text Data,Text Mining,Russia Classification
38.65M 505
带有语言标签的文本数据。它可以用于语言检测。 Language Detection Dataset Text data with language labels. It can be used for language detection....NLP,Classification,Computer Science,Multiclass Classification,Languages Classification
31.7M 1013
测试用例数据集,软件测试中使用的样本数据集的集合 There are lots of datasets available for different machine learning tasks like NLP, Computer vision etc. However I could...NLP,Deep Learning,Earth and Nature Classification
1.3M 657
俄罗斯有毒评论 Internet,Social Networks,NLP,Text Data Classification
37.45M 345
NLP:报告和新闻分类 Social Science,Investing,NLP,Literature,Environment,Binary Classification,Multilabel Classification,Water Bodies Classification
0.03M 362
COVID假新闻数据集 Health,News,Coronavirus,NLP Classification
1.06M 414
疾病 Health,Health Conditions,NLP,Russia Classification
4.47M 330
科研论文主题建模 Business,Earth and Nature,Education,NLP,Psychology Classification
21.96M 407
COVID19相关常见问题,此数据包含与新冠肺炎相关的问答集19 What is this?This data contains collection of question and answers related to COVID19.Where does this come from?Thi...NLP,Health,Coronavirus,Psychology,Diseases Classification
0.1M 498
GENIA生物医学事件数据集 ContextBio-medical texts have a lot of information which can be used for developments in the medical field. Traditionall...NLP,Biology,Text Mining,Medicine Classification
2.67M 787
汉语停止词 Earth and Nature,NLP Classification
0.03M 403
Tanglish情绪分析推文,使用了4个标签来描述推特的情绪 So it all started when I was looking for Abusive Tamil tweets in the Roman Script to use for a project and instead of fi...NLP,Deep Learning,Online Communities,People Classification
0.85M 501