相关搜索
您是不是在找?
今日排行
本周排行
本月排行
WebMD药物评论数据集,各种药物的用户评论数据集
The dataset provides user reviews on specific drugs along with related conditions, side effects, age, sex, and ratings r...NLP,Computer Science,Education,Tabular Data,Drugs and Medications Classification
168.58M
243
Rohan Harode
品酒师给出的葡萄酒评论数据 ,使用文本分类来对评论中的评论者进行分类
Thinking of Natural Language Processing as a beginner!!The dataset has been about the wine comments or reviews that has...NLP,Business,News,Text Data,Multiclass Classification,Alcohol Classification
50.35M
513
Subhasree Mohapatra
多模态仇恨言语,150000条带有文本和图像的推特,用于仇恨检测
现有的仇恨语音数据集仅包含文本数据。我们创建了一个新的手动注释的多模态仇恨语音数据集,该数据集由150000条推文组成,每条推...NLP,Online Communities,Image Data,Multiclass Classification,Social Networks Classification
6.55G
533
Victor Callejas Fuentes
带有语言标签的文本数据。它可以用于语言检测。
Language Detection Dataset Text data with language labels. It can be used for language detection....NLP,Classification,Computer Science,Multiclass Classification,Languages Classification
31.7M
506
Ishant
测试用例数据集,软件测试中使用的样本数据集的集合
There are lots of datasets available for different machine learning tasks like NLP, Computer vision etc. However I could...NLP,Deep Learning,Earth and Nature Classification
1.3M
299
sapal6
Tanglish情绪分析推文,使用了4个标签来描述推特的情绪
So it all started when I was looking for Abusive Tamil tweets in the Roman Script to use for a project and instead of fi...NLP,Deep Learning,Online Communities,People Classification
0.85M
246
vyom bhatia
用户评级为10M的Goodreads图书数据集
Arts and Entertainment,Social Science,NLP,Literature,Recommender Systems Classification
1128.5M
425
Bahram Jannesar
来自wallstreetbets等的Subreddit数据,用于后验量化交易算法的情绪分析
All of the submissions to each of the r/wallstreetbets, r/investing, r/options, and r/SecurityAnalysis subreddits since...NLP,Online Communities,Investing Classification
1.49G
243
Sheridan Green
ELI5记分器训练数据原型816000例,用于创建评分模型
ELI5 means Explain like I am 5 . It's originally a long and free form Question-Answering scraping from reddit eli5 s...NLP,Earth and Nature,Arts and Entertainment,Education,Social Science,Sports,Regression,Transformers Classification
672.61M
250
Neuron Engineer
IMBD情绪分类数据集,用spacy标记并以JSON格式存储
ContextIMDB sentiment classification dataset from derived from torchtext, tokenized using spacy and then stored as JSON...NLP,Beginner,Earth and Nature,Movies and TV Shows,Text Data,Binary Classification,spaCy Classification
104.31M
236
Manoj Patra
用于Sarcasm检测的新闻标题数据集,用于讽刺和假新闻检测任务的高质量数据集
Past studies in Sarcasm Detection mostly make use of Twitter datasets collected using hashtag based supervision but such...NLP,Deep Learning,Classification,Earth and Nature,Computer Science,Programming Classification
11.13M
256
Rishabh Misra
OSCAR尼泊尔语语料库,尼泊尔语文本语料库,用于训练NLP的无监督语言模型
The files are from [OSCAR Corpus](https://oscar-corpus.com/). Please visit their site for more information.The dataset i...NLP,Computer Science,Movies and TV Shows,Text Data,Languages Classification
3.1G
302
Prabesh Dhakal
用于语音克隆的英语多说话人语料库 CSTR-VCTK语料库
This CSTR VCTK Corpus includes speech data uttered by 109 native speakers of English with various accents. Each speaker...NLP,Audio Data Classification
15.22G
381
Michael Fekadu