单_公开数据集帕依提提-人工智能高质量数据集服务平台

2017年加拿大国家污染物排放清单 Others Classification

4.44M 286

Mitch Roman

CC-100 卡纳达语单语言数据集：来自Web爬网数据的1300万条单语言数据集 This monolingual dataset includes roughly 13 million uncleaned Kannada sentences crawled from numerous websites....NLP,Text Data,Languages Classification

3.51G 444

Darshan

海绵宝宝成绩单 Arts and Entertainment,NLP Classification

4.85M 394

Mikhail Gaerlan

单词难度预测 Computer Science,Games,NLP,Text Data,Languages Classification

1.85M 447

koustubhk

情绪分析单元数据集 Earth and Nature,Software,NLP,Africa Classification

0M 354

Dr. Paul Azunre

快速文本对齐单词向量 Education,NLP Classification

18167.9M 1199

Max S?derman

fastText预先训练的波斯语单词向量 Health,NLP Classification

4316.7M 1043

javad helali

fastText预先训练的阿拉伯语单词向量 NLP Classification

4313.21M 1017

javad helali

NLP Word2Vec 现有的word2vec嵌入，包括手套和谷歌新闻，用于被训练来重建单词的语言上下文 Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neur...NLP,Computer Science Classification

5.89G 627

pkugoodspeed

Facebook发布的300维预训练FastText英语单词向量 300-dimensional pretrained FastText English word vectors released by Facebook.The first line of the file contains the nu...NLP,Arts and Entertainment,Games Classification

4.52G 597

Vladimir Demidov

GloVe是一种无监督的学习算法用于获得单词的向量表示 GloVe 是一种无监督学习算法，用于获取单词的向量表示。训练是在来自语料库的聚合全局词-词共现统计数据上执行的，结果表示展示...NLP,Deep Learning,Education Classification

1.5G 1297

JdPaletto

NLP简单的数学问题从聊天机器人应用程序 Earth and Nature,Internet,Education,NLP Classification

0M 501

dusan2

Word2vec在维基百科上训练数据(单字母+双字母)，以捕捉unigram和bigram 这是一个单词嵌入模型，创建于维基百科+各种来源的评论。与从基于短语的方法（不考虑相邻词的短语/双词上下文）创建双词不同，这...NLP,Computer Science,Software,Programming,Neural Networks Classification

8.62G 688

aintnosunshine

基于Reddit评论的单词表示法的全局矢量数据集 GloVe Reddit Comments Global Vectors for Word Representation based on Reddit comments...NLP Classification

19.1G 637

Leigh

简单的LSTM（长短期记忆人工神经网络）模型，输出数据集 Simple LSTM model epoch4...NLP Classification

6.32G 361

Bo Wang

SMILES OCR数据集，包含超过 90 万个 SMILES 格式的单一产品反应 SMILES（简化分子输入行输入系统）是一种用于输入和表示分子和反应的行符号（一种使用可打印字符的印刷方法）。该数据集包含超过...NLP,Chemistry Classification

175M 1368

Elahi

医学成绩单,从mtsamples获取的医学转录数据 Medical data is extremely hard to find due to HIPAA privacy regulations. This dataset offers a solution by providing med...NLP,Health,Medicine Classification

16.22M 634

Tara Boyle

染色单体:通过整合遗传图谱和保守共线性来修复和增强组装基因组的一套工具新参考基因组的测序和计算组装的步伐正在加快。尽管DNA测序技术和组装软件工具不断改进，但基因组的生物学特征，如重复序列以及...Others Classification

192.28M 1015

Catchen, Julian,X,Amores, Angel,Bassham, Susan

Dataset Category

公开数据集