相关搜索
您是不是在找?
今日排行
本周排行
本月排行
Word2vec在维基百科上训练数据(单字母+双字母),以捕捉unigram和bigram
这是一个单词嵌入模型,创建于维基百科+各种来源的评论。与从基于短语的方法(不考虑相邻词的短语/双词上下文)创建双词不同,这...NLP,Computer Science,Software,Programming,Neural Networks Classification
8.62G
573
aintnosunshine
Flickr图片数据集,Flickr 图像字幕数据集
The Flickr30k dataset has become a standard benchmark for sentence-based image description. This paper presents Flickr30...NLP,Image Data,Computer Vision Classification
8.2G
606
Hsankesara
Facebook 发布的300维预训练,在 Common Crawl 上训练的200万个词向量
300-dimensional pretrained FastText English word vectors released by Facebook.The first line of the file contains the nu...NLP,Arts and Entertainment Classification
650M
594
Manish Maharjan
SMS Spam Ham Prediction
Business,Earth and Nature,Internet,Economics,NLP Classification
0.48M
352
Lampu
斯坦福GloVe 200d数据集,转化为word2vec格式数据
Is the Stanford GloVe 200d dataset converted to word2vec format...NLP,Computer Science Classification
661.31M
902
the kwisatz haderach
SMILES OCR数据集,包含超过 90 万个 SMILES 格式的单一产品反应
SMILES(简化分子输入行输入系统)是一种用于输入和表示分子和反应的行符号(一种使用可打印字符的印刷方法)。该数据集包含超过...NLP,Chemistry Classification
175M
1176
Elahi
维基百科Word2Vec,Apache Spark word2vec由200K维基百科页面培训
I used Apache Spark to extract more than 6 million phrases from 200,000 English Wikipedia pages. Here is the process of...NLP,Business,Earth and Nature,Text Mining Classification
132.74M
531
Maziyar
ConceptNet Numberbatch 向量,来自 ConceptNet 的词向量
These are the word vectors released by the Conceptnet project.ConceptNet的本质是一个三元组:...NLP Classification
899.91M
411
Nohman
reddit向量数据集,用于训练 sence2vec模型
Sence2vec word embeddings model works better than word2vec , since it utilises contextual information from words.This re...NLP,Computer Science,Text Data,spaCy Classification
635.76M
828
Poonam Ligade
Medium Articles 包含标记为AI、机器学习、数据科学或人工智能的帖子,以及用户信息
Medium taps into the brains of the world’s most insightful writers, thinkers, and storytellers to bring you the smartes...NLP,Text Data,Literature Classification
1.8G
512
AiswaryaRamachandran
实体提取从Pitchfork评论
Business,Arts and Entertainment,Music,Retail and Shopping,NLP,Popular Culture Classification
14.49M
927
Justin K
Stack Overflow 2018 问题数据集
In this dataset, we explore StackOverflow questions and try to use unsupervised algorithms to extract tags, then train c...NLP,Earth and Nature,Computer Science,Multiclass Classification Classification
230.27M
579
Réda
ACL论文选集,论文数据来自ACL选集
The Accepted paper's data from ACL Anthology. An abstract of a paper is extracted from arXiv if it exists.The data i...NLP,Education,Literature Classification
1.14M
353
Takahiro Kubo
123.13M
602
Henry Dashwood
JHU-CROWD++
A large-scale unconstrained crowd counting dataset.A comprehensive dataset with 4,372 images and 1.51 million annotation...Person 2D Box
2.87G
866
JHU-VIU lab
雄性和雌性大鼠海马的促肾上腺皮质激素释放激素(CRH)和糖皮质激素受体(GR)的PCR数据
这些文件包含在雄性和雌性大鼠海马和下丘脑中测量的促肾上腺皮质激素释放激素(CRH)和糖皮质激素受体(GR)的定量实时PCR的所有...Others Classification
0.04M
824
Rowe, Rachel,Bromberg, Caitlin,Condon, Andrew,Ridgway, Samantha,Krishna, Gokul,GarciaFilion, Pamela,Adelson, P. David,Thomas, Theresa C.