Word2vec在维基百科上训练数据(单字母+双字母),以捕捉unigram和bigram
这是一个单词嵌入模型,创建于维基百科+各种来源的评论。与从基于短语的方法(不考虑相邻词的短语/双词上下文)创建双词不同,这...NLP,Computer Science,Software,Programming,Neural Networks Classification
8.62G
283
aintnosunshine
维基百科Word2Vec,Apache Spark word2vec由200K维基百科页面培训
I used Apache Spark to extract more than 6 million phrases from 200,000 English Wikipedia pages. Here is the process of...NLP,Business,Earth and Nature,Text Mining Classification
132.74M
305
Maziyar
维基百科的句子,英语维基百科转储中收集了780万个句子
The wikipedia dump is a giant XML file and contains loads of not-so-useful content. I needed some english text for some...NLP,Text Mining Classification
891.28M
312
Mike Ortman
维基百科的电影情节
Arts and Entertainment,Movies and TV Shows,NLP,Text Data,Recommender Systems Classification
77.43M
535
JustinR