
0.26M
333
0
1000 个平行句子
Linguistics,Languages
Classification
前往PC端下载数据
# Context A few years ago, I investigated a Korean corpus in order to find the most frequent 1000 words. Subsequently, I asked native speakers to translate those words and their example sentences into English, Japanese, Spanish, and Indonesian. I've totally forgotten this data since then, but it flashed on me this might be helpful for some people. Undoubtedly, 1000 sentences are a pretty small corpus, but it is also true that parallel corpora are hard to get. # Content It's a csv file. As you expect, the first line is the heading. * ID: Id of the headword. It is arranged by alphabetical order. * HEADWORD: 1000 most frequent Korean words. * POS: Part of speech. * ENGLISH: English meaning or equivalent. * JAPANESE: Japanese meaning or equivalent. * SPANISH: Spanish meaning or equivalent. * INDONESIAN: Indonesian meaning or equivalent. * EXAMPLE (KO): An example sentence * EXAMPLE (EN): English translation * EXAMPLE (JA): Japanese translation * EXAMPLE (ES): Spanish translation * EXAMPLE (ID): Indonesian translation # Inspiration For now, I'm not sure how this small corpus can be used. Hopefully this will be helpful for some pilot linguistic project.
版权信息
- 数据大小0.26M
- 发布者Kyubyong Park
- 引用地址
- 许可协议CC BY-NC-SA 4.0