NLP
新闻点击诱饵数据集

3.41M

468

0

新闻点击诱饵数据集

Business,Online Communities,News,NLP,Classification,Deep Learning,Text Data

Classification

新闻点击诱饵数据集前往PC端下载数据

Description

Online content publishers often use catchy headlines for their articles in order to attract users to their websites. These headlines, popularly known as clickbait, exploit a user’s curiosity gap and lure them to click on links that often disappoint them. Existing methods for automatically detecting clickbait rely on heavy feature engineering and domain knowledge. Dataset The train1.csv collected from Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. "Stop Clickbait: Detecting and Preventing Click baits in Online News Media”. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Fransisco, US, August 2016. [GitHub](https://github.com/bhargaviparanjape/clickbait/tree/master/dataset) It has two columns first one contains headlines and the second one has numerical labels of clickbait in which 1 represents that it is clickbait and 0 represents that it is the non-clickbait headline. The dataset contains a total of 32000 rows of which 50% are clickbait and the other 50% are non-clickbait. The train2.csv collected from the [Clickbait news detection dataset](https://www.kaggle.com/c/clickbait-news-detection/data) from the Kaggle InClass Prediction Competition. The dataset contains the title and text of the news and label.
发表评论
0评