Predicting real v. fake news using natural language processing

Last updated on Aug 5, 2022

For this project, I wanted to see if I could use machine learning to reliably predict whether a news article was real of fake. I used a collection of real and fake news articles, which can be found here. I compared two models- Naive bayes with word bags and Artificial Neural Networks (ANN) with word embeddings. The ANN outperformed Naive bayes, with an accuracy of 94% on the final test set. For a bit of fun, I then evaluated how well these models could generalize to a random sample of POTUS tweets from 2019-2020 (found here).

The code to this project and an outline of my findings can be found here.

Data science

Predicting real v. fake news using natural language processing

Lea E. Frank