Back to Projects

NLP

Movie Sentiment Analysis

Comparative study of ML methods and neural networks for sentiment classification of movie reviews.

Course: Introduction to Data Science

Movie Sentiment Analysis

Objectives

1Classify sentiment in movie reviews comparing traditional ML methods versus neural networks.
2Compare text vectorization methods: bag of words, TF-IDF, and word embeddings.
3Analyze the effectiveness of regularization techniques (dropout) to reduce overfitting.

Conclusions

Logistic Regression with TF-IDF achieved 88% accuracy, comparable to LSTM networks with less complexity.
Word embeddings capture semantic relationships but require more data to outperform TF-IDF on this dataset.
Dropout (0.3-0.5) reduces LSTM overfitting by 10-15% on validation accuracy.
Traditional ML models generalize better on limited data, while neural networks require larger datasets.

Technologies

NumPy
Matplotlib
Scikit-learn
TensorFlow
Keras