Back to Projects
NLP

Movie Sentiment Analysis

Comparative study of ML methods and neural networks for sentiment classification of movie reviews.

Course: Introduction to Data Science
Movie Sentiment Analysis

Objectives

  • 1Classify sentiment in movie reviews comparing traditional ML methods versus neural networks.
  • 2Compare text vectorization methods: bag of words, TF-IDF, and word embeddings.
  • 3Analyze the effectiveness of regularization techniques (dropout) to reduce overfitting.

Conclusions

  • Logistic Regression with TF-IDF achieved 88% accuracy, comparable to LSTM networks with less complexity.
  • Word embeddings capture semantic relationships but require more data to outperform TF-IDF on this dataset.
  • Dropout (0.3-0.5) reduces LSTM overfitting by 10-15% on validation accuracy.
  • Traditional ML models generalize better on limited data, while neural networks require larger datasets.

Technologies

  • NumPy
  • Matplotlib
  • Scikit-learn
  • TensorFlow
  • Keras