← Back

Sentiment Analysis with Deep Learning using BERT

Tech Stack

BERT
PyTorch
Transformers
scikit-learn
Python
Deep Learning
NLP
Fine-tuning
Emotion Classification

Project Overview

This project implements a sophisticated sentiment analysis system using BERT (Bidirectional Encoder Representations from Transformers) for emotion classification. The model was trained on the SMILE dataset to classify text into 6 different emotion categories, demonstrating advanced natural language processing capabilities.

What I Built

Deep Learning Architecture

  • BERT Base Model: Pre-trained BERT model fine-tuned for emotion classification
  • Multi-class Classification: 6 emotion categories (happy, sad, angry, etc.)
  • Fine-tuning Pipeline: Complete training and evaluation workflow
  • Model Deployment: Saved model for inference and real-time predictions

Key Features

  • Advanced NLP: Leveraged state-of-the-art BERT architecture
  • Multi-emotion Classification: Classifies text into 6 different emotions
  • High Accuracy: Achieved strong performance metrics
  • Scalable Design: Model can be easily deployed for production use

Technical Implementation

Data Processing Pipeline

  1. Dataset: SMILE (Stanford Multimodal Interaction Learning Environment) dataset
  2. Preprocessing: Text cleaning, tokenization, and encoding
  3. Data Splitting: Train/validation split with stratification
  4. Tokenization: BERT tokenizer with special tokens and padding

Model Architecture

Python
# BERT Model Setup
model = BertForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=len(label_dict),
    output_attentions=False,
    output_hidden_states=False
)

# Tokenization
tokenizer = BertTokenizer.from_pretrained(
    'bert-base-uncased',
    do_lower_case=True
)

# Data Encoding
encoded_data = tokenizer.batch_encode_plus(
    texts,
    add_special_tokens=True,
    return_attention_mask=True,
    pad_to_max_length=True,
    max_length=256,
    return_tensors='pt'
)

Training Process

  • Optimizer: AdamW with learning rate 1e-5
  • Scheduler: Linear warmup schedule
  • Batch Size: 32 for optimal training
  • Epochs: 10 epochs with early stopping
  • Evaluation: F1-score and accuracy metrics

Challenges & Solutions

Challenge 1: Model Complexity

Problem: BERT models are computationally intensive and require careful tuning Solution:

  • Used pre-trained BERT base model for efficiency
  • Implemented proper learning rate scheduling
  • Optimized batch size and training parameters

Challenge 2: Data Imbalance

Problem: Emotion categories in the dataset were not evenly distributed Solution:

  • Used stratified sampling for train/validation split
  • Implemented weighted F1-score for evaluation
  • Applied data augmentation techniques

Challenge 3: Overfitting

Problem: Complex models can overfit on limited training data Solution:

  • Used validation set for monitoring
  • Implemented early stopping
  • Applied regularization techniques

Results & Performance

Model Performance

  • Accuracy: High classification accuracy across emotion categories
  • F1-Score: Strong weighted F1-score indicating balanced performance
  • Training Time: Efficient training with GPU acceleration
  • Model Size: Optimized model size for deployment

Emotion Categories

The model successfully classifies text into 6 emotion categories:

  1. Happy: Positive emotions and joy
  2. Sad: Negative emotions and sadness
  3. Angry: Anger and frustration
  4. Surprise: Unexpected or surprising content
  5. Fear: Fearful or anxious content
  6. Disgust: Disgusting or unpleasant content

What I Learned

Technical Skills

  • BERT Architecture: Deep understanding of transformer-based models
  • Fine-tuning: Practical experience with transfer learning
  • NLP Pipelines: End-to-end natural language processing workflows
  • PyTorch: Advanced deep learning framework usage
  • Model Evaluation: Proper evaluation metrics for classification tasks

Deep Learning Concepts

  • Transfer Learning: Leveraging pre-trained models for specific tasks
  • Attention Mechanisms: Understanding transformer attention
  • Tokenization: Advanced text preprocessing techniques
  • Model Optimization: Training and fine-tuning strategies

Code Snippets

Model Training Loop

Python
def train_model(model, dataloader_train, dataloader_validation, optimizer, scheduler):
    for epoch in range(epochs):
        model.train()
        total_loss = 0
        
        for batch in dataloader_train:
            batch_input_ids = batch[0].to(device)
            batch_attention_mask = batch[1].to(device)
            batch_labels = batch[2].to(device)
            
            model.zero_grad()
            outputs = model(batch_input_ids, 
                          attention_mask=batch_attention_mask, 
                          labels=batch_labels)
            
            loss = outputs[0]
            total_loss += loss.item()
            
            loss.backward()
            torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
            optimizer.step()
            scheduler.step()
        
        # Validation
        val_accuracy, val_f1 = evaluate(dataloader_validation)
        print(f'Epoch {epoch+1}: Loss = {total_loss/len(dataloader_train):.4f}, '
              f'Val Accuracy = {val_accuracy:.4f}, Val F1 = {val_f1:.4f}')

Evaluation Function

Python
def evaluate(dataloader_val):
    model.eval()
    total_eval_accuracy = 0
    total_eval_f1 = 0
    total_eval_loss = 0
    
    for batch in dataloader_val:
        batch_input_ids = batch[0].to(device)
        batch_attention_mask = batch[1].to(device)
        batch_labels = batch[2].to(device)
        
        with torch.no_grad():
            outputs = model(batch_input_ids, 
                          attention_mask=batch_attention_mask, 
                          labels=batch_labels)
        
        loss = outputs[0]
        logits = outputs[1]
        
        total_eval_loss += loss.item()
        logits = logits.detach().cpu().numpy()
        label_ids = batch_labels.to('cpu').numpy()
        
        total_eval_accuracy += flat_accuracy(logits, label_ids)
        total_eval_f1 += f1_score_func(logits, label_ids)
    
    avg_val_accuracy = total_eval_accuracy / len(dataloader_val)
    avg_val_f1 = total_eval_f1 / len(dataloader_val)
    
    return avg_val_accuracy, avg_val_f1

Prediction Function

Python
def predict_emotion(text, model, tokenizer, label_dict):
    # Tokenize input text
    inputs = tokenizer.encode_plus(
        text,
        add_special_tokens=True,
        return_attention_mask=True,
        pad_to_max_length=True,
        max_length=256,
        return_tensors='pt'
    )
    
    # Get prediction
    with torch.no_grad():
        outputs = model(inputs['input_ids'], 
                       attention_mask=inputs['attention_mask'])
        logits = outputs[0]
        predictions = torch.argmax(logits, dim=1)
    
    # Convert to emotion label
    emotion = list(label_dict.keys())[list(label_dict.values()).index(predictions.item())]
    return emotion

Future Improvements

  1. Multi-modal Analysis: Integrate text with audio/visual data
  2. Real-time Processing: Optimize for live sentiment analysis
  3. Domain Adaptation: Fine-tune for specific domains (social media, reviews)
  4. Ensemble Methods: Combine multiple models for better accuracy
  5. API Development: Create RESTful API for easy integration

Project Impact

This project demonstrates my ability to:

  • Advanced NLP: Work with state-of-the-art transformer models
  • Deep Learning: Implement complex neural network architectures
  • Transfer Learning: Effectively use pre-trained models
  • Model Optimization: Fine-tune models for specific tasks
  • Production Readiness: Create deployable machine learning solutions

The project showcases practical application of cutting-edge NLP techniques and demonstrates my proficiency in deep learning and natural language processing, making it a valuable addition to my portfolio.