A
All projects
2025CompleteLSTM · FinBERT · PyTorch · Multi-modal

Multi-Modal Apple Stock Prediction

Fused LSTM time-series with FinBERT sentiment embeddings to predict AAPL direction and price — beat linear baseline by 15pp accuracy. IEOR 242B team capstone.

01

Overview

242B final project with a team of six. Traditional stock models rely purely on price and volume and leave sentiment on the table. We built a multi-modal deep-learning architecture that treats the tape and the news as two complementary signals and fuses them at the representation layer.

02

Process

  1. 01

    Time-series branch

    Pulled five years of AAPL OHLCV from yfinance + Alpha Vantage. Engineered MACD, RSI, 5/10-day moving averages. Built a hybrid 1D-CNN + bidirectional LSTM with SMOTE oversampling and Focal Loss to handle the imbalanced up/down/neutral labels.

  2. 02

    Sentiment branch

    Collected date-matched financial news from Kaggle. Built three parallel embedders — FinBERT for contextual finance-aware vectors, Word2Vec for fast static baselines, a custom LSTM encoder for pure sequential learning. Each produced 128-d vectors per day.

  3. 03

    Fusion & evaluation

    Concatenated both branches into an MLP head and trained end-to-end for both regression (next-day close) and classification (up/down). Validated against a logistic-regression baseline to isolate the sentiment lift.

03

Result

Regression: MAE 0.6146, RMSE 0.7198. Classification: 53.70% accuracy vs 38.65% baseline, ROC-AUC 0.5838, F1 0.3866. The sentiment branch measurably helped — proof that even noisy public text carries price-relevant signal. Real lesson: label imbalance matters more than model depth at this sample size.

By the numbers

+15pp

Accuracy lift

0.72

Regression RMSE

6

Team size

Next project

BTC ETF Analysis

Three-month quantitative deep-dive on spot Bitcoin ETFs: decomposed premium/discount dynamics across 12 products and identified a persistent 3–7bp arbitrage window.