Time Series Forecasting: ARIMA, LightGBM, and LSTM Compared

Time series has seasonality, trend, and temporal dependencies that standard ML ignores. Here is when to use ARIMA vs. LightGBM lag features vs. LSTM -- and the critical mistake of random data splits.

Mahmudul Haque Qudrati

CEO & ML Engineer

May 18, 2026

12 min read

// tags

#time-series#forecasting#arima#lightgbm#lstm#prophet#machine-learning

FIG. ART-29

12 min read

“

Time Series Forecasting: ARIMA, LightGBM, and LSTM Compared

// reading plan

sections

1,437

words

min read

// Machine Learning

Ensemble Methods: Why Combining Models Beats Any Individual Model

Bagging, boosting, and stacking -- ensemble methods consistently win Kaggle competitions and improve production accuracy. Here is how each works and when to use them.

9 min read

// Machine Learning

The ML Tools Ecosystem in 2026: A Map of What Is Worth Knowing

Time series forecasting is one of the most practically valuable and most frequently botched areas of applied machine learning. The demand forecast that saves a retailer millions in inventory costs, the energy load prediction that prevents grid failures, the sales forecast that guides hiring decisions -- all of these are time series problems. And all of them have a fundamental property that standard ML workflows ignore at their peril: observations are not independent. What happened yesterday predicts what happens today.

This dependency violates the core assumption of most ML workflows and requires fundamentally different approaches to data preparation, model selection, and evaluation.

Understanding Time Series Structure

A time series is a sequence of observations indexed by time. Unlike tabular data where rows are independent examples, time series observations are inherently ordered and temporally dependent.

Time series typically exhibit three components:

Trend: A long-term directional movement. Revenue grows 15% year-over-year. User base expands. Climate temperatures rise. Trend is often captured by fitting a line or polynomial to the data over time.

Seasonality: Regular, periodic patterns that repeat. Retail sales spike in November-December. Air conditioning load peaks in summer. Website traffic drops on weekends. Seasonality has a fixed period (daily, weekly, annual) and repeatable shape.

Residual (or noise): What is left after removing trend and seasonality. Ideally, the residual is random noise. In practice, the residual often contains additional structure (autocorrelation -- today's residual correlates with yesterday's residual).

Understanding these components guides model selection and feature engineering.

The Critical Mistake: Random Train-Test Splits

The most dangerous mistake in time series ML: splitting data randomly into training and test sets.

In standard tabular ML, random splitting is correct because examples are independent. In time series, random splitting is catastrophically wrong because it creates data leakage: information from the future leaks into the training set, making your model appear more accurate than it actually is.

If you train on a random 80% of your time series data and test on the remaining 20%, your training set will contain observations from after the test observations. The model will have implicitly seen future information and will appear to forecast well -- but this performance will not generalize to actual future prediction.

Always use time-based splits:

# WRONG: random split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# CORRECT: time-based split
cutoff = int(len(df) * 0.8)
train_df = df.iloc[:cutoff]
test_df = df.iloc[cutoff:]

For cross-validation, use time series cross-validation (walk-forward validation): train on periods 1-N, test on period N+1, then train on 1-(N+1), test on N+2, and so on. Scikit-learn provides TimeSeriesSplit for this.

ARIMA: The Classical Statistical Approach

ARIMA (AutoRegressive Integrated Moving Average) is the classical approach to time series forecasting. It models the time series as a function of its own past values (autoregressive component), past forecast errors (moving average component), and differences to handle non-stationarity (integrated component).

ARIMA is parameterized by (p, d, q):

p: number of lagged observations in the autoregressive component
d: number of differencing operations to make the series stationary
q: number of lagged forecast errors in the moving average component

from statsmodels.tsa.arima.model import ARIMA

model = ARIMA(train_series, order=(2, 1, 2))  # AR(2), I(1), MA(2)
fitted = model.fit()
forecast = fitted.forecast(steps=30)  # Forecast 30 periods ahead

ARIMA handles trend and autocorrelation well. SARIMA extends it with seasonal components. These models are interpretable, have well-understood statistical properties, and work well for short to medium forecast horizons on univariate time series.

When ARIMA works well:

Univariate forecasting (one variable predicted from its own history)
Clear trend and/or seasonality
Short to medium forecast horizons (days to weeks)
When confidence intervals and statistical inference matter

ARIMA limitations:

Struggles with multivariate inputs (many external variables affecting the forecast)
Cannot capture complex non-linear patterns
Requires stationarity (constant statistical properties over time) -- often requires differencing
Sensitive to outliers and structural breaks

LightGBM with Lag Features: The Workhorse for Complex Forecasting

For most production forecasting problems -- especially when you have multiple relevant features, complex non-linear patterns, or need to forecast many series simultaneously -- gradient boosting with lag features is the current best practice.

The key transformation: convert the time series forecasting problem into a standard supervised ML problem by creating lag features.

import pandas as pd
import lightgbm as lgb

def create_lag_features(df, target_col, lags, windows):
    for lag in lags:
        df[f'lag_{lag}'] = df[target_col].shift(lag)

    for window in windows:
        df[f'rolling_mean_{window}'] = df[target_col].shift(1).rolling(window).mean()
        df[f'rolling_std_{window}'] = df[target_col].shift(1).rolling(window).std()

    return df

df = create_lag_features(df, 'sales', lags=[1, 7, 14, 28], windows=[7, 14, 28])

# Add date features
df['day_of_week'] = df['date'].dt.dayofweek
df['month'] = df['date'].dt.month
df['week_of_year'] = df['date'].dt.isocalendar().week

# Time-based split and train
train = df[df['date'] < '2024-01-01'].dropna()
test = df[df['date'] >= '2024-01-01'].dropna()

model = lgb.LGBMRegressor(n_estimators=500, learning_rate=0.05)
model.fit(train[feature_cols], train['sales'])

The lag features encode temporal dependencies explicitly: "how much did we sell 7 days ago, 14 days ago, 28 days ago?" Rolling statistics encode trend and local volatility. Date features encode seasonality (day of week, month of year).

Why LightGBM beats ARIMA for complex forecasting:

Easily incorporates external features (promotions, holidays, price changes, weather)
Captures complex non-linear interactions between features
Scales to forecasting many series simultaneously (same model for all products/locations)
Feature importance gives interpretability into which lags and features matter
Competitive or superior performance on most practical forecasting benchmarks

The Kaggle M5 competition (forecasting Walmart sales across 42,840 time series) was dominated by LightGBM and related gradient boosting approaches, validating their practical effectiveness.

LSTM: Deep Learning for Sequential Patterns

LSTMs (Long Short-Term Memory networks) are recurrent neural networks designed for sequences. They maintain a "cell state" that can carry information across many time steps, addressing the vanishing gradient problem that plagued earlier RNNs.

import torch
import torch.nn as nn

class LSTMForecaster(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, forecast_horizon):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.linear = nn.Linear(hidden_size, forecast_horizon)

    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        return self.linear(lstm_out[:, -1, :])

LSTMs can learn to weight information from many time steps ago when it is relevant, which is useful for long-range dependencies that lag features miss.

When LSTMs are appropriate:

Very long-range dependencies (information from months or years ago is relevant)
Complex multivariate sequence inputs (multiple sensor readings over time)
When the sequential structure is genuinely important (not just lagged features)

LSTM caveats:

Slower to train than LightGBM
Requires more data to realize the advantage over simpler methods
Hyperparameter tuning is more complex
In practice, LightGBM with good lag features often matches or beats LSTM on tabular time series

Modern alternatives to LSTMs for time series: Temporal Convolutional Networks (TCNs), N-BEATS, and Temporal Fusion Transformer. For most practitioners, these are advanced options to explore after establishing a solid LightGBM baseline.

Facebook Prophet: Time Series for Non-Specialists

Prophet, developed by Facebook, is designed for business forecasting by non-specialists. It models trend, seasonality, and holidays explicitly and handles missing data, outliers, and trend changes gracefully.

from prophet import Prophet

model = Prophet(seasonality_mode='multiplicative', yearly_seasonality=True)
model.add_country_holidays(country_name='US')
model.fit(train_df[['ds', 'y']])  # ds: datetime, y: target

future = model.make_future_dataframe(periods=90)
forecast = model.predict(future)

Prophet is particularly good for business metrics with strong yearly seasonality and holiday effects (website traffic, retail sales). It is less suitable for fine-grained forecasting (sub-hourly, highly irregular series) or when external features need to be incorporated in complex ways.

Choosing the Right Approach

Univariate, strong trend/seasonality, statistical inference needed: ARIMA/SARIMA
Multivariate, complex features, production forecasting at scale: LightGBM with lag features
Long-range sequential dependencies, complex multivariate inputs: LSTM or Temporal Fusion Transformer
Business metrics forecasting for non-specialists, holiday effects matter: Prophet
Exploratory / sanity check baseline: Simple seasonally-adjusted average (naive seasonal baseline)

Always start with a simple baseline (naive forecast: tomorrow = today, or seasonal naive: this week = same week last year). If your ML model cannot beat the naive baseline substantially, something is wrong with your data, features, or evaluation approach.

Forecasting is hard. The uncertainty in forecasts grows rapidly with the forecast horizon. Be honest about forecast confidence intervals, communicate them to stakeholders, and build systems that are robust to forecast error rather than assuming point estimates are correct.

Keep Reading

Machine Learning Complete Guide for Software Developers -- where time series forecasting fits in the broader ML landscape
Feature Engineering Practical Guide -- lag features and cyclical encoding are central to time series ML
Overfitting and Underfitting: How to Fix Them -- time series models overfit in subtle ways, especially with too many lag features

Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace -- chat, projects, time tracking, AI meeting summaries, and invoicing -- in one tool. Try it free.

Time Series Forecasting: ARIMA, LightGBM, and LSTM Compared

Related Articles

Ensemble Methods: Why Combining Models Beats Any Individual Model

The ML Tools Ecosystem in 2026: A Map of What Is Worth Knowing

Understanding Time Series Structure

The Critical Mistake: Random Train-Test Splits

ARIMA: The Classical Statistical Approach

LightGBM with Lag Features: The Workhorse for Complex Forecasting

LSTM: Deep Learning for Sequential Patterns

Facebook Prophet: Time Series for Non-Specialists

Choosing the Right Approach

Keep Reading

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

ML Research Papers Every Practitioner Should Know in 2026

Time Series Forecasting: ARIMA, LightGBM, and LSTM Compared

Related Articles

Ensemble Methods: Why Combining Models Beats Any Individual Model

The ML Tools Ecosystem in 2026: A Map of What Is Worth Knowing

Understanding Time Series Structure

The Critical Mistake: Random Train-Test Splits

ARIMA: The Classical Statistical Approach

LightGBM with Lag Features: The Workhorse for Complex Forecasting

LSTM: Deep Learning for Sequential Patterns

Facebook Prophet: Time Series for Non-Specialists

Choosing the Right Approach

Keep Reading

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

ML Research Papers Every Practitioner Should Know in 2026

The workspace your team
actually needs