From Theory to Code: Building Your First LSTM Model for Futures Price Forecasting

We have defined the problem in our broader introduction to AI in futures trading, explored the theoretical building blocks, and established a clean benchmark dataset. Now, it is time to write code. This is where the abstract concepts of deep learning meet the practical realities of financial markets.

In this fourth article in the Xbratai technical series, we will build our first predictive model. Using the Xbratai Futures Benchmark (XFB-v1) dataset, we will construct a Long Short-Term Memory (LSTM) network designed for a simple but crucial task: price forecasting. This tutorial will provide a complete, reproducible blueprint from loading data to making your first prediction.

Get your Python environment ready. We are moving from theory to implementation.

The Goal: Building a Price Forecasting Model

Our objective is to create an LSTM model that, given a sequence of past market data (e.g., the last 60 minutes), predicts the price direction over the next few minutes. We are not building a complete, profitable trading system just yet. Instead, we are building the core predictive engine that would power such a system.

We will follow a structured workflow common in any machine learning project:

Environment Setup and Data Loading: Preparing our tools and loading the XFB-v1 dataset.
Feature Selection and Preprocessing: Choosing which data to feed the model and scaling it correctly.
Data Structuring for LSTMs: Converting our time-series data into the "sequence" and "target" format that LSTMs require.
Model Architecture: Defining the layers of our LSTM network using TensorFlow and Keras.
Training the Model: Feeding the data to our model and letting it learn.

Step 1: Environment Setup and Data Loading

First, you need the right tools. We will use standard Python data science libraries. If you do not have them installed, you can get them via pip.

# Ensure you have these libraries installed

# pip install tensorflow pandas numpy scikit-learn matplotlib

With our environment ready, let's load the E-mini S&P 500 (ES) data from our benchmark dataset. We will use the pandas library to load the CSV file into a DataFrame, which is a powerful tool for manipulating tabular data.

import pandas as pd

# Load the dataset

df = pd.read_csv('xfb_v1_es_1min.csv', index_col='timestamp', parse_dates=True)

# Display the first few rows to verify

print(df.head())

Expected Output:

timestamp	open	low	close	volume	contract_id
2022-01-03 00:00:00	4772.25	4771.75	4772.00	330	ESH22
2022-01-03 00:01:00	4771.75	4771.75	4772.25	144	ESH22

Step 2: Feature Selection and Preprocessing

A model is only as good as its inputs. We will start with a simple but effective feature: the closing price. However, as discussed in our introduction to AI in futures trading, raw prices are not ideal for neural networks. We need to normalize them. We will use the MinMaxScaler from scikit-learn, which scales the data to a range between 0 and 1.

from sklearn.preprocessing import MinMaxScaler

import numpy as np

# Select the 'close' price and reshape for the scaler

close_prices = df['close'].values.reshape(-1, 1)

# Initialize the scaler

scaler = MinMaxScaler(feature_range=(0, 1))

# Fit and transform the data

scaled_prices = scaler.fit_transform(close_prices)

This step is critical. It ensures that large price values do not disproportionately influence the model's learning process.

Step 3: Structuring Data for Sequence Modeling

LSTMs learn from sequences. We need to restructure our flat list of prices into input sequences (X) and corresponding target values (y). We will define a "look-back" period, which is the number of past time steps our model will use to make a prediction. Let's use 60 minutes.

Our task is to take 60 minutes of prices (X) and predict the price at the 61st minute (y).

def create_sequences(data, look_back=60):

X, y = [], []

for i in range(len(data) - look_back):

X.append(data[i:(i + look_back), 0])

y.append(data[i + look_back, 0])

return np.array(X), np.array(y)

look_back_period = 60

X, y = create_sequences(scaled_prices, look_back_period)

# Reshape X to be [samples, time steps, features] for the LSTM layer

X = np.reshape(X, (X.shape[0], X.shape[1], 1))

print(f"Shape of X: {X.shape}") # e.g., (Number of samples, 60, 1)

print(f"Shape of y: {y.shape}") # e.g., (Number of samples,)

Step 4: Defining the LSTM Model Architecture

Now for the exciting part: building the brain of our operation. We will use the Keras API within TensorFlow to define a simple but powerful LSTM network.

Our architecture will consist of:

An LSTM layer with 50 neurons. This is the core memory component.
A Dropout layer to prevent overfitting by randomly "turning off" some neurons during training.
A Dense layer to consolidate the output.
A final Output layer with one neuron to give us our predicted price.

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import LSTM, Dense, Dropout

model = Sequential()

# First LSTM layer with dropout

model.add(LSTM(units=50, return_sequences=True, input_shape=(look_back_period, 1)))

model.add(Dropout(0.2))

# Second LSTM layer (return_sequences is False by default)

model.add(LSTM(units=50))

model.add(Dropout(0.2))

# Dense output layer

model.add(Dense(units=1))

# Compile the model

model.compile(optimizer='adam', loss='mean_squared_error')

model.summary()

We use the adam optimizer, a popular and effective choice, and mean_squared_error as our loss function, which is standard for regression tasks like price forecasting.

Step 5: Training the Model

With our data prepared and model built, we can start training. We will feed our X and y arrays to the model.fit() function. This process involves the model making predictions, comparing them to the actual values (y), calculating the error (loss), and adjusting its internal weights to reduce that error.

# For demonstration, we'll use a small portion of data and few epochs

# In a real project, you would split data into train/validation sets

history = model.fit(X, y, epochs=10, batch_size=32, verbose=1)

During training, you will see the loss value decrease with each epoch. This indicates that the model is learning to map the input sequences to the target prices more accurately.

From Prediction to Action

After training, our model object is ready to make predictions. We can feed it a new sequence of 60 minutes of scaled data, and it will output a single value—its prediction for the 61st minute. Remember, this output will be scaled, so you must use the scaler.inverse_transform() method to convert it back into an actual price.

This is a foundational step. We have built a model that can "see" patterns in price history. However, a raw prediction is not a trading strategy. As we emphasized in our introduction to AI in futures trading, models must be validated under realistic market conditions before they can be trusted. Is the model any good? How do we measure its performance in a way that is relevant to trading?

Up Next: Rigorous Model Evaluation

A low training loss does not guarantee a profitable model. Financial models can easily overfit to historical data, learning the noise of the past instead of the true signal.

In the next article, "Evaluating AI Models in Futures Trading," we will confront this challenge directly. We will introduce critical evaluation techniques like backtesting, walk-forward analysis, and calculating risk-adjusted return metrics like the Sharpe ratio. We will learn how to determine if our model has a real predictive edge or if it is just a product of wishful thinking.