top of page

Using Lévy Area to Develop a Model for U.S. Motor Gasoline Prices

  • Writer: Henry Salkever
    Henry Salkever
  • Nov 18
  • 3 min read

Updated: Nov 25

Forecast Thesis


This report forecasts U.S. gasoline prices for the end of November 2025 using a multivariate time-series regression model. The model incorporates market-relevant fundamentals including copper futures prices, Brent crack spread, crude roll spread, and U.S.-Euro exchange rate. By quantifying the dynamic relationship between these variables and monthly U.S. gasoline prices, the analysis evaluates whether current macroeconomic signals point toward upward or downward pressure on consumer fuel prices for the coming month.


Data Sources


The dataset was primarily obtained from the U.S. Energy Information Administration (EIA), covering U.S. energy production, inventories, and consumption from January 2000 through October 2025. Additional macroeconomic indicators were gathered from the Federal Reserve Economic Data site (FRED). Futures market data for commodities relevant to the energy ecosystem (crude oil, soybean oil, copper, and gold) were retrieved from LSEG (London Stock Exchange Group).


All variables were converted to a monthly frequency to ensure temporal consistency across sources, which originally included weekly, monthly, and quarterly data. For quarterly variables, we filled in the missing months using linear interpolation so that each month had an estimated value. For weekly variables, we reduced the data to one observation per month by selecting the weekly value closest to the first day of the following month, ensuring consistent alignment across the dataset. To avoid distortions caused by extreme volatility and non-stationary dynamics during the early COVID-19 period, observations from March through July 2020 were excluded. Data is available here.


Methodology


Out of roughly 50 predictors, we used the Lévy area to identify which variables consistently moved before U.S. gasoline prices. We computed the Lévy area between the monthly percent changes of each predictor and the percent change in U.S. gas prices, since this transformation isolates timing relationships rather than levels. When X = U.S. gasoline prices, a negative Lévy area indicates that the other variable (Y) tends to lead X. To obtain the Lévy area, we applied a discrete adaptation of the shoelace polygon theorem, which allows us to calculate the signed area formed by the point-wise pairing of the two time series.

Since the resulting Lévy areas have units that are the products of the respective units of the X and Y variables (ex. units = dollars * barrels per day), we made the area dimensionless by dividing by the product of X and Y's standard deviations

To ensure our results weren’t due to chance, we used a block resampling method to calculate p-values. This method shuffles blocks of data to break up long-term patterns while keeping short-term relationships intact. We recalculated the Lévy area for each shuffled version and used it to compute the p-value of the real value.


We expected that gas price drivers vary by season (peak demand in summer, lowest in winter, transition periods in spring and fall), so we only used data from the demand season we are predicting (winter). We selected variables that met three criteria: negative Lévy area, p-value <= 0.10, and data with updated values available for the date of October 30th. The variables that met these conditions were 1) front-month copper futures price, 2) Brent oil crack spread, 3) WTI Intermediate front-month / spot spread, and 4) USD / Euro exchange rate.



ree

Model Fitting and Coefficient Interpretation


We fit a multivariable regression on the selected variables, and also included the current price of gas as an autoregressive variable. The resulting multiple achieved an R-squared score of .907.


Variable

Coefficient

T-statistic

P-Value

Copper Futures Price

0.1490

3.312

0.001

Brent Crack Spread

0.0055

2.061

0.042

Crude Roll Spread

0.1525

1.936

0.055

US/Euro Exchange

0.2152

1.219

0.225

Current Gas Price

0.798

14.858

0.000

Current gas prices were by far the strongest indicator, with about 80% of the previous month’s value carrying over into the next and indicating strong price persistence. As a result, the regression essentially provides a directional forecast of current prices, adjusted by market fundamentals. The positive copper coefficient makes sense since copper acts as an industrial barometer and rises with broader demand. The positive Brent crack spread indicates that when refinery margins expand due to higher wholesale gasoline prices, retailers pass on those costs to consumers. A positive crude roll spread reflects backwardation, tight current supply and strong demand, so wider spreads correspond to higher prices. The U.S.-Euro exchange rate was the weakest predictor, but was retained to avoid post-hoc variable selection.


Results


ree

The final fitted model achieved a test error of 13.3 cents.


Prediction of November Month-End Value


Front-Month Copper Futures ($/lb)

Brent Crack Spread ($/bl)

Crude Roll Spread ($)

US/EU Exchange

Gas ($/gal)

5.10

18.0

-0.16

1.16

3.038

Predicted November month-end gas price: $3.35/Gallon.

 
 
bottom of page