## Introduction

The Time Series Foundation Model, or TimesFM in short, is a pretrained time-series foundation model developed by Google Research for forecasting univariate time-series. As a pretrained foundation model, it simplifies the often complex process of time-series analysis. Google Research says that their time-series foundation model exhibits zero-shot forecasting capabilities that rival the accuracy of leading supervised forecasting models across multiple public datasets.

### Overview

- TimesFM is a pretrained model developed by Google Research for univariate time-series forecasting, providing zero-shot prediction capabilities that rival leading supervised models.
- TimesFM is a transformer-based model with 200 million parameters, designed to predict future values of a single variable based on its historical data, supporting context lengths up to 512 points.
- It exhibits strong forecasting accuracy on unseen datasets, leveraging its transformer layers and tunable hyperparameters such as model dimensions, patch lengths, and horizon lengths.
- The demo uses TimesFM on Kaggle’s electric production dataset. It shows accurate forecasting with minimal errors (e.g., MAE = 3.34), performing well in comparison to actual data.
- TimesFM is an advanced model that simplifies time-series analysis while achieving near state-of-the-art accuracy in predicting future trends across various datasets without needing additional training.

## Background

A time series consists of data points collected at consistent time intervals, such as daily stock prices or hourly temperature readings. Forecasting such data is often complex due to elements like trends, seasonal variations, and erratic patterns. These challenges can hinder accurate predictions of future values, but models like TimesFM are designed to streamline this task.

## Understanding TimesFM Architecture

The TimesFM 1.0 contains a 200M parameter, a transformer-based model trained decoder-only on a pretrain dataset with over 100 billion real-world time points.

The TimesFM 1.0 generates accurate forecasts on unseen datasets without additional training; it predicts the future values of **a single variable** based on its own historical data. It involves using one variable (time series) to forecast future points of that same variable with respect to time. It performs univariate time series forecasting for context lengths up to 512-time points, and on any horizon lengths, it has an optional frequency indicator input.

Also read: Time series Forecasting: Complete Tutorial | Part-1

### Parameters (Hyperparameters)

These are tunable values that control the behavior of the model and impact its performance:

**model_dim**: Dimensionality of the input and output vectors.**input_patch_len (p)**: Length of each input patch.**output_patch_len (h)**: Length of the forecast generated in each step.**num_heads**: Number of attention heads in the multi-head attention mechanism.**num_layers (nl)**: Number of stacked transformer layers.**context length (L)**: The length of the historical data used for prediction.**horizon length (H)**: The length of the forecast horizon.**Number of input tokens (N)**, calculated as the total context length divided by the input patch length: N = L/p. Each of these tokens is fed into the transformer layers for processing.

### Components

These are the fundamental building blocks of the model’s architecture:

**Residual Blocks**: Neural network blocks used to process input and output patches.**Stacked Transformer**: The core transformer layers in the model.**tj**: The input tokens fed to the transformer layers, derived from the processed patches.

*t_j = InputResidualBlock(ŷ_j ⊙ (1 – m_j)) + PE_j*

where ỹ_j is the j-th patch of the input series, m̃_j is the corresponding mask, and PE_j is the positional encoding.

**oj**: The**output token**at step j, generated by the transformer layers based on the input tokens. It is used to predict the corresponding output patch:

*o_j = StackedTransformer((t_1, ṁ_1), …, (t_j, ṁ_j))*

**m1:L (mask)**: The mask used to ignore certain parts of the input during processing.

The **loss function** is used during training. In the case of point forecasting, it is the **Mean Squared Error (MSE)**:

*TrainLoss = (1 / N) * Σ (MSE(ŷp(j+1):p(j+h), yp(j+1):p(j+h)))*

Where ŷ are the model’s predictions and y are the true future values.

Also read: Introduction to Time Series Data Forecasting

## TimesFM 1.0 for Forecasting

The **“Electric Production”** dataset is available on Kaggle and contains data related to electric production over time. It consists of only two columns: **DATE**, which represents the date of the recorded values, and **Value**, which indicates the amount of electricity produced in that month. Our task is to forecast 24 months of data using TimesFM.

#### Demo

Before we start, make sure that you’re using a GPU. I’m doing this demonstration on kaggle and I’ll be using the GPU T4 x 2 accelerator.

Let’s install “timesfm” using pip, the “-q” will just install it without displaying anything.

`!pip -q install timesfm`

Let’s import a few necessary libraries and read the dataset.

`import timesfm`

```
import pandas as pd
data=pd.read_csv('/kaggle/input/electric-production/Electric_Production.csv')
data.head()
```

It performs univariate time series forecasting for context lengths up to 512 timepoints and on any horizon lengths, it has an optional frequency indicator input.

```
data['DATE']=pd.to_datetime(data['DATE'])
data.head()
```

Converted the DATE column to datetime, and now it’s in YYYY-MM-DD format

```
#Let's Visualise the Datas
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore') # Settings the warnings to be ignored
sns.set(style="darkgrid")
plt.figure(figsize=(15, 6))
sns.lineplot(x="DATE", y='Value', data=data, color="green")
plt.title('Electric Production')
plt.xlabel('Date')
plt.ylabel('Value')
plt.show()
```

Let’s look at the data:

```
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose
# Set index to DATE and decompose the data
data.set_index("DATE", inplace=True)
result = seasonal_decompose(data['Value'])
# Create a 2x2 grid for the subplots
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(12, 10))
result.observed.plot(ax=ax1, color="darkgreen")
ax1.set_ylabel('Observed')
result.trend.plot(ax=ax2, color="darkgreen")
ax2.set_ylabel('Trend')
result.seasonal.plot(ax=ax3, color="darkgreen")
ax3.set_ylabel('Seasonal')
result.resid.plot(ax=ax4, color="darkgreen")
ax4.set_ylabel('Residual')
plt.tight_layout()
plt.show()
# Adjust layout and show the plots
plt.tight_layout()
plt.show()
# Reset the index after plotting
data.reset_index(inplace=True)
```

We can see the components of the time series, like trend and seasonality, and we can get an idea of their relation to time.

```
df = pd.DataFrame({'unique_id':[1]*len(data),'ds': data["DATE"],
"y":data['Value']})
```

```
# Spliting into 94% and 6%
split_idx = int(len(df) * 0.94)
# Split the dataframe into train and test sets
train_df = df[:split_idx]
test_df = df[split_idx:]
print(train_df.shape, test_df.shape)
```

(373, 3) (24, 3)

Let’s forecast 24 months or 2 years of the data using the remaining data as past data.

```
# Initialize the TimesFM model with specified parameters
tfm = timesfm.TimesFm(
context_len=128, # Length of the context window for the model
horizon_len=24, # Forecasting horizon length
input_patch_len=32, # Length of input patches
output_patch_len=128, # Length of output patches
num_layers=20,
model_dims=1280,
)
# Load the pretrained model checkpoint
tfm.load_from_checkpoint(repo_id="google/timesfm-1.0-200m")
# Forecasting the values using the TimesFM model
timesfm_forecast = tfm.forecast_on_df(
inputs=train_df, # Input training data for training
freq="MS", # Frequency of the time-series data
value_name="y", # Name of the column containing the values to be forecasted
num_jobs=-1, # Set to -1 to use all available cores
)
timesfm_forecast = timesfm_forecast[["ds","timesfm"]]
```

The predictions are ready let’s look at both the actual values and predicted values

`timesfm_forecast.head()`

ds | Timesfm | |

0 | 2016-02-01 | 111.673813 |

1 | 2016-03-01 | 100.474892 |

2 | 2016-04-01 | 89.024544 |

3 | 2016-05-01 | 90.391014 |

4 | 2016-06-01 | 100.934502 |

`test_df.head()`

unique_id | ds | y | |

373 | 1 | 2016-02-01 | 106.6688 |

374 | 1 | 2016-03-01 | 95.3548 |

375 | 1 | 2016-04-01 | 89.3254 |

376 | 1 | 2016-05-01 | 90.7369 |

377 | 1 | 2016-06-01 | 104.0375 |

```
import numpy as np
actuals = test_df['y']
predicted_values = timesfm_forecast['timesfm']
# Convert to numpy arrays
actual_values = np.array(actuals)
predicted_values = np.array(predicted_values)
# Calculate error metrics
MAE = np.mean(np.abs(actual_values - predicted_values)) # Mean Absolute Error
MSE = np.mean((actual_values - predicted_values)**2) # Mean Squared Error
RMSE = np.sqrt(np.mean((actual_values - predicted_values)**2)) # Root Mean Squared Error
# Print the error metrics
print(f"Mean Absolute Error (MAE): {MAE}")
print(f"Mean Squared Error (MSE): {MSE}")
print(f"Root Mean Squared Error (RMSE): {RMSE}")
```

Mean Absolute Error (MAE): 3.3446476043701163Mean Squared Error (MSE): 22.60650784076036

Root Mean Squared Error (RMSE): 4.754630147630872

```
# Let's Visualise the Data
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore') # Setting the warnings to be ignored
# Set the style for seaborn
sns.set(style="darkgrid")
# Plot size
plt.figure(figsize=(15, 6))
# Plot actual timeseries data
sns.lineplot(x="ds", y='timesfm', data=timesfm_forecast, color="red", label="Forecast")
# Plot forecasted values
sns.lineplot(x="DATE", y='Value', data=data, color="green", label="Actual Time Series")
# Set plot title and labels
plt.title('Electric Production: Actual vs Forecast')
plt.xlabel('Date')
plt.ylabel('Value')
# Show the legend
plt.legend()
# Display the plot
plt.show()
```

The predictions are close to the actual values. The model also performs well on the error metrics [MSE, RMSE, MAE] despite forecasting the values in zero-shot.

Also read: A Comprehensive Guide to Time Series Analysis and Forecasting

## Conclusion

In conclusion, TimesFM, a transformer-based pretrained model by Google Research, demonstrates impressive zero-shot forecasting capabilities for univariate time-series data. Its architecture and training on extensive datasets enable accurate predictions, showing the potential to streamline time-series analysis while approaching the accuracy of state-of-the-art models in various applications.

*Are you looking for more articles on similar topics like this? Check out our Time Series* *articles.*

## Frequently Asked Questions

**Q1. How would you explain MAE (Mean Absolute Error)?**

Ans. The Mean Absolute Error (MAE) calculates the average of the absolute differences between predictions and actual values, providing an easy way to evaluate model performance. A smaller MAE implies more accurate forecasts and a more reliable model.

**Q2. What does seasonality mean in time series analysis?**

Ans. Seasonality shows the regular, predictable variations in a time series that arise from seasonal influences. For example, annual retail sales often surge during the holiday period. It’s important to consider these factors.

**Q3. What is a trend in time series analysis?**

Ans. A trend in time series data denotes a sustained direction or movement observed over time, which can be upward, downward, or stable. Identifying trends is crucial for comprehending the data’s long-term behavior, as it impacts forecasting and the effectiveness of the predictive model.

**Q4. How does TimesFM forecast univariate time-series data?**

Ans. The Timeseries Foundation model predicts a single variable by examining its historical trends. Utilizing a decoder-only transformer-based architecture, it provides precise forecasts based on previous values of that variable.