時間序列與LSTM__Time Series Forecasting of GDP using LSTM

Sina_席納

9 min readOct 26, 2023

Time series with ARIMA(ENG)
Time series with ARIMA(CH)
Time series with LSTM(ENG)
Time series with LSTM(CH)
GitHub

在之前的 Medium 文章中，使用了 ARIMA 進行時序預測。在此次的分析中，我將會使用 LSTM 模型預測 GDP。LSTM 是遞歸神經網絡 (RNN) 的一種專門變體，特別擅長於解析序列。

1. 數據處理

有關數據處理和圖像化與上篇文章相同，請參考文章開頭連結以獲得更多 insights。

2. Feature Engineering:

— 時序轉換:

# Create sequences of data suitable for time-series forecasting
def create_sequences(input_data, tw):
    """
    Transform a time series into a prediction dataset
    X : our feature
    y : prediction
    tw : training window
    """
    X, y = [], []
    L = len(input_data)
    for i in range(L - tw):
        feature = input_data[i:i+tw]
        target = input_data[i+1:i+tw+1]
        X.append(feature)
        y.append(target)
    return torch.tensor(X, dtype=torch.float32), torch.tensor(y, dtype=torch.float32)

創建適用於時序預測的數據: 序列原始的時序數據主要轉換為重疊序列。在此分析中，每個序列的長度為 tw（在此分析中，tw=12，這是3年的季度數據），作為輸入，序列後的值作為預測目標。
使用神經網絡，尤其是LSTMs，進行時序預測需要以序列的形式輸入。模型不是基於單一的時間戳進行預測，而是基於一系列的先前觀測值進行預測。這允許模型捕捉給定窗口大小的時間模式和依賴性。
如果資料為：[10, 20, 30, 40, 50]，而tw=2。轉換後的結果將是：
特徵(X)：[10, 20], [20, 30], [30, 40]
目標(y)：30, 40, 50

— 正規化數據:

# Normalize the data
scaler = MinMaxScaler()
gdp_normalized = scaler.fit_transform(gdp)

由於神經網絡，包括LSTMs，當輸入值在小範圍內時通常會有更好的表現。大的輸入值可能導致更大的梯度，使訓練不穩定。正規化有助於提高模型的收斂速度和整體性能。此外，它確保所有特徵（如果使用多於一個特徵）都平等地貢獻於模型的訓練，不考慮它們原始的尺度。正規化意味著調整這些值以適應指定的範圍，在這裡我們使用[0, 1]作為MinMaxScaler的情況。值被調整到0和1之間的範圍:

— 分割數據:

# Split data into train and test sets
train, test = gdp_normalized[:train_size], gdp_normalized[train_size:]

數據的 70% 用於訓練，其餘 30% 用於測試.

3. 模型:

— LSTM 架構:

# Define LSTM model
class LSTM(nn.Module):
    def __init__(self, input_dim=1, hidden_dim=15, output_dim=1, layer_num=1):
        super().__init__()
        # Define LSTM layer
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_num, batch_first=True)
        # Define output layer
        self.linear = nn.Linear(hidden_dim, output_dim)
    # Define forward pass through the network
    def forward(self, input_seq):
        lstm_out, _ = self.lstm(input_seq)
        predictions = self.linear(lstm_out)
        return predictions

長短期記憶網絡（LSTM）是一種針對時序和序列優化的循環神經網絡（RNN）。LSTMs相較於標準的RNNs的主要優勢在於它們能夠避免長期依賴問題因為該數據的模式分散在長時間範圍內。
所創建的模型包含一個LSTM層，然後是一個線性層。LSTM層捕捉數據的序列依賴性，而線性層將LSTM的輸出映射到所需的輸出形狀，提供預測。

— 參數設定:

# Set hyperparameters
tw = 12

# Initialize LSTM model, optimizer and loss function
model = LSTM()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
loss_function = nn.MSELoss()

訓練窗口（tw）：這表示在預測下一個時間步驟之前考慮了多少先前的時間步驟（在我們的情況下是季度）。值15表示基於最後12個季度的GDP（即3年）來預測下一季度的GDP。這個窗口的選擇可能會影響模型捕捉長期模式的能力。
Hidden Dimensions of LSTM ：LSTM的隱藏維度定義了隱藏層中的LSTM細胞或神經元的數量。更多的神經元允許模型具有更大的學習能力，但如果沒有適當的正則化，它也可能使模型容易過擬合。
學習率：此參數在梯度下降優化過程中控制步驟大小。較小的學習率確保模型不會越過最佳點，但可能收斂速度較慢，而較大的學習率可以加快訓練，但有跳過最佳點的風險。.
損失函數：使用均方誤差（MSE）。

— 訓練方法:

for epoch in range(epochs_n):
    model.train()
    for x_batch, y_batch in loader:
        model.zero_grad()
        pred_y = model(x_batch)
        loss = loss_function(pred_y, y_batch)
        loss.backward()
        optimizer.step()
    # Print losses every 100 epochs
    if epoch % 100 == 0:
        with torch.no_grad():
            model.eval()
            pred_y = model(train_X)
            train_loss = np.sqrt(loss_function(pred_y, train_y))
            train_rmse_list.append(train_loss.item())
            
            pred_test_y = model(test_y)
            test_loss = np.sqrt(loss_function(pred_test_y, test_y))
            test_rmse_list.append(test_loss.item())
                    
            print(f"Epoch {epoch}: Training RMSE: {train_loss:.4f}, Testing RMSE: {test_loss:.4f}")
            # Check for increasing RMSE
            if test_loss > prev_test_loss:
                increasing_count += 1
            else:
                increasing_count = 0
            # Update the previous test loss
            prev_test_loss = test_loss
            # Stop if RMSE increased consecutively for 2 checkpoints
            if increasing_count == 2:
                print("Stopping early due to increasing test RMSE.")
                break

從下面的圖表中可以明顯看到，雖然測試RMSE顯示出下降趨勢，但訓練RMSE在整個訓練中都有周期性的增加。

該模型最多進行1000個epoch的訓練。這裡設有提前停止的機制：如果連續兩個檢查點（每100個時代評估一次）的測試數據的均方根誤差（RMSE）上升，則停止訓練。

In our final forecast, the predicted values align closely with the actual figures. Yet, there’s a caveat.

當深入研究時間序列分析時，尤其在經濟領域，我們不能忽略外部衝擊或不可預見事件對數據集的潛在影響。GDP的突然波動，無論是急劇上升還是下降，都可能歸因於全球經濟衰退、技術突破、關鍵的政治決策等因素。雖然LSTM在識別數據本質的模式方面表現出色，但它們缺乏考慮這些外部影響的固有能力。因此，將時間序列模型與輔助數據相結合，或結合複雜的結構，如混合模型，可能為更精確和有彈性的預測模型。

Reference:
https://pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html
https://colah.github.io/posts/2015-08-Understanding-LSTMs/
https://machinelearningmastery.com/time-series-forecasting-long-short-term-memory-network-python/
https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
https://machinelearningmastery.com/lstm-for-time-series-prediction-in-pytorch/
https://towardsdatascience.com/pytorch-lstms-for-time-series-data-cd16190929d7

時間序列與LSTM__Time Series Forecasting of GDP using LSTM

1. 數據處理

2. Feature Engineering:

3. 模型:

Written by Sina_席納