目錄

1140722 meeting

封面圖片是使用 250 天進行訓練後,第 1 天預測的結果,被預測的時間為 2024 年 9 月 7 日 11 時 30 分。

前言

本次實驗欲透過空間基底與真實資料之線性關係,預測下一筆資料。

實驗

本次實驗係嘗試將原始三維時空資料 (5000 地點, 250 天, 24 小時) 依各地點時間序列進行標準化後,再減去對應時間點之平均值,經由 MRTS 計算出各地點的空間基底方陣 (5000, 5000) ,進行線性迴歸。並使用 $SSSD^{S4}$ 模型預測線性迴歸的下一期 $\mathbb{\beta}$ 與 $\mathbb{\varepsilon}$ 。

實驗地點為中國地區,並以固定種子挑選 5,000 個地點,且迭代次數為 3800次。

部分程式碼如下:

  • 各時間序列標準化
1
2
3
4
train_ts_mean = np.mean(stacked_train, axis=1)  # (24, 5000)
train_ts_std = np.std(stacked_train, axis=1, ddof=0)  # (24, 5000)
for i in range(24):  # 各時間序列標準化
    stacked_train[i] = (stacked_train[i] - train_ts_mean[i]) / train_ts_std[i]
  • 減去各空間平均值

    為簡化運算,此處減去所有空間的平均值。未來預計使用評儀的方式進行預測,如減去前次空間平均,並在未來預測完 $\mathbb{\beta}$ 與 $\mathbb{\varepsilon}$ 後再填補回去。

1
2
3
train_sp_mean = np.mean(stacked_train, axis=2)  # (24, 250)
for i in range(24):  # 各空間點在時間 t 中心化
    stacked_train[i] = stacked_train[i] - np.mean(train_sp_mean[i][:, None])

應修改下列程式碼。

1
   stacked_train[i] = stacked_train[i] - train_sp_mean[i][:, None]
  • 空間基底與線性迴歸模型程式碼
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## mrts
mrts = MRTS(locs = torch.tensor(locations_choose), k = locations_choose.shape[0]).forward()
mrts = np.asarray(mrts)

## linear model
betas_all = []
residuals_all = []

X = mrts                     # shape: (5000, 5000)
X_pinv = np.linalg.pinv(X)   # shape: (5000, 5000)

for i in range(24):
    y = stacked_train[i].T   # shape: (5000, 250)
    beta = X_pinv @ y        # shape: (5000, 250)
    y_hat = X @ beta         # shape: (5000, 250)
    residuals = y - y_hat    # shape: (5000, 250)

    betas_all.append(beta.T)
    residuals_all.append(residuals.T)

## beta and residual
betas_all = np.stack(betas_all)         # (24, 250, 5000) 5000 is not location, it is beta
residuals_all = np.stack(residuals_all) # (24, 250, 5000)

以下顯示前 40 個 $\mathbb{\beta}$ 、 $\mathbb{\varepsilon}$ 與 QQ plots 。

  • $\mathbb{\beta}$
gallery_made_with_nanogallery_beta
  • $\mathbb{\varepsilon}$
gallery_made_with_nanogallery_residual
  • QQ plots
gallery_made_with_nanogallery_qqplots

本次僅預測未來 10 期,使用 $SSSD^{S4}$ 模型預測未來 10 期資料。預測時使用原訓練集 (5000, 250, 24) ,再外加 10 天組成 (5000, 260, 24) 筆資料,填補未來 10 天。迭代次數皆為 3,800 。

以下為本次實驗參數。

model.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
wavenet:
  # WaveNet model parameters
  input_channels: 24  # Number of input channels
  output_channels: 24  # Number of output channels
  residual_layers: 26  # Number of residual layers
  residual_channels: 64  # Number of channels in residual blocks
  skip_channels: 64  # Number of channels in skip connections

  # Diffusion step embedding dimensions
  diffusion_step_embed_dim_input: 128  # Input dimension
  diffusion_step_embed_dim_hidden: 512  # Middle dimension
  diffusion_step_embed_dim_output: 512  # Output dimension

  # Structured State Spaces sequence model (S4) configurations
  s4_max_sequence_length: 250  # Maximum sequence length
  s4_state_dim: 64  # State dimension
  s4_dropout: 0.0  # Dropout rate
  s4_bidirectional: true  # Whether to use bidirectional layers
  s4_use_layer_norm: true  # Whether to use layer normalization

diffusion:
  # Diffusion model parameters
  T: 200  # Number of diffusion steps
  beta_0: 0.0001  # Initial beta value
  beta_T: 0.02  # Final beta value

training.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Training configuration
batch_size: 500  # Batch size
output_directory: "./results/surface air temperature/beta"  # Output directory for checkpoints and logs
ckpt_iter: "max"  # Checkpoint mode (max or min)
iters_per_ckpt: 100  # Checkpoint frequency (number of epochs)
iters_per_logging: 100  # Log frequency (number of iterations)
n_iters: 60000  # Maximum number of iterations
learning_rate: 0.0005  # Learning rate

# Additional training settings
only_generate_missing: true  # Generate missing values only
use_model: 2  # Model to use for training
masking: "forecast"  # Masking strategy for missing values
missing_k: 10  # Number of missing values

# Data paths
data:
  train_path: "./datasets/surface air temperature/beta/train_betas.npy"  # Path to training data

inference.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Inference configuration
batch_size: 500  # Batch size for inference
output_directory: "./results/surface air temperature/inference/beta"  # Output directory for inference results
ckpt_path: "./results/surface air temperature/beta"  # Path to checkpoint for inference
trials: 1 # Replications

# Additional training settings
only_generate_missing: true  # Generate missing values only
use_model: 2  # Model to use for training
masking: "forecast"  # Masking strategy for missing values
missing_k: 10  # Number of missing values

# Data paths
data:
  test_path: "./datasets/surface air temperature/beta/test_betas.npy"  # Path to test data

填補

填補時,依據先前的反順序還原資料。因此,先進行線性資料還原,並加上空間平均與時間反標準化。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## inference
beta_inference_data = inference_data_concatenate(beta_inference_path).transpose(1, 2, 0)  # (24, 260, 5000)
residual_inference_data = inference_data_concatenate(residual_inference_path).transpose(1, 2, 0)  # (24, 260, 5000)

X = mrts                                     # shape: (5000, 5000)
y_inference = []
for i in tqdm(range(24)):
    beta = beta_inference_data[i].T          # shape: (5000, 260)
    residual = residual_inference_data[i].T  # shape: (5000, 260)
    y_hat = X @ beta + residual              # shape: (5000, 260)

    y_inference.append(y_hat)                # (24, 5000, 260)

y_inference = np.array(y_inference)
y_inference = y_inference.transpose(0, 2, 1)  # (24, 260, 5000)

for i in range(24):
    y_inference[i] = y_inference[i] + np.mean(train_sp_mean[i][:, None])

for i in range(24):
    y_inference[i] = y_inference[i] * train_ts_std[i] + train_ts_mean[i]

結果

填補結果如下,以下為前 40 個地點的 260 天 24 小時填補結果。

gallery_made_with_nanogallery_location-value-ts

以下為 260 天真值與填補結果,末 10 天為缺失值。

以下為實驗結果,針對所有天數(ALL)與填補天數(Future)。

MethodValue
MSPE (All)275.036719
MSPE (Future)73.921309
MAPE (All)11.352546
MAPE (Future)6.607120
MSPE% (All)0.040497
MSPE% (Future)0.022734
MAPE% (All)0.040497
MAPE% (Future)0.022734

以下為前 10 個地點的實驗結果。

Metriclocation 0location 1location 2location 3location 4location 5location 6location 7location 8location 9
MSPE (All)5.25526210.02698214.82048711.32723321.76603010.16399712.02822727.28116415.93013812.857481
MSPE (Future)3.3568096.47403514.73264210.66481113.9678948.50294112.17420114.51222713.8999788.035698
MAPE (All)1.8284302.5308523.2305152.7704083.8560132.5370442.7927644.2071243.2204592.865893
MAPE (Future)1.3852292.0760083.5607893.0160313.3856092.5519443.0870123.2424863.0982112.263944
MSPE% (All)0.0060740.0084650.0107170.0092040.0128190.0084250.0092390.0139150.0106480.009491
MSPE% (Future)0.0046190.0069980.0119530.0101420.0113760.0085670.0103230.0108230.0103380.007556
MAPE% (All)0.0060740.0084650.0107170.0092040.0128190.0084250.0092390.0139150.0106480.009491
MAPE% (Future)0.0046190.0069980.0119530.0101420.0113760.0085670.0103230.0108230.0103380.007556

對照組

為確認上述修改是否有成效,以下進行對照組實現。其中,為求公平,迭代次數同樣為 3,800 ,且經過個時間標準化後送入 $SSSD^{S4}$ 模型預測未來 10 天資訊。以下為實驗結果。

gallery_made_with_nanogallery_location-value-ts_control

以下為對照組實驗結果,針對所有天數(ALL)與填補天數(Future)。

Method (Control)Value
MSPE (All)33.366627
MSPE (Future)21.066000
MAPE (All)4.087378
MAPE (Future)3.584273
MSPE% (All)0.014458
MSPE% (Future)0.012363
MAPE% (All)0.014458
MAPE% (Future)0.012363

以下為對照組前 10 個地點的實驗結果。

Metriclocation 0location 1location 2location 3location 4location 5location 6location 7location 8location 9
MSPE (All)10.08742220.62824232.42920325.99424726.91630924.31629423.36177122.52578926.6756389.733860
MSPE (Future)9.85914622.75444242.28782729.44016130.57431636.25718331.12337125.59259433.91875118.377745
MAPE (All)2.4437043.6766705.0091524.4648254.5964674.1879944.1393864.0708034.4076892.394271
MAPE (Future)2.5367534.2754346.3365795.2149185.3530995.7447325.2379204.6423705.3605643.896869
MSPE% (All)0.0080980.0122780.0166220.0148400.0152770.0138980.0136750.0134490.0145650.007949
MSPE% (Future)0.0084600.0144160.0212720.0175270.0180020.0192900.0175220.0155050.0178970.013019
MAPE% (All)0.0080980.0122780.0166220.0148400.0152770.0138980.0136750.0134490.0145650.007949
MAPE% (Future)0.0084600.0144160.0212720.0175270.0180020.0192900.0175220.0155050.0178970.013019

結論

嵌入空間後仍較單純時間序列預測低。但或許是因為僅預測小範圍,整體準確率較先前從 $\frac{1}{4}$ 個地球抽取 5,000 個點還高。可以再重新檢驗線性迴歸的過程,並重新思索單位時間的空間平均值因如何操作,使結果帶有時間性。

結語

https://raw.githubusercontent.com/Josh-test-lab/website-assets-repository/refs/heads/main/posts/1140722%20meeting/To%20be%20continued.jpg
To be continued!

運行環境

  • 本機作業系統:Windows 11 24H2
    • 程式語言:Python 3.12.9
  • 計算平臺:財團法人國家實驗研究院國家高速網路與計算中心臺灣 AI 雲
    • 作業系統:Ubuntu
    • Miniconda
    • GPU:NVIDIA Tesla V100 32GB GPU
    • CUDA 12.8 driver
    • 程式語言:Python 3.10.16 for Linux

延伸學習

參考資料