1140722 meeting

發佈於 2025-07-20 更新於 2025-09-21 約 2057 字預計閱讀 5 分鐘次閱讀

https://raw.githubusercontent.com/Josh-test-lab/website-assets-repository/refs/heads/main/posts/1140722%20meeting/cover%20image.png

封面圖片是使用 250 天進行訓練後，第 1 天預測的結果，被預測的時間為 2024 年 9 月 7 日 11 時 30 分。

前言

本次實驗欲透過空間基底與真實資料之線性關係，預測下一筆資料。

實驗

本次實驗係嘗試將原始三維時空資料 (5000 地點, 250 天, 24 小時) 依各地點時間序列進行標準化後，再減去對應時間點之平均值，經由 MRTS 計算出各地點的空間基底方陣 (5000, 5000) ，進行線性迴歸。並使用 $SSSD^{S4}$ 模型預測線性迴歸的下一期 $\mathbb{\beta}$ 與 $\mathbb{\varepsilon}$ 。

實驗地點為中國地區，並以固定種子挑選 5,000 個地點，且迭代次數為 3800次。

部分程式碼如下：

各時間序列標準化

1
2
3
4
train_ts_mean = np.mean(stacked_train, axis=1)  # (24, 5000)
train_ts_std = np.std(stacked_train, axis=1, ddof=0)  # (24, 5000)
for i in range(24):  # 各時間序列標準化
    stacked_train[i] = (stacked_train[i] - train_ts_mean[i]) / train_ts_std[i]

減去各空間平均值
為簡化運算，此處減去所有空間的平均值。未來預計使用評儀的方式進行預測，如減去前次空間平均，並在未來預測完 $\mathbb{\beta}$ 與 $\mathbb{\varepsilon}$ 後再填補回去。

1
2
3
train_sp_mean = np.mean(stacked_train, axis=2)  # (24, 250)
for i in range(24):  # 各空間點在時間 t 中心化
    stacked_train[i] = stacked_train[i] - np.mean(train_sp_mean[i][:, None])

應修改下列程式碼。

1
   stacked_train[i] = stacked_train[i] - train_sp_mean[i][:, None]

空間基底與線性迴歸模型程式碼

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## mrts
mrts = MRTS(locs = torch.tensor(locations_choose), k = locations_choose.shape[0]).forward()
mrts = np.asarray(mrts)

## linear model
betas_all = []
residuals_all = []

X = mrts                     # shape: (5000, 5000)
X_pinv = np.linalg.pinv(X)   # shape: (5000, 5000)

for i in range(24):
    y = stacked_train[i].T   # shape: (5000, 250)
    beta = X_pinv @ y        # shape: (5000, 250)
    y_hat = X @ beta         # shape: (5000, 250)
    residuals = y - y_hat    # shape: (5000, 250)

    betas_all.append(beta.T)
    residuals_all.append(residuals.T)

## beta and residual
betas_all = np.stack(betas_all)         # (24, 250, 5000) 5000 is not location, it is beta
residuals_all = np.stack(residuals_all) # (24, 250, 5000)

以下顯示前 40 個 $\mathbb{\beta}$ 、 $\mathbb{\varepsilon}$ 與 QQ plots 。

$\mathbb{\beta}$

gallery_made_with_nanogallery_beta

$\mathbb{\varepsilon}$

gallery_made_with_nanogallery_residual

QQ plots

gallery_made_with_nanogallery_qqplots

本次僅預測未來 10 期，使用 $SSSD^{S4}$ 模型預測未來 10 期資料。預測時使用原訓練集 (5000, 250, 24) ，再外加 10 天組成 (5000, 260, 24) 筆資料，填補未來 10 天。迭代次數皆為 3,800 。

以下為本次實驗參數。

model.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
wavenet:
  # WaveNet model parameters
  input_channels: 24  # Number of input channels
  output_channels: 24  # Number of output channels
  residual_layers: 26  # Number of residual layers
  residual_channels: 64  # Number of channels in residual blocks
  skip_channels: 64  # Number of channels in skip connections

  # Diffusion step embedding dimensions
  diffusion_step_embed_dim_input: 128  # Input dimension
  diffusion_step_embed_dim_hidden: 512  # Middle dimension
  diffusion_step_embed_dim_output: 512  # Output dimension

  # Structured State Spaces sequence model (S4) configurations
  s4_max_sequence_length: 250  # Maximum sequence length
  s4_state_dim: 64  # State dimension
  s4_dropout: 0.0  # Dropout rate
  s4_bidirectional: true  # Whether to use bidirectional layers
  s4_use_layer_norm: true  # Whether to use layer normalization

diffusion:
  # Diffusion model parameters
  T: 200  # Number of diffusion steps
  beta_0: 0.0001  # Initial beta value
  beta_T: 0.02  # Final beta value

training.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Training configuration
batch_size: 500  # Batch size
output_directory: "./results/surface air temperature/beta"  # Output directory for checkpoints and logs
ckpt_iter: "max"  # Checkpoint mode (max or min)
iters_per_ckpt: 100  # Checkpoint frequency (number of epochs)
iters_per_logging: 100  # Log frequency (number of iterations)
n_iters: 60000  # Maximum number of iterations
learning_rate: 0.0005  # Learning rate

# Additional training settings
only_generate_missing: true  # Generate missing values only
use_model: 2  # Model to use for training
masking: "forecast"  # Masking strategy for missing values
missing_k: 10  # Number of missing values

# Data paths
data:
  train_path: "./datasets/surface air temperature/beta/train_betas.npy"  # Path to training data

inference.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Inference configuration
batch_size: 500  # Batch size for inference
output_directory: "./results/surface air temperature/inference/beta"  # Output directory for inference results
ckpt_path: "./results/surface air temperature/beta"  # Path to checkpoint for inference
trials: 1 # Replications

# Additional training settings
only_generate_missing: true  # Generate missing values only
use_model: 2  # Model to use for training
masking: "forecast"  # Masking strategy for missing values
missing_k: 10  # Number of missing values

# Data paths
data:
  test_path: "./datasets/surface air temperature/beta/test_betas.npy"  # Path to test data

填補

填補時，依據先前的反順序還原資料。因此，先進行線性資料還原，並加上空間平均與時間反標準化。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## inference
beta_inference_data = inference_data_concatenate(beta_inference_path).transpose(1, 2, 0)  # (24, 260, 5000)
residual_inference_data = inference_data_concatenate(residual_inference_path).transpose(1, 2, 0)  # (24, 260, 5000)

X = mrts                                     # shape: (5000, 5000)
y_inference = []
for i in tqdm(range(24)):
    beta = beta_inference_data[i].T          # shape: (5000, 260)
    residual = residual_inference_data[i].T  # shape: (5000, 260)
    y_hat = X @ beta + residual              # shape: (5000, 260)

    y_inference.append(y_hat)                # (24, 5000, 260)

y_inference = np.array(y_inference)
y_inference = y_inference.transpose(0, 2, 1)  # (24, 260, 5000)

for i in range(24):
    y_inference[i] = y_inference[i] + np.mean(train_sp_mean[i][:, None])

for i in range(24):
    y_inference[i] = y_inference[i] * train_ts_std[i] + train_ts_mean[i]

結果

填補結果如下，以下為前 40 個地點的 260 天 24 小時填補結果。

gallery_made_with_nanogallery_location-value-ts

以下為 260 天真值與填補結果，末 10 天為缺失值。

以下為實驗結果，針對所有天數（ALL）與填補天數（Future）。

Method	Value
MSPE (All)	275.036719
MSPE (Future)	73.921309
MAPE (All)	11.352546
MAPE (Future)	6.607120
MSPE% (All)	0.040497
MSPE% (Future)	0.022734
MAPE% (All)	0.040497
MAPE% (Future)	0.022734

以下為前 10 個地點的實驗結果。

Metric	location 0	location 1	location 2	location 3	location 4	location 5	location 6	location 7	location 8	location 9
MSPE (All)	5.255262	10.026982	14.820487	11.327233	21.766030	10.163997	12.028227	27.281164	15.930138	12.857481
MSPE (Future)	3.356809	6.474035	14.732642	10.664811	13.967894	8.502941	12.174201	14.512227	13.899978	8.035698
MAPE (All)	1.828430	2.530852	3.230515	2.770408	3.856013	2.537044	2.792764	4.207124	3.220459	2.865893
MAPE (Future)	1.385229	2.076008	3.560789	3.016031	3.385609	2.551944	3.087012	3.242486	3.098211	2.263944
MSPE% (All)	0.006074	0.008465	0.010717	0.009204	0.012819	0.008425	0.009239	0.013915	0.010648	0.009491
MSPE% (Future)	0.004619	0.006998	0.011953	0.010142	0.011376	0.008567	0.010323	0.010823	0.010338	0.007556
MAPE% (All)	0.006074	0.008465	0.010717	0.009204	0.012819	0.008425	0.009239	0.013915	0.010648	0.009491
MAPE% (Future)	0.004619	0.006998	0.011953	0.010142	0.011376	0.008567	0.010323	0.010823	0.010338	0.007556

對照組

為確認上述修改是否有成效，以下進行對照組實現。其中，為求公平，迭代次數同樣為 3,800 ，且經過個時間標準化後送入 $SSSD^{S4}$ 模型預測未來 10 天資訊。以下為實驗結果。

gallery_made_with_nanogallery_location-value-ts_control

以下為對照組實驗結果，針對所有天數（ALL）與填補天數（Future）。

Method (Control)	Value
MSPE (All)	33.366627
MSPE (Future)	21.066000
MAPE (All)	4.087378
MAPE (Future)	3.584273
MSPE% (All)	0.014458
MSPE% (Future)	0.012363
MAPE% (All)	0.014458
MAPE% (Future)	0.012363

以下為對照組前 10 個地點的實驗結果。

Metric	location 0	location 1	location 2	location 3	location 4	location 5	location 6	location 7	location 8	location 9
MSPE (All)	10.087422	20.628242	32.429203	25.994247	26.916309	24.316294	23.361771	22.525789	26.675638	9.733860
MSPE (Future)	9.859146	22.754442	42.287827	29.440161	30.574316	36.257183	31.123371	25.592594	33.918751	18.377745
MAPE (All)	2.443704	3.676670	5.009152	4.464825	4.596467	4.187994	4.139386	4.070803	4.407689	2.394271
MAPE (Future)	2.536753	4.275434	6.336579	5.214918	5.353099	5.744732	5.237920	4.642370	5.360564	3.896869
MSPE% (All)	0.008098	0.012278	0.016622	0.014840	0.015277	0.013898	0.013675	0.013449	0.014565	0.007949
MSPE% (Future)	0.008460	0.014416	0.021272	0.017527	0.018002	0.019290	0.017522	0.015505	0.017897	0.013019
MAPE% (All)	0.008098	0.012278	0.016622	0.014840	0.015277	0.013898	0.013675	0.013449	0.014565	0.007949
MAPE% (Future)	0.008460	0.014416	0.021272	0.017527	0.018002	0.019290	0.017522	0.015505	0.017897	0.013019

結論

嵌入空間後仍較單純時間序列預測低。但或許是因為僅預測小範圍，整體準確率較先前從 $\frac{1}{4}$ 個地球抽取 5,000 個點還高。可以再重新檢驗線性迴歸的過程，並重新思索單位時間的空間平均值因如何操作，使結果帶有時間性。

結語

https://raw.githubusercontent.com/Josh-test-lab/website-assets-repository/refs/heads/main/posts/1140722%20meeting/To%20be%20continued.jpg — To be continued!

運行環境

本機作業系統：Windows 11 24H2
- 程式語言：Python 3.12.9
計算平臺：財團法人國家實驗研究院國家高速網路與計算中心臺灣 AI 雲
- 作業系統：Ubuntu
- Miniconda
- GPU：NVIDIA Tesla V100 32GB GPU
- CUDA 12.8 driver
- 程式語言：Python 3.10.16 for Linux

延伸學習

我測試此項目的 Github 儲存庫。

參考資料

Juan Lopez Alcaraz 、 Nils Strodthoff（2022）。Diffusion-based time series imputation and forecasting with structured state space models。Transactions on Machine Learning Research。參考自 https://openreview.net/forum?id=hHiIbk7ApW
SSSD（2022）。GitHub。參考自 https://github.com/AI4HealthUOL/SSSD
SSSD_CP（2024）。GitHub。參考自 https://github.com/egpivo/SSSD_CP
近兩年小時值查詢（無日期）。環境部 - 空氣品質監測網。參考自 https://airtw.moenv.gov.tw/CHT/Query/InsValue.aspx

目錄

1140722 meeting

前言

實驗

填補

結果

對照組

結論

結語

運行環境

延伸學習

參考資料

相關文章