前言
本次實驗同 1140812-meeting ,僅將訓練的資料集更換為自製的空間模擬資料集(#simulation01)。
SSSD
+ autoFRK
model.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| wavenet:
# WaveNet model parameters
input_channels: 24 # Number of input channels
output_channels: 24 # Number of output channels
residual_layers: 32 # Number of residual layers
residual_channels: 64 # Number of channels in residual blocks
skip_channels: 64 # Number of channels in skip connections
# Diffusion step embedding dimensions
diffusion_step_embed_dim_input: 128 # Input dimension
diffusion_step_embed_dim_hidden: 512 # Middle dimension
diffusion_step_embed_dim_output: 512 # Output dimension
# Structured State Spaces sequence model (S4) configurations
s4_max_sequence_length: 250 # Maximum sequence length
s4_state_dim: 64 # State dimension
s4_dropout: 0.0 # Dropout rate
s4_bidirectional: true # Whether to use bidirectional layers
s4_use_layer_norm: true # Whether to use layer normalization
diffusion:
# Diffusion model parameters
T: 200 # Number of diffusion steps
beta_0: 0.0001 # Initial beta value
beta_T: 0.02 # Final beta value
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| # Training configuration
batch_size: 200 # Batch size
output_directory: "./results/surface air temperature/simulation_spatial" # Output directory for checkpoints and logs
ckpt_iter: "max" # Checkpoint mode (max or min)
iters_per_ckpt: 100 # Checkpoint frequency (number of epochs)
iters_per_logging: 100 # Log frequency (number of iterations)
n_iters: 60000 # Maximum number of iterations
learning_rate: 0.0005 # Learning rate
# Additional training settings
only_generate_missing: true # Generate missing values only
use_model: 2 # Model to use for training
masking: "forecast" # Masking strategy for missing values
missing_k: 10 # Number of missing values
# Data paths
data:
train_path: "./datasets/surface air temperature/train/train_sssd.npy" # Path to training data
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # Inference configuration
batch_size: 200 # Batch size for inference
output_directory: "./results/surface air temperature/inference/simulation_spatial" # Output directory for inference results
ckpt_path: "./results/surface air temperature/simulation_spatial" # Path to checkpoint for inference
trials: 1 # Replications
# Additional training settings
only_generate_missing: true # Generate missing values only
use_model: 2 # Model to use for training
masking: "forecast" # Masking strategy for missing values
missing_k: 10 # Number of missing values
# Data paths
data:
test_path: "./datasets/surface air temperature/test/test_sssd.npy" # Path to test data
|
SSSD 推論耗時 06:20:00 (GPU)。
autoFRK 推論耗時 2.186092 小時(CPU)。
Method | ALL Locs & All Time | Known Locs & All Time | Unknown Locs & All Time | ALL Locs & Future | Known Locs & Future | Unknown Locs & Future | ALL Locs & Past | Known Locs & Past | Unknown Locs & Past |
---|
MSPE | 3.067854381 | 2.944473417 | 3.813358697 | 4.446151669 | 4.317139150 | 5.225683507 | 3.012722490 | 2.889566788 | 3.756865704 |
RMSPE | 1.751529155 | 1.715946799 | 1.952782296 | 2.108589972 | 2.077772641 | 2.285975395 | 1.735719588 | 1.699872580 | 1.938263580 |
MSPE% | 0.010246664 | 0.009823395 | 0.012804179 | 0.015253644 | 0.014798086 | 0.018006261 | 0.010046385 | 0.009624408 | 0.012596096 |
RMSPE% | 0.101225808 | 0.099113044 | 0.113155553 | 0.123505645 | 0.121647385 | 0.134187409 | 0.100231656 | 0.098104066 | 0.112232330 |
MAPE | 0.974059015 | 0.958734496 | 1.066654299 | 1.487598057 | 1.476281373 | 1.555976813 | 0.953517454 | 0.938032621 | 1.047081399 |
MAPE% | 0.003235654 | 0.003183061 | 0.003553432 | 0.005065191 | 0.005025057 | 0.005307689 | 0.003162472 | 0.003109382 | 0.003483261 |
TSMixer
+ autoFRK
TSMixer 推論耗時 4:35:02.560534 (GPU)。
autoFRK 推論耗時 小時(CPU)。
RegressionEnsemble
+ autoFRK
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
|
RegressionEnsemble 推論耗時 0:02:46.558833 (CPU)。
autoFRK 推論耗時 3.306048 小時(CPU)。
| Method | ALL Locs & All Time | Known Locs & All Time | Unknown Locs & All Time | ALL Locs & Future | Known Locs & Future | Unknown Locs & Future | ALL Locs & Past | Known Locs & Past | Unknown Locs & Past |
|------------|---------------------|-----------------------|-------------------------|-------------------|---------------------|-----------------------|-----------------|-------------------|---------------------|
| MSPE | 2.991342157 | 2.861356059 | 3.776756648 | 3.675732451 | 3.552316007 | 4.421451142 | 2.963966545 | 2.833717661 | 3.750968868 |
| RMSPE | 1.729549698 | 1.691554332 | 1.943387930 | 1.917219980 | 1.884758872 | 2.102724695 | 1.721617421 | 1.683364982 | 1.936741818 |
| MSPE% | 0.009989546 | 0.009544637 | 0.012677817 | 0.012557865 | 0.012124315 | 0.015177499 | 0.009886813 | 0.009441449 | 0.012577830 |
| RMSPE% | 0.099947714 | 0.097696656 | 0.112595814 | 0.112061880 | 0.110110469 | 0.123196995 | 0.099432453 | 0.097167121 | 0.112150925 |
| MAPE | 0.934706683 | 0.915188150 | 1.052643436 | 1.220234210 | 1.210561085 | 1.278682096 | 0.923285582 | 0.903373233 | 1.043601890 |
| MAPE% | 0.003105028 | 0.003038683 | 0.003505907 | 0.004144856 | 0.004110203 | 0.004354238 | 0.003063435 | 0.002995822 | 0.003471973 |
## 結語

## 運行環境
- 本機作業系統:Windows 11 24H2
- 程式語言:Python 3.12.9
- 計算平臺:財團法人國家實驗研究院國家高速網路與計算中心臺灣 AI 雲
- 作業系統:Ubuntu
- Miniconda
- GPU:NVIDIA Tesla V100 32GB GPU
- CUDA 12.8 driver
- 程式語言:Python 3.10.16 for Linux
## 延伸學習
- 我測試此項目的 [Github 儲存庫](https://github.com/Josh-test-lab/sssd_cp_learning_and_testing) 。
## 參考資料
- Global Modeling and Assimilation Office (GMAO)。(2015)。*MERRA-2 tavg1_2d_flx_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Surface Flux Diagnostics* (Version 5.12.4) [資料集]。Goddard Earth Sciences Data and Information Services Center (GES DISC)。參考自 https://doi.org/10.5067/7MCPBJ41Y0K6
- Juan Lopez Alcaraz 、 Nils Strodthoff(2022)。Diffusion-based time series imputation and forecasting with structured state space models。*Transactions on Machine Learning Research*。參考自 https://openreview.net/forum?id=hHiIbk7ApW
- *SSSD*(2022)。GitHub。參考自 https://github.com/AI4HealthUOL/SSSD
- *SSSD_CP*(2024)。GitHub。參考自 https://github.com/egpivo/SSSD_CP
- Unit8 SA(無日期)。*Time Series Made Easy in Python*。Darts。參考自 https://unit8co.github.io/darts/index.html
- *darts*(2025)。GitHub。參考自 https://github.com/unit8co/darts
- Tzeng, S., & Huang, H. C. (2018). *Resolution Adaptive Fixed Rank Kriging*. Technometrics, 60(2), 198–208. 參考自 https://doi.org/10.1080/00401706.2017.1345701
- *autoFRK*(2024)。GitHub。參考自 https://github.com/egpivo/autoFRK
- Si-An Chen, Chun-Liang Li, Nate Yoder, Sercan O. Arik and Tomas Pfister. (2023). *TSMixer: An all-MLP architecture for time series forecasting*. arXiv. 參考自 https://arxiv.org/abs/2303.06053
|