目錄

1140819 meeting

前言

本次實驗同 1140812-meeting ,僅將訓練的資料集更換為自製的空間模擬資料集(#simulation01)。

SSSD + autoFRK

model.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
wavenet:
  # WaveNet model parameters
  input_channels: 24  # Number of input channels
  output_channels: 24  # Number of output channels
  residual_layers: 32  # Number of residual layers
  residual_channels: 64  # Number of channels in residual blocks
  skip_channels: 64  # Number of channels in skip connections

  # Diffusion step embedding dimensions
  diffusion_step_embed_dim_input: 128  # Input dimension
  diffusion_step_embed_dim_hidden: 512  # Middle dimension
  diffusion_step_embed_dim_output: 512  # Output dimension

  # Structured State Spaces sequence model (S4) configurations
  s4_max_sequence_length: 250  # Maximum sequence length
  s4_state_dim: 64  # State dimension
  s4_dropout: 0.0  # Dropout rate
  s4_bidirectional: true  # Whether to use bidirectional layers
  s4_use_layer_norm: true  # Whether to use layer normalization

diffusion:
  # Diffusion model parameters
  T: 200  # Number of diffusion steps
  beta_0: 0.0001  # Initial beta value
  beta_T: 0.02  # Final beta value
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Training configuration
batch_size: 200  # Batch size
output_directory: "./results/surface air temperature/simulation_spatial"  # Output directory for checkpoints and logs
ckpt_iter: "max"  # Checkpoint mode (max or min)
iters_per_ckpt: 100  # Checkpoint frequency (number of epochs)
iters_per_logging: 100  # Log frequency (number of iterations)
n_iters: 60000  # Maximum number of iterations
learning_rate: 0.0005  # Learning rate

# Additional training settings
only_generate_missing: true  # Generate missing values only
use_model: 2  # Model to use for training
masking: "forecast"  # Masking strategy for missing values
missing_k: 10  # Number of missing values

# Data paths
data:
  train_path: "./datasets/surface air temperature/train/train_sssd.npy"  # Path to training data
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Inference configuration
batch_size: 200  # Batch size for inference
output_directory: "./results/surface air temperature/inference/simulation_spatial"  # Output directory for inference results
ckpt_path: "./results/surface air temperature/simulation_spatial"  # Path to checkpoint for inference
trials: 1 # Replications

# Additional training settings
only_generate_missing: true  # Generate missing values only
use_model: 2  # Model to use for training
masking: "forecast"  # Masking strategy for missing values
missing_k: 10  # Number of missing values

# Data paths
data:
  test_path: "./datasets/surface air temperature/test/test_sssd.npy"  # Path to test data

SSSD 推論耗時 06:20:00 (GPU)。

autoFRK 推論耗時 2.186092 小時(CPU)。

MethodALL Locs & All TimeKnown Locs & All TimeUnknown Locs & All TimeALL Locs & FutureKnown Locs & FutureUnknown Locs & FutureALL Locs & PastKnown Locs & PastUnknown Locs & Past
MSPE3.0678543812.9444734173.8133586974.4461516694.3171391505.2256835073.0127224902.8895667883.756865704
RMSPE1.7515291551.7159467991.9527822962.1085899722.0777726412.2859753951.7357195881.6998725801.938263580
MSPE%0.0102466640.0098233950.0128041790.0152536440.0147980860.0180062610.0100463850.0096244080.012596096
RMSPE%0.1012258080.0991130440.1131555530.1235056450.1216473850.1341874090.1002316560.0981040660.112232330
MAPE0.9740590150.9587344961.0666542991.4875980571.4762813731.5559768130.9535174540.9380326211.047081399
MAPE%0.0032356540.0031830610.0035534320.0050651910.0050250570.0053076890.0031624720.0031093820.003483261

TSMixer + autoFRK

TSMixer 推論耗時 4:35:02.560534 (GPU)。

autoFRK 推論耗時 小時(CPU)。

RegressionEnsemble + autoFRK

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

RegressionEnsemble 推論耗時 0:02:46.558833 (CPU)。

autoFRK 推論耗時 3.306048 小時(CPU)。


| Method     | ALL Locs & All Time | Known Locs & All Time | Unknown Locs & All Time | ALL Locs & Future | Known Locs & Future | Unknown Locs & Future | ALL Locs & Past | Known Locs & Past | Unknown Locs & Past |
|------------|---------------------|-----------------------|-------------------------|-------------------|---------------------|-----------------------|-----------------|-------------------|---------------------|
| MSPE       | 2.991342157          | 2.861356059            | 3.776756648              | 3.675732451        | 3.552316007          | 4.421451142            | 2.963966545      | 2.833717661        | 3.750968868          |
| RMSPE      | 1.729549698          | 1.691554332            | 1.943387930              | 1.917219980        | 1.884758872          | 2.102724695            | 1.721617421      | 1.683364982        | 1.936741818          |
| MSPE%      | 0.009989546          | 0.009544637            | 0.012677817              | 0.012557865        | 0.012124315          | 0.015177499            | 0.009886813      | 0.009441449        | 0.012577830          |
| RMSPE%     | 0.099947714          | 0.097696656            | 0.112595814              | 0.112061880        | 0.110110469          | 0.123196995            | 0.099432453      | 0.097167121        | 0.112150925          |
| MAPE       | 0.934706683          | 0.915188150            | 1.052643436              | 1.220234210        | 1.210561085          | 1.278682096            | 0.923285582      | 0.903373233        | 1.043601890          |
| MAPE%      | 0.003105028          | 0.003038683            | 0.003505907              | 0.004144856        | 0.004110203          | 0.004354238            | 0.003063435      | 0.002995822        | 0.003471973          |














## 結語
![To be continued!](https://raw.githubusercontent.com/Josh-test-lab/website-assets-repository/refs/heads/main/posts/1140819%20meeting/To%20be%20continued.jpg "To be continued!")



## 運行環境
- 本機作業系統:Windows 11 24H2
  - 程式語言:Python 3.12.9
- 計算平臺:財團法人國家實驗研究院國家高速網路與計算中心臺灣 AI 雲
  - 作業系統:Ubuntu
  - Miniconda
  - GPU:NVIDIA Tesla V100 32GB GPU
  - CUDA 12.8 driver
  - 程式語言:Python 3.10.16 for Linux

  

## 延伸學習
- 我測試此項目的 [Github 儲存庫](https://github.com/Josh-test-lab/sssd_cp_learning_and_testing) 。




## 參考資料

- Global Modeling and Assimilation Office (GMAO)。(2015)。*MERRA-2 tavg1_2d_flx_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Surface Flux Diagnostics* (Version 5.12.4) [資料集]。Goddard Earth Sciences Data and Information Services Center (GES DISC)。參考自 https://doi.org/10.5067/7MCPBJ41Y0K6

- Juan Lopez Alcaraz 、 Nils Strodthoff(2022)。Diffusion-based time series imputation and forecasting with structured state space models。*Transactions on Machine Learning Research*。參考自 https://openreview.net/forum?id=hHiIbk7ApW

- *SSSD*(2022)。GitHub。參考自 https://github.com/AI4HealthUOL/SSSD

- *SSSD_CP*(2024)。GitHub。參考自 https://github.com/egpivo/SSSD_CP

- Unit8 SA(無日期)。*Time Series Made Easy in Python*。Darts。參考自 https://unit8co.github.io/darts/index.html

- *darts*(2025)。GitHub。參考自 https://github.com/unit8co/darts

- Tzeng, S., & Huang, H. C. (2018). *Resolution Adaptive Fixed Rank Kriging*. Technometrics, 60(2), 198–208. 參考自 https://doi.org/10.1080/00401706.2017.1345701

- *autoFRK*(2024)。GitHub。參考自 https://github.com/egpivo/autoFRK

- Si-An Chen, Chun-Liang Li, Nate Yoder, Sercan O. Arik and Tomas Pfister. (2023). *TSMixer: An all-MLP architecture for time series forecasting*. arXiv.  參考自 https://arxiv.org/abs/2303.06053