我目前正在使用 Tensorflow 庫玩基于 LSTM 的基本自動編碼器。目標是讓自動編碼器重建多元時間序列。我有興趣將資料的特征歸一化從資料管道移動到模型內部。
目前我通過以下方式規范化資料:
normalizer = Normalization(axis=-1)
normalizer.adapt(data_train)
data_train = normalizer(data_train)
inputs = Input(shape=[None, n_inputs])
x = LSTM(4, return_sequences=True)(inputs)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(n_inputs)))(x)
model = Model(inputs, x)
按預期作業,導致可觀的損失(~1e-2),但在模型之外。根據檔案(在“在模型之前或模型內部預處理資料”下),以下代碼應等效于上述代碼段,只是它在模型內部運行:
normalizer = Normalization(axis=-1)
normalizer.adapt(data_train)
inputs = Input(shape=[None, n_inputs])
x = normalizer(inputs)
x = LSTM(4, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(n_inputs)))(x)
model = Model(inputs, x)
然而,運行后一個變體會導致天文損失值(~1e3),并且在測驗中也會產生更差的結果。因此我的問題是:我做錯了什么?難道是我誤解了檔案?
非常感謝任何建議!
uj5u.com熱心網友回復:
只要歸一化器在模型外使用時僅應用于輸入(即特征矩陣),這兩種方法似乎都能給出一致的結果:
import numpy as np
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense, LSTM, TimeDistributed
from tensorflow.keras.models import Model
from tensorflow.keras.layers.experimental.preprocessing import Normalization
np.random.seed(42)
# define the input parameters
num_samples = 100
time_steps = 10
train_size = 0.8
# generate the data
X = np.random.normal(loc=10, scale=5, size=(num_samples, time_steps, 1))
y = np.mean(X, axis=1) np.random.normal(loc=0, scale=1, size=(num_samples, 1))
# split the data
X_train, X_test = X[:np.int(train_size * X.shape[0]), :], X[np.int(train_size * X.shape[0]):, :]
y_train, y_test = y[:np.int(train_size * y.shape[0]), :], y[np.int(train_size * y.shape[0]):, :]
# normalize the inputs inside the model
normalizer = Normalization()
normalizer.adapt(X_train)
inputs = Input(shape=[None, 1])
x = normalizer(inputs)
x = LSTM(4, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(1)))(x)
model = Model(inputs, x)
model.compile(loss='mae', optimizer='adam')
model.fit(X_train, y_train, batch_size=32, epochs=10, verbose=0)
print(model.evaluate(X_test, y_test))
# 10.704551696777344
# normalize the inputs outside the model
normalizer = Normalization()
normalizer.adapt(X_train)
X_train_normalized = normalizer(X_train)
X_test_normalized = normalizer(X_test)
inputs = Input(shape=[None, 1])
x = LSTM(4, return_sequences=True)(inputs)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(2, return_sequences=True)(x)
x = LSTM(4, return_sequences=True)(x)
x = TimeDistributed((Dense(1)))(x)
model = Model(inputs, x)
model.compile(loss='mae', optimizer='adam')
model.fit(X_train_normalized, y_train, batch_size=32, epochs=10, verbose=0)
print(model.evaluate(X_test_normalized, y_test))
# 10.748750686645508
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/323915.html