我正在使用 Keras(TensorFlow) 構建一個連體網路,其中目標是二進制列,即匹配或不匹配(1 或 0)。但是模型擬合方法會拋出一個錯誤,指出y_pred is not compatible with the y_true shape
. 我正在使用binary_crossentropy
損失函式。
這是我看到的錯誤:
這是我正在使用的代碼:
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[tf.keras.metrics.Recall()])
history = model.fit([X_train_entity_1.todense(),X_train_entity_2.todense()],np.array(y_train),
epochs=2,
batch_size=32,
verbose=2,
shuffle=True)
我的輸入資料形狀如下:
Inputs:
X_train_entity_1.shape is (700,2822)
X_train_entity_2.shape is (700,2822)
Target:
y_train.shape is (700,1)
在它拋出的錯誤中,y_pred
是內部創建的變數。當我y_pred
有一個二進制目標時,維度是 2822。而 2822 維度實際上與輸入大小匹配,但我怎么理解呢?
這是我創建的模型:
in_layers = []
out_layers = []
for i in range(2):
input_layer = Input(shape=(1,))
embedding_layer = Embedding(embed_input_size 1, embed_output_size)(input_layer)
lstm_layer_1 = Bidirectional(LSTM(1024, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(embedding_layer)
lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
in_layers.append(input_layer)
out_layers.append(lstm_layer_2)
merge = concatenate(out_layers)
dense1 = Dense(256, activation='relu', kernel_initializer='he_normal', name='data_embed')(merge)
drp1 = Dropout(0.4)(dense1)
btch_norm1 = BatchNormalization()(drp1)
dense2 = Dense(32, activation='relu', kernel_initializer='he_normal')(btch_norm1)
drp2 = Dropout(0.4)(dense2)
btch_norm2 = BatchNormalization()(drp2)
output = Dense(1, activation='sigmoid')(btch_norm2)
model = Model(inputs=in_layers, outputs=output)
model.summary()
由于我的資料非常稀疏,所以我使用了 todense。型別如下:
type(X_train_entity_1) is scipy.sparse.csr.csr_matrix
type(X_train_entity_1.todense()) is numpy.matrix
type(X_train_entity_2) is scipy.sparse.csr.csr_matrix
type(X_train_entity_2.todense()) is numpy.matrix
最后幾層總結如下:
uj5u.com熱心網友回復:
Input
圖層中的形狀不匹配。輸入形狀需要匹配作為x
, 或傳遞的單個元素的形狀dataset.shape[1:]
。因此,由于您的資料集大小為(700,2822)
,即 700 個大小為 2822 的樣本。因此您的輸入形狀應為 2822。
改變:
input_layer = Input(shape=(1,))
至:
input_layer = Input(shape=(2822,))
uj5u.com熱心網友回復:
您需要在 toreturn_sequences
中lstm_layer_2
設定False
:
lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=False, recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
否則,您仍然會有輸入的時間步長。這就是為什么你有形狀(None, 2822, 1)
。您也可以Flatten
在輸出層之前添加一個層,但我建議設定return_sequences=False
.
請注意,一個Dense
層計算輸入和內核之間沿輸入最后一個軸的點積。
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/508473.html