Keras模型擬合引發形狀不匹配錯誤-有解無憂

我正在使用 Keras(TensorFlow) 構建一個連體網路，其中目標是二進制列，即匹配或不匹配（1 或 0）。但是模型擬合方法會拋出一個錯誤，指出y_pred is not compatible with the y_true shape. 我正在使用binary_crossentropy損失函式。

這是我看到的錯誤：

Keras 模型擬合引發形狀不匹配錯誤

這是我正在使用的代碼：

model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[tf.keras.metrics.Recall()])
history = model.fit([X_train_entity_1.todense(),X_train_entity_2.todense()],np.array(y_train),
                    epochs=2, 
                    batch_size=32,
                    verbose=2,
                    shuffle=True)

我的輸入資料形狀如下：

Inputs:
X_train_entity_1.shape is (700,2822)
X_train_entity_2.shape is (700,2822)

Target:
y_train.shape is (700,1)

在它拋出的錯誤中，y_pred是內部創建的變數。當我y_pred有一個二進制目標時，維度是 2822。而 2822 維度實際上與輸入大小匹配，但我怎么理解呢？

這是我創建的模型：

in_layers = []
out_layers = []
for i in range(2):
  input_layer = Input(shape=(1,))
  embedding_layer = Embedding(embed_input_size 1, embed_output_size)(input_layer)
  lstm_layer_1 = Bidirectional(LSTM(1024, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(embedding_layer)
  lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)

  in_layers.append(input_layer)
  out_layers.append(lstm_layer_2)

merge = concatenate(out_layers)
dense1 = Dense(256, activation='relu', kernel_initializer='he_normal', name='data_embed')(merge)
drp1 = Dropout(0.4)(dense1)
btch_norm1 = BatchNormalization()(drp1)
dense2 = Dense(32, activation='relu', kernel_initializer='he_normal')(btch_norm1)
drp2 = Dropout(0.4)(dense2)
btch_norm2 = BatchNormalization()(drp2)
output = Dense(1, activation='sigmoid')(btch_norm2)
model = Model(inputs=in_layers, outputs=output)
model.summary()

由于我的資料非常稀疏，所以我使用了 todense。型別如下：

type(X_train_entity_1) is scipy.sparse.csr.csr_matrix
type(X_train_entity_1.todense()) is numpy.matrix
type(X_train_entity_2) is scipy.sparse.csr.csr_matrix
type(X_train_entity_2.todense()) is numpy.matrix

最后幾層總結如下： Keras 模型擬合引發形狀不匹配錯誤

uj5u.com熱心網友回復：

Input圖層中的形狀不匹配。輸入形狀需要匹配作為x, 或傳遞的單個元素的形狀dataset.shape[1:]。因此，由于您的資料集大小為(700,2822)，即 700 個大小為 2822 的樣本。因此您的輸入形狀應為 2822。

改變：

input_layer = Input(shape=(1,))

至：

input_layer = Input(shape=(2822,))

uj5u.com熱心網友回復：

您需要在 toreturn_sequences中lstm_layer_2設定False：

lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=False, recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)

否則，您仍然會有輸入的時間步長。這就是為什么你有形狀(None, 2822, 1)。您也可以Flatten在輸出層之前添加一個層，但我建議設定return_sequences=False.

請注意，一個Dense 層計算輸入和內核之間沿輸入最后一個軸的點積。

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/508473.html

標籤：张量流喀拉斯深度学习神经网络 nlp

上一篇：如何創建由多個目錄中的檔案名標記的TensorFlow影像資料集？

下一篇：XAML：調整視窗大小時顯示/隱藏UI元素