[python] 훈련 된 Keras 모델을로드하고 훈련 계속하기

Question 1

부분적으로 훈련 된 Keras 모델을 저장하고 모델을 다시로드 한 후 훈련을 계속할 수 있는지 궁금합니다.

그 이유는 앞으로 더 많은 훈련 데이터를 갖게 될 것이고 전체 모델을 다시 훈련시키고 싶지 않기 때문입니다.

내가 사용하는 기능은 다음과 같습니다.

#Partly train model
model.fit(first_training, first_classes, batch_size=32, nb_epoch=20)

#Save partly trained model
model.save('partly_trained.h5')

#Load partly trained model
from keras.models import load_model
model = load_model('partly_trained.h5')

#Continue training
model.fit(second_training, second_classes, batch_size=32, nb_epoch=20)

편집 1 : 완전히 작동하는 예제 추가

10 Epoch 이후의 첫 번째 데이터 세트에서 마지막 Epoch의 손실은 0.0748이고 정확도는 0.9863입니다.

모델을 저장, 삭제 및 다시로드 한 후 두 번째 데이터 세트에서 학습 된 모델의 손실 및 정확도는 각각 0.1711 및 0.9504가됩니다.

이것은 새로운 훈련 데이터 때문입니까, 아니면 완전히 재 훈련 된 모델 때문입니까?

"""
Model by: http://machinelearningmastery.com/
"""
# load (downloaded if needed) the MNIST dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.models import load_model
numpy.random.seed(7)

def baseline_model():
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
    model.add(Dense(num_classes, init='normal', activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

if __name__ == '__main__':
    # load data
    (X_train, y_train), (X_test, y_test) = mnist.load_data()

    # flatten 28*28 images to a 784 vector for each image
    num_pixels = X_train.shape[1] * X_train.shape[2]
    X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
    X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
    # normalize inputs from 0-255 to 0-1
    X_train = X_train / 255
    X_test = X_test / 255
    # one hot encode outputs
    y_train = np_utils.to_categorical(y_train)
    y_test = np_utils.to_categorical(y_test)
    num_classes = y_test.shape[1]

    # build the model
    model = baseline_model()

    #Partly train model
    dataset1_x = X_train[:3000]
    dataset1_y = y_train[:3000]
    model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)

    # Final evaluation of the model
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

    #Save partly trained model
    model.save('partly_trained.h5')
    del model

    #Reload model
    model = load_model('partly_trained.h5')

    #Continue training
    dataset2_x = X_train[3000:]
    dataset2_y = y_train[3000:]
    model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

Question 2

사실은 – model.save 귀하의 경우 교육을 다시 시작하는 데 필요한 모든 정보를 저장합니다. 모델을 다시로드하여 손상 될 수있는 유일한 것은 최적화 프로그램 상태입니다. 이를 확인하려면 save모델을 다시로드하고 훈련 데이터에 대해 훈련 시키십시오.

Question 3

문제는 다른 옵티 마이저를 사용하거나 옵티 마이저에 다른 인수를 사용한다는 것입니다. 사용자 지정 사전 훈련 된 모델에서 동일한 문제가 발생했습니다.

reduce_lr = ReduceLROnPlateau(monitor='loss', factor=lr_reduction_factor,
                              patience=patience, min_lr=min_lr, verbose=1)

사전 훈련 된 모델의 경우 원래 학습률이 0.0003에서 시작하고 사전 훈련 중에 min_learning rate 인 0.000003으로 감소합니다.

방금 사전 훈련 된 모델을 사용하는 스크립트에 해당 줄을 복사했고 정확도가 매우 떨어졌습니다. 사전 훈련 된 모델의 마지막 학습률이 최소 학습률, 즉 0.000003임을 알 때까지. 그 학습률로 시작하면 사전 훈련 된 모델의 출력과 정확히 동일한 정확도를 얻습니다. 이는 사전 훈련 된 모델에서 사용 된 마지막 학습률보다 100 배 더 큰 학습률로 시작하는 것과 같습니다. 모델은 GD의 엄청난 오버 슈트를 초래하므로 정확도가 크게 떨어집니다.

Question 4

위 답변의 대부분은 중요한 사항을 다루었습니다. 최근 Tensorflow ( TF2.1또는 그 이상)를 사용하는 경우 다음 예제가 도움이 될 것입니다. 코드의 모델 부분은 Tensorflow 웹 사이트에서 가져 왔습니다.

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])

  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
  return model

# Create a basic model instance
model=create_model()
model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

* .tf 형식으로 모델을 저장하십시오. 내 경험상 custom_loss가 정의되어 있으면 * .h5 형식은 최적화 프로그램 상태를 저장하지 않으므로 우리가 떠난 곳에서 모델을 다시 교육하려는 경우 목적에 부합하지 않습니다.

# saving the model in tensorflow format
model.save('./MyModel_tf',save_format='tf')


# loading the saved model
loaded_model = tf.keras.models.load_model('./MyModel_tf')

# retraining the model
loaded_model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

이 접근 방식은 모델을 저장하기 전에 남은 학습을 다시 시작합니다. 다른 사람에 의해 언급 한 바와 같이 가장 모델의 가중치를 저장하려는 경우 또는, 당신은 모델의 가중치를 사용하면 다음과 같은 옵션을 keras 콜백 기능 (ModelCheckpoint)를 사용할 필요가 모든 시대를 저장하려면 save_weights_only=True, save_freq='epoch'하고 save_best_only.

자세한 내용은 여기 를 확인하고 여기 에서 다른 예 를 확인 하세요 .

Question 5

Keras는 여기 에서와 같이로드 된 모델에 문제가있는 경우가 있습니다 . 이것은 동일한 훈련 된 정확도에서 시작하지 않는 경우를 설명 할 수 있습니다.

Question 6

위의 모든 것이 도움이 되며 모델과 가중치가 저장되었을 때 LR과 동일한 학습률 ()에서 다시 시작 해야합니다 . 옵티 마이저에서 직접 설정하십시오.

모델이 전 세계적 일 수있는 지역 최소값에 도달했을 수 있기 때문에 거기에서의 개선이 보장되지 않습니다. 제어 된 방식으로 학습률을 높이고 모델을 멀지 않은 더 나은 최소값으로 조금씩 이동하지 않는 한 다른 로컬 최소값을 검색하기 위해 모델을 재개 할 필요가 없습니다.

Question 7

개념 드리프트를 누를 수도 있습니다 . 새 관찰을 사용할 수있을 때 모델을 다시 학습해야하는지 참조하십시오 . 많은 학술 논문에서 논의하는 치명적인 망각의 개념도 있습니다. MNIST 실증적 망각에 대한 실증 조사