3.3.1 Tensorflow for large dataset
Since MNIST dataset is a quick and small example, the issue of GPU out-of-memory (OOM) will not be raised. However, large dataset is often seen in the real world, which might cause program crashed due to GPU OOM.
A general approach is to load data from batches into GPU. Even if the method "model.fit" and "model.predict" contains the argument "batch_size", the behavior of Tensorflow tends to acquire more space than a batch which leads to the risk of GPU OOM. Hence, please add the following code to limit the GPU memory usage:
1. In the beginning of main.py
import tensorflow as tf gpus = tf.config.experimental.list_physical_devices(device_type="GPU") for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True) |
2. At the end of data preparation
# after the preparation of: xTrain, yTrain, xTest, yTest # after setup batch_size with tf.device("CPU"): trainset = tf.data.Dataset.from_tensor_slices( (xTrain, yTrain) ).shuffle(8192).batch(batch_size) testset = tf.data.Dataset.from_tensor_slices( (xTest, yTest) ).shuffle(8192).batch(batch_size) |
3. Use for loop to train and validate batches
Before | After |
model.fit(xTrain, yTrain, epochs=1, batch_size=batch_size) | for xTrainBatch, yTrainBatch in trainset: model.fit(xTrainBatch, yTrainBatch, epochs=1, batch_size=batch_size) |
model.predict(xTest, yTest, epochs=1, batch_size=batch_size) | for xTestBatch, yTestBatch in testset: model.predict(xTestBatch, yTestBatch, epochs=1, batch_size=batch_size) |