Tutotrial 1
This is a tutorial for developing a basic autoencoder using python, and keras. The tutorial mentioned on this page follows from: https://blog.keras.io/building-autoencoders-in-keras.html
Contents
Overview
(From the link mentioned above:)"Autoencoding" is a data compression algorithm where the compression and decompression functions are
- Data-specific
- Lossy, and
- Learned automatically from examples rather than engineered by a human
Additionally, in almost all contexts where the term "autoencoder" is used, the compression and decompression functions are implemented with neural networks. This tutorial will focus on building two types of Autoencoders,
- A deep autoencoder
- A convolutional autoencoder
Deep autoencoder for grayscale images
In order to build a deep autoencoder for grayscale images, here are the steps we should be following:
- Import required packages
- Create the structure of the autoencoder
- Import data and fit it to our newly built autoencoder
- Plot our test data
Importing the required packages
This tutorial only requires one package and that is keras. From keras, we will be importing the Input, Dense and the Model functions.
from keras.layers import Input, Dense from keras.models import Model
The Input function takes a shape tuple as it's argument and creates a tensor based on that. The Dense function is used to specify a layer of neurons, and the activation function they posses. The Model function combines different layers and allows us to address a couple of layers together, as a model.
Creating the structure of the Autoencoder
We will be using the aforementioned Dense and Model functions to create the structure of the Neural net. Let us create an autoencoder for the MNIST dataset (http://yann.lecun.com/exdb/mnist/). This is a dataset of thounsands of 28x28 grayscale pictures of handwritten digits. Since we have 28x28 pixels, our autoencoder must have 784 neurons in it's input layer.
input_img = Input(shape=(784,))
Now we have an input layer with 784 neurons. Now we proceed to construct the hidden layers.
Let us choose the architecture where from the input layer, the autoencoder gradually decreases in neurons through 128, 64 and at the "waist" of the autoencoder, we have 32 neurons. Since we also need to decode the image from 32 pixels back to 784, let it progress through the same number of neurons. Thus, the architecture is 784 ---> 128 ---> 64 ---> 32(Waist) ---> 64 ---> 128 ---> 784. The dense function takes as input the number of neurons in the new layer, the activation function for all the neurons and accounts for what the previous layer was.
encoded = Dense(128, activation='relu')(input_img) encoded = Dense(64, activation='relu')(encoded) encoded = Dense(32, activation='relu')(encoded) decoded = Dense(64, activation='relu')(encoded) decoded = Dense(128, activation='relu')(decoded) decoded = Dense(784, activation='sigmoid')(decoded) autoencoder = Model(input_img, decoded) autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
Finally, the Model function creates a container for us to collectively address the neural network that begins at the input_img layer and ends at the decoded layer. The compile function sets the values for the attributes optimizer and designates the binary_crossentropy loss function to be used.
Importing data and fitting it to our autoencoder
Within the datasets module of keras, we have the option to import the MNIST dataset. For this, all we have to do is:
from keras.datasets import mnist
Next we have to split the dataset into a training and a test set. We will make use of the load_data() function present within the mnist object.
(x_train, _), (x_test, _) = mnist.load_data()
In order to provide the data to the neural network in the right fashion, it is important for us to perform three main steps:
- One, we must convert the datatype of data to float.
- It is better to normalize our inputs to the neural network. Hence we must map the domain [0,255] to the range [0,1]
- The data which is currently present in the shape (x, 28, 28) (which is basically x instances of 28x28 images) must be reshaped into (x, 784).
Fortunately, for the three aforementioned steps, we have predefined functions that make our lives easier.
x_train = x_train.astype('float32') / 255. x_test = x_test.astype('float32') / 255.
We have now converted the datatype to float and divided all the values by the maximum, which is 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:]))) x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
The np.product function converts the (x, 28, 28) shape to (x, 784) and the reshape function sets it to this shape.
In order to fit the data, we call the fit function in the Model class. Optional: If needed, we can also log the training error vs iterations as a graph using the Tensorboard object from keras.callbacks.
from keras.callbacks import TensorBoard autoencoder.fit(x_train, x_train, epochs=100, batch_size=256, shuffle=True, validation_data=(x_test, x_test), callbacks=[TensorBoard(log_dir=<path to where the logs must be stored>)])
Here we train our autoencoder for 100 epochs and shuffle the data. The logs are stored in the specified path. Tensorboard takes this log data and plots it.