Revision as of 13:10, 7 February 2018

This is a tutorial for developing a basic autoencoder using python, and keras. The tutorial mentioned on this page follows from: https://blog.keras.io/building-autoencoders-in-keras.html

Overview

(From the link mentioned above:)"Autoencoding" is a data compression algorithm where the compression and decompression functions are

Data-specific
Lossy, and
Learned automatically from examples rather than engineered by a human

Additionally, in almost all contexts where the term "autoencoder" is used, the compression and decompression functions are implemented with neural networks. This tutorial will focus on building two types of Autoencoders,

A deep autoencoder
A convolutional autoencoder

Deep autoencoder for grayscale images

In order to build a deep autoencoder for grayscale images, here are the steps we should be following:

Import required packages
Create the structure of the autoencoder
Import data and fit it to our newly built autoencoder
Plot our test data

Importing the required packages

This tutorial only requires one package and that is keras. From keras, we will be importing the Input, Dense and the Model functions.

from keras.layers import Input, Dense
from keras.models import Model

The Input function takes a shape tuple as it's argument and creates a tensor based on that. The Dense function is used to specify a layer of neurons, and the activation function they posses. The Model function combines different layers and allows us to address a couple of layers together, as a model.

Creating the structure of the Autoencoder

We will be using the aforementioned Dense and Model functions to create the structure of the Neural net. Let us create an autoencoder for the MNIST dataset (http://yann.lecun.com/exdb/mnist/). This is a dataset of thounsands of 28x28 grayscale pictures of handwritten digits. Since we have 28x28 pixels, our autoencoder must have 784 neurons in it's input layer.

input_img = Input(shape=(784,))

Now we have an input layer with 784 neurons. Now we proceed to construct the hidden layers.

Let us choose the architecture where from the input layer, the autoencoder gradually decreases in neurons through 128, 64 and at the "waist" of the autoencoder, we have 32 neurons. Since we also need to decode the image from 32 pixels back to 784, let it progress through the same number of neurons. Thus, the architecture is 784 ---> 128 ---> 64 ---> 32(Waist) ---> 64 ---> 128 ---> 784. The dense function takes as input the number of neurons in the new layer, the activation function for all the neurons and accounts for what the previous layer was.

encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
encoded = Dense(32, activation='relu')(encoded)

decoded = Dense(64, activation='relu')(encoded)
decoded = Dense(128, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(decoded)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

Finally, the Model function creates a container for us to collectively address the neural network that begins at the input_img layer and ends at the decoded layer. The compile function sets the values for the attributes optimizer and designates the binary_crossentropy loss function to be used.

Importing data and fitting it to our autoencoder

Within the datasets module of keras, we have the option to import the MNIST dataset. For this, all we have to do is:

from keras.datasets import mnist

Next we have to split the dataset into a training and a test set. We will make use of the load_data() function present within the mnist object.

(x_train, _), (x_test, _) = mnist.load_data()

In order to provide the data to the neural network in the right fashion, it is important for us to perform three main steps:

One, we must convert the datatype of data to float.
It is better to normalize our inputs to the neural network. Hence we must map the domain [0,255] to the range [0,1]
The data which is currently present in the shape (x, 28, 28) (which is basically x instances of 28x28 images) must be reshaped into (x, 784).

Fortunately, for the three aforementioned steps, we have predefined functions that make our lives easier.

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

We have now converted the datatype to float and divided all the values by the maximum, which is 255.

x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

The np.product function converts the (x, 28, 28) shape to (x, 784) and the reshape function sets it to this shape.

In order to fit the data, we call the fit function in the Model class. Optional: If needed, we can also log the training error vs iterations as a graph using the Tensorboard object from keras.callbacks.

from keras.callbacks import TensorBoard
autoencoder.fit(x_train, x_train,
                epochs=100,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test),
                callbacks=[TensorBoard(log_dir=<path to where the logs must be stored>)])

Here we train our autoencoder for 100 epochs and shuffle the data. The logs are stored in the specified path. Tensorboard takes this log data and plots it.

Difference between revisions of "Tutotrial 1"

Revision as of 13:10, 7 February 2018

Contents

Overview

Deep autoencoder for grayscale images

Importing the required packages

Creating the structure of the Autoencoder

Importing data and fitting it to our autoencoder

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

QuantSoftware Research Group

Spring 2020

Site

Tools

@@ Line 62: / Line 62: @@
 Next we have to split the dataset into a training and a test set. We will make use of the load_data() function present within the <tt>mnist</tt> object.
 <PRE>(x_train, _), (x_test, _) = mnist.load_data()</PRE>
+In order to provide the data to the neural network in the right fashion, it is important for us to perform three main steps:
+*One, we must convert the datatype of data to float.
+*It is better to normalize our inputs to the neural network. Hence we must map the domain <tt>[0,255]</tt> to the range <tt>[0,1]</tt>
+*The data which is currently present in the shape <tt>(x, 28, 28)</tt> (which is basically x instances of 28x28 images) must be reshaped into <tt>(x, 784)</tt>.
+Fortunately, for the three aforementioned steps, we have predefined functions that make our lives easier.
+<PRE>x_train = x_train.astype('float32') / 255.
+x_test = x_test.astype('float32') / 255.
+</PRE>
+We have now converted the datatype to float and divided all the values by the maximum, which is 255.
+<PRE>
+x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
+x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
+</PRE>
+The <tt>np.product</tt> function converts the <tt>(x, 28, 28)</tt> shape to <tt>(x, 784)</tt> and the reshape function sets it to this shape.
+In order to fit the data, we call the <tt>fit</tt> function in the <tt>Model</tt> class.
+Optional: If needed, we can also log the training error vs iterations as a graph using the <tt>Tensorboard</tt> object from <tt>keras.callbacks</tt>.
+<PRE>
+from keras.callbacks import TensorBoard
+autoencoder.fit(x_train, x_train,
+                epochs=100,
+                batch_size=256,
+                shuffle=True,
+                validation_data=(x_test, x_test),
+                callbacks=[TensorBoard(log_dir=<path to where the logs must be stored>)])
+</PRE>
+Here we train our autoencoder for 100 epochs and shuffle the data. The logs are stored in the specified path. Tensorboard takes this log data and plots it.