Autoencoders – Part 1

Dawson Metzger-Fleetwood

|

7 min read
Visualization of autoencoder architecture


The Basics

Visualization of autoencoder architecture
  • It compresses its input data into a lower dimension
  • It attempts to reconstruct the original input from the lower dimensional data
The formula for reconstruction error
If you have experience with machine learning this formula may look familiar; it’s the formula for SSE (sum of squared errors) but instead for using y for the target output you’re using x, because in the case of autoencoders the target output and predicted output of the model is the input.


The Encoder and Decoder

The first part of the network is called the encoder. Its job is to create a lower dimensional representation of the input data that can be “passed through” the middle layer. The values of this middle layer are that lower dimensional representation.

The two main terms for this lower dimensional representation that I will use throughout this article are “embedding” and “latent space representation.”

Visualization of encoder architecture

The first part of the network is called the encoder. Its job is to create a lower dimensional representation of the input data that can be “passed through” the middle layer. The values of this middle layer are that lower dimensional representation.

The two main terms for this lower dimensional representation that I will use throughout this article are “embedding” and “latent space representation.”

The second part of the network is the decoder. Its job is to take the embedding and reconstruct the original input from that embedding.

To reconstruct the input effectively, the information from the input must pass through the middle layer. By training the neural network to minimize the reconstruction error, the network must find an efficient lower dimensional representation of the input data.

Visualization of decoder architecture


Lower Dimensional Subspaces

A 3D scatterplot of random points
Figure 1
A 3D scatterplot of points arranged in a spiral
In our airplane dataset example, you can think of figure 1 as representing all possible images. This figure represents only images of airplanes
A math formula that maps from one dimension to three
A math formula that maps between one and three dimensions


Training an Autoencoder

import keras
from keras.datasets import mnist
from keras.layers import Dense
from keras.models import Model
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import ImageGrid
# Load the data
(x_train, _), (x_test, _) = mnist.load_data()
num_pixels = x_train.shape[1] * x_train.shape[2]
# Flatten and normalize the data
x_train = x_train.reshape(x_train.shape[0], num_pixels).astype('float32') / 255
x_test = x_test.reshape(x_test.shape[0], num_pixels).astype('float32') / 255
(fashion_x_train, _), (fashion_x_test, _) = fashion_mnist.load_data()
fashion_x_train = fashion_x_train[:1] # Let's use 1 outlier
x_train = np.concatenate((x_train, fashion_x_train), axis=0)
np.random.shuffle(x_train) # Shuffling the dataset so we don't know where the outlier is
autoencoder = keras.Sequential()
autoencoder.add(Dense(128, activation='relu', input_shape=(num_pixels,)))  
autoencoder.add(Dense(32, activation='relu')) 
autoencoder.add(Dense(8, activation='sigmoid')) 
autoencoder.add(Dense(32, activation='relu')) 
autoencoder.add(Dense(128, activation='relu'))
autoencoder.add(Dense(num_pixels, activation='sigmoid'))
print("Total parameters:", autoencoder.count_params())
autoencoder.fit(x_train, x_train,
                epochs=50,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))
encoder = keras.Sequential(autoencoder.layers[:3])
decoder = keras.Sequential(autoencoder.layers[3:])
def display(images, padding=1, figsize=(8, 4)):
   fig = plt.figure(figsize=figsize)
   grid = ImageGrid(fig, 111, nrows_ncols=(1, len(images)), axes_pad=padding)


   for i, (image, title) in enumerate(images):
       if image.size == 8:
           reshaped_image = image.reshape(8, 1)
       elif image.size == 784:
           reshaped_image = image.reshape(28, 28)
       else:
           raise ValueError("Image size not supported")


       ax = grid[i]
       ax.imshow(reshaped_image, cmap='gray')
       ax.set_title(title)
       ax.axis('off')


   plt.show()
image_index = 21  # You can change this to view different images


images = [
   (x_test[image_index], "Original Image"),
   (encoded_imgs[image_index], "Encoded Image"),
   (decoded_imgs[image_index], "Decoded Image")
   ]


display(images)
An image displaying model layer inputs and outputs
Newsletter Form
  • VA ProdOps and Data Quality Engineering
    VA ProdOps and Data Quality Engineering
    See more: VA ProdOps and Data Quality Engineering
    Small orange arrow symbol
  • Where the future is going: sniffing out all of Apple’s clues
    Where the future is going: sniffing out all of Apple’s clues
    See more: Where the future is going: sniffing out all of Apple’s clues
    Small orange arrow symbol
Abstract graffiti-style artwork with vivid colors and dynamic shapes.
Simple Form

Connect with our team.

Error

An error has occured, please try again.

Try again