Digit Classifier: Neural Network Architecture
The following link takes you to a GitHub repository containing all the material for this course. It’s assumed that you have a copy of this.
Introduction
In this lab, you’ll implement part of the neural network class that we will use for digit classification. This lab is meant to let you put what you learned in the first chapter into practice.
Background - Data
The MNIST dataset is the machine learning equivalent of writing a “Hello, World” program when learning to code. MNIST consists of 28×28 pixel grayscale images of handwritten digits (0-9). In order to be able to input these images into our neural network, each 28×28 image needs to be flattened into a 784-dimensional vector.
In order to match what the network should ideally output, each digit is represented using a 10-dimensional vector with a “1” in the index corresponding to the digit and “0” everywhere else. For example, the digit 2 would be encoded as:
y = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
Understanding
Let’s take a look at neural-network.ipynb
.
First, we have a class called NeuralNetwork
. We’ll be implementing part of the class in this lab.
Inside the main
function in the code provided, you’ll also find three variables: training
, validation
,
and testing
. For this first lab, we’re only going to use the testing data to evaluate the performance of
the neural network (which won’t be much better than random for now). The dataset is parsed as an array of
tuples that contain image-label pairs:
testing = [(image_0, label_0), ..., (image_n-1, label_n-1)]
We then initialize a neural network, digit_classifier
. We pass an array [784, 15, 15, 10]
to specify
that it takes an input of 784 pixels, parses them using two hidden layers with 15 neurons, and finally it
returns a prediction of what digit it is. The choice of hidden layers and neurons is arbitrary. Feel free to
mess around with them.
You can ignore digit_classifier.stochastic_gradient_descent(training, 10, 30, 3, testing)
. It won’t do anything
for now. We’ll implement it in the next lab.
At the end of main
, we test the accuracy of our neural network’s predictions. They will be terrible for now.
Neural Network Class Specification
1. Complete the __init__
Method
- The function should initialize the neural network structure based on the input list
sizes
, where each entry represents the number of neurons in a layer. - The function should store the number of layers in
layer_count
and thesizes
list itself. - The
weights
andbiases
should be initialized by sampling from a normal distribution. The reason for this will be explained in Chapter 3. - You may assume that the
sizes
list contains at least two values.
2. Complete the activation_function
Method
- This function should return the result of applying the sigmoid to the input
z
, which may be a scalar or a NumPy array.
3. Complete the activation_function_derivative
Method
- This function should return the result of applying the sigmoid derivative to the input
z
, which may be a scalar or a NumPy array. - This function will be used in the next lab.
4. Complete the feed_forward
Method
- This function takes an input vector
a
. - This function returns the output of the neural network.
- The method should pass the input through each layer of the network, applying weights, biases, and the activation function at each step.
- You may assume that the input size matches the size of the input layer.
5. Complete the evaluate
Method
- The
test_data
parameter is a list of tuples(x, y)
wherex
is the input andy
is the expected output. - The method should return the number of correctly classified inputs.
Testing Your Implementation
After implementing the required functions, test your network with the provided testing data and compare to the corresponding solution to ensure everything is working properly. Although the network won’t perform well yet, your implementation should be able to process inputs to make predictions.
Looking Forward
In the next chapter, you’re going to learn how we teach neural networks, and then implement the training algorithms in the corresponding lab.