Digit Classifier: Neural Network Architecture

The following link takes you to a GitHub repository containing all the material for this course. It’s assumed that you have a copy of this.

Introduction

In this lab, you’ll implement part of the neural network class that we will use for digit classification. This lab is meant to let you put what you learned in the first chapter into practice.

Background - Data

The MNIST dataset is the machine learning equivalent of writing a “Hello, World” program when learning to code. MNIST consists of 28×28 pixel grayscale images of handwritten digits (0-9). In order to be able to input these images into our neural network, each 28×28 image needs to be flattened into a 784-dimensional vector.

In order to match what the network should ideally output, each digit is represented using a 10-dimensional vector with a “1” in the index corresponding to the digit and “0” everywhere else. For example, the digit 2 would be encoded as:

y = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0]

Understanding

Let’s take a look at neural-network.ipynb.

First, we have a class called NeuralNetwork. We’ll be implementing part of the class in this lab.

Inside the main function in the code provided, you’ll also find three variables: training, validation, and testing. For this first lab, we’re only going to use the testing data to evaluate the performance of the neural network (which won’t be much better than random for now). The dataset is parsed as an array of tuples that contain image-label pairs:

testing = [(image_0, label_0), ..., (image_n-1, label_n-1)]

We then initialize a neural network, digit_classifier. We pass an array [784, 15, 15, 10] to specify that it takes an input of 784 pixels, parses them using two hidden layers with 15 neurons, and finally it returns a prediction of what digit it is. The choice of hidden layers and neurons is arbitrary. Feel free to mess around with them.

You can ignore digit_classifier.stochastic_gradient_descent(training, 10, 30, 3, testing). It won’t do anything for now. We’ll implement it in the next lab.

At the end of main, we test the accuracy of our neural network’s predictions. They will be terrible for now.

Neural Network Class Specification

1. Complete the __init__ Method

  • The function should initialize the neural network structure based on the input list sizes, where each entry represents the number of neurons in a layer.
  • The function should store the number of layers in layer_count and the sizes list itself.
  • The weights and biases should be initialized by sampling from a normal distribution. The reason for this will be explained in Chapter 3.
  • You may assume that the sizes list contains at least two values.

2. Complete the activation_function Method

  • This function should return the result of applying the sigmoid to the input z, which may be a scalar or a NumPy array.
σ(z)=11+ez \sigma(z) = \frac{1}{1 + e^{-z}}

3. Complete the activation_function_derivative Method

  • This function should return the result of applying the sigmoid derivative to the input z, which may be a scalar or a NumPy array.
  • This function will be used in the next lab.
σ(z)=σ(z)(1σ(z)) \sigma'(z) = \sigma(z) \cdot (1 - \sigma(z))

4. Complete the feed_forward Method

  • This function takes an input vector a.
  • This function returns the output of the neural network.
  • The method should pass the input through each layer of the network, applying weights, biases, and the activation function at each step.
  • You may assume that the input size matches the size of the input layer.

5. Complete the evaluate Method

  • The test_data parameter is a list of tuples (x, y) where x is the input and y is the expected output.
  • The method should return the number of correctly classified inputs.

Testing Your Implementation

After implementing the required functions, test your network with the provided testing data and compare to the corresponding solution to ensure everything is working properly. Although the network won’t perform well yet, your implementation should be able to process inputs to make predictions.

Looking Forward

In the next chapter, you’re going to learn how we teach neural networks, and then implement the training algorithms in the corresponding lab.

Neural Networks From Scratch

Prioritize understanding over memorization. Good luck!