{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "5qqJjXeFxkxK" }, "source": [ "# MNIST Code Along with CNN\n", " let's work the same data with a Convolutional Neural Network (CNN).\n", "Make sure to watch the theory lectures! You'll want to be comfortable with:\n", "* convolutional layers\n", "* filters/kernels\n", "* pooling\n", "* depth, stride and zero-padding\n", "\n", "Note that in this exercise there is no need to flatten the MNIST data, as a CNN expects 2-dimensional data." ] }, { "cell_type": "code", "source": [ "pip install torchvision" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "V6dJ49t3x5PU", "outputId": "b42a108c-fbdc-4634-843a-cfbe28895f25" }, "execution_count": null, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (0.16.0+cu118)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from torchvision) (1.23.5)\n", "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from torchvision) (2.31.0)\n", "Requirement already satisfied: torch==2.1.0 in /usr/local/lib/python3.10/dist-packages (from torchvision) (2.1.0+cu118)\n", "Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.10/dist-packages (from torchvision) (9.4.0)\n", "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0->torchvision) (3.13.1)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0->torchvision) (4.5.0)\n", "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0->torchvision) (1.12)\n", "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0->torchvision) (3.2.1)\n", "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0->torchvision) (3.1.2)\n", "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0->torchvision) (2023.6.0)\n", "Requirement already satisfied: triton==2.1.0 in /usr/local/lib/python3.10/dist-packages (from torch==2.1.0->torchvision) (2.1.0)\n", "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (3.3.2)\n", "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (3.4)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (2.0.7)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (2023.7.22)\n", "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch==2.1.0->torchvision) (2.1.3)\n", "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch==2.1.0->torchvision) (1.3.0)\n" ] } ] }, { "cell_type": "markdown", "metadata": { "id": "I0gOt6sixkxL" }, "source": [ "## Perform standard imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZiYKA7vwxkxL" }, "outputs": [], "source": [ "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "from torch.utils.data import DataLoader\n", "from torchvision import datasets, transforms\n", "from torchvision.utils import make_grid\n", "\n", "import numpy as np\n", "import pandas as pd\n", "from sklearn.metrics import confusion_matrix\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": { "id": "nBpZ5SyaxkxL" }, "source": [ "## Load the MNIST dataset\n", "PyTorch makes the MNIST train and test datasets available through torchvision. The first time they're called, the datasets will be downloaded onto your computer to the path specified. From that point, torchvision will always look for a local copy before attempting another download.\n", "\n", "Refer to the previous section for explanations of transformations, batch sizes and DataLoader." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "cOVtdy8OxkxM", "outputId": "9c642f75-c165-43e0-d692-d6f0e53126b7" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\n", "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../Data/MNIST/raw/train-images-idx3-ubyte.gz\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "100%|██████████| 9912422/9912422 [00:00<00:00, 71259373.24it/s]\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Extracting ../Data/MNIST/raw/train-images-idx3-ubyte.gz to ../Data/MNIST/raw\n", "\n", "Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\n", "Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../Data/MNIST/raw/train-labels-idx1-ubyte.gz\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "100%|██████████| 28881/28881 [00:00<00:00, 78455760.25it/s]" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Extracting ../Data/MNIST/raw/train-labels-idx1-ubyte.gz to ../Data/MNIST/raw\n", "\n", "Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../Data/MNIST/raw/t10k-images-idx3-ubyte.gz\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "100%|██████████| 1648877/1648877 [00:00<00:00, 23057121.60it/s]" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Extracting ../Data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../Data/MNIST/raw\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "\n" ] }, { "output_type": "stream", "name": "stdout", "text": [ "\n", "Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\n", "Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../Data/MNIST/raw/t10k-labels-idx1-ubyte.gz\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "100%|██████████| 4542/4542 [00:00<00:00, 13312738.48it/s]" ] }, { "output_type": "stream", "name": "stdout", "text": [ "Extracting ../Data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../Data/MNIST/raw\n", "\n" ] }, { "output_type": "stream", "name": "stderr", "text": [ "\n" ] } ], "source": [ "transform = transforms.ToTensor()\n", "\n", "train_data = datasets.MNIST(root='../Data', train=True, download=True, transform=transform)\n", "test_data = datasets.MNIST(root='../Data', train=False, download=True, transform=transform)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ThZMPjBHxkxM", "outputId": "32df1e42-2911-459a-ffb1-817b7462f659" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "Dataset MNIST\n", " Number of datapoints: 60000\n", " Root location: ../Data\n", " Split: Train\n", " StandardTransform\n", "Transform: ToTensor()" ] }, "metadata": {}, "execution_count": 4 } ], "source": [ "train_data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "OgYyG1ZbxkxM", "outputId": "acd87b03-4808-453e-8153-9b7535795c61" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "Dataset MNIST\n", " Number of datapoints: 10000\n", " Root location: ../Data\n", " Split: Test\n", " StandardTransform\n", "Transform: ToTensor()" ] }, "metadata": {}, "execution_count": 5 } ], "source": [ "test_data" ] }, { "cell_type": "markdown", "metadata": { "id": "PBXoRJuPxkxN" }, "source": [ "### Create loaders\n", "When working with images, we want relatively small batches; a batch size of 4 is not uncommon." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "s3wKS5klxkxN" }, "outputs": [], "source": [ "train_loader = DataLoader(train_data, batch_size=10, shuffle=True)\n", "test_loader = DataLoader(test_data, batch_size=10, shuffle=False)" ] }, { "cell_type": "markdown", "metadata": { "id": "31W9Nv8BxkxN" }, "source": [ "## Define a convolutional model\n", "In the previous section we used only fully connected layers, with an input layer of 784 (our flattened 28x28 images), hidden layers of 120 and 84 neurons, and an output size representing 10 possible digits.\n", "\n", "This time we'll employ two convolutional layers and two pooling layers before feeding data through fully connected hidden layers to our output. The model follows CONV/RELU/POOL/CONV/RELU/POOL/FC/RELU/FC." ] }, { "cell_type": "markdown", "metadata": { "id": "IhwT70wZxkxN" }, "source": [ "