Neural Networks with Keras

2. Neural Networks with Keras#

First things first, I’m a realist…

I added instructions to install Tensorflow and Keras (it doesn’t come pre-packaged with Anaconda). I’m assuming some have yet to do that, and these packages take time to install.

Try to run the code block below. If it gives an error that Tensorflow or Keras aren’t found, you’ll need to install those packages. Uncomment the first line in the code block below and run it again.

While that’s happening, we’ll go over some background and vocabulary relating to neural networks.

#!conda install tensorflow keras

from keras.datasets import fashion_mnist
# Load the Fashion-MNIST dataset
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

2.1. What are Neural Networks?#

Neural networks are machine learning models (loosely) inspired by the structure of the human brain. Neural networks can be implemented for regression and classification tasks and are widely used in complex tasks such as image and speech processing, language translation, time-series modeling and forecasting, and anomaly detection. There are numerous variants of neural networks. Some notable examples:

Multi-layer Percpetron (MLP) - the vanilla neural network we’ll use today
Convolutional Neural Network (CNN) - are largely used in image and video processing, but can also be applied to time-series and forecasting problems
Recurrent Neural Network (RNN) - used for time-series modeling and forecasting
Long- Short-Term Memory Neural Network (LSTM) - an upgrade of RNN. This was the cutting edge of natural language processing before transformers (as in Generative Pre-trained Transformer) ushered in the age of LMMs.
Generative Adversarial Networks (GANs) - used for generating novel data (i.e. a genAI model) and fraud detection. GANs comprise parallel neural networks, a generator (creating fake instances of data) and discriminator (trained to detect real data from fake).

2.2. Why Neural Networks over other models? The PROs.#

Ability to model complex nonlinear relationships: Neural networks can automatically learn and represent intricate, nonlinear patterns between inputs and outputs, which many traditional models (like linear regression) cannot do without extensive manual feature engineering.
Handling of high-dimensional and unstructured data: NNs excel at processing large-scale, high-dimensional data (e.g. images, audio, and text) where other models often struggle.
Feature extraction and automatic feature engineering: NNs ‘discover’ and construct relevant features from raw data, reducing the need for manual intervention and domain expertise in feature selection.
Adaptability in deployment: NNs adapt to new data and improve performance over time, making them suitable for dynamic, real-world applications.

2.3. Why not Neural Networks? The CONs.#

Require large amounts of data: NNs have many many parameters, so they typically need thousands to millions of labeled examples to perform well. For smaller data sets, other ML models are more suitable.
Lack interpretability (“black box”): For most NNs (especially deep NNs), we cannot gleen meaning from the parameters.
High computational cost: Training neural networks, especially deep NNs, demands significant computational resources (powerful GPUs/TPUs) and can take much longer than training traditional models. Many such models are trained on remote cloud computing servers (pay per compute).
Risk of overfitting: Neural networks, with their large number of parameters, are prone to overfitting if not properly regularized, especially when trained on small or noisy datasets.
Complexity in development and tuning: Designing, training, and tuning neural networks (e.g., choosing architecture, hyperparameters) is often more complex and time-consuming than working with traditional models, which generally have fewer parameters and simpler structures.

2.4. Neural Network Anatomy and The Multi-Layer Perceptron#

The term perceptron has two common usages: a single artificial neuron or several artificial neurons arranged in a single layer. Either way, a perceptron is a sort of building block for more complex neural networks.

First, let’s consider the single ‘neuron’ interpretation.

perceptron, by Adam Weaver

In the diagram above, each input feature is assigned a weight (parameters) and a weighted sum of features and a bias term are fed through some non-linear activation function. The diagram above can be represented by the following equation:

\[ \hat{y} = f(b + w_0 \cdot x_0 + w_1 \cdot x_1+ w_2 \cdot x_2 + w_3 \cdot x_3) \]

2.4.1. Activation Functions#

Every perceptron that comprises a neural network has some activation function. Activiation functions are non-linear and without them, a neural network (regardless of size) could be reduced to a single linear neuron. These activation functions make it that for any given region of the feature space, only some neurons will participate and others will be dormant. So the feature space is parsed by different subsets of neurons.

Some example activation functions are:

Heaviside function - step function
Sigmoid function - same as logit function from logistic regression
Rectifying Linear Unit (ReLu) - linear for positive values and zero otherwise. This is the most common activation function.
Tanh - hyperbolic tangent function, similar to sigmoid but ranges -1 to 1.

2.4.2. Multi-Layer Perceptron (the vanilla NN)#

MLP, Kishgore NG

A MLP comprises layers of perceptrons and each layer may itself contain numerous perceptrons. In this diagram, every neuron in one layer projects onto every neuron in the subsequent layer. These are called dense layers.

Generally, in densely connected NNs, each perceptron in a layer is the same (same number of parameters and same activation function).

Glossary:

Input layer - This layer accepts the features
Hidden layer - layers of perceptrons between input and output. Deep Neural Networks refer to NNs with many hidden layers. In the past, ‘many’ meant more than 3, but today, we have NNs with hundreds or thousands of layers. Deep is subjective and changes as technology improves.
Output layer - the layer where predictions are made
For regression, the output layer will have a single neuron for each predicted value (often one)
For binary classification, the output layer may have one or two neurons.
For multi-class classification, the output layer will have as many neurons as there are classes.

2.4.3. Architecture and Hyper-parameters#

When we first create the MLP, we have to decide on the architecture (how many layers, how many neurons per layer, activation functions, etc) and the hyper-parameters (regularization, batch size, epochs).

We haven’t had to worry about training time and computational demands with the models we’ve used thus far, but NN’s complexity make these issues non-trivial.

In fitting, we can adjust two hyper-parameters that govern learning: learning rate, batch size, and number of epochs.

Learning Rate - determines how much the parameters are adjusted at each update
Batch - a subset of the training data. The training data set is partitioned into batches and the parameters are updated after each batch is processed.
Epoch - Once through the entire training data. The training algorithm iterates through the entire training data set numerous times, each is an epoch.

Batch Size	Training Speed	Memory Usage	Generalization
Large	Faster	Higher	Risk of Overfitting
Small	Slower	Lower	Regularized

Rule of thumb: If we increase batch size, we should also increase learning rate.

import numpy as np

import keras
from keras.models import Sequential
from keras.layers import Dense, Flatten

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay, classification_report, accuracy_score

import matplotlib.pyplot as plt

2.5. Example: Fashion-MNIST, classifying articles of clothing#

The Fashion-MNIST datase comprises 60,000 28x28 pixel, gray-scale images of clothing items from one of the following categories.

0 T-shirt/top
1 Trouser
2 Pullover
3 Dress
4 Coat
5 Sandal
6 Shirt
7 Sneaker
8 Bag
9 Ankle boot

x_train.shape

(60000, 28, 28)

num_samples = 25
fig, ax = plt.subplots(5, 5, figsize = (10, 10), sharex = True, sharey= True)

for k in range(num_samples):
    i,j = int(k/5), k%5
    ax[i,j].imshow(x_train[k,:,:]/255, cmap = 'gray')
plt.show()

../_images/42890b2f0b3bd3b98316fe58b43d7afdc4d0c780a9bfc65da0b02f1524049137.png

# Normalize the data, pixel data is 0-255 (8-bit) but we want 0-1
x_train = x_train / 255.0
x_test = x_test / 255.0

# Create a vanilla neural network
model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

Epoch 1/10

/Users/eatai/.pyenv/versions/3.13.1/envs/datascience/lib/python3.13/site-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 5:07 205ms/step - accuracy: 0.1250 - loss: 2.5064


  84/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 609us/step - accuracy: 0.4923 - loss: 1.4047  


 178/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 571us/step - accuracy: 0.5888 - loss: 1.1415


 272/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 559us/step - accuracy: 0.6349 - loss: 1.0175


 367/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 551us/step - accuracy: 0.6638 - loss: 0.9406


 462/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 546us/step - accuracy: 0.6845 - loss: 0.8862


 554/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 547us/step - accuracy: 0.6996 - loss: 0.8462


 649/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - accuracy: 0.7117 - loss: 0.8135


 745/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - accuracy: 0.7215 - loss: 0.7868


 833/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - accuracy: 0.7291 - loss: 0.7660


 925/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - accuracy: 0.7359 - loss: 0.7471


1018/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - accuracy: 0.7420 - loss: 0.7305


1114/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - accuracy: 0.7475 - loss: 0.7154


1147/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 606us/step - accuracy: 0.7492 - loss: 0.7107


1242/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 600us/step - accuracy: 0.7538 - loss: 0.6978


1337/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 595us/step - accuracy: 0.7579 - loss: 0.6861


1434/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 590us/step - accuracy: 0.7618 - loss: 0.6753


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 692us/step - accuracy: 0.8174 - loss: 0.5204 - val_accuracy: 0.8486 - val_loss: 0.4296

Epoch 2/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 9s 6ms/step - accuracy: 0.8750 - loss: 0.2825


  95/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8423 - loss: 0.4300


 190/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 532us/step - accuracy: 0.8488 - loss: 0.4218


 284/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 533us/step - accuracy: 0.8510 - loss: 0.4175


 380/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 530us/step - accuracy: 0.8529 - loss: 0.4135


 475/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 530us/step - accuracy: 0.8541 - loss: 0.4110


 570/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 530us/step - accuracy: 0.8551 - loss: 0.4091


 666/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 529us/step - accuracy: 0.8561 - loss: 0.4076


 761/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 529us/step - accuracy: 0.8568 - loss: 0.4062


 857/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 528us/step - accuracy: 0.8571 - loss: 0.4052


 951/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 529us/step - accuracy: 0.8574 - loss: 0.4042


1042/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 531us/step - accuracy: 0.8577 - loss: 0.4032


1136/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 531us/step - accuracy: 0.8578 - loss: 0.4025


1232/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 531us/step - accuracy: 0.8579 - loss: 0.4018


1328/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 530us/step - accuracy: 0.8581 - loss: 0.4011


1425/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 530us/step - accuracy: 0.8583 - loss: 0.4004


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 604us/step - accuracy: 0.8621 - loss: 0.3878 - val_accuracy: 0.8608 - val_loss: 0.3761

Epoch 3/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 8s 6ms/step - accuracy: 0.8750 - loss: 0.3307


  96/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 529us/step - accuracy: 0.8681 - loss: 0.3795


 187/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 540us/step - accuracy: 0.8698 - loss: 0.3675


 280/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 540us/step - accuracy: 0.8698 - loss: 0.3651


 376/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8697 - loss: 0.3639


 473/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 531us/step - accuracy: 0.8700 - loss: 0.3622


 569/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 530us/step - accuracy: 0.8702 - loss: 0.3608


 666/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 528us/step - accuracy: 0.8703 - loss: 0.3594


 762/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 528us/step - accuracy: 0.8704 - loss: 0.3585


 859/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 527us/step - accuracy: 0.8705 - loss: 0.3577


 954/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 527us/step - accuracy: 0.8707 - loss: 0.3568


1048/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 528us/step - accuracy: 0.8709 - loss: 0.3559


1135/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 532us/step - accuracy: 0.8711 - loss: 0.3552


1223/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - accuracy: 0.8712 - loss: 0.3546


1313/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8713 - loss: 0.3540


1407/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8714 - loss: 0.3535


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 610us/step - accuracy: 0.8724 - loss: 0.3469 - val_accuracy: 0.8721 - val_loss: 0.3569

Epoch 4/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 7s 5ms/step - accuracy: 0.8750 - loss: 0.3239


  95/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8855 - loss: 0.3232


 192/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 528us/step - accuracy: 0.8799 - loss: 0.3310


 288/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - accuracy: 0.8785 - loss: 0.3313


 384/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - accuracy: 0.8782 - loss: 0.3312


 480/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - accuracy: 0.8787 - loss: 0.3293


 576/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - accuracy: 0.8795 - loss: 0.3272


 671/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - accuracy: 0.8801 - loss: 0.3259


 767/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - accuracy: 0.8805 - loss: 0.3248


 863/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - accuracy: 0.8809 - loss: 0.3240


 961/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 524us/step - accuracy: 0.8810 - loss: 0.3236


1057/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 524us/step - accuracy: 0.8812 - loss: 0.3233


1148/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - accuracy: 0.8813 - loss: 0.3231


1243/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - accuracy: 0.8815 - loss: 0.3228


1339/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - accuracy: 0.8815 - loss: 0.3226


1435/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 526us/step - accuracy: 0.8816 - loss: 0.3225


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 599us/step - accuracy: 0.8822 - loss: 0.3203 - val_accuracy: 0.8750 - val_loss: 0.3516

Epoch 5/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 7s 5ms/step - accuracy: 0.7812 - loss: 0.4199


  97/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 525us/step - accuracy: 0.8813 - loss: 0.2982


 193/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 524us/step - accuracy: 0.8861 - loss: 0.2940


 289/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 523us/step - accuracy: 0.8870 - loss: 0.2966


 387/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 521us/step - accuracy: 0.8870 - loss: 0.2995


 484/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 521us/step - accuracy: 0.8869 - loss: 0.3018


 581/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 521us/step - accuracy: 0.8867 - loss: 0.3033


 678/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8869 - loss: 0.3039


 774/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8870 - loss: 0.3042


 871/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8871 - loss: 0.3045


 968/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8871 - loss: 0.3048


1065/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8871 - loss: 0.3048


1162/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8872 - loss: 0.3047


1257/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8872 - loss: 0.3046


1353/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8873 - loss: 0.3046


1449/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.8873 - loss: 0.3045


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 599us/step - accuracy: 0.8880 - loss: 0.3034 - val_accuracy: 0.8804 - val_loss: 0.3408

Epoch 6/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 9s 7ms/step - accuracy: 0.9375 - loss: 0.1836


  91/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 561us/step - accuracy: 0.8982 - loss: 0.2788


 185/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 546us/step - accuracy: 0.8953 - loss: 0.2865


 279/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - accuracy: 0.8956 - loss: 0.2869


 375/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8957 - loss: 0.2875


 470/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - accuracy: 0.8958 - loss: 0.2878


 567/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 532us/step - accuracy: 0.8959 - loss: 0.2879


 662/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 531us/step - accuracy: 0.8960 - loss: 0.2880


 756/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 532us/step - accuracy: 0.8960 - loss: 0.2878


 839/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 539us/step - accuracy: 0.8959 - loss: 0.2878


 933/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 539us/step - accuracy: 0.8958 - loss: 0.2878


1031/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8958 - loss: 0.2877


1126/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8958 - loss: 0.2875


1217/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8958 - loss: 0.2874


1312/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8958 - loss: 0.2873


1406/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8958 - loss: 0.2872


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8957 - loss: 0.2872


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 614us/step - accuracy: 0.8945 - loss: 0.2883 - val_accuracy: 0.8815 - val_loss: 0.3330

Epoch 7/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 7s 5ms/step - accuracy: 0.8125 - loss: 0.4398


  94/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - accuracy: 0.8948 - loss: 0.2732


 188/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.8960 - loss: 0.2699


 282/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8966 - loss: 0.2697


 376/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - accuracy: 0.8970 - loss: 0.2697


 470/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - accuracy: 0.8971 - loss: 0.2703


 563/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8974 - loss: 0.2705


 657/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 536us/step - accuracy: 0.8976 - loss: 0.2708


 753/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - accuracy: 0.8979 - loss: 0.2708


 847/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 534us/step - accuracy: 0.8980 - loss: 0.2711


 942/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 534us/step - accuracy: 0.8980 - loss: 0.2714


1018/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - accuracy: 0.8979 - loss: 0.2716


1111/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - accuracy: 0.8980 - loss: 0.2717


1205/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 543us/step - accuracy: 0.8981 - loss: 0.2716


1300/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - accuracy: 0.8981 - loss: 0.2716


1395/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - accuracy: 0.8982 - loss: 0.2715


1491/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 540us/step - accuracy: 0.8982 - loss: 0.2715


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 616us/step - accuracy: 0.8989 - loss: 0.2716 - val_accuracy: 0.8847 - val_loss: 0.3234

Epoch 8/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 8s 6ms/step - accuracy: 0.9375 - loss: 0.1962


  94/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - accuracy: 0.9026 - loss: 0.2717


 189/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 534us/step - accuracy: 0.9023 - loss: 0.2659


 284/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 533us/step - accuracy: 0.9025 - loss: 0.2637


 379/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 531us/step - accuracy: 0.9030 - loss: 0.2612


 469/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.9033 - loss: 0.2603


 558/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 542us/step - accuracy: 0.9034 - loss: 0.2600


 647/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - accuracy: 0.9034 - loss: 0.2599


 737/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 547us/step - accuracy: 0.9034 - loss: 0.2597


 831/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 545us/step - accuracy: 0.9034 - loss: 0.2596


 926/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 544us/step - accuracy: 0.9034 - loss: 0.2595


1023/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 541us/step - accuracy: 0.9034 - loss: 0.2596


1119/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 540us/step - accuracy: 0.9034 - loss: 0.2597


1215/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 539us/step - accuracy: 0.9035 - loss: 0.2597


1312/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 537us/step - accuracy: 0.9035 - loss: 0.2597


1410/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 535us/step - accuracy: 0.9035 - loss: 0.2598


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 608us/step - accuracy: 0.9024 - loss: 0.2623 - val_accuracy: 0.8842 - val_loss: 0.3244

Epoch 9/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 8s 5ms/step - accuracy: 0.9062 - loss: 0.2110


  96/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 530us/step - accuracy: 0.8964 - loss: 0.2571


 193/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 523us/step - accuracy: 0.9002 - loss: 0.2548


 291/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.9018 - loss: 0.2532


 389/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 518us/step - accuracy: 0.9026 - loss: 0.2519


 487/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 517us/step - accuracy: 0.9032 - loss: 0.2512


 585/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 516us/step - accuracy: 0.9035 - loss: 0.2511


 683/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 515us/step - accuracy: 0.9038 - loss: 0.2506


 782/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 514us/step - accuracy: 0.9041 - loss: 0.2501


 882/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 513us/step - accuracy: 0.9043 - loss: 0.2498


 981/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 512us/step - accuracy: 0.9045 - loss: 0.2497


1080/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 512us/step - accuracy: 0.9045 - loss: 0.2500


1180/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 511us/step - accuracy: 0.9046 - loss: 0.2502


1279/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 511us/step - accuracy: 0.9046 - loss: 0.2503


1377/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 511us/step - accuracy: 0.9047 - loss: 0.2504


1473/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 512us/step - accuracy: 0.9048 - loss: 0.2504


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 584us/step - accuracy: 0.9063 - loss: 0.2507 - val_accuracy: 0.8904 - val_loss: 0.3164

Epoch 10/10

   1/1500 ━━━━━━━━━━━━━━━━━━━━ 7s 5ms/step - accuracy: 0.9688 - loss: 0.1049


  98/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 518us/step - accuracy: 0.9133 - loss: 0.2170


 196/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 516us/step - accuracy: 0.9140 - loss: 0.2204


 294/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 515us/step - accuracy: 0.9136 - loss: 0.2245


 389/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 518us/step - accuracy: 0.9126 - loss: 0.2285


 487/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 518us/step - accuracy: 0.9119 - loss: 0.2310


 586/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 516us/step - accuracy: 0.9114 - loss: 0.2333


 685/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 515us/step - accuracy: 0.9110 - loss: 0.2350


 783/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 515us/step - accuracy: 0.9107 - loss: 0.2362


 876/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 517us/step - accuracy: 0.9106 - loss: 0.2369


 967/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 521us/step - accuracy: 0.9105 - loss: 0.2374


1064/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.9105 - loss: 0.2379


1158/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 522us/step - accuracy: 0.9104 - loss: 0.2382


1256/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 521us/step - accuracy: 0.9104 - loss: 0.2385


1354/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 521us/step - accuracy: 0.9103 - loss: 0.2387


1452/1500 ━━━━━━━━━━━━━━━━━━━━ 0s 520us/step - accuracy: 0.9103 - loss: 0.2388


1500/1500 ━━━━━━━━━━━━━━━━━━━━ 1s 592us/step - accuracy: 0.9098 - loss: 0.2410 - val_accuracy: 0.8884 - val_loss: 0.3171

<keras.src.callbacks.history.History at 0x120ecbe00>

model.layers

[<Flatten name=flatten, built=True>,
 <Dense name=dense, built=True>,
 <Dense name=dense_1, built=True>]

model.layers[1].get_weights()

[array([[-0.0713469 ,  0.03250205,  0.06500132, ...,  0.27670878,
         -0.18487853, -0.08517335],
        [-0.05822118, -0.09768099,  0.07717109, ...,  0.41678566,
         -0.01116703, -0.11538623],
        [-0.05299133,  0.10303343, -0.09143736, ...,  0.17697497,
         -0.11204378, -0.11445503],
        ...,
        [-0.18283023, -0.05082365, -0.15573852, ...,  0.4768174 ,
         -0.13417463, -0.1478706 ],
        [-0.23174927, -0.29934692,  0.14626086, ...,  0.5767144 ,
         -0.3227887 , -0.11754484],
        [-0.22875336, -0.20766266, -0.07919479, ...,  0.42055562,
         -0.42195907, -0.07124062]], shape=(784, 128), dtype=float32),
 array([ 0.33520427,  0.41476032,  0.42854255,  0.26456   ,  0.28423107,
         0.4619906 , -0.01085375, -0.01410234,  0.46816376,  0.04695842,
         0.16199604,  0.1178596 , -0.15440884,  0.26269406, -0.02273692,
         0.39012593,  0.11480645, -0.16267306,  0.01186102, -0.20373592,
         0.77516353,  0.28331184,  0.04801618, -0.01358667,  0.27781582,
        -0.30025145, -0.01313267, -0.02819712,  0.05266671, -0.01129104,
         0.3718213 , -0.06276885,  0.07143743, -0.42740637,  0.27966768,
         0.5643312 ,  0.13939717, -0.02660735,  0.25482723,  0.15903096,
         0.02761725, -0.11521526, -0.018591  , -0.08635442,  0.1875941 ,
        -0.24020974,  0.33598614,  0.14443065,  0.16586447,  0.31778356,
        -0.01498642,  0.92979336, -0.01211132, -0.10277095, -0.1661265 ,
         0.00524938,  0.10501956, -0.11736097, -0.05673584,  0.83130604,
        -0.16295753,  0.4000649 , -0.28859657,  0.11708748,  0.2133803 ,
         0.5311273 , -0.07117038,  0.36213043,  0.09919515,  0.04499178,
        -0.06633466, -0.00488194, -0.06314942,  0.12584068,  0.316148  ,
         0.4582312 ,  0.26652548,  0.32099682,  0.27829206,  0.35091612,
         0.34667146,  0.51486707,  0.4186107 ,  0.12076598,  0.28654033,
         0.383458  ,  0.8813752 , -0.00307724,  0.44035587,  0.24539825,
         0.08246749,  0.34919336, -0.18238382, -0.0148306 , -0.42604852,
        -0.17117037,  0.29033217, -0.26033685,  0.19798407,  0.08194257,
        -0.01106311,  0.2647324 , -0.21065266,  0.45927092, -0.06942318,
        -0.23241717, -0.01085974,  0.51232344,  0.529505  ,  0.3539233 ,
        -0.01077797, -0.13007548, -0.01885176, -0.10258822,  0.37626743,
         0.24545035,  0.23259574, -0.03035248,  0.39216578,  0.68170875,
        -0.01603815,  0.38373974,  0.37374336,  0.31222183, -0.39119828,
         0.3422284 ,  0.01534443, -0.12235121], dtype=float32)]

weights = model.layers[1].get_weights()[0]
biases = model.layers[1].get_weights()[1]

print(weights.shape)
print(biases.shape)

(784, 128)
(128,)

fig, ax = plt.subplots(16, 8, figsize = (10, 20), sharex = True, sharey= True)

for k, weight in enumerate(weights.transpose()):
    i,j = int(k/8), k%8
    ax[i,j].imshow(weight.reshape(28,28), cmap = 'gray')

../_images/e80eb6e4c87388168e2acc9112bd1037d333e284d9e76ee8e7325c2e82f28a14.png

# Evaluate the model on the test set
y_pred = np.argmax(model.predict(x_test), axis=1)
y_train_pred = np.argmax(model.predict(x_train), axis=1)

  1/313 ━━━━━━━━━━━━━━━━━━━━ 4s 15ms/step


201/313 ━━━━━━━━━━━━━━━━━━━━ 0s 251us/step


313/313 ━━━━━━━━━━━━━━━━━━━━ 0s 286us/step

   1/1875 ━━━━━━━━━━━━━━━━━━━━ 10s 5ms/step


 239/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 210us/step


 487/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step


 736/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 204us/step


 983/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 204us/step


1228/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 204us/step


1457/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 206us/step


1681/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 209us/step


1875/1875 ━━━━━━━━━━━━━━━━━━━━ 0s 212us/step

labels = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# Create a confusion matrix
cm = confusion_matrix(y_test, y_pred)
ConfusionMatrixDisplay(cm, display_labels = labels).plot()

plt.xticks(rotation = 60)
plt.show()

../_images/dea17cb740a47b3fac10eef5860d9920ea40b8e52d0fa4635123191b0044d50a.png

print('TRAINING REPORT:')
print(classification_report(y_train, y_train_pred))

print('TESTING REPORT:')
print(classification_report(y_test, y_pred))

TRAINING REPORT:
              precision    recall  f1-score   support

           0       0.81      0.93      0.87      6000
           1       0.99      0.99      0.99      6000
           2       0.85      0.84      0.85      6000
           3       0.92      0.93      0.92      6000
           4       0.84      0.86      0.85      6000
           5       1.00      0.96      0.98      6000
           6       0.82      0.69      0.75      6000
           7       0.96      0.97      0.97      6000
           8       0.98      0.99      0.99      6000
           9       0.96      0.98      0.97      6000

    accuracy                           0.91     60000
   macro avg       0.91      0.91      0.91     60000
weighted avg       0.91      0.91      0.91     60000

TESTING REPORT:
              precision    recall  f1-score   support

           0       0.78      0.89      0.83      1000
           1       0.99      0.97      0.98      1000
           2       0.79      0.78      0.79      1000
           3       0.88      0.89      0.89      1000
           4       0.80      0.81      0.80      1000
           5       0.99      0.93      0.96      1000
           6       0.73      0.63      0.68      1000
           7       0.93      0.96      0.95      1000
           8       0.97      0.97      0.97      1000
           9       0.94      0.96      0.95      1000

    accuracy                           0.88     10000
   macro avg       0.88      0.88      0.88     10000
weighted avg       0.88      0.88      0.88     10000