--- a +++ b/Code/All Qiskit ML Demos/04 Quantum Kernel 100% Cluster kkawchak.ipynb @@ -0,0 +1,1203 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quantum Kernel Machine Learning\n", + "\n", + "## Overview\n", + "\n", + "The general task of machine learning is to find and study patterns in data. For many datasets, the datapoints are better understood in a higher dimensional feature space. This is the fundamental principle behind a series of machine learning algorithms known as *kernel methods*.\n", + "\n", + "In this notebook, you will learn how to define quantum kernels using `qiskit-machine-learning` and how these can be plugged into different algorithms to solve classification and clustering problems.\n", + "\n", + "All examples used in this tutorial are based on this reference paper: [_Supervised learning with quantum enhanced feature spaces_](https://arxiv.org/pdf/1804.11326.pdf).\n", + "\n", + "The content is structured as follows:\n", + "\n", + "1. [Introduction](#1.-Introduction)\n", + "2. [Classification](#2.-Classification)\n", + "3. [Clustering](#3.-Clustering)\n", + "4. [Kernel Principal Components Analysis](#4.-Kernel-Principal-Component-Analysis)\n", + "5. [Conclusion](#5.-Conclusion)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Introduction\n", + "\n", + "### 1.1. Kernel Methods for Machine Learning\n", + "\n", + "Kernel methods are a collection of pattern analysis algorithms that use kernel functions to operate in a high-dimensional feature space. The best-known application of kernel methods is in **Support Vector Machines (SVMs)**, supervised learning algorithms commonly used for classification tasks. The main goal of SVMs is to find decision boundaries to separate a given set of data points into classes. When these data spaces are not linearly separable, SVMs can benefit from the use of kernels to find these boundaries.\n", + "\n", + "Formally, decision boundaries are hyperplanes in a high dimensional space. The kernel function implicitly maps input data into this higher dimensional space, where it can be easier to solve the initial problem. In other words, kernels may allow data distributions that were originally non-linearly separable to become a linearly separable problem. This is an effect known as the \"kernel trick\".\n", + "\n", + "There are use-cases for kernel-based unsupervised algorithms too, for example, in the context of clustering. **Spectral Clustering** is a technique where data points are treated as nodes of a graph, and the clustering task is viewed as a graph partitioning problem where nodes are mapped to a space where they can be easily segregated to form clusters.\n", + "\n", + "### 1.2. Kernel Functions\n", + "\n", + "Mathematically, kernel functions follow:\n", + "\n", + "$k(\\vec{x}_i, \\vec{x}_j) = \\langle f(\\vec{x}_i), f(\\vec{x}_j) \\rangle$\n", + "\n", + "where \n", + "* $k$ is the kernel function\n", + "* $\\vec{x}_i, \\vec{x}_j$ are $n$ dimensional inputs\n", + "* $f$ is a map from $n$-dimension to $m$-dimension space and \n", + "* $\\langle a,b \\rangle$ denotes the inner product\n", + "\n", + "When considering finite data, a kernel function can be represented as a matrix: \n", + "\n", + "$K_{ij} = k(\\vec{x}_i,\\vec{x}_j)$.\n", + "\n", + "### 1.3. Quantum Kernels\n", + "\n", + "The main idea behind quantum kernel machine learning is to leverage quantum feature maps to perform the kernel trick. In this case, the quantum kernel is created by mapping a classical feature vector $\\vec{x}$ to a Hilbert space using a quantum feature map $\\phi(\\vec{x})$. Mathematically:\n", + "\n", + "$K_{ij} = \\left| \\langle \\phi(\\vec{x}_i)| \\phi(\\vec{x}_j) \\rangle \\right|^{2}$\n", + "\n", + "where \n", + "* $K_{ij}$ is the kernel matrix\n", + "* $\\vec{x}_i, \\vec{x}_j$ are $n$ dimensional inputs\n", + "* $\\phi(\\vec{x})$ is the quantum feature map\n", + "* $\\left| \\langle a|b \\rangle \\right|^{2}$ denotes the overlap of two quantum states $a$ and $b$\n", + "\n", + "Quantum kernels can be plugged into common classical kernel learning algorithms such as SVMs or clustering algorithms, as you will see in the examples below. They can also be leveraged in new quantum kernel methods like [QSVC](https://qiskit.org/ecosystem/machine-learning/stubs/qiskit_machine_learning.algorithms.QSVC.html) class provided by `qiskit-machine-learning` which is explored in this tutorial, and other methods as shown in later tutorials on [Pegasos QSVC](07_pegasos_qsvc.ipynb) and [Quantum Kernel Training](08_quantum_kernel_trainer.ipynb).\n", + "\n", + "***\n", + "\n", + "Before introducing any example, we set up the global seed to ensure reproducibility:" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": {}, + "outputs": [], + "source": [ + "from qiskit.utils import algorithm_globals\n", + "\n", + "algorithm_globals.random_seed = 12345" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Classification\n", + "\n", + "This section illustrates a quantum kernel classification workflow using `qiskit-machine-learning`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.1. Defining the dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example, we will use the _ad hoc dataset_ as described in the reference [paper](https://arxiv.org/pdf/1804.11326.pdf). \n", + "\n", + "We can define the dataset dimension and get our train and test subsets:" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": {}, + "outputs": [], + "source": [ + "from qiskit_machine_learning.datasets import ad_hoc_data\n", + "\n", + "adhoc_dimension = 2\n", + "train_features, train_labels, test_features, test_labels, adhoc_total = ad_hoc_data(\n", + " training_size=20,\n", + " test_size=5,\n", + " n=adhoc_dimension,\n", + " gap=0.3,\n", + " plot_data=False,\n", + " one_hot=False,\n", + " include_sample_total=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This dataset is two-dimensional, the two features are represented by the $x$ and $y$ coordinates, and it has two class labels: A and B. We can plot it and see what the distribution looks like. We define utility functions to plot the dataset." + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": {}, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "\n", + "\n", + "def plot_features(ax, features, labels, class_label, marker, face, edge, label):\n", + " # A train plot\n", + " ax.scatter(\n", + " # x coordinate of labels where class is class_label\n", + " features[np.where(labels[:] == class_label), 0],\n", + " # y coordinate of labels where class is class_label\n", + " features[np.where(labels[:] == class_label), 1],\n", + " marker=marker,\n", + " facecolors=face,\n", + " edgecolors=edge,\n", + " label=label,\n", + " )\n", + "\n", + "\n", + "def plot_dataset(train_features, train_labels, test_features, test_labels, adhoc_total):\n", + "\n", + " plt.figure(figsize=(5, 5))\n", + " plt.ylim(0, 2 * np.pi)\n", + " plt.xlim(0, 2 * np.pi)\n", + " plt.imshow(\n", + " np.asmatrix(adhoc_total).T,\n", + " interpolation=\"nearest\",\n", + " origin=\"lower\",\n", + " cmap=\"RdBu\",\n", + " extent=[0, 2 * np.pi, 0, 2 * np.pi],\n", + " )\n", + "\n", + " # A train plot\n", + " plot_features(plt, train_features, train_labels, 0, \"s\", \"w\", \"b\", \"A train\")\n", + "\n", + " # B train plot\n", + " plot_features(plt, train_features, train_labels, 1, \"o\", \"w\", \"r\", \"B train\")\n", + "\n", + " # A test plot\n", + " plot_features(plt, test_features, test_labels, 0, \"s\", \"b\", \"w\", \"A test\")\n", + "\n", + " # B test plot\n", + " plot_features(plt, test_features, test_labels, 1, \"o\", \"r\", \"w\", \"B test\")\n", + "\n", + " plt.legend(bbox_to_anchor=(1.05, 1), loc=\"upper left\", borderaxespad=0.0)\n", + " plt.title(\"Ad hoc dataset\")\n", + "\n", + " plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we actually plot the dataset for classification:" + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "tags": [ + "nbsphinx-thumbnail" + ] + }, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 500x500 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "plot_dataset(train_features, train_labels, test_features, test_labels, adhoc_total)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.2. Defining the quantum kernel\n", + "\n", + "The next step is to create a quantum kernel instance that will help classify this data. \n", + "\n", + "We use the [FidelityQuantumKernel](https://qiskit.org/ecosystem/machine-learning/stubs/qiskit_machine_learning.kernels.FidelityQuantumKernel.html) class, and pass two input arguments to its constructor: \n", + "\n", + "1. `feature_map`: in this case, a two-qubit [ZZFeatureMap](https://qiskit.org/documentation/stubs/qiskit.circuit.library.ZZFeatureMap.html).\n", + "\n", + "2. `fidelity`: in this case, the [ComputeUncompute](https://qiskit.org/documentation/stubs/qiskit.algorithms.state_fidelities.ComputeUncompute.html) fidelity subroutine that leverages the [Sampler](https://qiskit.org/documentation/stubs/qiskit.primitives.Sampler.html) primitive.\n", + "\n", + "**NOTE:** If you don't pass a `Sampler` or `Fidelity` instance, then the instances of the reference `Sampler` and `ComputeUncompute` classes (found in `qiskit.primitives`) will be created by default." + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": {}, + "outputs": [], + "source": [ + "from qiskit.circuit.library import ZZFeatureMap\n", + "from qiskit.primitives import Sampler\n", + "from qiskit.algorithms.state_fidelities import ComputeUncompute\n", + "from qiskit_machine_learning.kernels import FidelityQuantumKernel\n", + "\n", + "adhoc_feature_map = ZZFeatureMap(feature_dimension=adhoc_dimension, reps=2, entanglement=\"linear\")\n", + "\n", + "sampler = Sampler()\n", + "\n", + "fidelity = ComputeUncompute(sampler=sampler)\n", + "\n", + "adhoc_kernel = FidelityQuantumKernel(fidelity=fidelity, feature_map=adhoc_feature_map)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.3. Classification with SVC\n", + "The quantum kernel can now be plugged into classical kernel methods, such as the [SVC](https://scikit-learn.org/stable/modules/svm.html) algorithm from `scikit-learn`. This algorithm allows us to define a [custom kernel](https://scikit-learn.org/stable/modules/svm.html#custom-kernels) in two ways:\n", + "\n", + "1. by providing the kernel as a **callable function**\n", + "2. by precomputing the **kernel matrix**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Kernel as a callable function\n", + "\n", + "We define a SVC model and directly pass the `evaluate` function of the quantum kernel as a callable. Once the model is created, we train it by calling the `fit` method on the training dataset and evaluate the model for accuracy with `score`." + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Callable kernel classification test score: 1.0\n" + ] + } + ], + "source": [ + "from sklearn.svm import SVC\n", + "\n", + "adhoc_svc = SVC(kernel=adhoc_kernel.evaluate)\n", + "\n", + "adhoc_svc.fit(train_features, train_labels)\n", + "\n", + "adhoc_score_callable_function = adhoc_svc.score(test_features, test_labels)\n", + "\n", + "print(f\"Callable kernel classification test score: {adhoc_score_callable_function}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Precomputed kernel matrix\n", + "\n", + "Instead of passing a function of the quantum kernel as a callable, we can also precompute training and testing kernel matrices before passing them to the `scikit-learn` `SVC` algorithm. \n", + "\n", + "To extract the train and test matrices, we can call `evaluate` on the previously defined kernel and visualize them graphically as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 1000x500 with 2 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "adhoc_matrix_train = adhoc_kernel.evaluate(x_vec=train_features)\n", + "adhoc_matrix_test = adhoc_kernel.evaluate(x_vec=test_features, y_vec=train_features)\n", + "\n", + "fig, axs = plt.subplots(1, 2, figsize=(10, 5))\n", + "\n", + "axs[0].imshow(\n", + " np.asmatrix(adhoc_matrix_train), interpolation=\"nearest\", origin=\"upper\", cmap=\"Blues\"\n", + ")\n", + "axs[0].set_title(\"Ad hoc training kernel matrix\")\n", + "\n", + "axs[1].imshow(np.asmatrix(adhoc_matrix_test), interpolation=\"nearest\", origin=\"upper\", cmap=\"Reds\")\n", + "axs[1].set_title(\"Ad hoc testing kernel matrix\")\n", + "\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To use these matrices, we set the `kernel` parameter of a new `SVC` instance to `\"precomputed\"`. We train the classifier by calling `fit` with the training matrix and training dataset. Once the model is trained, we evaluate it using the test matrix on the test dataset." + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Precomputed kernel classification test score: 1.0\n" + ] + } + ], + "source": [ + "adhoc_svc = SVC(kernel=\"precomputed\")\n", + "\n", + "adhoc_svc.fit(adhoc_matrix_train, train_labels)\n", + "\n", + "adhoc_score_precomputed_kernel = adhoc_svc.score(adhoc_matrix_test, test_labels)\n", + "\n", + "print(f\"Precomputed kernel classification test score: {adhoc_score_precomputed_kernel}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.4. Classification with QSVC\n", + "\n", + "`QSVC` is an alternative training algorithm provided by `qiskit-machine-learning` for convenience. It is an extension of `SVC` that takes in a quantum kernel instead of the `kernel.evaluate` method shown before." + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "QSVC classification test score: 1.0\n" + ] + } + ], + "source": [ + "from qiskit_machine_learning.algorithms import QSVC\n", + "\n", + "qsvc = QSVC(quantum_kernel=adhoc_kernel)\n", + "\n", + "qsvc.fit(train_features, train_labels)\n", + "\n", + "qsvc_score = qsvc.score(test_features, test_labels)\n", + "\n", + "print(f\"QSVC classification test score: {qsvc_score}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.5. Evaluation of models used for classification" + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Classification Model | Accuracy Score\n", + "---------------------------------------------------------\n", + "SVC using kernel as a callable function | 1.00\n", + "SVC using precomputed kernel matrix | 1.00\n", + "QSVC | 1.00\n" + ] + } + ], + "source": [ + "print(f\"Classification Model | Accuracy Score\")\n", + "print(f\"---------------------------------------------------------\")\n", + "print(f\"SVC using kernel as a callable function | {adhoc_score_callable_function:10.2f}\")\n", + "print(f\"SVC using precomputed kernel matrix | {adhoc_score_precomputed_kernel:10.2f}\")\n", + "print(f\"QSVC | {qsvc_score:10.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As the classification dataset is small, we find that the three models achieve 100% accuracy." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Clustering\n", + "\n", + "The second workflow in this tutorial focuses on a clustering task using `qiskit-machine-learning` and the spectral clustering algorithm from `scikit-learn`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.1. Defining the dataset\n", + "\n", + "We will once again use the _ad hoc dataset_, but now generated with a higher gap of `0.6` (previous example: `0.3`) between the two classes. \n", + "\n", + "Note that clustering falls under the category of unsupervised machine learning, so a test dataset is not required." + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": {}, + "outputs": [], + "source": [ + "adhoc_dimension = 2\n", + "train_features, train_labels, test_features, test_labels, adhoc_total = ad_hoc_data(\n", + " training_size=25,\n", + " test_size=0,\n", + " n=adhoc_dimension,\n", + " gap=0.6,\n", + " plot_data=False,\n", + " one_hot=False,\n", + " include_sample_total=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " We plot the clustering dataset below:" + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 500x500 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "plt.figure(figsize=(5, 5))\n", + "plt.ylim(0, 2 * np.pi)\n", + "plt.xlim(0, 2 * np.pi)\n", + "plt.imshow(\n", + " np.asmatrix(adhoc_total).T,\n", + " interpolation=\"nearest\",\n", + " origin=\"lower\",\n", + " cmap=\"RdBu\",\n", + " extent=[0, 2 * np.pi, 0, 2 * np.pi],\n", + ")\n", + "\n", + "# A label plot\n", + "plot_features(plt, train_features, train_labels, 0, \"s\", \"w\", \"b\", \"B\")\n", + "\n", + "# B label plot\n", + "plot_features(plt, train_features, train_labels, 1, \"o\", \"w\", \"r\", \"B\")\n", + "\n", + "plt.legend(bbox_to_anchor=(1.05, 1), loc=\"upper left\", borderaxespad=0.0)\n", + "plt.title(\"Ad hoc dataset for clustering\")\n", + "\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Defining the Quantum Kernel\n", + "We use an identical setup as in the classification example. We create another instance of the `FidelityQuantumKernel` class with a `ZZFeatureMap`, but you might notice that in this case we do not provide a `fidelity` instance. This is because the `ComputeUncompute` method provided in the previous case is instantiated by default when the fidelity instance is not provided explicitly. " + ] + }, + { + "cell_type": "code", + "execution_count": 176, + "metadata": {}, + "outputs": [], + "source": [ + "adhoc_feature_map = ZZFeatureMap(feature_dimension=adhoc_dimension, reps=2, entanglement=\"full\")\n", + "\n", + "adhoc_kernel = FidelityQuantumKernel(feature_map=adhoc_feature_map)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.3. Clustering with the Spectral Clustering Model\n", + "\n", + "The `scikit-learn` spectral clustering algorithm allows us to define a custom kernel in two ways (just like `SVC`):\n", + "\n", + "1. by providing the kernel as a **callable function**\n", + "2. by precomputing the **kernel matrix**. \n", + "\n", + "With the current `FidelityQuantumKernel` class in `qiskit-machine-learning`, we can only use the latter option, so we precompute the kernel matrix by calling `evaluate` and visualize it as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": 177, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 500x500 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "adhoc_matrix = adhoc_kernel.evaluate(x_vec=train_features)\n", + "\n", + "plt.figure(figsize=(5, 5))\n", + "plt.imshow(np.asmatrix(adhoc_matrix), interpolation=\"nearest\", origin=\"upper\", cmap=\"Greens\")\n", + "plt.title(\"Ad hoc clustering kernel matrix\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we define a spectral clustering model and fit it using the precomputed kernel. Further, we score the labels using normalized mutual information, since we know the class labels a priori (before hand)." + ] + }, + { + "cell_type": "code", + "execution_count": 178, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Clustering score: 1.0\n" + ] + } + ], + "source": [ + "from sklearn.cluster import SpectralClustering\n", + "from sklearn.metrics import normalized_mutual_info_score\n", + "\n", + "adhoc_spectral = SpectralClustering(2, affinity=\"precomputed\")\n", + "\n", + "cluster_labels = adhoc_spectral.fit_predict(adhoc_matrix)\n", + "\n", + "cluster_score = normalized_mutual_info_score(cluster_labels, train_labels)\n", + "\n", + "print(f\"Clustering score: {cluster_score}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Kernel Principal Component Analysis\n", + "\n", + "This section focuses on a Principal Component Analysis task using a kernel PCA algorithm. We calculate a kernel matrix using a `ZZFeatureMap` and show that this approach translates the original features into a new space, where axes are chosen along principal components. In this space the classification task can be performed with a simpler model rather than an SVM." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 4.1. Defining the dataset\n", + "\n", + "We again use the _ad hoc dataset_ with a gap of `0.6` between the two classes. This dataset resembles the dataset we had in the clustering section, the difference is that in this case `test_size` is not zero." + ] + }, + { + "cell_type": "code", + "execution_count": 179, + "metadata": {}, + "outputs": [], + "source": [ + "adhoc_dimension = 2\n", + "train_features, train_labels, test_features, test_labels, adhoc_total = ad_hoc_data(\n", + " training_size=25,\n", + " test_size=10,\n", + " n=adhoc_dimension,\n", + " gap=0.6,\n", + " plot_data=False,\n", + " one_hot=False,\n", + " include_sample_total=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We plot the training and test datasets below. Our ultimate goal in this section is to construct new coordinates where the two classes can be linearly separated." + ] + }, + { + "cell_type": "code", + "execution_count": 180, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 500x500 with 1 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "plot_dataset(train_features, train_labels, test_features, test_labels, adhoc_total)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 4.2. Defining the Quantum Kernel\n", + "\n", + "We proceed with the same kernel setup as it was in the classification task, namely a `ZZFeatureMap` circuit as a feature map and an instance of `FidelityQuantumKernel`." + ] + }, + { + "cell_type": "code", + "execution_count": 181, + "metadata": {}, + "outputs": [], + "source": [ + "feature_map = ZZFeatureMap(feature_dimension=2, reps=2, entanglement=\"linear\")\n", + "qpca_kernel = FidelityQuantumKernel(fidelity=fidelity, feature_map=feature_map)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, we evaluate kernel matrices for the training and test features." + ] + }, + { + "cell_type": "code", + "execution_count": 182, + "metadata": {}, + "outputs": [], + "source": [ + "matrix_train = qpca_kernel.evaluate(x_vec=train_features)\n", + "matrix_test = qpca_kernel.evaluate(x_vec=test_features, y_vec=test_features)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 4.3. Comparison of Kernel PCA on gaussian and quantum kernel\n", + "\n", + "In this section we use the `KernelPCA` implementation from `scikit-learn`, with the `kernel` parameter set to \"rbf\" for a gaussian kernel and \"precomputed\" for a quantum kernel. The former is very popular in classical machine learning models, whereas the latter allows using a quantum kernel defined as `qpca_kernel`.\n", + "\n", + "One can observe that the gaussian kernel based Kernel PCA model fails to make the dataset linearly separable, while the quantum kernel succeeds.\n", + "\n", + "While usually PCA is used to reduce the number of features in a dataset, or in other words to reduce dimensionality of a dataset, we don't do that here. Rather we keep the number of dimensions and employ the kernel PCA, mostly for visualization purposes, to show that classification on the transformed dataset becomes easily tractable by linear methods, like logistic regression. We use this method to separate two classes in the principal component space with a `LogisticRegression` model from `scikit-learn`. As usual we train it by calling the `fit` method on the training dataset and evaluate the model for accuracy with `score`." + ] + }, + { + "cell_type": "code", + "execution_count": 183, + "metadata": {}, + "outputs": [], + "source": [ + "from sklearn.decomposition import KernelPCA\n", + "\n", + "kernel_pca_rbf = KernelPCA(n_components=2, kernel=\"rbf\")\n", + "kernel_pca_rbf.fit(train_features)\n", + "train_features_rbf = kernel_pca_rbf.transform(train_features)\n", + "test_features_rbf = kernel_pca_rbf.transform(test_features)\n", + "\n", + "kernel_pca_q = KernelPCA(n_components=2, kernel=\"precomputed\")\n", + "train_features_q = kernel_pca_q.fit_transform(matrix_train)\n", + "test_features_q = kernel_pca_q.fit_transform(matrix_test)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here we train and score a model." + ] + }, + { + "cell_type": "code", + "execution_count": 184, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Logistic regression score: 0.05\n" + ] + } + ], + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "\n", + "logistic_regression = LogisticRegression()\n", + "logistic_regression.fit(train_features_q, train_labels)\n", + "\n", + "logistic_score = logistic_regression.score(test_features_q, test_labels)\n", + "print(f\"Logistic regression score: {logistic_score}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's plot the results. First, we plot the transformed dataset we get with the quantum kernel. On the same plot we also add model results. Then, we plot the transformed dataset we get with the gaussian kernel." + ] + }, + { + "cell_type": "code", + "execution_count": 185, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 1000x500 with 2 Axes>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "fig, (q_ax, rbf_ax) = plt.subplots(1, 2, figsize=(10, 5))\n", + "\n", + "\n", + "plot_features(q_ax, train_features_q, train_labels, 0, \"s\", \"w\", \"b\", \"A train\")\n", + "plot_features(q_ax, train_features_q, train_labels, 1, \"o\", \"w\", \"r\", \"B train\")\n", + "\n", + "plot_features(q_ax, test_features_q, test_labels, 0, \"s\", \"b\", \"w\", \"A test\")\n", + "plot_features(q_ax, test_features_q, test_labels, 1, \"o\", \"r\", \"w\", \"A test\")\n", + "\n", + "q_ax.set_ylabel(\"Principal component #1\")\n", + "q_ax.set_xlabel(\"Principal component #0\")\n", + "q_ax.set_title(\"Projection of training and test data\\n using KPCA with Quantum Kernel\")\n", + "\n", + "# Plotting the linear separation\n", + "h = 0.01 # step size in the mesh\n", + "\n", + "# create a mesh to plot in\n", + "x_min, x_max = train_features_q[:, 0].min() - 1, train_features_q[:, 0].max() + 1\n", + "y_min, y_max = train_features_q[:, 1].min() - 1, train_features_q[:, 1].max() + 1\n", + "xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))\n", + "\n", + "predictions = logistic_regression.predict(np.c_[xx.ravel(), yy.ravel()])\n", + "\n", + "# Put the result into a color plot\n", + "predictions = predictions.reshape(xx.shape)\n", + "q_ax.contourf(xx, yy, predictions, cmap=plt.cm.RdBu, alpha=0.2)\n", + "\n", + "plot_features(rbf_ax, train_features_rbf, train_labels, 0, \"s\", \"w\", \"b\", \"A train\")\n", + "plot_features(rbf_ax, train_features_rbf, train_labels, 1, \"o\", \"w\", \"r\", \"B train\")\n", + "plot_features(rbf_ax, test_features_rbf, test_labels, 0, \"s\", \"b\", \"w\", \"A test\")\n", + "plot_features(rbf_ax, test_features_rbf, test_labels, 1, \"o\", \"r\", \"w\", \"A test\")\n", + "\n", + "rbf_ax.set_ylabel(\"Principal component #1\")\n", + "rbf_ax.set_xlabel(\"Principal component #0\")\n", + "rbf_ax.set_title(\"Projection of training data\\n using KernelPCA\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As we can see, the data points on the right figure are not separable, but they are on the left figure, hence in case of quantum kernel we can apply linear models on the transformed dataset and this is why SVM classifier works perfectly well on the _ad hoc_ dataset as we saw in the [classification section](#2.-Classification)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Conclusion\n", + "\n", + "In this tutorial:\n", + "\n", + "* We reviewed the fundamentals of quantum kernel learning\n", + "* We understood how to define quantum kernels as instances of `FidelityQuantumKernel`\n", + "* We learned how to use the `scikit-learn` `SVC` algorithm with a custom quantum kernel as a callable function vs precomputed quantum kernel matrix for classification\n", + "* We learned how to train classifiers with the `QSVC` algorithm from `qiskit-machine-learning`\n", + "* We learned how to use the `scikit-learn` `SpectralClustering` algorithms with a precomputed quantum kernel matrix for clustering\n", + "* We investigated how to plug in a quantum kernel into `scikit-learn`'s `KernelPCA` algorithm and transform the ad-hoc dataset into a new one that can be tackled by a linear model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For further reference, `scikit-learn` has other algorithms that can use a precomputed kernel matrix, such as:\n", + "\n", + "- [Agglomerative clustering](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html)\n", + "- [Support vector regression](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html)\n", + "- [Ridge regression](https://scikit-learn.org/stable/modules/generated/sklearn.kernel_ridge.KernelRidge.html)\n", + "- [Gaussian process regression](https://scikit-learn.org/stable/modules/gaussian_process.html)" + ] + }, + { + "cell_type": "code", + "execution_count": 186, + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "<h3>Version Information</h3><table><tr><th>Software</th><th>Version</th></tr><tr><td><code>qiskit</code></td><td>0.44.1</td></tr><tr><td><code>qiskit-terra</code></td><td>0.25.1</td></tr><tr><td><code>qiskit_machine_learning</code></td><td>0.6.1</td></tr><tr><th colspan='2'>System information</th></tr><tr><td>Python version</td><td>3.10.8</td></tr><tr><td>Python compiler</td><td>GCC 10.4.0</td></tr><tr><td>Python build</td><td>main, Nov 22 2022 08:26:04</td></tr><tr><td>OS</td><td>Linux</td></tr><tr><td>CPUs</td><td>8</td></tr><tr><td>Memory (Gb)</td><td>31.14303207397461</td></tr><tr><td colspan='2'>Tue Oct 03 21:31:30 2023 UTC</td></tr></table>" + ], + "text/plain": [ + "<IPython.core.display.HTML object>" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "<div style='width: 100%; background-color:#d5d9e0;padding-left: 10px; padding-bottom: 10px; padding-right: 10px; padding-top: 5px'><h3>This code is a part of Qiskit</h3><p>© Copyright IBM 2017, 2023.</p><p>This code is licensed under the Apache License, Version 2.0. You may<br>obtain a copy of this license in the LICENSE.txt file in the root directory<br> of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.<p>Any modifications or derivative works of this code must retain this<br>copyright notice, and modified files need to carry a notice indicating<br>that they have been altered from the originals.</p></div>" + ], + "text/plain": [ + "<IPython.core.display.HTML object>" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import qiskit.tools.jupyter\n", + "\n", + "%qiskit_version_table\n", + "%qiskit_copyright" + ] + } + ], + "metadata": { + "celltoolbar": "Tags", + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.8" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "state": { + "01d066c1f1c54c6c8cfb7842f68b10a6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_87dbae66a3634ce9a17e663132e049af", + "style": "IPY_MODEL_02a2f339bd9145feb46db29329c08d1d", + "value": "<h5>Queue</h5>" + } + }, + "02a2f339bd9145feb46db29329c08d1d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "0eede61492c84361a1162a251e9748ed": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "4307f7cfa8754e28aede069c72cecc43": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_aa639da14ee24fb6b017115d2c7d102c", + "style": "IPY_MODEL_d8838e34cdaf45e1b5d77c5a7035b4e1", + "value": "<p style='font-family: IBM Plex Sans, Arial, Helvetica, sans-serif; font-size: 20px; font-weight: medium;'>Circuit Properties</p>" + } + }, + "4aeec806743247e8a17287ee5a25494a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_aee57f277cdf4edfb3617e7224f7933b", + "style": "IPY_MODEL_0eede61492c84361a1162a251e9748ed", + "value": "<h5>Message</h5>" + } + }, + "568e3882e1a14cfaabb25e8c0e381599": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "ButtonModel", + "state": { + "button_style": "primary", + "description": "Clear", + "layout": "IPY_MODEL_56b5c9b59ddc4ec7820139770c33929f", + "style": "IPY_MODEL_805dc61de96b42b480a3b71dccea1c63", + "tooltip": null + } + }, + "56b5c9b59ddc4ec7820139770c33929f": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "grid_area": "right", + "padding": "0px 0px 0px 0px", + "width": "70px" + } + }, + "662bd9ab875446a38b40fd53e41a7a95": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "grid_template_areas": "\n \". . . . right \"\n ", + "grid_template_columns": "20% 20% 20% 20% 20%", + "width": "100%" + } + }, + "7729f5d6c8a0479e9450127f73dee66a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_ae9088c42d7a426aaa942c8b10e8070c", + "style": "IPY_MODEL_98bc89a17b254b5180657f8b616321f2", + "value": "<h5>Job ID</h5>" + } + }, + "7e39e325220040478aed085939ceb13c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HBoxModel", + "state": { + "children": [ + "IPY_MODEL_7729f5d6c8a0479e9450127f73dee66a", + "IPY_MODEL_88a43a8aab17497b988d98fe58011310", + "IPY_MODEL_bf977d5fba7f40149adeed029aca024b", + "IPY_MODEL_01d066c1f1c54c6c8cfb7842f68b10a6", + "IPY_MODEL_4aeec806743247e8a17287ee5a25494a" + ], + "layout": "IPY_MODEL_e9929efdf3d948f284c88be52b7683f0" + } + }, + "805dc61de96b42b480a3b71dccea1c63": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "ButtonStyleModel", + "state": { + "font_family": null, + "font_size": null, + "font_style": null, + "font_variant": null, + "font_weight": null, + "text_color": null, + "text_decoration": null + } + }, + "87dbae66a3634ce9a17e663132e049af": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "width": "70px" + } + }, + "881d7fb7653143b886d3fe7d452946f3": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "width": "145px" + } + }, + "88a43a8aab17497b988d98fe58011310": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_881d7fb7653143b886d3fe7d452946f3", + "style": "IPY_MODEL_d13264c6c07d44299ac4a6f8d67b22a9", + "value": "<h5>Backend</h5>" + } + }, + "9288ad31da9341a79b2c28f4da1fb995": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "width": "95px" + } + }, + "98bc89a17b254b5180657f8b616321f2": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "9bbd7caa4133469c97d1810d18c52541": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "aa639da14ee24fb6b017115d2c7d102c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "margin": "0px 0px 10px 0px" + } + }, + "ae9088c42d7a426aaa942c8b10e8070c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "width": "190px" + } + }, + "aee57f277cdf4edfb3617e7224f7933b": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": {} + }, + "bf977d5fba7f40149adeed029aca024b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLModel", + "state": { + "layout": "IPY_MODEL_9288ad31da9341a79b2c28f4da1fb995", + "style": "IPY_MODEL_9bbd7caa4133469c97d1810d18c52541", + "value": "<h5>Status</h5>" + } + }, + "d13264c6c07d44299ac4a6f8d67b22a9": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "d8838e34cdaf45e1b5d77c5a7035b4e1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "HTMLStyleModel", + "state": { + "description_width": "", + "font_size": null, + "text_color": null + } + }, + "e9929efdf3d948f284c88be52b7683f0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "2.0.0", + "model_name": "LayoutModel", + "state": { + "margin": "0px 0px 0px 37px", + "width": "600px" + } + }, + "efd0a78a618741649620bdff3b084938": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "2.0.0", + "model_name": "GridBoxModel", + "state": { + "children": [ + "IPY_MODEL_568e3882e1a14cfaabb25e8c0e381599" + ], + "layout": "IPY_MODEL_662bd9ab875446a38b40fd53e41a7a95" + } + } + }, + "version_major": 2, + "version_minor": 0 + } + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}