Switch to side-by-side view

--- a
+++ b/Code/All PennyLane QML Demos/26 Learn QNNs 210.2s kkawchak.ipynb
@@ -0,0 +1,1947 @@
+{
+  "cells": [
+    {
+      "cell_type": "code",
+      "execution_count": 27,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "2eG4NG-MW7k4",
+        "outputId": "a21ced57-ae5d-4584-abb4-283c047eb428"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Time in seconds since beginning of run: 1701021597.313174\n",
+            "Sun Nov 26 17:59:57 2023\n"
+          ]
+        }
+      ],
+      "source": [
+        "# This cell is added by sphinx-gallery\n",
+        "# It can be customized to whatever you like\n",
+        "%matplotlib inline\n",
+        "# !pip install pennylane\n",
+        "# !pip install tensorflow==2.8.1\n",
+        "import time\n",
+        "seconds = time.time()\n",
+        "print(\"Time in seconds since beginning of run:\", seconds)\n",
+        "local_time = time.ctime(seconds)\n",
+        "print(local_time)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "rytAL-cWW7k5"
+      },
+      "source": [
+        "Learning to learn with quantum neural networks\n",
+        "==============================================\n",
+        "\n",
+        "::: {.meta}\n",
+        ":property=\\\"og:description\\\": Use a classical recurrent neural network\n",
+        "to initilize the parameters of a variational quatum algorithm.\n",
+        ":property=\\\"og:image\\\": ../demonstrations/learning2learn/thumbnail.png\n",
+        ":::\n",
+        "\n",
+        "::: {.related}\n",
+        "tutorial\\_qaoa\\_intro Intro to QAOA tutorial\\_qaoa\\_maxcut QAOA for\n",
+        "MaxCut problem\n",
+        ":::\n",
+        "\n",
+        "*Author: Stefano Mangini --- Posted: 02 March 2021. Last updated: 15\n",
+        "September 2021.*\n",
+        "\n",
+        "In this demo we recreate the architecture proposed in *Learning to learn\n",
+        "with quantum neural networks via classical neural networks*, using\n",
+        "**PennyLane** and **TensorFlow**. We use classical recurrent neural\n",
+        "networks to assist the optimization of variational quantum algorithms.\n",
+        "\n",
+        "We start with a brief theoretical overview explaining the problem and\n",
+        "the setup used to solve it. After that, we deep dive into the code to\n",
+        "build a fully functioning model, ready to be further developed or\n",
+        "customized for your own needs. Without further ado, let's begin!\n",
+        "\n",
+        "Problem: Optimization of Variational Quantum Algorithms\n",
+        "-------------------------------------------------------\n",
+        "\n",
+        "Recently, a big effort by the quantum computing community has been\n",
+        "devoted to the study of variational quantum algorithms (VQAs) which\n",
+        "leverage quantum circuits with fixed shape and tunable parameters. The\n",
+        "idea is similar to classical neural networks, where the weights of the\n",
+        "network are optimized during training. Similarly, once the shape of the\n",
+        "variational quantum circuit is chosen --- something that is very\n",
+        "difficult and sensitive to the particular task at hand --- its tunable\n",
+        "parameters are optimized iteratively by minimizing a cost (or loss)\n",
+        "function, which measures how good the quantum algorithm is performing\n",
+        "(see for a thorough overview on VQAs).\n",
+        "\n",
+        "A major challenge for VQAs relates to the optimization of tunable\n",
+        "parameters, which was shown to be a very hard task, . Parameter\n",
+        "initialization plays a key role in this scenario, since initializing the\n",
+        "parameters in the proximity of an optimal solution leads to faster\n",
+        "convergence and better results. Thus, a good initialization strategy is\n",
+        "crucial to promote the convergence of local optimizers to local extrema\n",
+        "and to select reasonably good local minima. By local optimizer, we mean\n",
+        "a procedure that moves from one solution to another by small (local)\n",
+        "changes in parameter space. These are opposed to global search methods,\n",
+        "which take into account large sections of parameter space to propose a\n",
+        "new solution.\n",
+        "\n",
+        "One such strategy could come from the classical machine learning\n",
+        "literature.\n",
+        "\n",
+        "Solution: Classical Recurrent Neural Networks\n",
+        "---------------------------------------------\n",
+        "\n",
+        "By building on results from the *meta-learning* literature in machine\n",
+        "learning, authors in propose to use a Recurrent Neural Network (RNN) as\n",
+        "a black-box controller to optimize the parameters of variational quantum\n",
+        "algorithms, as shown in the figure below. The cost function used is the\n",
+        "expectation value\n",
+        "$\\langle H \\rangle_{\\boldsymbol{\\theta}} = \\langle \\psi_{\\boldsymbol{\\theta}} | H | \\psi_{\\boldsymbol{\\theta}}\\rangle$\n",
+        "of a Hamiltonian $H$ with respect to the parametrized state\n",
+        "$|\\psi_\\boldsymbol{\\theta}\\rangle$ evolved by applying the variational\n",
+        "quantum circuit to the zero state $|00\\cdots0\\rangle$.\n",
+        "\n",
+        "![](../demonstrations/learning2learn/HybridLSTM.png){.align-center\n",
+        "width=\"100.0%\"}\n",
+        "\n",
+        "Given parameters $\\boldsymbol{\\theta}_{t-1}$ of the variational quantum\n",
+        "circuit, the cost function $y_{t-1}$, and the hidden state of the\n",
+        "classical network $\\boldsymbol{h}_{t-1}$ at the previous time step, the\n",
+        "recurrent neural network proposes a new guess for the parameters\n",
+        "$\\boldsymbol{\\theta}_t$, which are then fed into the quantum computer to\n",
+        "evaluate the cost function $y_t$. By repeating this cycle a few times,\n",
+        "and by training the weights of the recurrent neural network to minimize\n",
+        "the loss function $y_t$, a good initialization heuristic is found for\n",
+        "the parameters $\\boldsymbol{\\theta}$ of the variational quantum circuit.\n",
+        "\n",
+        "At a given iteration, the RNN receives as input the previous cost\n",
+        "function $y_t$ evaluated on the quantum computer, where $y_t$ is the\n",
+        "estimate of $\\langle H\\rangle_{t}$, as well as the parameters\n",
+        "$\\boldsymbol{\\theta}_t$ for which the variational circuit was evaluated.\n",
+        "The RNN at this time step also receives information stored in its\n",
+        "internal hidden state from the previous time step $\\boldsymbol{h}_t$.\n",
+        "The RNN itself has trainable parameters $\\phi$, and hence it applies the\n",
+        "parametrized mapping:\n",
+        "\n",
+        "$$\\boldsymbol{h}_{t+1}, \\boldsymbol{\\theta}_{t+1} = \\text{RNN}_{\\phi}(\\boldsymbol{h}_{t}, \\boldsymbol{\\theta}_{t}, y_{t}),$$\n",
+        "\n",
+        "which generates a new suggestion for the variational parameters as well\n",
+        "as a new internal state. Upon training the weights $\\phi$, the RNN\n",
+        "eventually learns a good heuristic to suggest optimal parameters for the\n",
+        "quantum circuit.\n",
+        "\n",
+        "Thus, by training on a dataset of graphs, the RNN can subsequently be\n",
+        "used to provide suggestions for starting points on new graphs! We are\n",
+        "not directly optimizing the variational parameters of the quantum\n",
+        "circuit, but instead, we let the RNN figure out how to do that. In this\n",
+        "sense, we are learning (training the RNN) how to learn (how to optimize\n",
+        "a variational quantum circuit).\n",
+        "\n",
+        "**VQAs in focus: QAOA for MaxCut**\n",
+        "\n",
+        "There are multiple VQAs for which this hybrid training routine could be\n",
+        "used, some of them directly analyzed in. In the following, we focus on\n",
+        "one such example, the Quantum Approximate Optimization Algorithm (QAOA)\n",
+        "for solving the MaxCut problem. Thus, referring to the picture above,\n",
+        "the shape of the variational circuit is the one dictated by the QAOA\n",
+        "ansatz, and such a quantum circuit is used to evaluate the cost\n",
+        "Hamiltonian $H$ of the MaxCut problem. Check out this great tutorial on\n",
+        "how to use QAOA for solving graph problems:\n",
+        "<https://pennylane.ai/qml/demos/tutorial_qaoa_intro.html>\n",
+        "\n",
+        "::: {.note}\n",
+        "::: {.title}\n",
+        "Note\n",
+        ":::\n",
+        "\n",
+        "Running the tutorial (excluding the Appendix) requires approx. \\~13m.\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "eu3eqTZIW7k6"
+      },
+      "source": [
+        "**Importing the required packages**\n",
+        "\n",
+        "During this tutorial, we will use **PennyLane** for executing quantum\n",
+        "circuits and for integrating seamlessly with **TensorFlow**, which will\n",
+        "be used for creating the RNN.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 28,
+      "metadata": {
+        "id": "VZImhkcGW7k6"
+      },
+      "outputs": [],
+      "source": [
+        "# Quantum Machine Learning\n",
+        "import pennylane as qml\n",
+        "from pennylane import qaoa\n",
+        "\n",
+        "# Classical Machine Learning\n",
+        "import tensorflow as tf\n",
+        "\n",
+        "# Generation of graphs\n",
+        "import networkx as nx\n",
+        "\n",
+        "# Standard Python libraries\n",
+        "import numpy as np\n",
+        "import matplotlib.pyplot as plt\n",
+        "import random\n",
+        "\n",
+        "# Fix the seed for reproducibility, which affects all random functions in this demo\n",
+        "random.seed(42)\n",
+        "np.random.seed(42)\n",
+        "tf.random.set_seed(42)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "fiDUxW0HW7k6"
+      },
+      "source": [
+        "Generation of training data: graphs\n",
+        "===================================\n",
+        "\n",
+        "The first step is to gather or create a good dataset that will be used\n",
+        "to train the model and test its performance. In our case, we are\n",
+        "analyzing MaxCut, which deals with the problem of finding a good binary\n",
+        "partition of nodes in a graph such that the number of edges *cut* by\n",
+        "such a separation is maximized. We start by generating some random\n",
+        "graphs $G_{n,p}$ where:\n",
+        "\n",
+        "-   $n$ is the number of nodes in each graph,\n",
+        "-   $p$ is the probability of having an edge between two nodes.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 29,
+      "metadata": {
+        "id": "c_rJNPLlW7k7"
+      },
+      "outputs": [],
+      "source": [
+        "def generate_graphs(n_graphs, n_nodes, p_edge):\n",
+        "    \"\"\"Generate a list containing random graphs generated by Networkx.\"\"\"\n",
+        "\n",
+        "    datapoints = []\n",
+        "    for _ in range(n_graphs):\n",
+        "        random_graph = nx.gnp_random_graph(n_nodes, p=p_edge)\n",
+        "        datapoints.append(random_graph)\n",
+        "    return datapoints"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "N6Lj3XcbW7k7"
+      },
+      "source": [
+        "An example of a random graph generated using the function\n",
+        "`generate_graphs` just defined:\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 30,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 516
+        },
+        "id": "sT8W61vWW7k7",
+        "outputId": "b5ef7fa4-e75a-47eb-82d8-6254ed445ff2"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": "\n"
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "# Define parameters of the graphs\n",
+        "n_graphs = 20\n",
+        "n_nodes = 7\n",
+        "p_edge = 3.0 / n_nodes\n",
+        "graphs = generate_graphs(n_graphs, n_nodes, p_edge)\n",
+        "\n",
+        "nx.draw(graphs[0])"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "syFn_GVnW7k7"
+      },
+      "source": [
+        "![](../demonstrations/learning2learn/rendered_Graph0.png){.align-center\n",
+        "width=\"70.0%\"}\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "A-yzL4ASW7k7"
+      },
+      "source": [
+        "Variational Quantum Circuit: QAOA\n",
+        "=================================\n",
+        "\n",
+        "Now that we have a dataset, we move on by creating the QAOA quantum\n",
+        "circuits using PennyLane's built-in sub-packages. In particular, using\n",
+        "PennyLane's `qaoa` module, we will able to create fully functioning\n",
+        "quantum circuits for the MaxCut problem, with very few lines of code.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 31,
+      "metadata": {
+        "id": "TSE2tqZnW7k7"
+      },
+      "outputs": [],
+      "source": [
+        "def qaoa_from_graph(graph, n_layers=1):\n",
+        "    \"\"\"Uses QAOA to create a cost Hamiltonian for the MaxCut problem.\"\"\"\n",
+        "\n",
+        "    # Number of qubits (wires) equal to the number of nodes in the graph\n",
+        "    wires = range(len(graph.nodes))\n",
+        "\n",
+        "    # Define the structure of the cost and mixer subcircuits for the MaxCut problem\n",
+        "    cost_h, mixer_h = qaoa.maxcut(graph)\n",
+        "\n",
+        "    # Defines a layer of the QAOA ansatz from the cost and mixer Hamiltonians\n",
+        "    def qaoa_layer(gamma, alpha):\n",
+        "        qaoa.cost_layer(gamma, cost_h)\n",
+        "        qaoa.mixer_layer(alpha, mixer_h)\n",
+        "\n",
+        "    # Creates the actual quantum circuit for the QAOA algorithm\n",
+        "    def circuit(params, **kwargs):\n",
+        "        for w in wires:\n",
+        "            qml.Hadamard(wires=w)\n",
+        "        qml.layer(qaoa_layer, n_layers, params[0], params[1])\n",
+        "        return qml.expval(cost_h)\n",
+        "\n",
+        "    # Evaluates the cost Hamiltonian\n",
+        "    def hamiltonian(params, **kwargs):\n",
+        "        \"\"\"Evaluate the cost Hamiltonian, given the angles and the graph.\"\"\"\n",
+        "\n",
+        "        # We set the default.qubit.tf device for seamless integration with TensorFlow\n",
+        "        dev = qml.device(\"default.qubit.tf\", wires=len(graph.nodes))\n",
+        "\n",
+        "        # This qnode evaluates the expectation value of the cost hamiltonian operator\n",
+        "        cost = qml.QNode(circuit, dev, interface=\"tf\", diff_method=\"backprop\")\n",
+        "\n",
+        "        return cost(params)\n",
+        "\n",
+        "    return hamiltonian"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OghYe4TIW7k7"
+      },
+      "source": [
+        "Before continuing, let's see how to use these functions.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 32,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "_M3VTQRpW7k7",
+        "outputId": "1f47ab70-5a9e-4f05-99fc-4774ba27a4cf"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "tf.Tensor(-3.1932679572555784, shape=(), dtype=float64)\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Create an instance of a QAOA circuit given a graph.\n",
+        "cost = qaoa_from_graph(graph=graphs[0], n_layers=1)\n",
+        "\n",
+        "# Since we use only one layer in QAOA, params have the shape 1 x 2,\n",
+        "# in the form [[alpha, gamma]].\n",
+        "x = tf.Variable([[0.5], [0.5]], dtype=tf.float32)\n",
+        "\n",
+        "# Evaluate th QAOA instance just created with some angles.\n",
+        "print(cost(x))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "thDyEL3MW7k7"
+      },
+      "source": [
+        "::: {.rst-class}\n",
+        "sphx-glr-script-out\n",
+        "\n",
+        "Out:\n",
+        "\n",
+        "``` {.none}\n",
+        "tf.Tensor(-3.193267957255582, shape=(), dtype=float64)\n",
+        "```\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "SdSJTG_LW7k7"
+      },
+      "source": [
+        "Recurrent Neural Network: LSTM\n",
+        "==============================\n",
+        "\n",
+        "So far, we have defined the machinery which lets us build the QAOA\n",
+        "algorithm for solving the MaxCut problem. Now we wish to implement the\n",
+        "Recurrent Neural Network architecture explained previously. As proposed\n",
+        "in the original paper, we will build a custom model of a Long-Short Term\n",
+        "Memory (LSTM) network, capable of handling the hybrid data passing\n",
+        "between classical and quantum procedures. For this task, we will use\n",
+        "`Keras` and `TensorFlow`.\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ka7S-aTmW7k7"
+      },
+      "source": [
+        "First of all, let's define the elemental building block of the model, an\n",
+        "LSTM cell (see [TensorFlow\n",
+        "documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers/LSTMCell)\n",
+        "for further details).\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 33,
+      "metadata": {
+        "id": "qlZAWZSIW7k8"
+      },
+      "outputs": [],
+      "source": [
+        "# Set the number of layers in the QAOA ansatz.\n",
+        "# The higher the better in terms of performance, but it also gets more\n",
+        "# computationally expensive. For simplicity, we stick to the single layer case.\n",
+        "n_layers = 1\n",
+        "\n",
+        "# Define a single LSTM cell.\n",
+        "# The cell has two units per layer since each layer in the QAOA ansatz\n",
+        "# makes use of two parameters.\n",
+        "cell = tf.keras.layers.LSTMCell(2 * n_layers)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "XrSz5XhlW7k8"
+      },
+      "source": [
+        "Using the `qaoa_from_graph` function, we create a list `graph_cost_list`\n",
+        "containing the cost functions of a set of graphs. You can see this as a\n",
+        "preprocessing step of the data.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 34,
+      "metadata": {
+        "id": "YLfLPV0SW7k8"
+      },
+      "outputs": [],
+      "source": [
+        "# We create the QAOA MaxCut cost functions of some graphs\n",
+        "graph_cost_list = [qaoa_from_graph(g) for g in graphs]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "e-EDV2h3W7k8"
+      },
+      "source": [
+        "At this stage, we seek to reproduce the recurrent behavior depicted in\n",
+        "the picture above, outlining the functioning of an RNN as a black-box\n",
+        "optimizer. We do so by defining two functions:\n",
+        "\n",
+        "-   `rnn_iteration`: accounts for the computations happening on a single\n",
+        "    time step in the figure. It performs the calculation inside the CPU\n",
+        "    and evaluates the quantum circuit on the QPU to obtain the loss\n",
+        "    function for the current parameters.\n",
+        "-   `recurrent_loop`: as the name suggests, it accounts for the creation\n",
+        "    of the recurrent loop of the model. In particular, it makes\n",
+        "    consecutive calls to the `rnn_iteration` function, where the outputs\n",
+        "    of a previous call are fed as inputs of the next call.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 35,
+      "metadata": {
+        "id": "yIdzr_qAW7k8"
+      },
+      "outputs": [],
+      "source": [
+        "def rnn_iteration(inputs, graph_cost, n_layers=1):\n",
+        "    \"\"\"Perform a single time step in the computational graph of the custom RNN.\"\"\"\n",
+        "\n",
+        "    # Unpack the input list containing the previous cost, parameters,\n",
+        "    # and hidden states (denoted as 'h' and 'c').\n",
+        "    prev_cost = inputs[0]\n",
+        "    prev_params = inputs[1]\n",
+        "    prev_h = inputs[2]\n",
+        "    prev_c = inputs[3]\n",
+        "\n",
+        "    # Concatenate the previous parameters and previous cost to create new input\n",
+        "    new_input = tf.keras.layers.concatenate([prev_cost, prev_params])\n",
+        "\n",
+        "    # Call the LSTM cell, which outputs new values for the parameters along\n",
+        "    # with new internal states h and c\n",
+        "    new_params, [new_h, new_c] = cell(new_input, states=[prev_h, prev_c])\n",
+        "\n",
+        "    # Reshape the parameters to correctly match those expected by PennyLane\n",
+        "    _params = tf.reshape(new_params, shape=(2, n_layers))\n",
+        "\n",
+        "    # Evaluate the cost using new angles\n",
+        "    _cost = graph_cost(_params)\n",
+        "\n",
+        "    # Reshape to be consistent with other tensors\n",
+        "    new_cost = tf.reshape(tf.cast(_cost, dtype=tf.float32), shape=(1, 1))\n",
+        "\n",
+        "    return [new_cost, new_params, new_h, new_c]\n",
+        "\n",
+        "\n",
+        "def recurrent_loop(graph_cost, n_layers=1, intermediate_steps=False):\n",
+        "    \"\"\"Creates the recurrent loop for the Recurrent Neural Network.\"\"\"\n",
+        "\n",
+        "    # Initialize starting all inputs (cost, parameters, hidden states) as zeros.\n",
+        "    initial_cost = tf.zeros(shape=(1, 1))\n",
+        "    initial_params = tf.zeros(shape=(1, 2 * n_layers))\n",
+        "    initial_h = tf.zeros(shape=(1, 2 * n_layers))\n",
+        "    initial_c = tf.zeros(shape=(1, 2 * n_layers))\n",
+        "\n",
+        "    # We perform five consecutive calls to 'rnn_iteration', thus creating the\n",
+        "    # recurrent loop. More iterations lead to better results, at the cost of\n",
+        "    # more computationally intensive simulations.\n",
+        "    out0 = rnn_iteration([initial_cost, initial_params, initial_h, initial_c], graph_cost)\n",
+        "    out1 = rnn_iteration(out0, graph_cost)\n",
+        "    out2 = rnn_iteration(out1, graph_cost)\n",
+        "    out3 = rnn_iteration(out2, graph_cost)\n",
+        "    out4 = rnn_iteration(out3, graph_cost)\n",
+        "\n",
+        "    # This cost function takes into account the cost from all iterations,\n",
+        "    # but using different weights.\n",
+        "    loss = tf.keras.layers.average(\n",
+        "        [0.1 * out0[0], 0.2 * out1[0], 0.3 * out2[0], 0.4 * out3[0], 0.5 * out4[0]]\n",
+        "    )\n",
+        "\n",
+        "    if intermediate_steps:\n",
+        "        return [out0[1], out1[1], out2[1], out3[1], out4[1], loss]\n",
+        "    else:\n",
+        "        return loss"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "AOu2R88QW7k8"
+      },
+      "source": [
+        "**The cost function**\n",
+        "\n",
+        "A key part in the `recurrent_loop` function is given by the definition\n",
+        "of the variable `loss`. In order to drive the learning procedure of the\n",
+        "weights in the LSTM cell, a cost function is needed. While in the\n",
+        "original paper the authors suggest using a measure called *observed\n",
+        "improvement*, for simplicity here we use an easier cost function\n",
+        "$\\cal{L}(\\phi)$ defined as:\n",
+        "\n",
+        "$$\\cal{L}(\\phi) = {\\bf w} \\cdot {\\bf y}_t(\\phi),$$\n",
+        "\n",
+        "where ${\\bf y}_t(\\phi) = (y_1, \\cdots, y_5)$ contains the Hamiltonian\n",
+        "cost functions from all iterations, and ${\\bf w}$ are just some\n",
+        "coefficients weighting the different steps in the recurrent loop. In\n",
+        "this case, we used ${\\bf w}=\\frac{1}{5} (0.1, 0.2, 0.3, 0.4, 0.5)$, to\n",
+        "give more importance to the last steps rather than the initial steps.\n",
+        "Intuitively in this way the RNN is more free (low coefficient) to\n",
+        "explore a larger portion of parameter space during the first steps of\n",
+        "optimization, while it is constrained (high coefficient) to select an\n",
+        "optimal solution towards the end of the procedure. Note that one could\n",
+        "also use just the final cost function from the last iteration to drive\n",
+        "the training procedure of the RNN. However, using values also from\n",
+        "intermediate steps allows for a smoother suggestion routine, since even\n",
+        "non-optimal parameter suggestions from early steps are penalized using\n",
+        "$\\cal{L}(\\phi)$.\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Yhl1qlXuW7k8"
+      },
+      "source": [
+        "**Training**\n",
+        "\n",
+        "Now all the cards are on the table and we just need to prepare a\n",
+        "training routine and then run it!\n",
+        "\n",
+        "First of all, let's wrap a single gradient descent step inside a custom\n",
+        "function `train_step`.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 36,
+      "metadata": {
+        "id": "3sjfCmA8W7k8"
+      },
+      "outputs": [],
+      "source": [
+        "def train_step(graph_cost):\n",
+        "    \"\"\"Single optimization step in the training procedure.\"\"\"\n",
+        "\n",
+        "    with tf.GradientTape() as tape:\n",
+        "        # Evaluates the cost function\n",
+        "        loss = recurrent_loop(graph_cost)\n",
+        "\n",
+        "    # Evaluates gradients, cell is the LSTM cell defined previously\n",
+        "    grads = tape.gradient(loss, cell.trainable_weights)\n",
+        "\n",
+        "    # Apply gradients and update the weights of the LSTM cell\n",
+        "    opt.apply_gradients(zip(grads, cell.trainable_weights))\n",
+        "    return loss"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "uw0zIpuPW7k8"
+      },
+      "source": [
+        "We are now ready to start the training. In particular, we will perform a\n",
+        "stochastic gradient descent in the parameter space of the weights of the\n",
+        "LSTM cell. For each graph in the training set, we evaluate gradients and\n",
+        "update the weights accordingly. Then, we repeat this procedure for\n",
+        "multiple times (epochs).\n",
+        "\n",
+        "::: {.note}\n",
+        "::: {.title}\n",
+        "Note\n",
+        ":::\n",
+        "\n",
+        "Be careful when using bigger datasets or training for larger epochs,\n",
+        "this may take a while to execute.\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 37,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "U3LB5fBVW7k8",
+        "outputId": "278fd59f-7039-44e0-982a-b2e0d597a4c8"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Epoch 1\n",
+            " > Graph 1/20 - Loss: -1.6641689538955688\n",
+            " > Graph 6/20 - Loss: -1.4287620782852173\n",
+            " > Graph 11/20 - Loss: -1.3801934719085693\n",
+            " > Graph 16/20 - Loss: -1.2876912355422974\n",
+            " >> Mean Loss during epoch: -1.7656798183917999\n",
+            "Epoch 2\n",
+            " > Graph 1/20 - Loss: -2.0880496501922607\n",
+            " > Graph 6/20 - Loss: -1.502755880355835\n",
+            " > Graph 11/20 - Loss: -1.3986393213272095\n",
+            " > Graph 16/20 - Loss: -1.3164150714874268\n",
+            " >> Mean Loss during epoch: -1.830520111322403\n",
+            "Epoch 3\n",
+            " > Graph 1/20 - Loss: -2.1370296478271484\n",
+            " > Graph 6/20 - Loss: -1.4908565282821655\n",
+            " > Graph 11/20 - Loss: -1.3993809223175049\n",
+            " > Graph 16/20 - Loss: -1.32102370262146\n",
+            " >> Mean Loss during epoch: -1.8355653047561646\n",
+            "Epoch 4\n",
+            " > Graph 1/20 - Loss: -2.1322379112243652\n",
+            " > Graph 6/20 - Loss: -1.4994161128997803\n",
+            " > Graph 11/20 - Loss: -1.3980844020843506\n",
+            " > Graph 16/20 - Loss: -1.3252378702163696\n",
+            " >> Mean Loss during epoch: -1.8335859596729278\n",
+            "Epoch 5\n",
+            " > Graph 1/20 - Loss: -2.1243948936462402\n",
+            " > Graph 6/20 - Loss: -1.5013129711151123\n",
+            " > Graph 11/20 - Loss: -1.3975803852081299\n",
+            " > Graph 16/20 - Loss: -1.325676679611206\n",
+            " >> Mean Loss during epoch: -1.8336702287197113\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Select an optimizer\n",
+        "opt = tf.keras.optimizers.Adam(learning_rate=0.2)\n",
+        "\n",
+        "# Set the number of training epochs\n",
+        "epochs = 5\n",
+        "\n",
+        "for epoch in range(epochs):\n",
+        "    print(f\"Epoch {epoch+1}\")\n",
+        "    total_loss = np.array([])\n",
+        "    for i, graph_cost in enumerate(graph_cost_list):\n",
+        "        loss = train_step(graph_cost)\n",
+        "        total_loss = np.append(total_loss, loss.numpy())\n",
+        "        # Log every 5 batches.\n",
+        "        if i % 5 == 0:\n",
+        "            print(f\" > Graph {i+1}/{len(graph_cost_list)} - Loss: {loss[0][0]}\")\n",
+        "    print(f\" >> Mean Loss during epoch: {np.mean(total_loss)}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "x0rSiW_BW7k8"
+      },
+      "source": [
+        "::: {.rst-class}\n",
+        "sphx-glr-script-out\n",
+        "\n",
+        "Out:\n",
+        "\n",
+        "``` {.none}\n",
+        "Epoch 1\n",
+        " > Graph 1/20 - Loss: -1.6641689538955688\n",
+        " > Graph 6/20 - Loss: -1.4186843633651733\n",
+        " > Graph 11/20 - Loss: -1.3757232427597046\n",
+        " > Graph 16/20 - Loss: -1.294339656829834\n",
+        " >> Mean Loss during epoch: -1.7352586269378663\n",
+        "Epoch 2\n",
+        " > Graph 1/20 - Loss: -2.119091749191284\n",
+        " > Graph 6/20 - Loss: -1.4789190292358398\n",
+        " > Graph 11/20 - Loss: -1.3779840469360352\n",
+        " > Graph 16/20 - Loss: -1.2963457107543945\n",
+        " >> Mean Loss during epoch: -1.8252217948436738\n",
+        "Epoch 3\n",
+        " > Graph 1/20 - Loss: -2.1322619915008545\n",
+        " > Graph 6/20 - Loss: -1.459418535232544\n",
+        " > Graph 11/20 - Loss: -1.390620470046997\n",
+        " > Graph 16/20 - Loss: -1.3165746927261353\n",
+        " >> Mean Loss during epoch: -1.8328069806098939\n",
+        "Epoch 4\n",
+        " > Graph 1/20 - Loss: -2.1432175636291504\n",
+        " > Graph 6/20 - Loss: -1.476362943649292\n",
+        " > Graph 11/20 - Loss: -1.3938289880752563\n",
+        " > Graph 16/20 - Loss: -1.3140206336975098\n",
+        " >> Mean Loss during epoch: -1.8369774043560028\n",
+        "Epoch 5\n",
+        " > Graph 1/20 - Loss: -2.1429405212402344\n",
+        " > Graph 6/20 - Loss: -1.477513074874878\n",
+        " > Graph 11/20 - Loss: -1.3909202814102173\n",
+        " > Graph 16/20 - Loss: -1.315887689590454\n",
+        " >> Mean Loss during epoch: -1.8371947884559632\n",
+        "```\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "OKnOYDO7W7k8"
+      },
+      "source": [
+        "As you can see, the Loss for each graph keeps decreasing across epochs,\n",
+        "indicating that the training routine is working correctly.\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "6deUvXWKW7k8"
+      },
+      "source": [
+        "Results\n",
+        "=======\n",
+        "\n",
+        "Let's see how to use the optimized RNN as an initializer for the angles\n",
+        "in the QAOA algorithm.\n",
+        "\n",
+        "First, we pick a new graph, not present in the training dataset:\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 38,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 516
+        },
+        "id": "iBlg7nwWW7k8",
+        "outputId": "02683683-299d-4e3a-84fc-cc6e277a66b6"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": "\n"
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "new_graph = nx.gnp_random_graph(7, p=3 / 7)\n",
+        "new_cost = qaoa_from_graph(new_graph)\n",
+        "\n",
+        "nx.draw(new_graph)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "wWa0Nw6rW7k8"
+      },
+      "source": [
+        "![](../demonstrations/learning2learn/rendered_Graph1.png){.align-center\n",
+        "width=\"70.0%\"}\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "aI5XYFUoW7k8"
+      },
+      "source": [
+        "Then we apply the trained RNN to this new graph, saving intermediate\n",
+        "results coming from all the recurrent iterations in the network.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 39,
+      "metadata": {
+        "id": "iI7WAs2TW7k8"
+      },
+      "outputs": [],
+      "source": [
+        "# Apply the RNN (be sure that training was performed)\n",
+        "res = recurrent_loop(new_cost, intermediate_steps=True)\n",
+        "\n",
+        "# Extract all angle suggestions\n",
+        "start_zeros = tf.zeros(shape=(2 * n_layers, 1))\n",
+        "guess_0 = res[0]\n",
+        "guess_1 = res[1]\n",
+        "guess_2 = res[2]\n",
+        "guess_3 = res[3]\n",
+        "guess_4 = res[4]\n",
+        "final_loss = res[5]\n",
+        "\n",
+        "# Wrap them into a list\n",
+        "guesses = [start_zeros, guess_0, guess_1, guess_2, guess_3, guess_4]\n",
+        "\n",
+        "# Losses from the hybrid LSTM model\n",
+        "lstm_losses = [new_cost(tf.reshape(guess, shape=(2, n_layers))) for guess in guesses]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "gqyCdBCDW7k9"
+      },
+      "source": [
+        "**Plot of the loss function**\n",
+        "\n",
+        "We can plot these losses to see how well the RNN proposes new guesses\n",
+        "for the parameters.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 40,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 453
+        },
+        "id": "XlrZtF4aW7k9",
+        "outputId": "8fa0780d-9f7c-4ba0-bd86-554d3af648da"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": "\n"
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "fig, ax = plt.subplots()\n",
+        "\n",
+        "plt.plot(lstm_losses, color=\"blue\", lw=3, ls=\"-.\", label=\"LSTM\")\n",
+        "\n",
+        "plt.grid(ls=\"--\", lw=2, alpha=0.25)\n",
+        "plt.ylabel(\"Cost function\", fontsize=12)\n",
+        "plt.xlabel(\"Iteration\", fontsize=12)\n",
+        "plt.legend()\n",
+        "ax.set_xticks([0, 5, 10, 15, 20]);\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "KN_QEBswW7k9"
+      },
+      "source": [
+        "![](../demonstrations/learning2learn/rendered_LossLSTM.png){.align-center\n",
+        "width=\"70.0%\"}\n",
+        "\n",
+        "That's remarkable! The RNN learned to propose new parameters such that\n",
+        "the MaxCut cost is minimized very rapidly: in just a few iterations the\n",
+        "loss reaches a minimum. Actually, it takes just a single step for the\n",
+        "LSTM to find a very good minimum. In fact, due to the recurrent loop,\n",
+        "the loss in each time step is directly dependent on the previous ones,\n",
+        "with the first iteration thus having a lot of influence on the loss\n",
+        "function defined above. Changing the loss function, for example giving\n",
+        "less importance to initial steps and just focusing on the last one,\n",
+        "leads to different optimization behaviors, but with the same final\n",
+        "results.\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "q7SfpqYGW7k9"
+      },
+      "source": [
+        "**Comparison with standard Stochastic Gradient Descent (SGD)**\n",
+        "\n",
+        "How well does this method compare with standard optimization techniques,\n",
+        "for example, leveraging Stochastic Gradient Descent (SGD) to optimize\n",
+        "the parameters in the QAOA?\n",
+        "\n",
+        "Let's check it out.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 41,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "JUa6RL_JW7k9",
+        "outputId": "99477710-7a48-4ba7-8785-1f499e89dde4"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Step 1 - Loss = -4.170080496283098\n",
+            "Step 2 - Loss = -4.675035876642076\n",
+            "Step 3 - Loss = -5.099499088193464\n",
+            "Step 4 - Loss = -5.403885327056364\n",
+            "Step 5 - Loss = -5.595292031792333\n",
+            "Step 6 - Loss = -5.704951974646541\n",
+            "Step 7 - Loss = -5.764256097617734\n",
+            "Step 8 - Loss = -5.795331977029804\n",
+            "Step 9 - Loss = -5.811387524394411\n",
+            "Step 10 - Loss = -5.819665291205046\n",
+            "Step 11 - Loss = -5.8239672171224806\n",
+            "Step 12 - Loss = -5.8262453708807325\n",
+            "Step 13 - Loss = -5.827491260066166\n",
+            "Step 14 - Loss = -5.828206256116937\n",
+            "Step 15 - Loss = -5.828643789480292\n",
+            "Final cost function: -5.828932361904981\n",
+            "Optimized angles: [[ 0.5865477 ]\n",
+            " [-0.32288585]]\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Parameters are randomly initialized\n",
+        "x = tf.Variable(np.random.rand(2, 1))\n",
+        "\n",
+        "# We set the optimizer to be a Stochastic Gradient Descent\n",
+        "opt = tf.keras.optimizers.SGD(learning_rate=0.01)\n",
+        "step = 15\n",
+        "\n",
+        "# Training process\n",
+        "steps = []\n",
+        "sdg_losses = []\n",
+        "for _ in range(step):\n",
+        "    with tf.GradientTape() as tape:\n",
+        "        loss = new_cost(x)\n",
+        "\n",
+        "    steps.append(x)\n",
+        "    sdg_losses.append(loss)\n",
+        "\n",
+        "    gradients = tape.gradient(loss, [x])\n",
+        "    opt.apply_gradients(zip(gradients, [x]))\n",
+        "    print(f\"Step {_+1} - Loss = {loss}\")\n",
+        "\n",
+        "print(f\"Final cost function: {new_cost(x).numpy()}\\nOptimized angles: {x.numpy()}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "rqAjFUZsW7k9"
+      },
+      "source": [
+        "::: {.rst-class}\n",
+        "sphx-glr-script-out\n",
+        "\n",
+        "Out:\n",
+        "\n",
+        "``` {.none}\n",
+        "Step 1 - Loss = -4.1700805\n",
+        "Step 2 - Loss = -4.67503588\n",
+        "Step 3 - Loss = -5.09949909\n",
+        "Step 4 - Loss = -5.40388533\n",
+        "Step 5 - Loss = -5.59529203\n",
+        "Step 6 - Loss = -5.70495197\n",
+        "Step 7 - Loss = -5.7642561\n",
+        "Step 8 - Loss = -5.79533198\n",
+        "Step 9 - Loss = -5.81138752\n",
+        "Step 10 - Loss = -5.81966529\n",
+        "Step 11 - Loss = -5.82396722\n",
+        "Step 12 - Loss = -5.82624537\n",
+        "Step 13 - Loss = -5.82749126\n",
+        "Step 14 - Loss = -5.82820626\n",
+        "Step 15 - Loss = -5.82864379\n",
+        "Final cost function: -5.828932361904984\n",
+        "Optimized angles: [[ 0.5865477 ]\n",
+        " [-0.3228858]]\n",
+        "```\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 42,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 453
+        },
+        "id": "h1101um-W7k9",
+        "outputId": "6234e57f-0166-4eb1-a758-27e0f9a1b0b5"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": "\n"
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "fig, ax = plt.subplots()\n",
+        "\n",
+        "plt.plot(sdg_losses, color=\"orange\", lw=3, label=\"SGD\")\n",
+        "\n",
+        "plt.plot(lstm_losses, color=\"blue\", lw=3, ls=\"-.\", label=\"LSTM\")\n",
+        "\n",
+        "plt.grid(ls=\"--\", lw=2, alpha=0.25)\n",
+        "plt.legend()\n",
+        "plt.ylabel(\"Cost function\", fontsize=12)\n",
+        "plt.xlabel(\"Iteration\", fontsize=12)\n",
+        "ax.set_xticks([0, 5, 10, 15, 20]);\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "5R_GawivW7k9"
+      },
+      "source": [
+        "![](../demonstrations/learning2learn/rendered_LossConfrontation.png){.align-center\n",
+        "width=\"70.0%\"}\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "r0n7g1ZqW7k9"
+      },
+      "source": [
+        "*Hurray!* 🎉🎉\n",
+        "\n",
+        "As is clear from the picture, the RNN reaches a better minimum in fewer\n",
+        "iterations than the standard SGD. Thus, as the authors suggest, the\n",
+        "trained RNN can be used for a few iterations at the start of the\n",
+        "training procedure to initialize the parameters of the quantum circuit\n",
+        "close to an optimal solution. Then, a standard optimizer like the SGD\n",
+        "can be used to fine-tune the proposed parameters and reach even better\n",
+        "solutions. While on this small scale example the benefits of using an\n",
+        "LSTM to initialize parameters may seem modest, on more complicated\n",
+        "instances and problems it can make a big difference, since, on random\n",
+        "initialization of the parameters, standard local optimizer may encounter\n",
+        "problems finding a good minimization direction (for further details,\n",
+        "see,).\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "P8A3iSfEW7k9"
+      },
+      "source": [
+        "Final remarks\n",
+        "=============\n",
+        "\n",
+        "In this demo, we saw how to use a recurrent neural network as a\n",
+        "black-box optimizer to initialize the parameters in a variational\n",
+        "quantum circuit close to an optimal solution. We connected MaxCut QAOA\n",
+        "quantum circuits in PennyLane with an LSTM built with TensorFlow, and we\n",
+        "used a custom hybrid training routine to optimize the whole network.\n",
+        "\n",
+        "Such architecture proved itself to be a good candidate for the\n",
+        "initialization problem of Variational Quantum Algorithms, since it\n",
+        "reaches good optimal solutions in very few iterations. Besides, the\n",
+        "architecture is quite general since the same machinery can be used for\n",
+        "graphs having a generic number of nodes (see \\\"Generalization\n",
+        "Performances\\\" in the Appendix).\n",
+        "\n",
+        "**What's next?**\n",
+        "\n",
+        "But the story does not end here. There are multiple ways this work could\n",
+        "be improved. Here are a few:\n",
+        "\n",
+        "-   Use the proposed architecture for VQAs other than QAOA for MaxCut.\n",
+        "    You can check the paper to get some inspiration.\n",
+        "-   Scale up the simulation, using bigger graphs and longer recurrent\n",
+        "    loops.\n",
+        "-   While working correctly, the training routine is quite basic and it\n",
+        "    could be improved for example by implementing batch learning or a\n",
+        "    stopping criterion. Also, one could implement the *observed\n",
+        "    improvement* loss function, as used in the original paper .\n",
+        "-   Depending on the problem, you may wish to transform the functions\n",
+        "    `rnn_iteration` and `recurrent_loop` to actual `Keras Layers` and\n",
+        "    `Models`. This way, by compiling the model before the training takes\n",
+        "    place, `TensorFlow` can create the computational graph of the model\n",
+        "    and train more efficiently. You can find some ideas below to start\n",
+        "    working on it.\n",
+        "\n",
+        "If you\\'re interested, in the Appendix below you can find some more\n",
+        "details and insights about this model. Go check it out!\n",
+        "\n",
+        "If you have any doubt, or wish to discuss about the project don't\n",
+        "hesitate to contact me, I'll be very happy to help you as much as I can\n",
+        "😁\n",
+        "\n",
+        "Have a great quantum day!\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "YL7cEoMeW7k-"
+      },
+      "source": [
+        "References\n",
+        "==========\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "HCTB3zQaW7k-"
+      },
+      "source": [
+        "Appendix\n",
+        "========\n",
+        "\n",
+        "In this appendix you can find further details about the Learning to\n",
+        "Learn approach introduced in this tutorial.\n",
+        "\n",
+        "Generalization performances\n",
+        "---------------------------\n",
+        "\n",
+        "A very interesting feature of this model, is that it can be\n",
+        "straightforwardly applied to graphs having a different number of nodes.\n",
+        "In fact, until now our analysis focused only on graphs with the same\n",
+        "number of nodes for ease of explanation, and there is no actual\n",
+        "restriction in this respect. The same machinery works fine for any\n",
+        "graph, since the number of QAOA parameters are only dependent on the\n",
+        "number of layers in the ansatz, and not on the number of qubits (equal\n",
+        "to the number of nodes in the graph) in the quantum circuit.\n",
+        "\n",
+        "Thus, we might want to challenge our model to learn a good\n",
+        "initialization heuristic for a non-specific graph, with an arbitrary\n",
+        "number of nodes. For this purpose, let's create a training dataset\n",
+        "containing graphs with a different number of nodes $n$, taken in the\n",
+        "interval $n \\in [7,9]$ (that is, our dataset now contains graphs having\n",
+        "either 7, 8 and 9 nodes).\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 43,
+      "metadata": {
+        "id": "_BI5C3FoW7k-"
+      },
+      "outputs": [],
+      "source": [
+        "cell = tf.keras.layers.LSTMCell(2 * n_layers)\n",
+        "\n",
+        "g7 = generate_graphs(5, 7, 3 / 7)\n",
+        "g8 = generate_graphs(5, 8, 3 / 7)\n",
+        "g9 = generate_graphs(5, 9, 3 / 7)\n",
+        "\n",
+        "gs = g7 + g8 + g9\n",
+        "gs_cost_list = [qaoa_from_graph(g) for g in gs]\n",
+        "\n",
+        "# Shuffle the dataset\n",
+        "import random\n",
+        "random.seed(1234)\n",
+        "random.shuffle(gs_cost_list)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "y5y3LsBcW7k-"
+      },
+      "source": [
+        "So far, we have created an equally balanced dataset that contains graphs\n",
+        "with a different number of nodes. We now use this dataset to train the\n",
+        "LSTM.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 44,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "vy_D1kNyW7k-",
+        "outputId": "f18c8edb-f4db-497c-c9f0-a7cb50ab48a0"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Epoch 1\n",
+            " > Graph 1/15 - Loss: [[-1.4876363]]\n",
+            " > Graph 6/15 - Loss: [[-1.5198656]]\n",
+            " > Graph 11/15 - Loss: [[-1.6226677]]\n",
+            " >> Mean Loss during epoch: -1.7711441119511921\n",
+            "Epoch 2\n",
+            " > Graph 1/15 - Loss: [[-1.8782666]]\n",
+            " > Graph 6/15 - Loss: [[-1.9808252]]\n",
+            " > Graph 11/15 - Loss: [[-1.8801485]]\n",
+            " >> Mean Loss during epoch: -2.116814629236857\n",
+            "Epoch 3\n",
+            " > Graph 1/15 - Loss: [[-1.8955324]]\n",
+            " > Graph 6/15 - Loss: [[-1.9727923]]\n",
+            " > Graph 11/15 - Loss: [[-1.8892248]]\n",
+            " >> Mean Loss during epoch: -2.1195581912994386\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Select an optimizer\n",
+        "opt = tf.keras.optimizers.Adam(learning_rate=0.2)\n",
+        "\n",
+        "# Set the number of training epochs\n",
+        "epochs = 3\n",
+        "\n",
+        "for epoch in range(epochs):\n",
+        "    print(f\"Epoch {epoch+1}\")\n",
+        "    total_loss = np.array([])\n",
+        "    for i, graph_cost in enumerate(gs_cost_list):\n",
+        "        loss = train_step(graph_cost)\n",
+        "        total_loss = np.append(total_loss, loss.numpy())\n",
+        "        # Log every 5 batches.\n",
+        "        if i % 5 == 0:\n",
+        "            print(f\" > Graph {i+1}/{len(gs_cost_list)} - Loss: {loss}\")\n",
+        "    print(f\" >> Mean Loss during epoch: {np.mean(total_loss)}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "wMFKWM26W7k-"
+      },
+      "source": [
+        "::: {.rst-class}\n",
+        "sphx-glr-script-out\n",
+        "\n",
+        "Out:\n",
+        "\n",
+        "``` {.none}\n",
+        "Epoch 1\n",
+        "> Graph 1/15 - Loss: [[-1.4876363]]\n",
+        "> Graph 6/15 - Loss: [[-1.8590403]]\n",
+        "> Graph 11/15 - Loss: [[-1.7644017]]\n",
+        ">> Mean Loss during epoch: -1.9704322338104248\n",
+        "Epoch 2\n",
+        "> Graph 1/15 - Loss: [[-1.8650053]]\n",
+        "> Graph 6/15 - Loss: [[-1.9578737]]\n",
+        "> Graph 11/15 - Loss: [[-1.8377447]]\n",
+        ">> Mean Loss during epoch: -2.092947308222453\n",
+        "Epoch 3\n",
+        "> Graph 1/15 - Loss: [[-1.9009062]]\n",
+        "> Graph 6/15 - Loss: [[-1.9726204]]\n",
+        "> Graph 11/15 - Loss: [[-1.8668792]]\n",
+        ">> Mean Loss during epoch: -2.1162660201390584\n",
+        "```\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "F0FTu2fLW7k-"
+      },
+      "source": [
+        "Let's check if this hybrid model eventually learned a good heuristic to\n",
+        "propose new updates for the parameters in the QAOA ansatz of the MaxCut\n",
+        "problem.\n",
+        "\n",
+        "For this reason, we consider a new graph. In particular, we can take a\n",
+        "graph with 10 nodes, which is something that the recurrent network has\n",
+        "not seen before.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 45,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 516
+        },
+        "id": "yeKySIaCW7k-",
+        "outputId": "3ee22c49-6793-45ae-a5aa-f07bd94977e2"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": "\n"
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "new_graph = nx.gnp_random_graph(10, p=3 / 7)\n",
+        "new_cost = qaoa_from_graph(new_graph)\n",
+        "\n",
+        "nx.draw(new_graph)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "S9T88hSRW7k-"
+      },
+      "source": [
+        "![](../demonstrations/learning2learn/rendered_Graph10.png){.align-center\n",
+        "width=\"70.0%\"}\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "wWLRpXx6W7k-"
+      },
+      "source": [
+        "We call the trained recurrent LSTM on this graph, saving not only the\n",
+        "last, but all intermediate guesses for the parameters.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 46,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 453
+        },
+        "id": "3IAC8P0ZW7k-",
+        "outputId": "eb9622c0-c532-4d8f-9c23-5c9ddb4f80cd"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 1 Axes>"
+            ],
+            "image/png": "\n"
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "res = recurrent_loop(new_cost, intermediate_steps=True)\n",
+        "\n",
+        "# Extract all angle suggestions\n",
+        "start_zeros = tf.zeros(shape=(2 * n_layers, 1))\n",
+        "guess_0 = res[0]\n",
+        "guess_1 = res[1]\n",
+        "guess_2 = res[2]\n",
+        "guess_3 = res[3]\n",
+        "guess_4 = res[4]\n",
+        "final_loss = res[5]\n",
+        "\n",
+        "# Wrap them into a list\n",
+        "guesses = [start_zeros, guess_0, guess_1, guess_2, guess_3, guess_4]\n",
+        "\n",
+        "# Losses from the hybrid LSTM model\n",
+        "lstm_losses = [new_cost(tf.reshape(guess, shape=(2, n_layers))) for guess in guesses]\n",
+        "\n",
+        "fig, ax = plt.subplots()\n",
+        "\n",
+        "plt.plot(lstm_losses, color=\"blue\", lw=3, ls=\"-.\", label=\"LSTM\")\n",
+        "\n",
+        "plt.grid(ls=\"--\", lw=2, alpha=0.25)\n",
+        "plt.legend()\n",
+        "plt.ylabel(\"Cost function\", fontsize=12)\n",
+        "plt.xlabel(\"Iteration\", fontsize=12)\n",
+        "ax.set_xticks([0, 5, 10, 15, 20]);\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_91cxwU4W7k-"
+      },
+      "source": [
+        "![](../demonstrations/learning2learn/rendered_LossGeneralization.png){.align-center\n",
+        "width=\"70.0%\"}\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "3gJmVDHnW7k-"
+      },
+      "source": [
+        "Again, we can confirm that the custom optimizer based on the LSTM\n",
+        "quickly reaches a good value of the loss function, and also achieve good\n",
+        "generalization performances, since it is able to initialize parameters\n",
+        "also for graphs not present in the training set.\n",
+        "\n",
+        "::: {.note}\n",
+        "::: {.title}\n",
+        "Note\n",
+        ":::\n",
+        "\n",
+        "To get the optimized weights of the LSTM use:\n",
+        "`optimized_weights = cell.get_weights()`. To set initial weights for the\n",
+        "LSTM cell, use instead: `cell.set_weights(optimized_weights)`.\n",
+        ":::\n",
+        "\n",
+        "Loss landscape in parameter space\n",
+        "=================================\n",
+        "\n",
+        "It may be interesting to plot the path suggested by the RNN in the space\n",
+        "of the parameters. Note that this is possible only if one layer is used\n",
+        "in the QAOA ansatz since in this case only two angles are needed and\n",
+        "they can be plotted on a 2D plane. Of course, if more layers are used,\n",
+        "you can always select a pair of them to reproduce a similar plot.\n",
+        "\n",
+        "::: {.note}\n",
+        "::: {.title}\n",
+        "Note\n",
+        ":::\n",
+        "\n",
+        "This cell takes approx. \\~1m to run with an 11 by 11 grid\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 47,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 476
+        },
+        "id": "c26_87lAW7k-",
+        "outputId": "5da41d92-28e4-4986-f1ec-885991b07c50"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<Figure size 640x480 with 2 Axes>"
+            ],
+            "image/png": "\n"
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "# Evaluate the cost function on a grid in parameter space\n",
+        "dx = dy = np.linspace(-1.0, 1.0, 11)\n",
+        "dz = np.array([new_cost([[xx], [yy]]).numpy() for yy in dy for xx in dx])\n",
+        "Z = dz.reshape((11, 11))\n",
+        "\n",
+        "# Plot cost landscape\n",
+        "plt.contourf(dx, dy, Z)\n",
+        "plt.colorbar()\n",
+        "\n",
+        "# Extract optimizer steps\n",
+        "params_x = [0.0] + [res[i].numpy()[0, 0] for i in range(len(res[:-1]))]\n",
+        "params_y = [0.0] + [res[i].numpy()[0, 1] for i in range(len(res[:-1]))]\n",
+        "\n",
+        "# Plot steps\n",
+        "plt.plot(params_x, params_y, linestyle=\"--\", color=\"red\", marker=\"x\")\n",
+        "\n",
+        "plt.yticks(np.linspace(-1, 1, 5))\n",
+        "plt.xticks(np.linspace(-1, 1, 5))\n",
+        "plt.xlabel(r\"$\\alpha$\", fontsize=12)\n",
+        "plt.ylabel(r\"$\\gamma$\", fontsize=12)\n",
+        "plt.title(\"Loss Landscape\", fontsize=12)\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "pfz2iKc4W7k-"
+      },
+      "source": [
+        "![](../demonstrations/learning2learn/rendered_LossLandscape.png){.align-center\n",
+        "width=\"70.0%\"}\n",
+        "\n",
+        "Ideas for creating a Keras Layer and Keras Model\n",
+        "================================================\n",
+        "\n",
+        "Definition of a `Keras Layer` containing a single pass through the LSTM\n",
+        "and the Quantum Circuit. That's equivalent to the function\n",
+        "`rnn_iteration` from before.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 48,
+      "metadata": {
+        "id": "zpCBWKxfW7k-"
+      },
+      "outputs": [],
+      "source": [
+        "class QRNN(tf.keras.layers.Layer):\n",
+        "    def __init__(self, p=1, graph=None):\n",
+        "        super(QRNN, self).__init__()\n",
+        "        # p is the number of layers in the QAOA ansatz\n",
+        "        self.cell = tf.keras.layers.LSTMCell(2 * p)\n",
+        "        self.expectation = qaoa_from_graph(graph, n_layers=p)\n",
+        "        self.qaoa_p = p\n",
+        "\n",
+        "    def call(self, inputs):\n",
+        "        prev_cost = inputs[0]\n",
+        "        prev_params = inputs[1]\n",
+        "        prev_h = inputs[2]\n",
+        "        prev_c = inputs[3]\n",
+        "\n",
+        "        # Concatenate the previous parameters and previous cost to create new input\n",
+        "        new_input = tf.keras.layers.concatenate([prev_cost, prev_params])\n",
+        "\n",
+        "        # New parameters obtained by the LSTM cell, along with new internal states h and c\n",
+        "        new_params, [new_h, new_c] = self.cell(new_input, states=[prev_h, prev_c])\n",
+        "\n",
+        "        # This part is used to feed the parameters to the PennyLane function\n",
+        "        _params = tf.reshape(new_params, shape=(2, self.qaoa_p))\n",
+        "\n",
+        "        # Cost evaluation, and reshaping to be consistent with other Keras tensors\n",
+        "        new_cost = tf.reshape(tf.cast(self.expectation(_params), dtype=tf.float32), shape=(1, 1))\n",
+        "\n",
+        "        return [new_cost, new_params, new_h, new_c]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "wmjxo5wkW7k_"
+      },
+      "source": [
+        "Code for creating an actual `Keras Model` starting from the previous\n",
+        "layer definition.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 49,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "fGqHsQ33W7k_",
+        "outputId": "357100ac-d664-4df7-a3e3-1fbffcdc79dc"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Model: \"model_1\"\n",
+            "__________________________________________________________________________________________________\n",
+            " Layer (type)                   Output Shape         Param #     Connected to                     \n",
+            "==================================================================================================\n",
+            " input_5 (InputLayer)           [(None, 1)]          0           []                               \n",
+            "                                                                                                  \n",
+            " input_6 (InputLayer)           [(None, 2)]          0           []                               \n",
+            "                                                                                                  \n",
+            " input_7 (InputLayer)           [(None, 2)]          0           []                               \n",
+            "                                                                                                  \n",
+            " input_8 (InputLayer)           [(None, 2)]          0           []                               \n",
+            "                                                                                                  \n",
+            " qrnn_1 (QRNN)                  [(1, 1),             48          ['input_5[0][0]',                \n",
+            "                                 (None, 2),                       'input_6[0][0]',                \n",
+            "                                 (None, 2),                       'input_7[0][0]',                \n",
+            "                                 (None, 2)]                       'input_8[0][0]',                \n",
+            "                                                                  'qrnn_1[0][0]',                 \n",
+            "                                                                  'qrnn_1[0][1]',                 \n",
+            "                                                                  'qrnn_1[0][2]',                 \n",
+            "                                                                  'qrnn_1[0][3]',                 \n",
+            "                                                                  'qrnn_1[1][0]',                 \n",
+            "                                                                  'qrnn_1[1][1]',                 \n",
+            "                                                                  'qrnn_1[1][2]',                 \n",
+            "                                                                  'qrnn_1[1][3]']                 \n",
+            "                                                                                                  \n",
+            " tf.math.multiply_3 (TFOpLambda  (1, 1)              0           ['qrnn_1[0][0]']                 \n",
+            " )                                                                                                \n",
+            "                                                                                                  \n",
+            " tf.math.multiply_4 (TFOpLambda  (1, 1)              0           ['qrnn_1[1][0]']                 \n",
+            " )                                                                                                \n",
+            "                                                                                                  \n",
+            " tf.math.multiply_5 (TFOpLambda  (1, 1)              0           ['qrnn_1[2][0]']                 \n",
+            " )                                                                                                \n",
+            "                                                                                                  \n",
+            " average_295 (Average)          (1, 1)               0           ['tf.math.multiply_3[0][0]',     \n",
+            "                                                                  'tf.math.multiply_4[0][0]',     \n",
+            "                                                                  'tf.math.multiply_5[0][0]']     \n",
+            "                                                                                                  \n",
+            "==================================================================================================\n",
+            "Total params: 48\n",
+            "Trainable params: 48\n",
+            "Non-trainable params: 0\n",
+            "__________________________________________________________________________________________________\n"
+          ]
+        }
+      ],
+      "source": [
+        "_graph = nx.gnp_random_graph(7, p=3 / 7)\n",
+        "\n",
+        "# Instantiate the LSTM cells\n",
+        "rnn0 = QRNN(graph=_graph)\n",
+        "\n",
+        "# Create some input layers to feed the data\n",
+        "inp_cost = tf.keras.layers.Input(shape=(1,))\n",
+        "inp_params = tf.keras.layers.Input(shape=(2,))\n",
+        "inp_h = tf.keras.layers.Input(shape=(2,))\n",
+        "inp_c = tf.keras.layers.Input(shape=(2,))\n",
+        "\n",
+        "# Manually creating the recurrent loops. In this case just three iterations are used.\n",
+        "out0 = rnn0([inp_cost, inp_params, inp_h, inp_c])\n",
+        "out1 = rnn0(out0)\n",
+        "out2 = rnn0(out1)\n",
+        "\n",
+        "# Definition of a loss function driving the training of the LSTM\n",
+        "loss = tf.keras.layers.average([0.15 * out0[0], 0.35 * out1[0], 0.5 * out2[0]])\n",
+        "\n",
+        "# Definition of a Keras Model\n",
+        "model = tf.keras.Model(\n",
+        "    inputs=[inp_cost, inp_params, inp_h, inp_c], outputs=[out0[1], out1[1], out2[1], loss]\n",
+        ")\n",
+        "\n",
+        "model.summary()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-2ZZ3ng5W7k_"
+      },
+      "source": [
+        "::: {.rst-class}\n",
+        "sphx-glr-script-out\n",
+        "\n",
+        "Out:\n",
+        "\n",
+        "``` {.none}\n",
+        "Model: \"functional_1\"\n",
+        "__________________________________________________________________________________________________\n",
+        "Layer (type)                    Output Shape         Param #     Connected to                     \n",
+        "==================================================================================================\n",
+        "input_1 (InputLayer)            [(None, 1)]          0                                            \n",
+        "__________________________________________________________________________________________________\n",
+        "input_2 (InputLayer)            [(None, 2)]          0                                            \n",
+        "__________________________________________________________________________________________________\n",
+        "input_3 (InputLayer)            [(None, 2)]          0                                            \n",
+        "__________________________________________________________________________________________________\n",
+        "input_4 (InputLayer)            [(None, 2)]          0                                            \n",
+        "__________________________________________________________________________________________________\n",
+        "qrnn (QRNN)                     [(1, 1),             48         input_1[0][0]                    \n",
+        "                                 (None, 2),                     input_2[0][0]                    \n",
+        "                                 (None, 2),                     input_3[0][0]                    \n",
+        "                                 (None, 2)]                     input_4[0][0]                    \n",
+        "                                                                qrnn[0][0]                       \n",
+        "                                                                qrnn[0][1]                       \n",
+        "                                                                qrnn[0][2]                       \n",
+        "                                                                qrnn[0][3]                       \n",
+        "                                                                qrnn[1][0]                       \n",
+        "                                                                qrnn[1][1]                       \n",
+        "                                                                qrnn[1][2]                       \n",
+        "                                                                qrnn[1][3]                       \n",
+        "__________________________________________________________________________________________________\n",
+        "tf.math.multiply (TFOpLambda)   (1, 1)               0           qrnn[0][0]                       \n",
+        "__________________________________________________________________________________________________\n",
+        "tf.math.multiply_1 (TFOpLambda) (1, 1)               0           qrnn[1][0]                       \n",
+        "__________________________________________________________________________________________________\n",
+        "tf.math.multiply_2 (TFOpLambda) (1, 1)               0           qrnn[2][0]                       \n",
+        "__________________________________________________________________________________________________\n",
+        "average_147 (Average)           (1, 1)               0           tf.math.multiply[0][0]           \n",
+        "                                                                tf.math.multiply_1[0][0]         \n",
+        "                                                                tf.math.multiply_2[0][0]         \n",
+        "==================================================================================================\n",
+        "Total params: 48\n",
+        "Trainable params: 48\n",
+        "Non-trainable params: 0\n",
+        "```\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "luGzuKmdW7k_"
+      },
+      "source": [
+        "A basic training routine for the `Keras Model` just created:\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 50,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "yEVE26ZhW7k_",
+        "outputId": "c945f4e6-5fa5-47bd-dddb-48f7111bfcd9"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Step 1 - Loss = [[-1.5563084]] - Cost = -4.762684301954698\n",
+            "Step 2 - Loss = [[-1.5649065]] - Cost = -4.799981173473759\n",
+            "Step 3 - Loss = [[-1.5741502]] - Cost = -4.840036354736864\n",
+            "Step 4 - Loss = [[-1.5841404]] - Cost = -4.883246618039306\n",
+            "Step 5 - Loss = [[-1.5948243]] - Cost = -4.929228955313207\n",
+            "Final Loss: [[-1.5948243]]\n",
+            "Final Outs:\n",
+            " >out0: [[-0.01041588  0.01016874]]\n",
+            " >out1: [[-0.04530389  0.38148248]]\n",
+            " >out2: [[-0.10258181  0.41341168]]\n",
+            " >Loss: [[-1.5948243]]\n"
+          ]
+        }
+      ],
+      "source": [
+        "p = 1\n",
+        "\n",
+        "inp_costA = tf.zeros(shape=(1, 1))\n",
+        "inp_paramsA = tf.zeros(shape=(1, 2 * p))\n",
+        "inp_hA = tf.zeros(shape=(1, 2 * p))\n",
+        "inp_cA = tf.zeros(shape=(1, 2 * p))\n",
+        "\n",
+        "inputs = [inp_costA, inp_paramsA, inp_hA, inp_cA]\n",
+        "\n",
+        "opt = tf.keras.optimizers.Adam(learning_rate=0.01)\n",
+        "step = 5\n",
+        "\n",
+        "for _ in range(step):\n",
+        "    with tf.GradientTape() as tape:\n",
+        "        pred = model(inputs)\n",
+        "        loss = pred[3]\n",
+        "\n",
+        "    gradients = tape.gradient(loss, model.trainable_variables)\n",
+        "    opt.apply_gradients(zip(gradients, model.trainable_variables))\n",
+        "    print(\n",
+        "        f\"Step {_+1} - Loss = {loss} - Cost = {qaoa_from_graph(_graph, n_layers=p)(np.reshape(pred[2].numpy(),(2, p)))}\"\n",
+        "    )\n",
+        "\n",
+        "print(\"Final Loss:\", loss.numpy())\n",
+        "print(\"Final Outs:\")\n",
+        "for t, s in zip(pred, [\"out0\", \"out1\", \"out2\", \"Loss\"]):\n",
+        "    print(f\" >{s}: {t.numpy()}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "JyN1UGdTW7k_"
+      },
+      "source": [
+        "::: {.rst-class}\n",
+        "sphx-glr-script-out\n",
+        "\n",
+        "Out:\n",
+        "\n",
+        "``` {.none}\n",
+        "Step 1 - Loss = [[-1.5563084]] - Cost = -4.762684301954701\n",
+        "Step 2 - Loss = [[-1.5649065]] - Cost = -4.799981173473755\n",
+        "Step 3 - Loss = [[-1.5741502]] - Cost = -4.840036354736862\n",
+        "Step 4 - Loss = [[-1.5841404]] - Cost = -4.883246647056216\n",
+        "Step 5 - Loss = [[-1.5948243]] - Cost = -4.929228976649736\n",
+        "Final Loss: [[-1.5948243]]\n",
+        "Final Outs:\n",
+        ">out0: [[-0.01041588  0.01016874]]\n",
+        ">out1: [[-0.04530389  0.38148248]]\n",
+        ">out2: [[-0.10258182  0.4134117 ]]\n",
+        ">Loss: [[-1.5948243]]\n",
+        "```\n",
+        ":::\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "dQcZkqAcW7k_"
+      },
+      "source": [
+        "::: {.note}\n",
+        "::: {.title}\n",
+        "Note\n",
+        ":::\n",
+        "\n",
+        "This code works only for a single graph at a time, since a graph was\n",
+        "needed to create the `QRNN` `Keras Layer` named `rnn0`. Thus, in order\n",
+        "to actually train the RNN network for multiple graphs, the above\n",
+        "training routine must be modified. Otherwise, you could find a way to\n",
+        "define the model to accept as input a whole dataset of graphs, and not\n",
+        "just a single one. Still, this might prove particularly hard, since\n",
+        "TensorFlow deals with tensors, and is not able to directly manage other\n",
+        "data structures, like graphs or functions taking graphs as input, like\n",
+        "`qaoa_from_graph`.\n",
+        ":::\n",
+        "\n",
+        "About the author\n",
+        "================\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "seconds = time.time()\n",
+        "print(\"Time in seconds since end of run:\", seconds)\n",
+        "local_time = time.ctime(seconds)\n",
+        "print(local_time)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 0
+        },
+        "id": "q47oDQ0JXZSS",
+        "outputId": "fd61e113-a0ba-4b8b-fc0f-b710a8ebdd74"
+      },
+      "execution_count": 51,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Time in seconds since end of run: 1701021807.473456\n",
+            "Sun Nov 26 18:03:27 2023\n"
+          ]
+        }
+      ]
+    }
+  ],
+  "metadata": {
+    "kernelspec": {
+      "display_name": "Python 3",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.9.17"
+    },
+    "colab": {
+      "provenance": []
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
\ No newline at end of file