In this tutorial, we will build a quantum machine learning algorithm that classifies and recognizes handwritten digits(whether a digit is 0 or 1) present in the MNIST dataset. We will make use of several dimensional reduction techniques, perform classical pre-processing and initialize our own quantum feature maps.
The following diagram gives a brief overview of the Variational Quantum Classifier protocol.
First, we will install the required dependencies needed to perform the task
You first have to download the training dataset from http://yann.lecun.com/exdb/mnist/ and save it in a .csv file format. We will give this file the name mnist_train.csv in the dataset folder. We will also download the testing dataset and save it as mnist_test.csv. You will have to have the python file in the same directory as the training and testing datasets to run the code. An image size of size 28 is initialized.
The testing dataset has the same form except it has 10000 data points. In order to get an idea of a picture of a digit inside the dataset, we will run the following code.
The dimension of the data corresponds to the number of qubits required in order to encode the data for the quantum feature maps we will later initialize. Since quantum computers today can only manipulate 50 qubits, we cannot work with large number of qubits like 784, therefore encoding data with dimension 784 is not viable.
Therefore, we will have to make use of the truncated Singular Value Decomposition (SVD) and t-distributed stochastic neighbor embedding (t-SNE) methods to reduce the dimension down to 10 and then to 2. If you're interested in learning about dimensionality reduction for the MNIST dataset you can read Colah's blog.
We will first truncate the dataset to 10000 data points so that it becomes easier to apply the TruncatedSVD and TSNE techniques.
We will plot out the dataset to see if there is some sort of clustering due to classification of digits in the reduced dataset. We will need to create a pandas dataframe and using seaborn to plot the data.
0 and 1 are well separated on opposite corners as they are easily distinguishable, however, 4 and 9 are overlapping as corresponding to purple and blue data points.
We will extract data points corresponding to the digits 0 and 1 from the reduced dataset and normalize their features to be between 0 and 2.
We need to normalize the data because the values will be inserted into a quantum feature map.
We will follow the Variational Quantum Classifier (VQC) method as proposed in the paper Havlicek et.al to classify the digits using the concepts of quantum mechanics.
Similar to classical supervised machine learning algorithms, the VQC has a training stage (where data points with labels are provided and learning takes place) and a testing stage (where new data points without labels are provided which are then classified).
The main steps of this algorithm are:
1. Load Data onto the Quantum Computer by Applying a Quantum Feature Map Φ(x)
2. Build and apply short-depth Variational Circuit W(θ).
The Quantum Feature Map of depth d is implemented by the following circuit.
We will first learn how to configure inbuilt quantum feature maps in Qiskit Terra (Circuit Library) and then how to build a custom feature map. A feature map is a variational circuit.
Some of the feature maps include:
Let us first import the required libraries. We will test out the PauliFeatureMap first with
paulis=['Z', 'Y', 'ZZ'].
These are the feature maps present in Qiskit. However, these feature maps may not perform well on all datasets. For a particular dataset, finding a quantum feature map that can spread the data points in Hilbert space in such a way that a hyperplane can be drawn to classify them is important to gain higher accuracies for our model(this is the basics of support vector machines).
We also want that the corresponding quantum feature map circuit is shallow( have a small circuit depth) as this reduces quantum decoherence, leading to higher accuracies. If you are interested in learning about errors, decoherence and error mitigation techniques, you can look at this section of the Qiskit textbook.
Generally, we want to construct custom feature maps for increasing the accuracy of classification.
In this step we will append a variational circuit to the feature map. The parameters of this variational circuit are trained using classical optimizers until it classifies the data points correctly. This is the training stage of the algorithm and accuracy of the model depends on the variational circuit one chooses.
Constructing using Real Amplitudes:
Let us create a variational circuit using the inbuilt Real Amplitudes method. Check out the documentation page to understand how Real Amplitudes work.
We can also use the EfficentSU2 method to create the variational circuit.
The following steps need to be done in order to create a custom feature map:
"How" a custom feature map is created is still not clear and in research.
Let's apply the VQC method in Qiskit Aqua to solve the task of classifying digits 0 and 1. We will take a very small subset of 20 training datapoints and 10 testing datapoints. We also keep 5 points per label as a validation set. We will first define the training and testing inputs based on the dataset we initialized before.
We have not used a custom feature map in this implementation, however the custom feature map we defined before will still work and you can try out different combinations of parameters to see which feature map gives highest accuracy.
A classical optimization routine changes the values of our variational circuit and repeats the whole process again. This is the classical loop that trains our parameters until the cost function value decreases. You can look at the code and the optimization step as given in the Qiskit Textbook to understand the classical optimization process.
VQC Aqua provides a few classical optimizer methods:
We will use the COBYLA optimizer method
This is what our final circuit will look like.
If you run this, then the testing accuracy output is 1.0. This means that our Variational Quantum Circuit is 100% accurate. We can check this by printing the actual and predicted values and we see that wherever there is a 0 in the actual dataset, we have predicted a 0 and vice-versa.