Quantum Finance Application on Portfolio Diversification¶

Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved.

Overview¶

Current finance problems can be mainly tackled by three areas of quantum algorithms: quantum simulation, quantum optimization, and quantum machine learning [1,2]. Many financial problems are essentially combinatorial optimization problems, and corresponding algorithms usually have high time complexity and are difficult to implement. Due to the power of quantum computing, these complex problems are expected to be solved by quantum algorithms in the future.

The Quantum Finance module of Paddle Quantum focuses on quantum optimization: how to apply quantum algorithms in real finance optimization problems. This tutorial focuses on how to use quantum algorithms to solve the portfolio diversification problem.

Portfolio Diversification Problem¶

Limited by the lack of professional knowledge and market experience, the investor prefers a passive investment strategy in the actual investment. Index investing is a typical example of passive investing, e.g. an investor buys and holds the Standard & Poor’s 500 (S&P 500) for a long period of time. As an investor, if you do not want to invest in an existing index portfolio, you can also create your specific index portfolio by picking representative stocks from the market.

An important way to balance risk and return in an investment portfolio is to diversify your assets. A specific description of portfolio diversification is as follows: the number of investable stocks is nn and the number of stocks included in the portfolio is KK. Based on some criteria, you need to divide all the stocks into KK categories and select the stock from each category that best represents that category. Adding representatives of each category to the index portfolio is better for investment management.

Encoding Portfolio Diversification Problem¶

To transform the portfolio diversification problem into a problem applicable for parameterized quantum circuits, we need to encode the portfolio diversification problem into a Hamiltonian.

To model the problem, two issues need to be clarified. The first one is how to classify different stocks, and the second one is what criteria are used to select representative stocks. To solve these two problems, firstly we define the similarity ρijρij between stock ii and stock jj:

  • ρii=1ρii=1 The stock is similar to itself with a similarity of 1
  • ρij≤1ρij≤1 The larger ρijρij , the higher the similarity between stock ii and stock jj

Due to the correlation of returns between stocks, we can further measure the similarity between the time series on the basis of the covariance matrix. Dynamic Time Warping (DTW) is a common method to measure the similarity of two time series. In this paper, the DTW algorithm is used to calculate the similarity between two stocks. So based on the similarity between different stocks, we can classify the stocks and select representative stocks in each category. We can define nn binary variables xijxij and 11 binary variables yjyj for each stock. Therefore, given nn stocks, there are n2+nn2+n binary variables. For the variable xijxij, ii denotes the order of stock, and jj denotes the position among the nn binary variables corresponding to that stock. If two stock has the same index of jj, they are classified in the same category. Meanwhile, the stock of i=ji=j is the most representative one in that category selected for the index portfolio:

xij={1,stock j is in the portfolio and it has the highest similarity to stock i0,otherwise,xij={1,stock j is in the portfolio and it has the highest similarity to stock i0,otherwise,yj={1,stock j is selected to the index portfolio0,otherwise.yj={1,stock j is selected to the index portfolio0,otherwise.

The model can be written as follows:

M=maxxijn∑i=1n∑j=1ρijxij.(1)(1)M=maxxij∑i=1n∑j=1nρijxij.

The model needs to satisfy the following constraints:

  • Clustering constraint: the index portfolio only include KK stocks
    • ∑nj=1yj=K∑j=1nyj=K
  • Integer constraint: a stock is either in the index portfolio or not
    • xij,yj∈{0,1},∀i=1,…,n;j=1,…,nxij,yj∈{0,1},∀i=1,…,n;j=1,…,n
  • Consistency constraint: if a stock can represent another stock, it must be in the index portfolio
    • ∑nj=1xij=1,∀i=1,…,n∑j=1nxij=1,∀i=1,…,n
    • xij≤yj,∀i=1,…,n;j=1,…,nxij≤yj,∀i=1,…,n;j=1,…,n
    • xjj=yj,∀j=1,…,nxjj=yj,∀j=1,…,n

The objective of the model is to maximize the similarity between the nn stocks and the selected index stock portfolio.

Since the loss function is to be optimized using the gradient descent method, some modifications made to the loss function based on the model equation and constraints:

Cx=−n∑i=1n∑j=1ρijxij+A(K−n∑j=1yj)2+n∑i=1A(n∑j=11−xij)2+n∑j=1A(xjj−yj)2+n∑i=1n∑j=1A(xij(1−yj)).(2)(2)Cx=−∑i=1n∑j=1nρijxij+A(K−∑j=1nyj)2+∑i=1nA(∑j=1n1−xij)2+∑j=1nA(xjj−yj)2+∑i=1n∑j=1nA(xij(1−yj)).

The first term represents similarity maximization, the next four terms are constraints. AA is the penalty parameter, which is usually set to a larger number so that the final binary string representing the index portfolio results satisfies the constraints.

We now need to transform the cost function CxCx into a Hamiltonian to realize the encoding of the portfolio diversification problem. Each variable xijxij has two possible values, 00 and 11, corresponding to quantum states |0⟩|0⟩ and |1⟩|1⟩. Note that every variable corresponds to a qubit and so n2+nn2+n qubits are needed for solving the portfolio diversification problem. The Pauli ZZ operator has two eigenstates which are the same as the states |0⟩|0⟩ and |1⟩|1⟩ . Their corresponding eigenvalues are 1 and -1, respectively. So we consider encoding the cost function as a Hamiltonian using the Pauli ZZ matrix.

Now we would like to consider the mapping

xij↦I−Zij2,(3)(3)xij↦I−Zij2,

where Zij=I⊗I⊗…⊗Z⊗…⊗IZij=I⊗I⊗…⊗Z⊗…⊗I with ZZ operates on the qubit at position ijij. Under this mapping, the value of xijxij represent different meanings. If the qubit ijij is in state |1⟩|1⟩, then xij|1⟩=I−Zij2|1⟩=1|1⟩xij|1⟩=I−Zij2|1⟩=1|1⟩, which means stock ii is in index portfolio. Also, for the qubit ijij in state |0⟩|0⟩, xij|0⟩=I−Zij2|0⟩=0|0⟩xij|0⟩=I−Zij2|0⟩=0|0⟩.

Thus using the above mapping, we can transform the cost function CxCx into a Hamiltonian HCHC for the system of n2+nn2+n qubits and realize the quantumization of the portfolio diversification problem. Then the ground state of HCHC is the optimal solution to the portfolio diversification problem. In the following section, we will show how to use a parametrized quantum circuit to find the ground state, i.e., the eigenvector with the smallest eigenvalue.

Paddle Quantum Implementation¶

To investigate the portfolio diversification problem using Paddle Quantum, there are some required packages to import, which are shown below.

In [1]:
# Import packages needed
import numpy as np
import pandas as pd
import datetime

# Import related modules from Paddle Quantum and PaddlePaddle
import paddle
import paddle_quantum
from paddle_quantum.ansatz import Circuit
from paddle_quantum.finance import DataSimulator, portfolio_diversification_hamiltonian

Prepare experimental data¶

In this tutorial, we choose stocks as investment assets. For the data used in the experimental tests, two options are provided:

  • The first method is to generate random data according to certain requirements, e.g. number of assets.

If the user prepares data using this method, then when initializing the data, it is necessary to give the list of parameters: a list of names of investable stocks (assets), the start date, and the end date of the trading data.

In [2]:
num_assets = 3 # Number of investable projects
stocks = [("TICKER%s" % i) for i in range(num_assets)]
data = DataSimulator( stocks = stocks, start = datetime.datetime(2016, 1, 1), end = datetime.datetime(2016, 1, 30))  
data.randomly_generate() # Generate random data
  • The second method is that the user can choose to set the data themselves, i.e. real stock data collected by themselves. Considering that the number of stocks contained in the file may be large, the user can specify the number of stocks used for this experiment, i.e., num_assets as initialized above.

We collect the closing prices of 1212 stocks for 3535 trading days into the realStockData_12.csv file, where we choose to read only the first 33 stocks.

In this tutorial, we choose to read real data as experimental data.

In [3]:
df = pd.read_csv('realStockData_12.csv') 
dt = []
for i in range(num_assets):
    mylist = df['closePrice'+str(i)].tolist()
    dt.append(mylist)
# Output the closing price of the seven stocks read from the file for the 35 trading days
print(dt)   
# Specify the experimental data as a local file read by the user
data.set_data(dt)  
[[16.87, 17.18, 17.07, 17.15, 16.66, 16.79, 16.69, 16.99, 16.76, 16.52, 16.33, 16.39, 16.45, 16.0, 16.09, 15.54, 13.99, 14.6, 14.63, 14.77, 14.62, 14.5, 14.79, 14.77, 14.65, 15.03, 15.37, 15.2, 15.24, 15.59, 15.58, 15.23, 15.04, 14.99, 15.11, 14.5], [32.56, 32.05, 31.51, 31.76, 31.68, 32.2, 31.46, 31.68, 31.39, 30.49, 30.53, 30.46, 29.87, 29.21, 30.11, 28.98, 26.63, 27.62, 27.64, 27.9, 27.5, 28.67, 29.08, 29.08, 29.95, 30.8, 30.42, 29.7, 29.65, 29.85, 29.25, 28.9, 29.33, 30.11, 29.67, 29.59], [5.4, 5.48, 5.46, 5.49, 5.39, 5.47, 5.46, 5.53, 5.5, 5.47, 5.39, 5.35, 5.37, 5.24, 5.26, 5.08, 4.57, 4.44, 4.5, 4.56, 4.52, 4.59, 4.66, 4.67, 4.66, 4.72, 4.84, 4.81, 4.84, 4.88, 4.89, 4.82, 4.74, 4.84, 4.79, 4.63]]

Encoding Hamiltonian¶

Here we construct the Hamiltonian HCHC of Eq. (2) with the replacement in Eq. (3).

In the process of encoding Hamiltonian, we first need to calculate the similarity matrix ρρ between the returns of each stock, which is available in the finance module and can be called directly.

In [4]:
rho = data.get_similarity_matrix()

Based on the provided and calculated parameters, the Hamiltonian is constructed below. Here we set the penalty parameter to the number of investable stocks.

In [5]:
q = 2 # Number of stocks in the index portfolio
penalty = num_assets # penalty parameter 
hamiltonian = portfolio_diversification_hamiltonian(penalty, rho, q)

Calculating the loss function¶

We adopt a parameterized quantum circuit consisting of U3(→θ)U3(θ→) and CNOTCNOT gates, that can be constructed by calling the built-in method complex_entangled_layer().

After running the quantum circuit, we obtain the circuit output |→θ⟩|θ→⟩. From the output state of the circuit we can calculate the objective function, and also the loss function of the portfolio diversification problem:

L(→θ)=⟨→θ|HC|→θ⟩.(4)(4)L(θ→)=⟨θ→|HC|θ→⟩.

We then use a classical optimization algorithm to minimize this function and find the optimal parameters →θ∗θ→∗. The following code shows a complete network built with Paddle Quantum and PaddlePaddle.

In [6]:
class PDNet(paddle.nn.Layer):

    def __init__(self, num_qubits, p, dtype="float64"):
        super(PDNet, self).__init__()
        self.num_qubits = num_qubits
        self.depth = p
        self.cir = Circuit(self.num_qubits)
        self.cir.complex_entangled_layer(depth=self.depth)

    def forward(self):
        """
        Forward propagation
        """
        state = self.cir(init_state)
        loss = loss_func(state)

        return loss, self.cir

Training the quantum neural network¶

After defining the quantum neural network, we use the gradient descent method to update the parameters to minimize the expectation value in Eq. (4).

In [7]:
SEED = 1100   # Set a global RNG seed 
p = 2        # Number of layers in the quantum circuit
ITR = 150    # Number of training iterations
LR = 0.4     # Learning rate of the optimization method based on gradient descent

Here, we optimize the network defined above in PaddlePaddle.

In [13]:
# number of qubits
n = len(rho)
# Fix paddle random seed
paddle.seed(SEED)
# number of qubits need in circuit
num_qubits = n * (n+1)
# Building Quantum Neural Networks
net = PDNet(num_qubits, p)
# Define the initial state
init_state = paddle_quantum.state.zero_state(num_qubits)
# Define loss function
loss_func = paddle_quantum.loss.ExpecVal(hamiltonian)
# Use Adam optimizer
opt = paddle.optimizer.Adam(learning_rate=LR, parameters=net.parameters())

# Gradient descent iteration
for itr in range(1, ITR + 1):
    # run circuit
    loss, cir = net()
    # compute gradient and optimize
    loss.backward()
    opt.minimize(loss)
    opt.clear_grad()
    if itr % 10 == 0:
        print("iter:", itr, "    loss:", "%.4f"% loss.numpy())
iter: 10     loss: 7.7805
iter: 20     loss: 5.4413
iter: 30     loss: 3.6023
iter: 40     loss: 3.2911
iter: 50     loss: 1.9415
iter: 60     loss: 0.3871
iter: 70     loss: 0.1342
iter: 80     loss: 0.0774
iter: 90     loss: 0.0122
iter: 100     loss: 0.0068
iter: 110     loss: -0.0001
iter: 120     loss: -0.0019
iter: 130     loss: -0.0025
iter: 140     loss: -0.0028
iter: 150     loss: -0.0028

Decoding the quantum solution¶

After obtaining the minimum value of the loss function and the corresponding set of parameters →θ∗θ→∗, our task has not been completed. In order to obtain an approximate solution to the portfolio diversification problem, it is necessary to decode the solution to the classical optimization problem from the quantum state |→θ∗⟩|θ→∗⟩ output by the circuit. Physically, to decode a quantum state, we need to measure it and then calculate the probability distribution of the measurement results:

p(z)=|⟨z|→θ∗⟩|2.(5)(5)p(z)=|⟨z|θ→∗⟩|2.

In the case of quantum parameterized circuits with sufficient expressiveness, the greater the probability of a certain bit string, the greater the probability that it corresponds to an optimal solution to the portfolio diversification problem.

Paddle Quantum provides a function to read the probability distribution of the measurement results of the state output by the quantum circuit:

In [15]:
# Repeat the simulated measurement of the circuit output state 2048 times
final_state = cir(init_state)
prob_measure = final_state.measure(shots=2048)
investment = max(prob_measure, key=prob_measure.get)
print("The bit string form of the solution: ", investment)
The bit string form of the solution:  100001001101

After measurement, we have found the bit string with the highest probability of occurrence, the index portfolio in the form of the bit string. As the result above 100001001101, we have n=3n=3 investable stocks and choose two for the index portfolio。The first n2=9n2=9 bits of 100001001 represent xijxij, and every 33 bits are grouped together. The first bit of the first group of 100 is set to 11, which means it is classified as a class. The third bit in the second group 001 and the third group 001 is set to 11, which means they are classified as one class. Also, the positions of 11 in the first and third groups are satisfied with i=ji=j, i.e., these two stocks are the most representative stock of their respective classes. It can be seen that 11 appears at j=1j=1 and j=3j=3, i.e., two positions are possible for 11, which corresponds to our presumption of having two stocks in the index portfolio. The last 33 position is 101, which represents yjyj, indicating that the first stock and the third stock are selected for the index portfolio. If the final result is not such a valid solution as described above, users can still get a better training result by adjusting the parameter values of the parameterized quantum circuit.

Conclusion¶

In this tutorial, we focus on how to classify investable stocks and how to select representative ones for our portfolio. In this problem, each investment item requires nn qubits to represent the classification and 11 qubit to represent whether it is selected for the portfolio or not. Due to the limitation of the number of qubits, the number of investment items that can be handled is still small.


References¶

[1] Orus, Roman, Samuel Mugel, and Enrique Lizaso. "Quantum computing for finance: Overview and prospects." Reviews in Physics 4 (2019): 100028.

[2] Egger, Daniel J., et al. "Quantum computing for Finance: state of the art and future prospects." IEEE Transactions on Quantum Engineering (2020).