Blog, Uncategorized

Matrix Nearness Problems for Sums of Kronecker Products and Preconditioning

I’ve asked Alex to contribute today’s article to give people a taste of what kinds of problems IZENTE helps to solve. I first met Alex when I saw him give a talk on semi-separable matrices. He is one of the brightest programmers I have ever worked with. I think this article really shows the value that IZENTE can offer to companies. — David Lee

I was asked to write up a bit on what kinds of things I have been working on. One of the more interesting problems that came up during the last year was the issue of preconditioning a large sparse-ish system. The client had a very large system of equations they were solving in 3-D that were weakly coupled across dimensions. The big problem they were running into was that the solver they were using wouldn’t always converge, or would take way too long(multiple days).

Introduction

I had actually dealt with a similar issue before as part of my studies in grad school, so I had some ideas about how to solve this. The basic setup was that they had some operator like:

T \otimes I \otimes I + I \otimes T \otimes I + I  \otimes I \otimes T + ...

Where circle with the x in it, denotes the Kronecker product, and I is the usual Identity matrix. Unfortunately, WordPress doesn’t work well with LaTeX so I apologize for any formatting issues.

Now this is basically a sum of matrices that look like:

A \otimes B \otimes C

And if we had a matrix like that we could invert it very very quickly:

(A \otimes B \otimes C)^{-1} = A^{-1} \otimes B^{-1} \otimes C^{-1}

If we take all of the matrices to be N by N, this would take O(N^9) operations to invert naively, but because of the above property, we only need to do O(N^3) operations. So this would be tremendously quicker to invert. If N=10, this would by 1 million times quicker. It would go from being impossible to being possible.

One quick note, for those of you not well versed in linear algebra, the sum of the Kronecker products is not equally easy to invert.

(A_1 \otimes B_1 + A_2 \otimes B_2)^{-1}  \neq A_1^{-1} \otimes B_1^{-1} + A_2^{-1} \otimes B_2^{-1}

Okay so we know what form our problem takes, and we know what form we want our solution to be in. How can we find the optimal approximation of that type?

Approximation of Sums of Kronecker Products

One of the very first questions we should be asking is “optimal in what norm?”. There are an infinite number of possible norms we could solve this problem in, the good news though is that there are actually very few that make much sense, and if the error in the approximation goes to zero in any reasonable norm, it goes to zero in most of them. For this case, we will use the Frobenius norm, which is induced by an inner product, which will be very useful. Let us write:

L = T \otimes I \otimes I + I \otimes T \otimes I + I  \otimes I \otimes T + ...

As our operator we seek to approximate, and

M = (aI+bT) \otimes (aI+bT) \otimes (aI+bT)

As the approximation we are looking for. So we seek to find the optimal ‘a’ and ‘b’ that solve the problem.

Now in the Frobenius norm we have:

||R||^2_F = \langle L-M,L-M\rangle _F = \langle L,L\rangle _F - 2\langle L,M\rangle _F + \langle M,M\rangle _F

Using the fact that

||A\otimes B \otimes C ||_F = ||A||_F||B||_F||C||_F

(see appendix) we can compute ||M||^2_F quickly since it is just the Kronecker product of three matrices.

||M||^2_F = ||a I + b T||^6_F = (a^2 \langle I,I\rangle _F+2a b \langle I,T \rangle _F+b^2 ||T||^2_F)^3 ||M||^2_F = ( a^2 N + 2 a b  trace(T) + b^2 ||T||^2_F )^3

Where here I assume every matrix is N by N. Now it should be obvious that we just have a optimization problem in two unknowns. The other quantities can be solved for numerically.

In the end this approach wound up stabilizing the solver and dramatically improving the running time of the simulation.

Appendix

I used the fact that:

\langle A\otimes B \otimes C, D\otimes E \otimes F\rangle _F = \langle A,D\rangle _F\langle B,E\rangle _F\langle C,F\rangle _F

Which I have not seen a proof of, so I will give one below.

Lemma 1. Given matrices of compatible dimensions, such that the inner products below make sense:

\langle A_1\otimes \dots \otimes A_n,B_1\otimes \dots \otimes B_n\rangle F = \langle A_1,B_1\rangle _F\dots \langle A_n,B_n\rangle _F

Proof. It is enough to prove it true for \langle A_1\otimes A_2,B_1\otimes B_2 \rangle , because A_1\otimes \dots \otimes A_n = A_1 \otimes C and so the property can be applied recursively. Let A_1, B_1 \in \mathbb{R}^{N \times P}, A_2, B_2 \in \mathbb{R}^{M \times Q}, then

\langle A_1 \otimes A_2, B_1 \otimes B_2\rangle _F = \sum_{i=1}^{NM} \sum_{j=1}^{PQ} (A_1\otimes A_2)_{ij} (B_1 \otimes B_2)_{ij}

if i = k (M-1) + m, j = l(Q-1)+n, where k ranges from 1 to N, l ranges from 1 to P, m ranges from 1 to M, n ranges from 1 to Q, then we have:

\langle A_1 \otimes A_2, B_1 \otimes B_2\rangle _F = \sum_{k=1}^{N} \sum_{m=1}^{M}\sum_{l=1}^{P} \sum_{n=1}^{Q} (A_1)_{kl} (A_2)_{mn} (B_1)_{kl} (B_2)_{mn} \langle A_1 \otimes A_2, B_1 \otimes B_2\rangle _F = \left(\sum_{k=1}^{N} \sum_{l=1}^{P}(A_1)_{kl} (B_1)_{kl}\right) \left(\sum_{m=1}^{M}\sum_{n=1}^{Q}  (A_2)_{mn}  (B_2)_{mn} \right) \langle A_1 \otimes A_2, B_1 \otimes B_2\rangle _F =\langle A_1,B_1\rangle _F \langle A_2,B_2\rangle _F

Hopefully that is useful for someone.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.