Biconjugate gradient stabilized method

In numerical linear algebra, the biconjugate gradient stabilized method, often abbreviated as BiCGSTAB, is an iterative method developed by H. A. van der Vorst for the numerical solution of nonsymmetric linear systems. It is a variant of the biconjugate gradient method (BiCG) and has faster and smoother convergence than the original BiCG as well as other variants such as the conjugate gradient squared method (CGS). It is a Krylov subspace method. Unlike the original BiCG method, it doesn't require multiplication by the transpose of the system matrix.

Algorithmic steps

edit

Unpreconditioned BiCGSTAB

edit

In the following sections, (x,y) = xT y denotes the dot product of vectors. To solve a linear system Ax = b, BiCGSTAB starts with an initial guess x0 and proceeds as follows:

  1. r0 = bAx0
  2. Choose an arbitrary vector 0 such that (0, r0) ≠ 0, e.g., 0 = r0
  3. ρ0 = (0, r0)
  4. p0 = r0
  5. For i = 1, 2, 3, …
    1. v = Api−1
    2. α = ρi−1/(0, v)
    3. h = xi−1 + αpi−1
    4. s = ri−1αv
    5. If h is accurate enough, i.e., if s is small enough, then set xi = h and quit
    6. t = As
    7. ω = (t, s)/(t, t)
    8. xi = h + ωs
    9. ri = sωt
    10. If xi is accurate enough, i.e., if ri is small enough, then quit
    11. ρi = (0, ri)
    12. β = (ρi/ρi−1)(α/ω)
    13. pi = ri + β(pi−1ωv)

In some cases, choosing the vector 0 randomly improves numerical stability.[1]

Preconditioned BiCGSTAB

edit

Preconditioners are usually used to accelerate convergence of iterative methods. To solve a linear system Ax = b with a preconditioner K = K1K2A, preconditioned BiCGSTAB starts with an initial guess x0 and proceeds as follows:

  1. r0 = bAx0
  2. Choose an arbitrary vector 0 such that (0, r0) ≠ 0, e.g., 0 = r0
  3. ρ0 = (0, r0)
  4. p0 = r0
  5. For i = 1, 2, 3, …
    1. y = K −1
      2
       
      K −1
      1
       
      pi−1
    2. v = Ay
    3. α = ρi−1/(0, v)
    4. h = xi−1 + αy
    5. s = ri−1αv
    6. If h is accurate enough then xi = h and quit
    7. z = K −1
      2
       
      K −1
      1
       
      s
    8. t = Az
    9. ω = (K −1
      1
       
      t, K −1
      1
       
      s)/(K −1
      1
       
      t, K −1
      1
       
      t)
    10. xi = h + ωz
    11. ri = sωt
    12. If xi is accurate enough then quit
    13. ρi = (0, ri)
    14. β = (ρi/ρi−1)(α/ω)
    15. pi = ri + β(pi−1ωv)

This formulation is equivalent to applying unpreconditioned BiCGSTAB to the explicitly preconditioned system

Ãx̃ =

with à = K −1
1
 
AK −1
2
 
, = K2x and = K −1
1
 
b
. In other words, both left- and right-preconditioning are possible with this formulation.

Derivation

edit

BiCG in polynomial form

edit

In BiCG, the search directions pi and i and the residuals ri and i are updated using the following recurrence relations:

pi = ri−1 + βipi−1,
i = i−1 + βii−1,
ri = ri−1αiApi,
i = i−1αiATi.

The constants αi and βi are chosen to be

αi = ρi/(i, Api),
βi = ρi/ρi−1

where ρi = (i−1, ri−1) so that the residuals and the search directions satisfy biorthogonality and biconjugacy, respectively, i.e., for ij,

(i, rj) = 0,
(i, Apj) = 0.

It is straightforward to show that

ri = Pi(A)r0,
i = Pi(AT)0,
pi+1 = Ti(A)r0,
i+1 = Ti(AT)0

where Pi(A) and Ti(A) are ith-degree polynomials in A. These polynomials satisfy the following recurrence relations:

Pi(A) = Pi−1(A) − αiATi−1(A),
Ti(A) = Pi(A) + βi+1Ti−1(A).

Derivation of BiCGSTAB from BiCG

edit

It is unnecessary to explicitly keep track of the residuals and search directions of BiCG. In other words, the BiCG iterations can be performed implicitly. In BiCGSTAB, one wishes to have recurrence relations for

i = Qi(A)Pi(A)r0

where Qi(A) = (Iω1A)(Iω2A)⋯(IωiA) with suitable constants ωj instead of ri = Pi(A)r0 in the hope that Qi(A) will enable faster and smoother convergence in i than ri.

It follows from the recurrence relations for Pi(A) and Ti(A) and the definition of Qi(A) that

Qi(A)Pi(A)r0 = (IωiA)(Qi−1(A)Pi−1(A)r0αiAQi−1(A)Ti−1(A)r0),

which entails the necessity of a recurrence relation for Qi(A)Ti(A)r0. This can also be derived from the BiCG relations:

Qi(A)Ti(A)r0 = Qi(A)Pi(A)r0 + βi+1(IωiA)Qi−1(A)Pi−1(A)r0.

Similarly to defining i, BiCGSTAB defines

i+1 = Qi(A)Ti(A)r0.

Written in vector form, the recurrence relations for i and i are

i = i−1 + βi(Iωi−1A)i−1,
i = (IωiA)(i−1αiAi).

To derive a recurrence relation for xi, define

si = i−1αiAi.

The recurrence relation for i can then be written as

i = i−1αiAiωiAsi,

which corresponds to

xi = xi−1 + αii + ωisi.

Determination of BiCGSTAB constants

edit

Now it remains to determine the BiCG constants αi and βi and choose a suitable ωi.

In BiCG, βi = ρi/ρi−1 with

ρi = (i−1, ri−1) = (Pi−1(AT)0, Pi−1(A)r0).

Since BiCGSTAB does not explicitly keep track of i or ri, ρi is not immediately computable from this formula. However, it can be related to the scalar

ρ̃i = (Qi−1(AT)0, Pi−1(A)r0) = (0, Qi−1(A)Pi−1(A)r0) = (0, ri−1).

Due to biorthogonality, ri−1 = Pi−1(A)r0 is orthogonal to Ui−2(AT)0 where Ui−2(AT) is any polynomial of degree i − 2 in AT. Hence, only the highest-order terms of Pi−1(AT) and Qi−1(AT) matter in the dot products (Pi−1(AT)0, Pi−1(A)r0) and (Qi−1(AT)0, Pi−1(A)r0). The leading coefficients of Pi−1(AT) and Qi−1(AT) are (−1)i−1α1α2αi−1 and (−1)i−1ω1ω2ωi−1, respectively. It follows that

ρi = (α1/ω1)(α2/ω2)⋯(αi−1/ωi−1)ρ̃i,

and thus

βi = ρi/ρi−1 = (ρ̃i/ρ̃i−1)(αi−1/ωi−1).

A simple formula for αi can be similarly derived. In BiCG,

αi = ρi/(i, Api) = (Pi−1(AT)0, Pi−1(A)r0)/(Ti−1(AT)0, ATi−1(A)r0).

Similarly to the case above, only the highest-order terms of Pi−1(AT) and Ti−1(AT) matter in the dot products thanks to biorthogonality and biconjugacy. It happens that Pi−1(AT) and Ti−1(AT) have the same leading coefficient. Thus, they can be replaced simultaneously with Qi−1(AT) in the formula, which leads to

αi = (Qi−1(AT)0, Pi−1(A)r0)/(Qi−1(AT)0, ATi−1(A)r0) = ρ̃i/(0, AQi−1(A)Ti−1(A)r0) = ρ̃i/(0, Ap̃i).

Finally, BiCGSTAB selects ωi to minimize i = (IωiA)si in 2-norm as a function of ωi. This is achieved when

((IωiA)si, Asi) = 0,

giving the optimal value

ωi = (Asi, si)/(Asi, Asi).

Generalization

edit

BiCGSTAB can be viewed as a combination of BiCG and GMRES where each BiCG step is followed by a GMRES(1) (i.e., GMRES restarted at each step) step to repair the irregular convergence behavior of CGS, as an improvement of which BiCGSTAB was developed. However, due to the use of degree-one minimum residual polynomials, such repair may not be effective if the matrix A has large complex eigenpairs. In such cases, BiCGSTAB is likely to stagnate, as confirmed by numerical experiments.

One may expect that higher-degree minimum residual polynomials may better handle this situation. This gives rise to algorithms including BiCGSTAB2[1] and the more general BiCGSTAB(l)[2]. In BiCGSTAB(l), a GMRES(l) step follows every l BiCG steps. BiCGSTAB2 is equivalent to BiCGSTAB(l) with l = 2.

See also

edit

References

edit
  1. ^ Schoutrop, Chris; Boonkkamp, Jan ten Thije; Dijk, Jan van (July 2022). "Reliability Investigation of BiCGStab and IDR Solvers for the Advection-Diffusion-Reaction Equation". Communications in Computational Physics. 32 (1): 156–188. doi:10.4208/cicp.oa-2021-0182. ISSN 1815-2406.