Iterative method used to solve a linear system of equations
Modified Richardson iteration is an iterative method for solving a system of linear equations . Richardson iteration was proposed by Lewis Fry Richardson in his work dated 1910. It is similar to the Jacobi and Gauss–Seidel method .
We seek the solution to a set of linear equations, expressed in matrix terms as
A
x
=
b
.
{\displaystyle Ax=b.}
The Richardson iteration is
x
(
k
+
1
)
=
x
(
k
)
+
ω
(
b
−
A
x
(
k
)
)
,
{\displaystyle x^{(k+1)}=x^{(k)}+\omega \left(b-Ax^{(k)}\right),}
where
ω
{\displaystyle \omega }
is a scalar parameter that has to be chosen such that the sequence
x
(
k
)
{\displaystyle x^{(k)}}
converges.
It is easy to see that the method has the correct fixed points , because if it converges, then
x
(
k
+
1
)
≈
x
(
k
)
{\displaystyle x^{(k+1)}\approx x^{(k)}}
and
x
(
k
)
{\displaystyle x^{(k)}}
has to approximate a solution of
A
x
=
b
{\displaystyle Ax=b}
.
Subtracting the exact solution
x
{\displaystyle x}
, and introducing the notation for the error
e
(
k
)
=
x
(
k
)
−
x
{\displaystyle e^{(k)}=x^{(k)}-x}
, we get the equality for the errors
e
(
k
+
1
)
=
e
(
k
)
−
ω
A
e
(
k
)
=
(
I
−
ω
A
)
e
(
k
)
.
{\displaystyle e^{(k+1)}=e^{(k)}-\omega Ae^{(k)}=\left(I-\omega A\right)e^{(k)}.}
Thus,
‖
e
(
k
+
1
)
‖
=
‖
(
I
−
ω
A
)
e
(
k
)
‖
≤
‖
I
−
ω
A
‖
‖
e
(
k
)
‖
,
{\displaystyle \left\|e^{(k+1)}\right\|=\left\|\left(I-\omega A\right)e^{(k)}\right\|\leq \left\|I-\omega A\right\|\left\|e^{(k)}\right\|,}
for any vector norm and the corresponding induced matrix norm. Thus, if
‖
I
−
ω
A
‖
<
1
{\displaystyle \|I-\omega A\|<1}
, the method converges.
Suppose that
A
{\displaystyle A}
is symmetric positive definite and that
(
λ
j
)
j
{\displaystyle (\lambda _{j})_{j}}
are the eigenvalues of
A
{\displaystyle A}
. The error converges to
0
{\displaystyle 0}
if
|
1
−
ω
λ
j
|
<
1
{\displaystyle |1-\omega \lambda _{j}|<1}
for all eigenvalues
λ
j
{\displaystyle \lambda _{j}}
. If, e.g., all eigenvalues are positive, this can be guaranteed if
ω
{\displaystyle \omega }
is chosen such that
0
<
ω
<
ω
max
,
ω
max
:=
2
/
λ
max
(
A
)
{\displaystyle 0<\omega <\omega _{\text{max}}\,,\ \omega _{\text{max}}:=2/\lambda _{\text{max}}(A)}
. The optimal choice, minimizing all
|
1
−
ω
λ
j
|
{\displaystyle |1-\omega \lambda _{j}|}
, is
ω
opt
:=
2
/
(
λ
min
(
A
)
+
λ
max
(
A
)
)
{\displaystyle \omega _{\text{opt}}:=2/(\lambda _{\text{min}}(A)+\lambda _{\text{max}}(A))}
, which gives the simplest Chebyshev iteration . This optimal choice yields a spectral radius of
min
ω
∈
(
0
,
ω
max
)
ρ
(
I
−
ω
A
)
=
ρ
(
I
−
ω
opt
A
)
=
1
−
2
κ
(
A
)
+
1
,
{\displaystyle \min _{\omega \in (0,\omega _{\text{max}})}\rho \left(I-\omega A\right)=\rho \left(I-\omega _{\text{opt}}A\right)=1-{\frac {2}{\kappa (A)+1}}\,,}
where
κ
(
A
)
{\displaystyle \kappa (A)}
is the condition number .
If there are both positive and negative eigenvalues, the method will diverge for any
ω
{\displaystyle \omega }
if the initial error
e
(
0
)
{\displaystyle e^{(0)}}
has nonzero components in the corresponding eigenvectors .
Consider minimizing the function
F
(
x
)
=
1
2
‖
A
~
x
−
b
~
‖
2
2
{\textstyle F(x)={\frac {1}{2}}\left\|{\tilde {A}}x-{\tilde {b}}\right\|_{2}^{2}}
. Since this is a convex function , a sufficient condition for optimality is that the gradient is zero (
∇
F
(
x
)
=
0
{\displaystyle \nabla F(x)=0}
) which gives rise to the equation
A
~
T
A
~
x
=
A
~
T
b
~
.
{\displaystyle {\tilde {A}}^{T}{\tilde {A}}x={\tilde {A}}^{T}{\tilde {b}}.}
Define
A
=
A
~
T
A
~
{\displaystyle A={\tilde {A}}^{T}{\tilde {A}}}
and
b
=
A
~
T
b
~
{\displaystyle b={\tilde {A}}^{T}{\tilde {b}}}
.
Because of the form of A , it is a positive semi-definite matrix , so it has no negative eigenvalues.
A step of gradient descent is
x
(
k
+
1
)
=
x
(
k
)
−
t
∇
F
(
x
(
k
)
)
=
x
(
k
)
−
t
(
A
x
(
k
)
−
b
)
{\displaystyle x^{(k+1)}=x^{(k)}-t\nabla F(x^{(k)})=x^{(k)}-t\left(Ax^{(k)}-b\right)}
which is equivalent to the Richardson iteration by making
t
=
ω
{\displaystyle t=\omega }
.
Key concepts Problems Hardware Software