Contents

# p?gels

Solves overdetermined or underdetermined linear systems involving a matrix of full rank.

## Syntax

Include Files
• mkl_scalapack.h
Description
The
p?gels
function
solves overdetermined or underdetermined real/ complex linear systems involving an
m
-by-
n
matrix
sub(
A
) =
A
(
ia
:
ia
+
m
-1,
ja
:
ja
+
n
-1)
, or its transpose/ conjugate-transpose, using a
QTQ
or
LQ
factorization of sub(
A
). It is assumed that sub(
A
) has full rank.
The following options are provided:
1. If
trans
=
'N'
and
m
n
: find the least squares solution of an overdetermined system, that is, solve the least squares problem
minimize ||sub(
B
) - sub(
A
)*
X
||
2. If
trans
=
'N'
and
m
<
n
: find the minimum norm solution of an underdetermined system
sub(
A
)*
X
= sub(
B
)
.
3. If
trans
=
'T'
and
m
n
: find the minimum norm solution of an undetermined system
sub(
A
)
T
*X
= sub(
B
)
.
4. If
trans
=
'T'
and
m
<
n
: find the least squares solution of an overdetermined system, that is, solve the least squares problem
minimize ||sub(
B
) - sub(
A
)
T
*
X
||
,
where
sub(
B
) denotes
B
(
ib
:
ib
+
m
-1,
jb
:
jb
+
nrhs
-1)
when
trans
=
'N'
and
B
(
ib
:
ib
+
n
-1,
jb
:
jb
+
nrhs
-1)
otherwise. Several right hand side vectors
b
and solution vectors
x
can be handled in a single call; when
trans
=
'N'
, the solution vectors are stored as the columns of the
n
-by-
nrhs
right hand side matrix sub(
B
) and the
m
-by-
nrhs
right hand side matrix sub(
B
) otherwise.
Input Parameters
trans
(global) Must be
'N'
, or
'T'
.
If
trans
=
'N'
, the linear system involves matrix sub(
A
);
If
trans
=
'T'
, the linear system involves the transposed matrix
A
T
(for real flavors only).
m
(global) The number of rows in the distributed matrix sub (
A
)
(
m
0)
.
n
(global) The number of columns in the distributed matrix sub (
A
)
(
n
0)
.
nrhs
(global) The number of right-hand sides; the number of columns in the distributed submatrices sub(
B
) and
X
.
(
nrhs
0)
.
a
(local)
Pointer into the local memory to an array of size
lld_a
*
LOCc
(
ja
+
n
-1)
. On entry, contains the
m
-by-
n
matrix
A
.
ia
,
ja
(global) The row and column indices in the global matrix
A
indicating the first row and the first column of the submatrix
A
, respectively.
desca
(global and local) array of size
dlen_
. The array descriptor for the distributed matrix
A
.
b
(local)
Pointer into the local memory to an array of local size
lld_b
*
LOCc
(
jb
+
nrhs
-1)
. On entry, this array contains the local pieces of the distributed matrix
B
of right-hand side vectors, stored columnwise; sub(
B
) is
m
-by-
nrhs
if
trans
=
'N'
, and
n
-by-
nrhs
otherwise.
ib
,
jb
(global) The row and column indices in the global matrix
B
indicating the first row and the first column of the submatrix
B
, respectively.
descb
(global and local) array of size
dlen_
. The array descriptor for the distributed matrix
B
.
work
(local)
Workspace array with size
lwork
.
lwork
(local or global) .
The size of the array
work
lwork
is local input and must be at least
lwork
ltau
+
max
(
lwf
,
lws
)
, where if
m
>
n
, then
ltau
=
numroc
(
ja
+
min
(
m
,
n
)-1,
nb_a
,
MYCOL
,
csrc_a
,
NPCOL
)
,
lwf
=
nb_a
*(
mpa
0 +
nqa
0 +
nb_a
)
lws
=
max
((
nb_a
*(
nb_a
-1))/2, (
nrhsqb
0 +
mpb
0)*
nb_a
) +
nb_a
*
nb_a
else
ltau
=
numroc
(
ia
+
min
(
m
,
n
)-1,
mb_a
,
MYROW
,
rsrc_a
,
NPROW
)
,
lwf
=
mb_a
* (
mpa
0 +
nqa
0 +
mb_a
)
lws
=
max
((
mb_a
*(
mb_a
-1))/2, (
npb
0 +
max
(
nqa
0 +
numroc
(
numroc
(
n
+
iroffb
,
mb_a
, 0, 0,
NPROW
),
mb_a
, 0, 0,
lcmp
),
nrhsqb
0))*
mb_a
) +
mb_a
*
mb_a
end if
,
where
lcmp
=
lcm
/
NPROW
with
lcm
=
ilcm
(
NPROW
,
NPCOL
)
,
iroffa
=
mod
(
ia
-1,
mb_a
)
,
icoffa
=
mod
(
ja
-1,
nb_a
)
,
iarow
=
indxg2p
(
ia
,
mb_a
,
MYROW
,
rsrc_a
,
NPROW
)
,
iacol
=
indxg2p
(
ja
,
nb_a
,
MYROW
,
rsrc_a
,
NPROW
)
mpa
0 =
numroc
(
m
+
iroffa
,
mb_a
,
MYROW
,
iarow
,
NPROW
)
,
nqa
0 =
numroc
(
n
+
icoffa
,
nb_a
,
MYCOL
,
iacol
,
NPCOL
)
,
iroffb
=
mod
(
ib
-1,
mb_b
)
,
icoffb
=
mod
(
jb
-1,
nb_b
)
,
ibrow
=
indxg2p
(
ib
,
mb_b
,
MYROW
,
rsrc_b
,
NPROW
)
,
ibcol
=
indxg2p
(
jb
,
nb_b
,
MYCOL
,
csrc_b
,
NPCOL
)
,
mpb
0 =
numroc
(
m
+
iroffb
,
mb_b
,
MYROW
,
icrow
,
NPROW
)
,
nqb
0 =
numroc
(
n
+
icoffb
,
nb_b
,
MYCOL
,
ibcol
,
NPCOL
)
,
mod(
x
,
y
)
is the integer remainder of
x
/
y
.
ilcm
,
indxg2p
and
numroc
are ScaLAPACK tool functions;
MYROW
,
MYCOL
,
NPROW
, and
NPCOL
can be determined by calling the
function
blacs_gridinfo
.
If
lwork
= -1
, then
lwork
is global input and a workspace query is assumed; the
function
only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by
pxerbla
.
Output Parameters
a
On exit, If
m
n
, sub(
A
) is overwritten by the details of its
QR
factorization as returned by
p?geqrf
; if
m
<
n
, sub(
A
) is overwritten by details of its
LQ
factorization as returned by
p?gelqf
.
b
On exit, sub(
B
) is overwritten by the solution vectors, stored columnwise: if
trans
=
'N'
and
m
n
, rows 1 to
n
of sub(
B
) contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of elements
n
+1
to
m
in that column;
If
trans
=
'N'
and
m
<
n
, rows 1 to
n
of sub(
B
) contain the minimum norm solution vectors;
If
trans
=
'T'
and
m
n
, rows 1 to
m
of sub(
B
) contain the minimum norm solution vectors; if
trans
=
'T'
and
m
<
n
, rows 1 to
m
of sub(
B
) contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of elements
m
+1
to
n
in that column.
work
[0]
On exit,
work
[0]
contains the minimum value of
lwork
required for optimum performance.
info
(global)
= 0
: the execution is successful.
< 0
: if the
i
-th argument is an array and the
j
-entry had an illegal value, then
info
= - (
i
* 100+
j
)
, if the
i
-th argument is a scalar and had an illegal value, then
info
= -
i
.

#### Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.