![]() |
LAPACK
3.4.1
LAPACK: Linear Algebra PACKage
|
00001 *> \brief \b ZLAQR5 00002 * 00003 * =========== DOCUMENTATION =========== 00004 * 00005 * Online html documentation available at 00006 * http://www.netlib.org/lapack/explore-html/ 00007 * 00008 *> \htmlonly 00009 *> Download ZLAQR5 + dependencies 00010 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/zlaqr5.f"> 00011 *> [TGZ]</a> 00012 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/zlaqr5.f"> 00013 *> [ZIP]</a> 00014 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/zlaqr5.f"> 00015 *> [TXT]</a> 00016 *> \endhtmlonly 00017 * 00018 * Definition: 00019 * =========== 00020 * 00021 * SUBROUTINE ZLAQR5( WANTT, WANTZ, KACC22, N, KTOP, KBOT, NSHFTS, S, 00022 * H, LDH, ILOZ, IHIZ, Z, LDZ, V, LDV, U, LDU, NV, 00023 * WV, LDWV, NH, WH, LDWH ) 00024 * 00025 * .. Scalar Arguments .. 00026 * INTEGER IHIZ, ILOZ, KACC22, KBOT, KTOP, LDH, LDU, LDV, 00027 * $ LDWH, LDWV, LDZ, N, NH, NSHFTS, NV 00028 * LOGICAL WANTT, WANTZ 00029 * .. 00030 * .. Array Arguments .. 00031 * COMPLEX*16 H( LDH, * ), S( * ), U( LDU, * ), V( LDV, * ), 00032 * $ WH( LDWH, * ), WV( LDWV, * ), Z( LDZ, * ) 00033 * .. 00034 * 00035 * 00036 *> \par Purpose: 00037 * ============= 00038 *> 00039 *> \verbatim 00040 *> 00041 *> ZLAQR5, called by ZLAQR0, performs a 00042 *> single small-bulge multi-shift QR sweep. 00043 *> \endverbatim 00044 * 00045 * Arguments: 00046 * ========== 00047 * 00048 *> \param[in] WANTT 00049 *> \verbatim 00050 *> WANTT is logical scalar 00051 *> WANTT = .true. if the triangular Schur factor 00052 *> is being computed. WANTT is set to .false. otherwise. 00053 *> \endverbatim 00054 *> 00055 *> \param[in] WANTZ 00056 *> \verbatim 00057 *> WANTZ is logical scalar 00058 *> WANTZ = .true. if the unitary Schur factor is being 00059 *> computed. WANTZ is set to .false. otherwise. 00060 *> \endverbatim 00061 *> 00062 *> \param[in] KACC22 00063 *> \verbatim 00064 *> KACC22 is integer with value 0, 1, or 2. 00065 *> Specifies the computation mode of far-from-diagonal 00066 *> orthogonal updates. 00067 *> = 0: ZLAQR5 does not accumulate reflections and does not 00068 *> use matrix-matrix multiply to update far-from-diagonal 00069 *> matrix entries. 00070 *> = 1: ZLAQR5 accumulates reflections and uses matrix-matrix 00071 *> multiply to update the far-from-diagonal matrix entries. 00072 *> = 2: ZLAQR5 accumulates reflections, uses matrix-matrix 00073 *> multiply to update the far-from-diagonal matrix entries, 00074 *> and takes advantage of 2-by-2 block structure during 00075 *> matrix multiplies. 00076 *> \endverbatim 00077 *> 00078 *> \param[in] N 00079 *> \verbatim 00080 *> N is integer scalar 00081 *> N is the order of the Hessenberg matrix H upon which this 00082 *> subroutine operates. 00083 *> \endverbatim 00084 *> 00085 *> \param[in] KTOP 00086 *> \verbatim 00087 *> KTOP is integer scalar 00088 *> \endverbatim 00089 *> 00090 *> \param[in] KBOT 00091 *> \verbatim 00092 *> KBOT is integer scalar 00093 *> These are the first and last rows and columns of an 00094 *> isolated diagonal block upon which the QR sweep is to be 00095 *> applied. It is assumed without a check that 00096 *> either KTOP = 1 or H(KTOP,KTOP-1) = 0 00097 *> and 00098 *> either KBOT = N or H(KBOT+1,KBOT) = 0. 00099 *> \endverbatim 00100 *> 00101 *> \param[in] NSHFTS 00102 *> \verbatim 00103 *> NSHFTS is integer scalar 00104 *> NSHFTS gives the number of simultaneous shifts. NSHFTS 00105 *> must be positive and even. 00106 *> \endverbatim 00107 *> 00108 *> \param[in,out] S 00109 *> \verbatim 00110 *> S is COMPLEX*16 array of size (NSHFTS) 00111 *> S contains the shifts of origin that define the multi- 00112 *> shift QR sweep. On output S may be reordered. 00113 *> \endverbatim 00114 *> 00115 *> \param[in,out] H 00116 *> \verbatim 00117 *> H is COMPLEX*16 array of size (LDH,N) 00118 *> On input H contains a Hessenberg matrix. On output a 00119 *> multi-shift QR sweep with shifts SR(J)+i*SI(J) is applied 00120 *> to the isolated diagonal block in rows and columns KTOP 00121 *> through KBOT. 00122 *> \endverbatim 00123 *> 00124 *> \param[in] LDH 00125 *> \verbatim 00126 *> LDH is integer scalar 00127 *> LDH is the leading dimension of H just as declared in the 00128 *> calling procedure. LDH.GE.MAX(1,N). 00129 *> \endverbatim 00130 *> 00131 *> \param[in] ILOZ 00132 *> \verbatim 00133 *> ILOZ is INTEGER 00134 *> \endverbatim 00135 *> 00136 *> \param[in] IHIZ 00137 *> \verbatim 00138 *> IHIZ is INTEGER 00139 *> Specify the rows of Z to which transformations must be 00140 *> applied if WANTZ is .TRUE.. 1 .LE. ILOZ .LE. IHIZ .LE. N 00141 *> \endverbatim 00142 *> 00143 *> \param[in,out] Z 00144 *> \verbatim 00145 *> Z is COMPLEX*16 array of size (LDZ,IHI) 00146 *> If WANTZ = .TRUE., then the QR Sweep unitary 00147 *> similarity transformation is accumulated into 00148 *> Z(ILOZ:IHIZ,ILO:IHI) from the right. 00149 *> If WANTZ = .FALSE., then Z is unreferenced. 00150 *> \endverbatim 00151 *> 00152 *> \param[in] LDZ 00153 *> \verbatim 00154 *> LDZ is integer scalar 00155 *> LDA is the leading dimension of Z just as declared in 00156 *> the calling procedure. LDZ.GE.N. 00157 *> \endverbatim 00158 *> 00159 *> \param[out] V 00160 *> \verbatim 00161 *> V is COMPLEX*16 array of size (LDV,NSHFTS/2) 00162 *> \endverbatim 00163 *> 00164 *> \param[in] LDV 00165 *> \verbatim 00166 *> LDV is integer scalar 00167 *> LDV is the leading dimension of V as declared in the 00168 *> calling procedure. LDV.GE.3. 00169 *> \endverbatim 00170 *> 00171 *> \param[out] U 00172 *> \verbatim 00173 *> U is COMPLEX*16 array of size 00174 *> (LDU,3*NSHFTS-3) 00175 *> \endverbatim 00176 *> 00177 *> \param[in] LDU 00178 *> \verbatim 00179 *> LDU is integer scalar 00180 *> LDU is the leading dimension of U just as declared in the 00181 *> in the calling subroutine. LDU.GE.3*NSHFTS-3. 00182 *> \endverbatim 00183 *> 00184 *> \param[in] NH 00185 *> \verbatim 00186 *> NH is integer scalar 00187 *> NH is the number of columns in array WH available for 00188 *> workspace. NH.GE.1. 00189 *> \endverbatim 00190 *> 00191 *> \param[out] WH 00192 *> \verbatim 00193 *> WH is COMPLEX*16 array of size (LDWH,NH) 00194 *> \endverbatim 00195 *> 00196 *> \param[in] LDWH 00197 *> \verbatim 00198 *> LDWH is integer scalar 00199 *> Leading dimension of WH just as declared in the 00200 *> calling procedure. LDWH.GE.3*NSHFTS-3. 00201 *> \endverbatim 00202 *> 00203 *> \param[in] NV 00204 *> \verbatim 00205 *> NV is integer scalar 00206 *> NV is the number of rows in WV agailable for workspace. 00207 *> NV.GE.1. 00208 *> \endverbatim 00209 *> 00210 *> \param[out] WV 00211 *> \verbatim 00212 *> WV is COMPLEX*16 array of size 00213 *> (LDWV,3*NSHFTS-3) 00214 *> \endverbatim 00215 *> 00216 *> \param[in] LDWV 00217 *> \verbatim 00218 *> LDWV is integer scalar 00219 *> LDWV is the leading dimension of WV as declared in the 00220 *> in the calling subroutine. LDWV.GE.NV. 00221 *> \endverbatim 00222 * 00223 * Authors: 00224 * ======== 00225 * 00226 *> \author Univ. of Tennessee 00227 *> \author Univ. of California Berkeley 00228 *> \author Univ. of Colorado Denver 00229 *> \author NAG Ltd. 00230 * 00231 *> \date November 2011 00232 * 00233 *> \ingroup complex16OTHERauxiliary 00234 * 00235 *> \par Contributors: 00236 * ================== 00237 *> 00238 *> Karen Braman and Ralph Byers, Department of Mathematics, 00239 *> University of Kansas, USA 00240 * 00241 *> \par References: 00242 * ================ 00243 *> 00244 *> K. Braman, R. Byers and R. Mathias, The Multi-Shift QR 00245 *> Algorithm Part I: Maintaining Well Focused Shifts, and Level 3 00246 *> Performance, SIAM Journal of Matrix Analysis, volume 23, pages 00247 *> 929--947, 2002. 00248 *> 00249 * ===================================================================== 00250 SUBROUTINE ZLAQR5( WANTT, WANTZ, KACC22, N, KTOP, KBOT, NSHFTS, S, 00251 $ H, LDH, ILOZ, IHIZ, Z, LDZ, V, LDV, U, LDU, NV, 00252 $ WV, LDWV, NH, WH, LDWH ) 00253 * 00254 * -- LAPACK auxiliary routine (version 3.4.0) -- 00255 * -- LAPACK is a software package provided by Univ. of Tennessee, -- 00256 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..-- 00257 * November 2011 00258 * 00259 * .. Scalar Arguments .. 00260 INTEGER IHIZ, ILOZ, KACC22, KBOT, KTOP, LDH, LDU, LDV, 00261 $ LDWH, LDWV, LDZ, N, NH, NSHFTS, NV 00262 LOGICAL WANTT, WANTZ 00263 * .. 00264 * .. Array Arguments .. 00265 COMPLEX*16 H( LDH, * ), S( * ), U( LDU, * ), V( LDV, * ), 00266 $ WH( LDWH, * ), WV( LDWV, * ), Z( LDZ, * ) 00267 * .. 00268 * 00269 * ================================================================ 00270 * .. Parameters .. 00271 COMPLEX*16 ZERO, ONE 00272 PARAMETER ( ZERO = ( 0.0d0, 0.0d0 ), 00273 $ ONE = ( 1.0d0, 0.0d0 ) ) 00274 DOUBLE PRECISION RZERO, RONE 00275 PARAMETER ( RZERO = 0.0d0, RONE = 1.0d0 ) 00276 * .. 00277 * .. Local Scalars .. 00278 COMPLEX*16 ALPHA, BETA, CDUM, REFSUM 00279 DOUBLE PRECISION H11, H12, H21, H22, SAFMAX, SAFMIN, SCL, 00280 $ SMLNUM, TST1, TST2, ULP 00281 INTEGER I2, I4, INCOL, J, J2, J4, JBOT, JCOL, JLEN, 00282 $ JROW, JTOP, K, K1, KDU, KMS, KNZ, KRCOL, KZS, 00283 $ M, M22, MBOT, MEND, MSTART, MTOP, NBMPS, NDCOL, 00284 $ NS, NU 00285 LOGICAL ACCUM, BLK22, BMP22 00286 * .. 00287 * .. External Functions .. 00288 DOUBLE PRECISION DLAMCH 00289 EXTERNAL DLAMCH 00290 * .. 00291 * .. Intrinsic Functions .. 00292 * 00293 INTRINSIC ABS, DBLE, DCONJG, DIMAG, MAX, MIN, MOD 00294 * .. 00295 * .. Local Arrays .. 00296 COMPLEX*16 VT( 3 ) 00297 * .. 00298 * .. External Subroutines .. 00299 EXTERNAL DLABAD, ZGEMM, ZLACPY, ZLAQR1, ZLARFG, ZLASET, 00300 $ ZTRMM 00301 * .. 00302 * .. Statement Functions .. 00303 DOUBLE PRECISION CABS1 00304 * .. 00305 * .. Statement Function definitions .. 00306 CABS1( CDUM ) = ABS( DBLE( CDUM ) ) + ABS( DIMAG( CDUM ) ) 00307 * .. 00308 * .. Executable Statements .. 00309 * 00310 * ==== If there are no shifts, then there is nothing to do. ==== 00311 * 00312 IF( NSHFTS.LT.2 ) 00313 $ RETURN 00314 * 00315 * ==== If the active block is empty or 1-by-1, then there 00316 * . is nothing to do. ==== 00317 * 00318 IF( KTOP.GE.KBOT ) 00319 $ RETURN 00320 * 00321 * ==== NSHFTS is supposed to be even, but if it is odd, 00322 * . then simply reduce it by one. ==== 00323 * 00324 NS = NSHFTS - MOD( NSHFTS, 2 ) 00325 * 00326 * ==== Machine constants for deflation ==== 00327 * 00328 SAFMIN = DLAMCH( 'SAFE MINIMUM' ) 00329 SAFMAX = RONE / SAFMIN 00330 CALL DLABAD( SAFMIN, SAFMAX ) 00331 ULP = DLAMCH( 'PRECISION' ) 00332 SMLNUM = SAFMIN*( DBLE( N ) / ULP ) 00333 * 00334 * ==== Use accumulated reflections to update far-from-diagonal 00335 * . entries ? ==== 00336 * 00337 ACCUM = ( KACC22.EQ.1 ) .OR. ( KACC22.EQ.2 ) 00338 * 00339 * ==== If so, exploit the 2-by-2 block structure? ==== 00340 * 00341 BLK22 = ( NS.GT.2 ) .AND. ( KACC22.EQ.2 ) 00342 * 00343 * ==== clear trash ==== 00344 * 00345 IF( KTOP+2.LE.KBOT ) 00346 $ H( KTOP+2, KTOP ) = ZERO 00347 * 00348 * ==== NBMPS = number of 2-shift bulges in the chain ==== 00349 * 00350 NBMPS = NS / 2 00351 * 00352 * ==== KDU = width of slab ==== 00353 * 00354 KDU = 6*NBMPS - 3 00355 * 00356 * ==== Create and chase chains of NBMPS bulges ==== 00357 * 00358 DO 210 INCOL = 3*( 1-NBMPS ) + KTOP - 1, KBOT - 2, 3*NBMPS - 2 00359 NDCOL = INCOL + KDU 00360 IF( ACCUM ) 00361 $ CALL ZLASET( 'ALL', KDU, KDU, ZERO, ONE, U, LDU ) 00362 * 00363 * ==== Near-the-diagonal bulge chase. The following loop 00364 * . performs the near-the-diagonal part of a small bulge 00365 * . multi-shift QR sweep. Each 6*NBMPS-2 column diagonal 00366 * . chunk extends from column INCOL to column NDCOL 00367 * . (including both column INCOL and column NDCOL). The 00368 * . following loop chases a 3*NBMPS column long chain of 00369 * . NBMPS bulges 3*NBMPS-2 columns to the right. (INCOL 00370 * . may be less than KTOP and and NDCOL may be greater than 00371 * . KBOT indicating phantom columns from which to chase 00372 * . bulges before they are actually introduced or to which 00373 * . to chase bulges beyond column KBOT.) ==== 00374 * 00375 DO 140 KRCOL = INCOL, MIN( INCOL+3*NBMPS-3, KBOT-2 ) 00376 * 00377 * ==== Bulges number MTOP to MBOT are active double implicit 00378 * . shift bulges. There may or may not also be small 00379 * . 2-by-2 bulge, if there is room. The inactive bulges 00380 * . (if any) must wait until the active bulges have moved 00381 * . down the diagonal to make room. The phantom matrix 00382 * . paradigm described above helps keep track. ==== 00383 * 00384 MTOP = MAX( 1, ( ( KTOP-1 )-KRCOL+2 ) / 3+1 ) 00385 MBOT = MIN( NBMPS, ( KBOT-KRCOL ) / 3 ) 00386 M22 = MBOT + 1 00387 BMP22 = ( MBOT.LT.NBMPS ) .AND. ( KRCOL+3*( M22-1 ) ).EQ. 00388 $ ( KBOT-2 ) 00389 * 00390 * ==== Generate reflections to chase the chain right 00391 * . one column. (The minimum value of K is KTOP-1.) ==== 00392 * 00393 DO 10 M = MTOP, MBOT 00394 K = KRCOL + 3*( M-1 ) 00395 IF( K.EQ.KTOP-1 ) THEN 00396 CALL ZLAQR1( 3, H( KTOP, KTOP ), LDH, S( 2*M-1 ), 00397 $ S( 2*M ), V( 1, M ) ) 00398 ALPHA = V( 1, M ) 00399 CALL ZLARFG( 3, ALPHA, V( 2, M ), 1, V( 1, M ) ) 00400 ELSE 00401 BETA = H( K+1, K ) 00402 V( 2, M ) = H( K+2, K ) 00403 V( 3, M ) = H( K+3, K ) 00404 CALL ZLARFG( 3, BETA, V( 2, M ), 1, V( 1, M ) ) 00405 * 00406 * ==== A Bulge may collapse because of vigilant 00407 * . deflation or destructive underflow. In the 00408 * . underflow case, try the two-small-subdiagonals 00409 * . trick to try to reinflate the bulge. ==== 00410 * 00411 IF( H( K+3, K ).NE.ZERO .OR. H( K+3, K+1 ).NE. 00412 $ ZERO .OR. H( K+3, K+2 ).EQ.ZERO ) THEN 00413 * 00414 * ==== Typical case: not collapsed (yet). ==== 00415 * 00416 H( K+1, K ) = BETA 00417 H( K+2, K ) = ZERO 00418 H( K+3, K ) = ZERO 00419 ELSE 00420 * 00421 * ==== Atypical case: collapsed. Attempt to 00422 * . reintroduce ignoring H(K+1,K) and H(K+2,K). 00423 * . If the fill resulting from the new 00424 * . reflector is too large, then abandon it. 00425 * . Otherwise, use the new one. ==== 00426 * 00427 CALL ZLAQR1( 3, H( K+1, K+1 ), LDH, S( 2*M-1 ), 00428 $ S( 2*M ), VT ) 00429 ALPHA = VT( 1 ) 00430 CALL ZLARFG( 3, ALPHA, VT( 2 ), 1, VT( 1 ) ) 00431 REFSUM = DCONJG( VT( 1 ) )* 00432 $ ( H( K+1, K )+DCONJG( VT( 2 ) )* 00433 $ H( K+2, K ) ) 00434 * 00435 IF( CABS1( H( K+2, K )-REFSUM*VT( 2 ) )+ 00436 $ CABS1( REFSUM*VT( 3 ) ).GT.ULP* 00437 $ ( CABS1( H( K, K ) )+CABS1( H( K+1, 00438 $ K+1 ) )+CABS1( H( K+2, K+2 ) ) ) ) THEN 00439 * 00440 * ==== Starting a new bulge here would 00441 * . create non-negligible fill. Use 00442 * . the old one with trepidation. ==== 00443 * 00444 H( K+1, K ) = BETA 00445 H( K+2, K ) = ZERO 00446 H( K+3, K ) = ZERO 00447 ELSE 00448 * 00449 * ==== Stating a new bulge here would 00450 * . create only negligible fill. 00451 * . Replace the old reflector with 00452 * . the new one. ==== 00453 * 00454 H( K+1, K ) = H( K+1, K ) - REFSUM 00455 H( K+2, K ) = ZERO 00456 H( K+3, K ) = ZERO 00457 V( 1, M ) = VT( 1 ) 00458 V( 2, M ) = VT( 2 ) 00459 V( 3, M ) = VT( 3 ) 00460 END IF 00461 END IF 00462 END IF 00463 10 CONTINUE 00464 * 00465 * ==== Generate a 2-by-2 reflection, if needed. ==== 00466 * 00467 K = KRCOL + 3*( M22-1 ) 00468 IF( BMP22 ) THEN 00469 IF( K.EQ.KTOP-1 ) THEN 00470 CALL ZLAQR1( 2, H( K+1, K+1 ), LDH, S( 2*M22-1 ), 00471 $ S( 2*M22 ), V( 1, M22 ) ) 00472 BETA = V( 1, M22 ) 00473 CALL ZLARFG( 2, BETA, V( 2, M22 ), 1, V( 1, M22 ) ) 00474 ELSE 00475 BETA = H( K+1, K ) 00476 V( 2, M22 ) = H( K+2, K ) 00477 CALL ZLARFG( 2, BETA, V( 2, M22 ), 1, V( 1, M22 ) ) 00478 H( K+1, K ) = BETA 00479 H( K+2, K ) = ZERO 00480 END IF 00481 END IF 00482 * 00483 * ==== Multiply H by reflections from the left ==== 00484 * 00485 IF( ACCUM ) THEN 00486 JBOT = MIN( NDCOL, KBOT ) 00487 ELSE IF( WANTT ) THEN 00488 JBOT = N 00489 ELSE 00490 JBOT = KBOT 00491 END IF 00492 DO 30 J = MAX( KTOP, KRCOL ), JBOT 00493 MEND = MIN( MBOT, ( J-KRCOL+2 ) / 3 ) 00494 DO 20 M = MTOP, MEND 00495 K = KRCOL + 3*( M-1 ) 00496 REFSUM = DCONJG( V( 1, M ) )* 00497 $ ( H( K+1, J )+DCONJG( V( 2, M ) )* 00498 $ H( K+2, J )+DCONJG( V( 3, M ) )*H( K+3, J ) ) 00499 H( K+1, J ) = H( K+1, J ) - REFSUM 00500 H( K+2, J ) = H( K+2, J ) - REFSUM*V( 2, M ) 00501 H( K+3, J ) = H( K+3, J ) - REFSUM*V( 3, M ) 00502 20 CONTINUE 00503 30 CONTINUE 00504 IF( BMP22 ) THEN 00505 K = KRCOL + 3*( M22-1 ) 00506 DO 40 J = MAX( K+1, KTOP ), JBOT 00507 REFSUM = DCONJG( V( 1, M22 ) )* 00508 $ ( H( K+1, J )+DCONJG( V( 2, M22 ) )* 00509 $ H( K+2, J ) ) 00510 H( K+1, J ) = H( K+1, J ) - REFSUM 00511 H( K+2, J ) = H( K+2, J ) - REFSUM*V( 2, M22 ) 00512 40 CONTINUE 00513 END IF 00514 * 00515 * ==== Multiply H by reflections from the right. 00516 * . Delay filling in the last row until the 00517 * . vigilant deflation check is complete. ==== 00518 * 00519 IF( ACCUM ) THEN 00520 JTOP = MAX( KTOP, INCOL ) 00521 ELSE IF( WANTT ) THEN 00522 JTOP = 1 00523 ELSE 00524 JTOP = KTOP 00525 END IF 00526 DO 80 M = MTOP, MBOT 00527 IF( V( 1, M ).NE.ZERO ) THEN 00528 K = KRCOL + 3*( M-1 ) 00529 DO 50 J = JTOP, MIN( KBOT, K+3 ) 00530 REFSUM = V( 1, M )*( H( J, K+1 )+V( 2, M )* 00531 $ H( J, K+2 )+V( 3, M )*H( J, K+3 ) ) 00532 H( J, K+1 ) = H( J, K+1 ) - REFSUM 00533 H( J, K+2 ) = H( J, K+2 ) - 00534 $ REFSUM*DCONJG( V( 2, M ) ) 00535 H( J, K+3 ) = H( J, K+3 ) - 00536 $ REFSUM*DCONJG( V( 3, M ) ) 00537 50 CONTINUE 00538 * 00539 IF( ACCUM ) THEN 00540 * 00541 * ==== Accumulate U. (If necessary, update Z later 00542 * . with with an efficient matrix-matrix 00543 * . multiply.) ==== 00544 * 00545 KMS = K - INCOL 00546 DO 60 J = MAX( 1, KTOP-INCOL ), KDU 00547 REFSUM = V( 1, M )*( U( J, KMS+1 )+V( 2, M )* 00548 $ U( J, KMS+2 )+V( 3, M )*U( J, KMS+3 ) ) 00549 U( J, KMS+1 ) = U( J, KMS+1 ) - REFSUM 00550 U( J, KMS+2 ) = U( J, KMS+2 ) - 00551 $ REFSUM*DCONJG( V( 2, M ) ) 00552 U( J, KMS+3 ) = U( J, KMS+3 ) - 00553 $ REFSUM*DCONJG( V( 3, M ) ) 00554 60 CONTINUE 00555 ELSE IF( WANTZ ) THEN 00556 * 00557 * ==== U is not accumulated, so update Z 00558 * . now by multiplying by reflections 00559 * . from the right. ==== 00560 * 00561 DO 70 J = ILOZ, IHIZ 00562 REFSUM = V( 1, M )*( Z( J, K+1 )+V( 2, M )* 00563 $ Z( J, K+2 )+V( 3, M )*Z( J, K+3 ) ) 00564 Z( J, K+1 ) = Z( J, K+1 ) - REFSUM 00565 Z( J, K+2 ) = Z( J, K+2 ) - 00566 $ REFSUM*DCONJG( V( 2, M ) ) 00567 Z( J, K+3 ) = Z( J, K+3 ) - 00568 $ REFSUM*DCONJG( V( 3, M ) ) 00569 70 CONTINUE 00570 END IF 00571 END IF 00572 80 CONTINUE 00573 * 00574 * ==== Special case: 2-by-2 reflection (if needed) ==== 00575 * 00576 K = KRCOL + 3*( M22-1 ) 00577 IF( BMP22 ) THEN 00578 IF ( V( 1, M22 ).NE.ZERO ) THEN 00579 DO 90 J = JTOP, MIN( KBOT, K+3 ) 00580 REFSUM = V( 1, M22 )*( H( J, K+1 )+V( 2, M22 )* 00581 $ H( J, K+2 ) ) 00582 H( J, K+1 ) = H( J, K+1 ) - REFSUM 00583 H( J, K+2 ) = H( J, K+2 ) - 00584 $ REFSUM*DCONJG( V( 2, M22 ) ) 00585 90 CONTINUE 00586 * 00587 IF( ACCUM ) THEN 00588 KMS = K - INCOL 00589 DO 100 J = MAX( 1, KTOP-INCOL ), KDU 00590 REFSUM = V( 1, M22 )*( U( J, KMS+1 )+ 00591 $ V( 2, M22 )*U( J, KMS+2 ) ) 00592 U( J, KMS+1 ) = U( J, KMS+1 ) - REFSUM 00593 U( J, KMS+2 ) = U( J, KMS+2 ) - 00594 $ REFSUM*DCONJG( V( 2, M22 ) ) 00595 100 CONTINUE 00596 ELSE IF( WANTZ ) THEN 00597 DO 110 J = ILOZ, IHIZ 00598 REFSUM = V( 1, M22 )*( Z( J, K+1 )+V( 2, M22 )* 00599 $ Z( J, K+2 ) ) 00600 Z( J, K+1 ) = Z( J, K+1 ) - REFSUM 00601 Z( J, K+2 ) = Z( J, K+2 ) - 00602 $ REFSUM*DCONJG( V( 2, M22 ) ) 00603 110 CONTINUE 00604 END IF 00605 END IF 00606 END IF 00607 * 00608 * ==== Vigilant deflation check ==== 00609 * 00610 MSTART = MTOP 00611 IF( KRCOL+3*( MSTART-1 ).LT.KTOP ) 00612 $ MSTART = MSTART + 1 00613 MEND = MBOT 00614 IF( BMP22 ) 00615 $ MEND = MEND + 1 00616 IF( KRCOL.EQ.KBOT-2 ) 00617 $ MEND = MEND + 1 00618 DO 120 M = MSTART, MEND 00619 K = MIN( KBOT-1, KRCOL+3*( M-1 ) ) 00620 * 00621 * ==== The following convergence test requires that 00622 * . the tradition small-compared-to-nearby-diagonals 00623 * . criterion and the Ahues & Tisseur (LAWN 122, 1997) 00624 * . criteria both be satisfied. The latter improves 00625 * . accuracy in some examples. Falling back on an 00626 * . alternate convergence criterion when TST1 or TST2 00627 * . is zero (as done here) is traditional but probably 00628 * . unnecessary. ==== 00629 * 00630 IF( H( K+1, K ).NE.ZERO ) THEN 00631 TST1 = CABS1( H( K, K ) ) + CABS1( H( K+1, K+1 ) ) 00632 IF( TST1.EQ.RZERO ) THEN 00633 IF( K.GE.KTOP+1 ) 00634 $ TST1 = TST1 + CABS1( H( K, K-1 ) ) 00635 IF( K.GE.KTOP+2 ) 00636 $ TST1 = TST1 + CABS1( H( K, K-2 ) ) 00637 IF( K.GE.KTOP+3 ) 00638 $ TST1 = TST1 + CABS1( H( K, K-3 ) ) 00639 IF( K.LE.KBOT-2 ) 00640 $ TST1 = TST1 + CABS1( H( K+2, K+1 ) ) 00641 IF( K.LE.KBOT-3 ) 00642 $ TST1 = TST1 + CABS1( H( K+3, K+1 ) ) 00643 IF( K.LE.KBOT-4 ) 00644 $ TST1 = TST1 + CABS1( H( K+4, K+1 ) ) 00645 END IF 00646 IF( CABS1( H( K+1, K ) ).LE.MAX( SMLNUM, ULP*TST1 ) ) 00647 $ THEN 00648 H12 = MAX( CABS1( H( K+1, K ) ), 00649 $ CABS1( H( K, K+1 ) ) ) 00650 H21 = MIN( CABS1( H( K+1, K ) ), 00651 $ CABS1( H( K, K+1 ) ) ) 00652 H11 = MAX( CABS1( H( K+1, K+1 ) ), 00653 $ CABS1( H( K, K )-H( K+1, K+1 ) ) ) 00654 H22 = MIN( CABS1( H( K+1, K+1 ) ), 00655 $ CABS1( H( K, K )-H( K+1, K+1 ) ) ) 00656 SCL = H11 + H12 00657 TST2 = H22*( H11 / SCL ) 00658 * 00659 IF( TST2.EQ.RZERO .OR. H21*( H12 / SCL ).LE. 00660 $ MAX( SMLNUM, ULP*TST2 ) )H( K+1, K ) = ZERO 00661 END IF 00662 END IF 00663 120 CONTINUE 00664 * 00665 * ==== Fill in the last row of each bulge. ==== 00666 * 00667 MEND = MIN( NBMPS, ( KBOT-KRCOL-1 ) / 3 ) 00668 DO 130 M = MTOP, MEND 00669 K = KRCOL + 3*( M-1 ) 00670 REFSUM = V( 1, M )*V( 3, M )*H( K+4, K+3 ) 00671 H( K+4, K+1 ) = -REFSUM 00672 H( K+4, K+2 ) = -REFSUM*DCONJG( V( 2, M ) ) 00673 H( K+4, K+3 ) = H( K+4, K+3 ) - 00674 $ REFSUM*DCONJG( V( 3, M ) ) 00675 130 CONTINUE 00676 * 00677 * ==== End of near-the-diagonal bulge chase. ==== 00678 * 00679 140 CONTINUE 00680 * 00681 * ==== Use U (if accumulated) to update far-from-diagonal 00682 * . entries in H. If required, use U to update Z as 00683 * . well. ==== 00684 * 00685 IF( ACCUM ) THEN 00686 IF( WANTT ) THEN 00687 JTOP = 1 00688 JBOT = N 00689 ELSE 00690 JTOP = KTOP 00691 JBOT = KBOT 00692 END IF 00693 IF( ( .NOT.BLK22 ) .OR. ( INCOL.LT.KTOP ) .OR. 00694 $ ( NDCOL.GT.KBOT ) .OR. ( NS.LE.2 ) ) THEN 00695 * 00696 * ==== Updates not exploiting the 2-by-2 block 00697 * . structure of U. K1 and NU keep track of 00698 * . the location and size of U in the special 00699 * . cases of introducing bulges and chasing 00700 * . bulges off the bottom. In these special 00701 * . cases and in case the number of shifts 00702 * . is NS = 2, there is no 2-by-2 block 00703 * . structure to exploit. ==== 00704 * 00705 K1 = MAX( 1, KTOP-INCOL ) 00706 NU = ( KDU-MAX( 0, NDCOL-KBOT ) ) - K1 + 1 00707 * 00708 * ==== Horizontal Multiply ==== 00709 * 00710 DO 150 JCOL = MIN( NDCOL, KBOT ) + 1, JBOT, NH 00711 JLEN = MIN( NH, JBOT-JCOL+1 ) 00712 CALL ZGEMM( 'C', 'N', NU, JLEN, NU, ONE, U( K1, K1 ), 00713 $ LDU, H( INCOL+K1, JCOL ), LDH, ZERO, WH, 00714 $ LDWH ) 00715 CALL ZLACPY( 'ALL', NU, JLEN, WH, LDWH, 00716 $ H( INCOL+K1, JCOL ), LDH ) 00717 150 CONTINUE 00718 * 00719 * ==== Vertical multiply ==== 00720 * 00721 DO 160 JROW = JTOP, MAX( KTOP, INCOL ) - 1, NV 00722 JLEN = MIN( NV, MAX( KTOP, INCOL )-JROW ) 00723 CALL ZGEMM( 'N', 'N', JLEN, NU, NU, ONE, 00724 $ H( JROW, INCOL+K1 ), LDH, U( K1, K1 ), 00725 $ LDU, ZERO, WV, LDWV ) 00726 CALL ZLACPY( 'ALL', JLEN, NU, WV, LDWV, 00727 $ H( JROW, INCOL+K1 ), LDH ) 00728 160 CONTINUE 00729 * 00730 * ==== Z multiply (also vertical) ==== 00731 * 00732 IF( WANTZ ) THEN 00733 DO 170 JROW = ILOZ, IHIZ, NV 00734 JLEN = MIN( NV, IHIZ-JROW+1 ) 00735 CALL ZGEMM( 'N', 'N', JLEN, NU, NU, ONE, 00736 $ Z( JROW, INCOL+K1 ), LDZ, U( K1, K1 ), 00737 $ LDU, ZERO, WV, LDWV ) 00738 CALL ZLACPY( 'ALL', JLEN, NU, WV, LDWV, 00739 $ Z( JROW, INCOL+K1 ), LDZ ) 00740 170 CONTINUE 00741 END IF 00742 ELSE 00743 * 00744 * ==== Updates exploiting U's 2-by-2 block structure. 00745 * . (I2, I4, J2, J4 are the last rows and columns 00746 * . of the blocks.) ==== 00747 * 00748 I2 = ( KDU+1 ) / 2 00749 I4 = KDU 00750 J2 = I4 - I2 00751 J4 = KDU 00752 * 00753 * ==== KZS and KNZ deal with the band of zeros 00754 * . along the diagonal of one of the triangular 00755 * . blocks. ==== 00756 * 00757 KZS = ( J4-J2 ) - ( NS+1 ) 00758 KNZ = NS + 1 00759 * 00760 * ==== Horizontal multiply ==== 00761 * 00762 DO 180 JCOL = MIN( NDCOL, KBOT ) + 1, JBOT, NH 00763 JLEN = MIN( NH, JBOT-JCOL+1 ) 00764 * 00765 * ==== Copy bottom of H to top+KZS of scratch ==== 00766 * (The first KZS rows get multiplied by zero.) ==== 00767 * 00768 CALL ZLACPY( 'ALL', KNZ, JLEN, H( INCOL+1+J2, JCOL ), 00769 $ LDH, WH( KZS+1, 1 ), LDWH ) 00770 * 00771 * ==== Multiply by U21**H ==== 00772 * 00773 CALL ZLASET( 'ALL', KZS, JLEN, ZERO, ZERO, WH, LDWH ) 00774 CALL ZTRMM( 'L', 'U', 'C', 'N', KNZ, JLEN, ONE, 00775 $ U( J2+1, 1+KZS ), LDU, WH( KZS+1, 1 ), 00776 $ LDWH ) 00777 * 00778 * ==== Multiply top of H by U11**H ==== 00779 * 00780 CALL ZGEMM( 'C', 'N', I2, JLEN, J2, ONE, U, LDU, 00781 $ H( INCOL+1, JCOL ), LDH, ONE, WH, LDWH ) 00782 * 00783 * ==== Copy top of H to bottom of WH ==== 00784 * 00785 CALL ZLACPY( 'ALL', J2, JLEN, H( INCOL+1, JCOL ), LDH, 00786 $ WH( I2+1, 1 ), LDWH ) 00787 * 00788 * ==== Multiply by U21**H ==== 00789 * 00790 CALL ZTRMM( 'L', 'L', 'C', 'N', J2, JLEN, ONE, 00791 $ U( 1, I2+1 ), LDU, WH( I2+1, 1 ), LDWH ) 00792 * 00793 * ==== Multiply by U22 ==== 00794 * 00795 CALL ZGEMM( 'C', 'N', I4-I2, JLEN, J4-J2, ONE, 00796 $ U( J2+1, I2+1 ), LDU, 00797 $ H( INCOL+1+J2, JCOL ), LDH, ONE, 00798 $ WH( I2+1, 1 ), LDWH ) 00799 * 00800 * ==== Copy it back ==== 00801 * 00802 CALL ZLACPY( 'ALL', KDU, JLEN, WH, LDWH, 00803 $ H( INCOL+1, JCOL ), LDH ) 00804 180 CONTINUE 00805 * 00806 * ==== Vertical multiply ==== 00807 * 00808 DO 190 JROW = JTOP, MAX( INCOL, KTOP ) - 1, NV 00809 JLEN = MIN( NV, MAX( INCOL, KTOP )-JROW ) 00810 * 00811 * ==== Copy right of H to scratch (the first KZS 00812 * . columns get multiplied by zero) ==== 00813 * 00814 CALL ZLACPY( 'ALL', JLEN, KNZ, H( JROW, INCOL+1+J2 ), 00815 $ LDH, WV( 1, 1+KZS ), LDWV ) 00816 * 00817 * ==== Multiply by U21 ==== 00818 * 00819 CALL ZLASET( 'ALL', JLEN, KZS, ZERO, ZERO, WV, LDWV ) 00820 CALL ZTRMM( 'R', 'U', 'N', 'N', JLEN, KNZ, ONE, 00821 $ U( J2+1, 1+KZS ), LDU, WV( 1, 1+KZS ), 00822 $ LDWV ) 00823 * 00824 * ==== Multiply by U11 ==== 00825 * 00826 CALL ZGEMM( 'N', 'N', JLEN, I2, J2, ONE, 00827 $ H( JROW, INCOL+1 ), LDH, U, LDU, ONE, WV, 00828 $ LDWV ) 00829 * 00830 * ==== Copy left of H to right of scratch ==== 00831 * 00832 CALL ZLACPY( 'ALL', JLEN, J2, H( JROW, INCOL+1 ), LDH, 00833 $ WV( 1, 1+I2 ), LDWV ) 00834 * 00835 * ==== Multiply by U21 ==== 00836 * 00837 CALL ZTRMM( 'R', 'L', 'N', 'N', JLEN, I4-I2, ONE, 00838 $ U( 1, I2+1 ), LDU, WV( 1, 1+I2 ), LDWV ) 00839 * 00840 * ==== Multiply by U22 ==== 00841 * 00842 CALL ZGEMM( 'N', 'N', JLEN, I4-I2, J4-J2, ONE, 00843 $ H( JROW, INCOL+1+J2 ), LDH, 00844 $ U( J2+1, I2+1 ), LDU, ONE, WV( 1, 1+I2 ), 00845 $ LDWV ) 00846 * 00847 * ==== Copy it back ==== 00848 * 00849 CALL ZLACPY( 'ALL', JLEN, KDU, WV, LDWV, 00850 $ H( JROW, INCOL+1 ), LDH ) 00851 190 CONTINUE 00852 * 00853 * ==== Multiply Z (also vertical) ==== 00854 * 00855 IF( WANTZ ) THEN 00856 DO 200 JROW = ILOZ, IHIZ, NV 00857 JLEN = MIN( NV, IHIZ-JROW+1 ) 00858 * 00859 * ==== Copy right of Z to left of scratch (first 00860 * . KZS columns get multiplied by zero) ==== 00861 * 00862 CALL ZLACPY( 'ALL', JLEN, KNZ, 00863 $ Z( JROW, INCOL+1+J2 ), LDZ, 00864 $ WV( 1, 1+KZS ), LDWV ) 00865 * 00866 * ==== Multiply by U12 ==== 00867 * 00868 CALL ZLASET( 'ALL', JLEN, KZS, ZERO, ZERO, WV, 00869 $ LDWV ) 00870 CALL ZTRMM( 'R', 'U', 'N', 'N', JLEN, KNZ, ONE, 00871 $ U( J2+1, 1+KZS ), LDU, WV( 1, 1+KZS ), 00872 $ LDWV ) 00873 * 00874 * ==== Multiply by U11 ==== 00875 * 00876 CALL ZGEMM( 'N', 'N', JLEN, I2, J2, ONE, 00877 $ Z( JROW, INCOL+1 ), LDZ, U, LDU, ONE, 00878 $ WV, LDWV ) 00879 * 00880 * ==== Copy left of Z to right of scratch ==== 00881 * 00882 CALL ZLACPY( 'ALL', JLEN, J2, Z( JROW, INCOL+1 ), 00883 $ LDZ, WV( 1, 1+I2 ), LDWV ) 00884 * 00885 * ==== Multiply by U21 ==== 00886 * 00887 CALL ZTRMM( 'R', 'L', 'N', 'N', JLEN, I4-I2, ONE, 00888 $ U( 1, I2+1 ), LDU, WV( 1, 1+I2 ), 00889 $ LDWV ) 00890 * 00891 * ==== Multiply by U22 ==== 00892 * 00893 CALL ZGEMM( 'N', 'N', JLEN, I4-I2, J4-J2, ONE, 00894 $ Z( JROW, INCOL+1+J2 ), LDZ, 00895 $ U( J2+1, I2+1 ), LDU, ONE, 00896 $ WV( 1, 1+I2 ), LDWV ) 00897 * 00898 * ==== Copy the result back to Z ==== 00899 * 00900 CALL ZLACPY( 'ALL', JLEN, KDU, WV, LDWV, 00901 $ Z( JROW, INCOL+1 ), LDZ ) 00902 200 CONTINUE 00903 END IF 00904 END IF 00905 END IF 00906 210 CONTINUE 00907 * 00908 * ==== End of ZLAQR5 ==== 00909 * 00910 END