File: troubleshooting.html

package info (click to toggle)
petsc 2.3.2-1
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 77,732 kB
  • ctags: 314,526
  • sloc: ansic: 254,277; python: 26,350; cpp: 18,257; fortran: 15,694; makefile: 11,000; sh: 3,638; xml: 620; csh: 211
file content (845 lines) | stat: -rw-r--r-- 28,548 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
<html>
<body BGCOLOR="FFFFFF">

      <h1>Docs:&nbsp;&nbsp; Troubleshooting</h1>



      <p align="left">Doing a search below will usually
lead straight
to the problem.</p>

      <ul>

      </ul>

      <p>We continually update this guide; so please click here
to get
the <a href="http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html">most
recent version.of the troubleshooting guide</a>.</p>

      </td>

    </tr>

  </tbody>
</table>

<hr>
<ol start="1" type="1">

  <li><font color="#ff0000"><strong>Symptom: </strong></font><font face="Terminal">No or little speedup: </font>
    <p><font color="#ff0000"><strong>Problem:&nbsp;</strong><span style="color: rgb(0, 0, 0);">This can be a result of not
using a</span> <a href="faq.html#computers">parallel
system that is suitable for sparse linear solvers.</a> </font></p>

  </li>

  <li><font color="#ff0000"><strong><a name="PetscSplitOwnership"></a>Symptom: </strong></font><font face="Terminal">Error detected&nbsp;in
PetscSplitOwnership() about "sum of local lengths ...": </font>
    <p><font color="#ff0000"><strong>Problem:&nbsp;</strong><span style="color: rgb(0, 0, 0);">In
a previous call to VecSetSizes(), MatSetSizes(), VecCreateXXX() or
MatCreateXXX() you passed in local and global sizes that do not make
sense for the correct number of processors. For example if you pass in
a local size of 2 and a global size of 100 and run on two processors,
this cannot work since the sum of the local sizes is 4, not 100.</span>
    </font></p>

  </li>

  <li><a name="Corrupt"></a><font color="#ff0000"><strong>Symptom: </strong></font><font face="Terminal">Corrupt argument: </font>
    <p><font color="#ff0000"><strong>Problem: </strong><span style="color: rgb(0, 0, 0);">
An argument to
a function is invalid. In Fortran this may be caused by forgeting to
list an argument in the call, especially the final ierr.<br>

&nbsp; &nbsp;&nbsp; <br>

Otherwise it
is usually caused by memory corruption; that is somewhere the code is
writing out of array bounds. To track this down rerun the debug version
of the code with the option -malloc_debug. Occasionally the
code may crash only with the optimized version, in that case
run the optimized version with -malloc_debug. If you determine the
problem
is from memory corruption you can put the macro CHKMEMQ in the code
near the crash to determine exactly what line is causing the
problem.<br>

&nbsp; <br>
 If -malloc_debug does not help for GNU/Linux you can try using&nbsp;<a href="http://valgrind.org">http://valgrind.org </a>to look for memory corruption,
&nbsp;on the Apple do "man libgmalloc" to see how to detect memory corruption.</span></font></p>
  </li>
  <li><font color="#ff0000"><span style="color: rgb(0, 0, 0);"><span style="font-weight: bold;"></span></span></font><font color="#ff0000"><strong><a name="signal"></a>Symptom</strong></font><font color="#ff0000"><span style="color: rgb(0, 0, 0);">: Caught signal</span></font><font face="Terminal">: </font>
    <p><font color="#ff0000"><strong>Problem:&nbsp;</strong></font><font color="#ff0000"><span style="color: rgb(0, 0, 0);">this is most likely due to memory corruption, see <a href="#corrupt">Corrupt Argument</a></span></font><font color="#ff0000"><strong><span style="color: rgb(0, 0, 0);"></span><br>
&nbsp;</strong><span style="color: rgb(0, 0, 0);"></span></font></p>
  </li>
  <li><a name="ZeroPivot"></a><font color="#ff0000"><strong>Symptom: </strong></font><font face="Terminal">Detected zero pivot in LU factorization: </font>
    <p><font color="#ff0000"><strong>Problem: </strong><span style="color: rgb(0, 0, 0);">
A zero pivot
in LU, ILU, Cholesky, or ICC sparse factorization does not always mean
that the matrix is singular. You can use -pc_factor_shift_nonzero or
-pc_factor_shift_positive_definite,
-[level]_pc_factor_shift_nonzero,&nbsp; </span></font><font style="color: rgb(0, 0, 0);" color="#ff0000">-[level]_pc_factor_shift_postive_definite
    </font><font style="color: rgb(0, 0, 0);" color="#ff0000">to prevent the zero pivot. For lu, ilu,
cholesky, or icc and [level] is sub is for a block in the bjacobi or
ASM preconditioner and -mg_levels and -mg_coarse are for inside
multigrid
smoothers or the coarse grid solver). See PCFactorSetShiftNonzero(),
PCFactorSetShiftPd().</font></p>

    <font style="color: rgb(0, 0, 0);" color="#ff0000">
    </font>
    <p style="color: rgb(0, 0, 0);">This error can also
happen if your matrix
is <strong>singular </strong>, see KSPSetNullSpace() for
how to
handle this.</p>

    <font style="color: rgb(0, 0, 0);" color="#ff0000">
    </font>
    <p><font color="#ff0000"><span style="color: rgb(0, 0, 0);">If this error occurs in
the </span><strong style="color: rgb(0, 0, 0);">zeroth
row </strong><span style="color: rgb(0, 0, 0);"> of
the matrix it is likely you have an error in
the code
that generates the matrix.</span> </font></p>

  </li>

  <li><font color="#ff0000"><strong>Symptom: </strong></font><font face="Terminal">alder&gt;make<br>

make: Warning: Can't find `../../bmake/': <br>

    </font>
    <p><font face="Terminal"><font color="#ff0000"><strong>Problem:
    </strong></font>You
have not set the variable PETSC_ARCH to the architecture of your
machine (e.g., sun4, rs6000). </font></p>

    <font face="Terminal"> </font>
    <p><font face="Terminal"><font color="#ff0000"><strong>Cure:
    </strong></font>Include
in your .cshrc file some code to set it automatically. Or remember to
include the PETSC_ARCH in the
command line every time you use make. For instance,
PETSC_ARCH=sun4 example4 </font></p>

    <font face="Terminal"> </font></li>

  <font face="Terminal"> <li><font face="Terminal"><strong>Symptom:
alder&gt; make</strong>&nbsp;</font>
    <p><font face="Terminal"> makefile:12:
/bmake/common/base: No such file or directory<br>

makefile:13: /bmake/common/test: No such file or directory<br>

make: *** No rule to make target `/bmake/common/test'.&nbsp; Stop.</font>
    </p>

or makefile:12:
home/joe/bmake/common/base: No such file or directory<br>

makefile:13: home/joe/bmake/common/test: No such file or directory<br>

make: *** No rule to make target
`home/joe/bmake/common/test'.&nbsp; Stop.<br>

    <p><font color="#ff0000"><strong>Problem: </strong></font>The
variable&nbsp; PETSC_DIR is not set or does not point to the PETSc
directory; in this case it points to the directory /home/joe. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Make
sure the variable PETSC_DIR in the makefile points to the PETSc
directory. Be aware that at many sites, your home directory may
have different names on different machines so it is usually better to
make the path relative, rather than absolute. That is, use
PETSC_DIR = ../../petsc rather than PETSC_DIR =
/c/cafa/username/petsc. </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom: </strong></font><font face="Terminal">alder&gt; make <br>

ex1 f77 -g -o ex1 ex1.o affine2d.o \ ../../libs/libsg/sun4/domain.a
../../libs/libsg/sun4/Xtools.a ../../libs/libsg/sun4/tools.a
../../libs/libsg/sun4/liblapack.a ../../libs/libsg/sun4/blas.a
../../libs/libsg/sun4/system.a -lX11 -lm<br>

ld: ex1.o: bad magic number Compilation failed *** Error code 4 </font>
    <p><font color="#ff0000"><strong>Problem:</strong></font>
The
file ex1.o was compiled on a different architecture or with a
different compiler. </p>

    <p><font color="#ff0000"><strong>Cure:</strong></font>
Remove
all .o files and recompile from scatch</p>

  </li>

  <li>using
MPICH [0] Truncated message (in CHK_MSGLEN) <br>

[0] Aborting program! <br>

p0_8959: p4_error: (null): 1
    <p><font color="#ff0000"><strong>Problem:</strong></font>
this
is due to some bug in a call to an MPI routine. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Run
the program with the option -start_in_debugger. In the
debugger, type "break p4_error" (or "stop in p4_error" for
dbx); then type "cont". When the program aborts, use debugger commands
such as "where" to track down the problem with the call.</p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong></font>
on
HP-UX <br>

Make: Unknown flag argument -. Stop. <br>

Make: Unknown flag argument -. Stop. <br>

Make: Unknown flag argument -. Stop.
    <p>We have gotten this on the HP-UX using the native (vendor
provided) make. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Install
and use Gnu make. To force PETSc to use an alternative make,
edit the file petsc/bmake/$PETSC_ARCH/base and change OMAKE to
your alternative. </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>on
IBM SP<br>

Could not load program
/afs/rpi.edu/big/00/0000/hongwl/petsc/petsc/src/ksp/examp les/ex1<br>

Symbol XSetWMProperties in pmd2 is undefined Symbol XSetWMName in pmd2
is undefined <br>

Error was: Exec format error
    <p><font color="#ff0000"><strong>Problem:</strong></font>
The
libraries on the IBM SP front-end for X may be different than on the
nodes. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Get
your system administrator to make sure the dynamic libraries on
the nodes are IDENTICAL to those on the compiler server. </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>using
Fortran While using VecGetArray(), MatGetArray(), ISGetArray() <br>

/usr/local/mpi/bin/mpirun.ch_p4: 17545 Breakpoint then program stops
    <p><font color="#ff0000"><strong>Problem:</strong></font>
You
have compiled some of your code with the option to check for
arrays out of bound. (on the IBM rs6000 this is the -C option) </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Recompile
all code making sure it does not check for arrays out of bound.
The use of VecGetArray(), etc. requires accessing arrays out of
bounds; this is done safely. -</p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>You
create Draw windows or ViewerDraw windows or use options
-ksp_xmonitor or -snes_xmonitor and the program seems to run OK
but windows never open.
    <p><font color="#ff0000"><strong>Problem:</strong></font>
The
libraries were compiled without support for X windows. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Make
sure that config/configure.py was run with the option --with-x=1 </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>
    <p><font color="#ff0000"><strong>Problem:</strong>
    </font>PETSc
cannot work on a machine where the length of C integers does not equal
the length of Fortran integers. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Change
your compilers so that you use ones that have the same length
for integers. Or check compiler flags to see if you can change
the default integer lengths to match.</p>

  </li>

  <li>On
DEDC alpha Unaligned access pid=15199 va=140021674 pc=12001e8d8
ra=12001e8c0 type=ldt
    <p><font color="#ff0000"><strong>Problem: </strong></font>The
system has detected an unaligned variable. This is usually an unaligned
double. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Make
sure in Fortran that you always write double precision numbers
as 10.d0 etc not just as 10. cause then it will be stored as a
single precision number and may not be properly aligned. -</p>

  </li>

  <li><font color="#ff0000"><strong>Symptom: </strong></font>PetscScalarAddressToFortran:C
and Fortran arrays are not commonly aligned <br>

or are too far apart to be indexed by an integer. <br>

Locations: C 1920156 Fortran 2438656 [<br>

0] MPI Abort by user Aborting program ! <br>

[0] Aborting program!
    <p><font color="#ff0000"><strong>Problem:</strong></font>
This
occurs when trying to access a PETSc array from Fortran. The array may
have been obtained with VecGetArray(), MatGetArray(), etc. On
the IRIX64 this is because the Fortran address's are so far
away from the C address that you cannot move between them with an
integer offset (integers are just not big enough). On other machines
this is because the distance between the Fortran array starting
point and the C array starting point is not divisible by the
length of a double (or complex). This one cannot access the other with
an integer offset. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>1)
Rewrite Fortran code to not use the particular XXXGetXXX()
routine. For example, use VecSetValues() instead of directly stuffing
the values into the array. 2) Determine how to force the
Fortran and or C compiler to commonly align doubles or complex
numbers. That is, if all doubles are double aligned then this
won't be a problem, if all complex are quad aligned then it is not a
problem. If you determine how to do this for a particular machine,
please let use know so we can add it to PETSc. </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom: </strong></font>On
rs6000 machines, the program encounters a segmentation fault
when initializing MPI. <br>

[light] mpirun ex1 <br>

/light_home2/lmcinnes/mpich/lib/rs6000/ch_p4/mpirun: 23817 Memory fault
    <br>

See, e.g., the following debugger session:<br>

[light] 525%gdb ex1 <br>

GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the
conditions. There is absolutely no warranty for GDB; type "show
warranty" for details. GDB 4.13 (rs6000-ibm-aix3.2), Copyright
1994 Free Software Foundation, Inc... <br>

(gdb) run -p4pg joe <br>

Starting program:
/light_home2/lmcinnes/petsc-2.0.15/src/sles/examples/tutorials/ex1
-p4pg joe <br>

Program received signal SIGSEGV, Segmentation fault. 0x10003750 in
getenv () <br>

(gdb) where <br>

#0 0x10003750 in getenv () <br>

#1 0x10001438 in MPIR_Init (=0x2ff7f630, =0x2ff7f634) <br>

#2 0x10001384 in MPI_Init (=0x2ff7f630, =0x2ff7f634) <br>

#3 0x100004d8 in main (argc=3, args=0x2ff7f65c) at ex1.c:37 <br>

#4 0x10000430 in __start ()
    <p><font color="#ff0000"><strong>Problem: </strong></font>As
shown
below, libxlf.a contains the Fortran routine getenv(), which is
being used instead of the UNIX routine that we really need.
This seems to occur when using gcc/g++ instead of xlc. <br>

    </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Edit
the file petsc/bmake/rs6000/bpackages and define FC_LIB as as
follows, making sure to list "-lbsd -lc" BEFORE libxlf.a and
any other Fortran libraries. FC_LIB = -lbsd -lc /usr/lib/libxlf.a</p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>The
program seems to use more and more memory as it runs, even
though you don't think you are allocating more memory.
    <p><font color="#ff0000"><strong>Problem:</strong></font>
Possibly some of the following:</p>

    <ol>

      <li>You are creating new PETSc objects but never freeing
them.</li>

      <li>There is a memory leak in PETSc or your code. </li>

      <li>Something much more subtle: (if you are using Fortran).
When you declare a large array in Fortran, the operating
system does not allocate all the memory pages for that array until you
start using the different locations in the array. Thus, in a
code, if at each step you start using later values in the
array your virtual memory usage will "continue" to increase
as measured by ps or top. </li>

      <li>You are running with the -log, -log_mpe, or -log_all
option. He a great deal of logging information is stored in
memory until the conclusion of the run.</li>

      <li>You are linking with the MPI profiling libraries; these
cause logging of all MPI activities. Another <font color="#ff0000"><strong>Symptom</strong>
        </font>is at the conclusion of the run it may print some
message about writing log
files. </li>

    </ol>

    <p><font color="#ff0000"><strong>Cures:</strong></font></p>

    <ol>

      <li>Run with the -malloc_debug option and -malloc_dump. Or
use
the
commands PetscMallocDump() and PetscMallocLogDump() sprinkled in
your code to track memory that is allocated and not later freed. Use
the commands PetscMallocSpace() and PetscGetResidentSetSize()
to monitor memory allocated and total memory used as the
code progresses. </li>

      <li>This is just the way Unix works and is harmless.</li>

      <li>Do not use the -log, -log_mpe, or -log_all option, or
use PLogEventDeactivate() or PLogEventDeactivateClass(),
PLogEventMPEDeactivate() to turn off logging of specific events. </li>

      <li>Make sure you do not link with the MPI profiling
libraries.&nbsp;</li>

      <li></li>

    </ol>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>Under
Windows when Installing using g++ libfast in:
/users/petsc/petsc_prj/petsc/src/sys/src str.c: In function `void
PetscStrncpy(char *, char *, int)': str.c:36: warning: implicit
declaration of function `int strncpy(...)' ... ...
    <p><font color="#ff0000"><strong>Problem:</strong></font>
This is
due to the case insensitivity of Windows file systems. Instead of using
string.h , the compiler is picking up String.h - a C++
include-file, causing these errors. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>In
the gcc include dir do "cp string.h string_bak.h" - Edit
petsc/src/sys/src/str.c replace string.h with string_bak.h -
Edit petsc/src/sys/src/memc.c replace memory.h with string_bak.h -
recompile. </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>on
SGI (or Origin) using SGI MPI version 2.0 MPI Error, rank:0,
function:MPI_ERRHANDLER_SET, Invalid communicator MPI_Abort()
called, aborting program! Other, random crashes in MPI.
    <p><font color="#ff0000"><strong>Problem: </strong></font>bug
in SGI's implementation of MPI called version 2.0 (confirmed by
SGI) </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Upgrade
to SGI's version 3.0 of MPI. </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>on
SGI Powerchallenge running 6.2 ld64: WARNING 134: weak
definition of __dcis in /usr/lib64/mips4/libftn.so preempts that weak
definition in /usr/lib64/mips4/libm.so. ld64: WARNING 134: weak
definition of __rcis in /usr/lib64/mips4/libftn.so preempts
that weak definition in /usr/lib64/mips4/libm.so.
    <p><font color="#ff0000"><strong>Problem:</strong></font>
Message seems harmless </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Change
the CLINKER and FLINKER in bmake/IRIX64/base to&nbsp; <br>

CLINKER = cc -64 ${COPTFLAGS} -Wl,-woff,84,-woff,85,-woff,134 -rpath
${LDIR}:${DYLIBPATH} <br>

FLINKER = f77 -64 ${FOPTFLAGS} -Wl,-woff,84,-woff,85,-woff,134 -rpath
${LDIR}:${DYLIBPATH}</p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>&nbsp;</font>when
calling MatPartitioningApply() you get a message Error! Key 16615 not
found
    <p><font color="#ff0000"><strong>Problem:&nbsp;</strong></font>the
graph of &nbsp;the matrix you are using is not symmetric </p>

    <p><font color="#ff0000"><strong>Cure:</strong></font>
you must use symmetric matrices for partitioning </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>With
GMRES At restart the second residual norm printed does not
match the first <br>

26 KSP Residual norm 3.421544615851e-04 <br>

27 KSP Residual norm 2.973675659493e-04 <br>

28 KSP Residual norm 2.588642948270e-04 <br>

29 KSP Residual norm 2.268190747349e-04 <br>

30 KSP Residual norm 1.977245964368e-04<br>

30 KSP Residual norm 1.994426291979e-04 &lt;----- At restart the
residual norm is printed a second time
    <p><font color="#ff0000"><strong>Problem:</strong></font>
Actually this is not surprising. GMRES computes the norm of the
residual at each iteration via a recurrence relation between
the norms of the residuals at the previous iterations and quantities
computed at the current iteration; it does not compute it via directly
|| b - A x^{n} ||. Sometimes, especially with an
ill-conditioned matrix, or computation of the matrix-vector product via
differencing, the residual norms computed by GMRES start to
"drift" from the correct values. At the restart, we compute the
residual norm directly, hence the "strange stuff," the
difference printed. The drifting, if it remains small, is harmless
(doesn't effect the accuracy of the solution that GMRES
computes). </p>

    <p><font color="#ff0000"><strong>Cure:</strong></font>
There
realy isn't a cure, but if you use a more powerful
preconditioner the drift will often be smaller and less noticeable. Of
if you are running matrix-free you may need to tune the
matrix-free parameters.</p>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>on
Cray T3E/T3D My mixed Fortran/(C or C++) code works fine on
other machines but does not link (or links but crashes on) on
the Cray T3D or T3E.
    <p><font color="#ff0000"><strong>Probable
problems</strong></font>:</p>

    <ol>

      <li>NOT LINKING: the Cray Fortran compiler changes all
Fortran routine names to all caps, so when you call them
from C/C++ with all small letters, the linker cannot find the. </li>

      <li>STRANGE CRASHING: the Cray Fortran compiler uses double
precision to denote quad precision and single precision to
denote "regular" double precision. </li>

    </ol>

    <p><font color="#ff0000"><strong>Cures:</strong></font>
    </p>

    <ol>

      <li>You must make sure that when you call Fortran routines
from C/C++ the name of the routine called (in C/C++) is in
all caps. The PETSc macro HAVE_FORTRAN_CAPS is defined on machines like
the Cray so you can use it in your C/C++ like this #if
defined(HAVE_FORTRAN_CAPS) #define myfortranroutine_ MYFORTRANROUTINE
#elif !defined(HAVE_FORTRAN_UNDERSCORE) #define
myfortranroutine_ #define myfortranroutine #endif /* some C
code that calls Fortran */ myfortranroutine_(.....). See
src/fortran/custom/zoptions.c for examples. </li>

      <li>To get the Fortran compiler to to behave like a normal
Unix Fortran compiler you must make sure that all of your
Fortran routines are compiled with the -dp flag. If you use the PETSc
makefiles and macro FC to compile your Fortran code this will
handle this automatically. </li>

    </ol>

  </li>

  <li><font color="#ff0000"><strong>Symptom:</strong>
    </font>[0]
PETSC ERROR: MatAssemblyBegin() line 1858 in
src/mat/interface/matrix.c Not for factored matrix
    <p><font color="#ff0000"><strong>Problem: </strong></font>You
are trying to assemble a matrix that has been factored.
Normally this does not make sense, unless you are using an implace
factorization and want to reuse the space. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Call
MatSetUnfactored(Mat); before calling the MatSetValues()
routines.</p>

  </li>

  <li><strong><font color="#ff0000">Symptom:</font></strong>
Error
while running program<br>

    <br>

&gt; mpirun ex1<br>

p0_27137: p4_error: semget failed for setnum=%d: 0<br>

    <br>

    <strong><font color="#ff0000">Problem:</font></strong>
Inproperly installed or configured MPICH. Often this results
from compiling the socket based version of MPICH, with device ch_p4 but
using the mpirun associated with the shared memory version or
the other way around.<br>

    <br>

    <font color="#ff0000"><strong>Cure:</strong>
    </font>First,
make sure that you can run plain old MPI programs (those
without PETSC). Make sure you are using the correct version of
mpirun for the installed version of MPICH or reinstall MPICH<br>

.<br>

  </li>

  <li><strong><font color="#ff0000">Symptom</font></strong>:
Error when compiling PETSc examples<br>

    <br>

&gt; ld : -lg2c no such file<br>

    <br>

    <font color="#ff0000"><strong>Problem:</strong></font>
Your
fortran compiler is probably using libf2c.a instead of libg2c.a<br>

    <br>

    <font color="#ff0000"><strong>Cure:</strong></font>
Edit
bmake/${PETSC_ARCH}/petscconf and replace -lg2c with -lf2c<br>

    <br>

  </li>

  <li>
    <p>&nbsp;<font color="#ff0000"><strong>Symptom</strong>:</font>
Get the following errors when using PETSc graphics on
windows/cygwin-X11 <br>

    <font face="Terminal"><br>

X Error of failed request: BadMatch (invalid parameter attributes)<br>

Major opcode of failed request: 78 (X_CreateColormap)<br>

Serial number of failed request: 8<br>

Current serial number in output stream: 9<br>

    <br>

    </font></p>

    <p><font face="Terminal"><font color="#ff0000"><strong>Problem:</strong></font>
This
problem might occur when using 25 color mode or 32bit color
mode on windows. </font></p>

    <font face="Terminal"> </font>
    <p><font face="Terminal"><font color="#ff0000"><strong>Cure:</strong></font>
This
can be fixed by changing the display settings on windows to 16
bit colors or 24 bit colors.<br>

    </font></p>

    <font face="Terminal"> </font>
    <p></p>

  </li>

  <li><font face="Terminal"><font color="#ff0000"><strong>Symptom</strong>:</font>
Some
Krylov methods seem to print two residual norms per iteration,
for example <br>

    <font face="Terminal"><br>

&gt; 1198 KSP Residual norm 1.366052062216e-04<br>

&gt; 1198 KSP Residual norm 1.931875025549e-04<br>

&gt; 1199 KSP Residual norm 1.366026406067e-04<br>

&gt; 1199 KSP Residual norm 1.931819426344e-04&nbsp;</font></font>
    <p><font color="#ff0000"><strong>Problem:</strong></font>
Some Krylov methods, for example tfqmr, actually have a
"sub-iteration"<br>

of size 2 inside the loop; each of the two substeps has its own matrix
vector<br>

product and application of the preconditioner and updates the residual<br>

approximations. This is why you get this "funny" output where it looks
like&nbsp;<br>

there are two residual norms per iteration. You can also think of it as
twice<br>

as many iterations. </p>

    <p></p>

  </li>

  <li><font color="#ff0000"><strong>Symptom</strong>:</font>
The
example compiles fine - but at runtime gives the following
error:
    <p>[0]PETSC ERROR: PetscInitialize_DynamicLibraries() line 63
in src/sys/src/dll/reg.c<br>

[0]PETSC ERROR: Unable to locate PETSc dynamic library
/home/balay/spetsc/lib/libg/linux/libpetsc&nbsp;<br>

You cannot move the dynamic libraries!<br>

    <br>

    </p>

    <p><font color="#ff0000"><strong>Problem:</strong></font>
When
using DYNAMIC libraries - the libraries cannot be moved after
they are installed. This could also happen on clusters - where
the paths are different on the (run) nodes - than on the
(compile) front-end. </p>

    <p><font color="#ff0000"><strong>Cure:</strong></font>&nbsp;
Do not use dynamic libraries &amp; shared libraries. Run
config/configure.py with --with-shared=0 --with-dynamic=0</p>

  </li>

  <li><font color="#ff0000"><strong>Symptom</strong>:</font>
When
running with -start_in_debugger one gets the error message
    <p>PETSC: Attaching gdb to
/opt/procast_mpich/procast051003/./procast of pid 31603 on display
linux.:0.\
0 on machine linux.
: Can't get address for linux. Xt error: Can't open display: linux.:0.0
    </p>

    <p><font color="#ff0000"><strong>Problem:</strong></font>
The
remote nodes
do not know where to display the debugger window. </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>Run
with
the additional option
-display displayname where displayname is something like mymachine.0.0 </p>

  </li>

  <li><font color="#ff0000"><strong></strong></font><font color="#ff0000"><strong>Symptom: </strong></font><br>

# make PETSC_ARCH=asterix-mpd test<br>

Running test examples to verify correct installation<br>

Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1
MPI process<br>

See
http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html<br>

mpdrun_asterix: cannot connect to local mpd (/tmp/mpd2.console_balay);
possible causes:<br>

&nbsp; 1. no mpd is running on this host<br>

&nbsp; 2. an mpd is running but was started without a "console" (-n
option)<br>

    <p><font color="#ff0000"><strong>Problem:</strong></font>&nbsp;
As
the error message indicates - 'mpd'&nbsp; - required for the
version of
MPICH you've installed isn't started </p>

    <p><font color="#ff0000"><strong>Cure:</strong></font><span style="font-weight: bold;">&nbsp;</span> Start the
mpd daemon [should
be at MPI_DIR/bin/mpdboot].<br>

    </p>

  </li>

  <li><font color="#ff0000"><strong>Symptom: </strong></font>
    <p><font color="#ff0000"><strong>Problem: </strong></font>
    </p>

    <p><font color="#ff0000"><strong>Cure: </strong></font>
    </p>

  </li>

  <li>
    <p> </p>

  </li>

  </font>
</ol>

</body>
</html>
</body>
</html>