File: faq.html

package info (click to toggle)
petsc 3.10.3%2Bdfsg1-5
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 209,064 kB
  • sloc: ansic: 587,333; python: 29,696; makefile: 12,445; fortran: 11,626; f90: 9,677; cpp: 8,768; sh: 1,027; xml: 621; objc: 445; csh: 194; java: 13
file content (2028 lines) | stat: -rw-r--r-- 106,273 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
  <head> <link rel="canonical" href="http://www.mcs.anl.gov/petsc/petsc-current/docs/faq.html" />
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <title>Documentation: FAQ</title>
  </head>
  <body bgcolor="#ffffff">
   <div id="version" align=right><b>petsc-3.10.3 2018-12-18</b></div>
   <div id="bugreport" align=right><a href="mailto:petsc-maint@mcs.anl.gov?subject=Typo or Error in Documentation &body=Please describe the typo or error in the documentation: petsc-3.10.3 v3.10.3 docs/faq.html "><small>Report Typos and Errors</small></a></div>

    <h1>Documentation: FAQ</h1>

    

    <style type="text/css">
      #answers > h3 a[name] {
        color: red;
      }

      #answers > ol > li {
        list-style-type: lower-alpha;
        padding: .5em 0;
      }
    </style>

    <div id="main">

      <h4><a href="#general">General</a></h4>

      <ul>
        <li><a href="#petsc-mailing-list">How can I subscribe to the PETSc mailing lists?</a></li>
        <li><a href="#book">Any useful books on numerical computing?</a></li>
        <li><a href="#computers">What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?</a></li>
        <li><a href="#license">What kind of license is PETSc released under?</a></li>
        <li><a href="#why-c">Why is PETSc programmed in C, instead of Fortran or C++?</a> </li>
        <li><a href="#logging-overhead">Does all the PETSc error checking and logging reduce PETSc's efficiency?</a></li>
        <li><a href="#work-efficiently">How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?</a></li>
        <li><a href="#complex">For complex numbers will I get better performance using C or C++?</a></li>
        <li><a href="#different">How come when I run the same program on the same number of processes I get "different" answers?</a></li>
        <li><a href="#differentiterations">How come when I run the same linear solver with different number of processes it takes a different number of iterations?</a></li>
        <li><a href="#gpus">Can PETSc use GPUs to speed up the computation time?</a></li>
        <li><a href="#precision">Can I run PETSc with extended precision?</a></li>
        <li><a href="#qd">Why doesn't PETSc use QD to implement support for extended precision?</a></li>
      </ul>

      <h4><a href="#installation">Installation</a></h4>

      <ul>
        <li><a href="#already-installed">How do I begin using PETSc if the software has already been completely built and installed by someone else?</a></li>
        <li><a href="#reduce-disk-space">The PETSc distribution is SO large. How can I reduce my disk space usage?</a></li>
        <li><a href="#petsc-uni">I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?</a></li>
        <li><a href="#no-x">Can I install PETSc to not use X windows (either under Unix or Windows with gcc, the gnu compiler)?</a></li>
        <li><a href="#use-mpi">Why do you use MPI</a>?</li> <li><a href="#mpi-compilers">What do I do if my MPI compiler wrappers are invalid</a>?</li>
        <li><a href="#with-64-bit-indices">When should/can I use the ./configure option --with-64-bit-indices</a>?</li>
        <li><a href="#compilererror">What if I get an internal compiler error?</a></li>
        <li><a href="#install-petsc4py-dev">How do I install petsc4py with the development PETSc</a>?</li>
        <li><a href="#gfortran">What Fortran compiler do you recommend for the Apple Mac OS X?</a></li>
        <li><a href="#packagelocations">How can I find the URL locations of the packages you install using --download-PACKAGE?</a></li>
        <li><a href="#multiplempis">How to fix the problem: PETSc was configured with one MPICH or OpenMPI mpi.h version but now appears to be compiling using a different MPICH OpenMPI mpi.h version</a></li>
        <li><a href="#PetscOptionsInsertFile">What does it mean when make test gives:</a>
            <pre>
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes
See http://www.mcs.anl.gov/petsc/documentation/faq.html
[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
            </pre></li>
      </ul>

      <h4><a href="#usage">Usage</a></h4>

      <ul>
        <li><a href="#redirectstdout">How can I redirect PETSc's stdout and stderr when programming with a GUI interface in Windows Developer Studio or to C++ streams?</a></li>
        <li><a href="#hypre">I want to use hypre boomerAMG without GMRES but when I run -pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly I don't get a very accurate answer!</a></li>
        <li><a href="#nosaij">You have AIJ and BAIJ matrix formats, and SBAIJ for symmetric storage, how come no SAIJ?</a></li>
        <li><a href="#domaindecomposition">How do I use PETSc for domain decomposition?</a></li>
        <li><a href="#blocks">Can I create BAIJ matrices with different size blocks for different block rows?</a></li>
        <li><a href="#mpi-vec-access">How do I access values from a parallel PETSc vector on a different process than the one that owns the values?</a></li>
        <li><a href="#mpi-vec-to-seq-vec">How do I collect all the values from a parallel PETSc vector into a sequential vector on each processor?</a></li>
        <li><a href="#mpi-vec-to-mpi-vec">How do I collect all the values from a parallel PETSc vector into a vector on the zeroth (or any particular) processor?</a></li>
        <li><a href="#sparse-matrix-ascii-format">How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, SLAPC or other ASCII format?</a></li>
        <li><a href="#setfromoptions">Does TSSetFromOptions(), SNESSetFromOptions() or KSPSetFromOptions() reset all the parameters I set or how come TS/SNES/KSPSetXXX() don't seem to work?</a></li>
        <li><a href="#makefiles">Can I use my own makefiles or rules for compiling code, rather than PETSc's?</a></li>
        <li><a href="#cmake">Can I use CMake to build my own project that depends on PETSc?</a></li>
        <li><a href="#carriagereturns">How can I put carriage returns in PetscPrintf() statements from Fortran?</a></li>
        <li><a href="#cxxmethod">How can I implement callbacks using C++ class methods?</a></li>
        <li><a href="#functionjacobian">Everyone knows that when you code Newton's method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?</a></li>
        <li><a href="#jacobianfree">How can I use Newton's method Jacobian free? Can I difference a different function than provided with SNESSetFunction()?</a></li>
        <li><a href="#conditionnumber">How can I determine the condition number of a matrix?</a></li>
        <li><a href="#invertmatrix">How can I compute the inverse of a PETSc matrix?</a></li>
        <li><a href="#schurcomplement">How can I compute a Schur complement: Kbb - Kba *inverse(Kaa)*Kab?</a></li>
        <li><a href="#fem">Do you have examples of doing unstructured grid finite element computations (FEM) with PETSc?</a></li>
        <li><a href="#da_mpi_cart">The PETSc DMDA object decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?</a></li>
        <li><a href="#redistribute">When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-symmetric system. How can I apply Dirichlet boundary conditions and yet keep the matrix symmetric?</a></li>
        <li><a href="#matlab">How can I use PETSc with MATLAB? How can I get PETSc Vecs and Mats to MATLAB or vice versa?</a></li>
        <li><a href="#cython">How do I get started with Cython so that I can extend petsc4py?</a></li>
        <li><a href="#customnormforksp">I would like to compute a custom norm for KSP to use as a convergence test or for monitoring?</a></li>
        <li><a href="#parallelsolvesequentialcode">If I have a sequential program can I use a parallel direct solver?</a></li>
      </ul>

      <h4><a href="#Execution">Execution</a></h4>

      <ul>
        <li><a href="#long-link-time">PETSc executables are SO big and take SO long to link.</a></li>
        <li><a href="#petsc-options">PETSc has so many options for my program that it is hard to keep them straight.</a></li>
        <li><a href="#petsc-log-info">PETSc automatically handles many of the details in parallel PDE solvers.  How can I understand what is really happening within my program? </a></li>
        <li><a href="#efficient-assembly">Assembling large sparse matrices takes a long time. What can I do make this process faster? or MatSetValues() is <b>so slow</b>, what can I do to make it faster?</a></li>
        <li><a href="#log-view">How can I generate performance summaries with PETSc?</a></li>
        <li><a href="#parallel-roundoff">Why do I get different answers on a different numbers of processors?</a></li>
        <li><a href="#mg-log">How do I know the amount of time spent on each level of the solver in multigrid (PCType of PCMG) -pc_type mg.</a></li>
        <li><a href="#datafiles">Where do I get the input matrices for the examples?</a></li>
        <li><a href="#info">When I dump some matrices and vectors to binary, I seem to be generating some empty files with .info extensions. What's the deal with these?</a></li>
        <li><a href="#slowerparallel">Why is my parallel <b>solver slower</b> than the sequential solver, or I have poor speed-up?</a></li>
        <li><a href="#pipelined">What steps are necessary to make the pipelined solvers execute efficiently?</a></li>
        <li><a href="#singleprecision">When using PETSc in single precision mode (--with-precision=single when running ./configure) are the operations done in single or double precision?</a></li>
        <li><a href="#newton">Why is Newton's method (SNES) not converging, or converges slowly?</a></li>
        <li><a href="#kspdiverged">Why is the linear solver (KSP) not converging, or converges slowly?</a></li>
      </ul>

      <h4><a href="#debugging">Debugging</a></h4>

      <ul>
        <li><a href="#debug-ibmfortran">How do I turn off PETSc signal handling so I can use the -C option on xlF?</a></li>
        <li><a href="#start_in_debugger-doesnotwork">How do I debug if -start_in_debugger does not work on my machine?</a></li>
        <li><a href="#debug-hang">How can I see where my code is hanging?</a></li>
        <li><a href="#debug-inspect">How can I inspect Vec and Mat values when in the debugger?</a></li>
        <li><a href="#debug-fp">How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?</a></li>
        <li><a href="#libimf">Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory.</a></li>
        <li><a href="#objecttypenotset">What does Object Type not set: Argument # n mean?</a></li>
        <li><a href="#split">What does Error detected in PetscSplitOwnership() about "sum of local lengths ...": mean?</a></li>
        <li><a href="#valgrind">What does Corrupt argument or Caught signal or SEQV or segmentation violation or bus error mean? Can I use valgrind to debug memory corruption issues?</a></li>
        <li><a href="#zeropivot">What does Detected zero pivot in LU factorization mean?</a></li>
        <li><a href="#xwindows">You create Draw windows or ViewerDraw windows or use options -ksp_monitor or_draw -snes_monitor_lg_residualnorm and the program seems to run OK but windows never open.</a></li>
        <li><a href="#memory">The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory.</a></li>
        <li><a href="#key">When calling MatPartitioningApply() you get a message <code>Error! Key 16615 not found</code></a></li>
        <li><a href="#gmres">With GMRES At restart the second residual norm printed does not match the first </a></li>
        <li><a href="#doubleits">Why do some Krylov methods seem to print two residual norms per iteration?</a></li>
        <li><a href="#dylib">Unable to locate PETSc dynamic library /home/balay/spetsc/lib/libg/linux/libpetsc</a></li>
        <li><a href="#bisect">How do I determine what update to PETSc broke my code?</a></li>
      </ul>

      <h4><a href="#shared-libraries">Shared Libraries</a></h4>

      <ul>
        <li><a href="#install-shared">Can I install PETSc libraries as shared libraries?</a></li>
        <li><a href="#why-use-shared">Why should I use shared libraries?</a></li>
        <li><a href="#link-shared">How do I link to the PETSc shared libraries?</a></li>
        <li><a href="#link-regular-lib">What if I want to link to the regular .a library files?</a></li>
        <li><a href="#move-shared-exec">What do I do if I want to move my executable to a different machine?</a></li>
      </ul>

    </div>

    <hr>

    <div id="answers">

      <h1><a name="general">General</a></h1>

      <h3><a name="petsc-mailing-list">How can I subscribe to the PETSc mailing lists?</a></h3>

      See <a href="http://www.mcs.anl.gov/petsc/miscellaneous/mailing-lists.html">http://www.mcs.anl.gov/petsc/miscellaneous/mailing-lists.html</a>

      <h3><a name="book">Any useful books on numerical computing?</a></h3>

      <a href="http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511617973">Writing Scientific Software: A Guide to Good Style</a>

      <h3><a name="computers">What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?</a></h3>

      PETSc can be used with any kind of parallel system that supports MPI
      <b>BUT</b> for any decent performance one needs

      <ul>
        <li>
          a <b>fast, low-latency interconnect</b>; any ethernet, even 10 gigE
          simply cannot provide the needed performance.
        </li>
        <li>
          <b>high per-CPU MEMORY performance</b>. Each CPU (core in multi-core
          systems) needs to have its <b>own</b> memory bandwith of at least 2 or
          more gigabytes/second. For example, standard dual processor "PC's" will
          <b>not</b> provide better performance when the second processor is
          used, that is, you will not see speed-up when you using the second
          processor. This is because the speed of sparse matrix computations is
          almost totally determined by the speed of the memory, not the speed of
          the CPU.
        </li>
     </ul>


    To obtain good performance it is important that you know your
    machine, how many nodes, how many memory sockets per node and how
    many cores per memory socket and how much memory bandwidth you
    obtain per core, memory socket, and node. If you do not know this
    and can run MPI programs with mpiexec (that is, you don't have
    batch system) in ${PETSC_DIR} run </p>

   make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]</p>

   then it will provide a summary of the bandwidth received with different number of MPI processes and potential speedups. If you have a batch system
     <ul>
      <li> cd src/benchmarks/streams</li>
      <li> make MPIVersion</li>
      <li> submit MPIVersion to the batch system a number of times with 1, 2, 3, etc MPI processes collecting all of the output from the runs into the single file scaling.log.</li>
      <li> copy scaling.log into the src/benchmarks/streams directory</li>
      <li> ./process.py createfile ; process.py </li>
     </ul>


           Even if you have enough memory bandwidth if the OS switches
          processes between cores performance can degrade. Smart
          process to core/socket binding (this just means locking a
          process to a particular core or memory socket) may help
          you. For example, consider using fewer processes than cores
          and binding processes to separate sockets so that each
          process uses a different memory bus:

      <ul>
      <li>
            <dt><a href="http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding">MPICH2 binding with the Hydra process manager</a></dt>
            <dd><code>mpiexec.hydra -n 4 --binding cpu:sockets</code></dd>
      </li>
      <li>
            <dt><a href="http://www.open-mpi.org/doc/v1.5/man1/mpiexec.1.php#sect8">Open MPI binding</a></dt>
            <dd><code>mpiexec -n 4 --bysocket --bind-to-socket --report-bindings</code></dd>
      </li>
      <li>
            <dt><tt>taskset</tt>, part of the <tt>util-linux</tt> package</dt>
            <dd>
              <tt>Usage: taskset [options] [mask | cpu-list] [pid|cmd [args...]]</tt>, type <tt>man taskset</tt> for details.
              Make sure to set affinity for your program, not for the <tt>mpiexec</tt> program.
            </dd>
      </li>
      <li>
            <dt><tt>numactl</tt></dt>
            <dd>In addition to task affinity, this tool also allows changing the default memory affinity policy.
              On Linux, the default policy is to attempt to find memory on the same memory bus that serves the core that a thread is running on at whatever time the memory is faulted (<em>not</em> when <tt>malloc()</tt> is called).
              If local memory is not available, it is found elsewhere, possibly leading to serious memory imbalances.
              The option <tt>--localalloc</tt> allocates memory on the local NUMA node, similar to the <tt>numa_alloc_local()</tt> function in the <tt>libnuma</tt> library.
              The option <tt>--cpunodebind=nodes</tt> binds the process to a given NUMA node (note that this can be larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
              The option <tt>--physcpubind=cpus</tt> binds the process to a given processor core (numbered according to <tt>/proc/cpuinfo</tt>, therefore including logical cores if Hyper-threading is enabled).
              With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your machine to calculate the correct NUMA node or processor number given the environment variable <tt>OMPI_COMM_WORLD_LOCAL_RANK</tt>.
              In most cases, it is easier to make <tt>mpiexec</tt> or a resource manager set affinities.
            </dd>
      </li>
     </ul>
        </p>


          The software <a href="http://open-mx.gforge.inria.fr">http://open-mx.gforge.inria.fr</a>
          provides faster speed for ethernet systems, we have not tried it but it
          claims it can dramatically reduce latency and increase bandwidth on
          Linux system. You must first install this software and then install
          MPICH or Open MPI to use it.
        </p>




      <h3><a name="license">What kind of license is PETSc released under?</a></h3>

      See the <a href="license.html">licensing notice.</a>

      <h3><a name="why-c">Why is PETSc programmed in C, instead of Fortran or C++?</a></h3>

      C enables us to build data structures for storing sparse matrices, solver
      information, etc. in ways that Fortran simply does not allow. ANSI C is
      a complete standard that all modern C compilers support. The language is
      identical on all machines. C++ is still evolving and compilers on different
      machines are not identical. Using C function pointers to provide data
      encapsulation and polymorphism allows us to get many of the advantages of
      C++ without using such a large and more complicated language. It would be
      natural and reasonable to have coded PETSc in C++; we opted to use
      C instead.

      <h3><a name="logging-overhead">Does all the PETSc error checking and logging reduce PETSc's efficiency?</a></h3>

      No.

      <h3><a name="work-efficiently">How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?</a></h3>

      <ol>
        <li>We work very efficiently.
          <ol>
            <li>
              We use Emacs for all editing; the etags feature makes navigating
              and changing our source code very easy.
            </li>
            <li>
              Our manual pages are generated automatically from
              formatted comments in the code, thus alleviating the
              need for creating and maintaining manual pages.
            </li>
            <li>
              We employ automatic nightly tests of PETSc on several
              different machine architectures. This process helps us
              to discover problems the day after we have introduced
              them rather than weeks or months later.
            </li>
          </ol>
        </li>

        <li>
          We are very careful in our design (and are constantly
          revising our design) to make the package easy to use,
          write, and maintain.
        </li>

        <li>
          We are willing to do the grunt work of going through
          all the code regularly to make sure that <u>all</u> code
          conforms to our interface design. We will <u>never</u>
          keep in a bad design decision simply because changing it
          will require a lot of editing; we do a lot of editing.
        </li>

        <li>
          We constantly seek out and experiment with new design
          ideas; we retain the useful ones and discard the rest.
          All of these decisions are based on <u>practicality</u>.
        </li>

        <li>
          Function and variable names are chosen to be very
          consistent throughout the software. Even the rules about
          capitalization are designed to make it easy to figure out
          the name of a particular object or routine. Our memories
          are terrible, so careful consistent naming puts less
          stress on our limited human RAM.
        </li>

        <li>
          The PETSc directory tree is carefully designed to make
          it easy to move throughout the entire package.
        </li>

        <li>
          Our bug reporting system, based on email to <a
            href="../documentation/bugreporting.html">petsc-maint@mcs.anl.gov</a>,
          makes it very simple to keep track of what bugs have been found and
          fixed. In addition, the bug report system retains an archive of all
          reported problems and fixes, so it is easy to refind fixes to
          previously discovered problems.
        </li>

        <li>
          We contain the complexity of PETSc by using object-oriented programming
          techniques including data encapsulation (this is why your program
          cannot, for example, look directly at what is inside the object Mat)
          and polymorphism (you call MatMult() regardless of whether your matrix
          is dense, sparse, parallel or sequential; you don't call a different
          routine for each format).
        </li>

        <li>We try to provide the functionality requested by our users.</li>

        <li>We never sleep.</li>
      </ol>

      <h3><a name="complex">For complex numbers will I get better performance with C++?</a></h3>

      To use PETSc with complex numbers you either <code>./configure</code> with
      the option <code>--with-scalar-type=complex</code> and either
      <code>--with-clanguage=c++</code> or, the default,
      <code>--with-clanguage=c</code>. In our experience they will deliver very
      similar performance (speed), but if one is concerned they should just try
      both and see if one is faster.

      <h3><a name="different">How come when I run the same program on the same number of processes I get a "different" answer?</a></h3>

      <p>
        Inner products and norms in PETSc are computed using the MPI_Allreduce()
        command. In different runs the order at which values arrive at a given
        process (via MPI) can be in a different order, thus the order in which some
        floating point arithmetic operations are performed will be different. Since
        floating point arithmetic arithmetic is not associative, the computed
        quantity may be (slightly) different. Over a run the many slight
        differences in the inner products and norms will effect all the computed
        results. It is important to realize that none of the computed answers are
        any less right or wrong (in fact the sequential computation is no more
        right then the parallel ones), they are all equally valid.
      </p>

      <p>
        The discussion above assumes that the exact same algorithm is being used on
        the different number of processes. When the algorithm is different for the
        different number of processes (almost all preconditioner algorithms except
        Jacobi are different for different number of processes) then one expects to
        see (and does) a greater difference in results for different numbers of
        processes.  In some cases (for example block Jacobi preconditioner) it may
        be that the algorithm works for some number of processes and does not work
        for others.
      </p>

      <h3><a name="differentiterations">How come when I run the same linear solver on a different number of processes it takes a different number of iterations?</a></h3>

      The convergence of many of the preconditioners in PETSc including the
      default parallel preconditioner block Jacobi depends on the number of
      processes. The more processes the (slightly) slower convergence it has.
      This is the nature of iterative solvers, the more parallelism means the
      more "older" information is used in the solution process hence slower
      convergence.

      <h3><a name="gpus">Can PETSc use GPUs to speedup computations?</a></h3>

      The <a href="https://bitbucket.org/petsc/petsc">PETSc developer repository</a> has some support for running portions of the computation on
      GPUs. See <a href="http://www.mcs.anl.gov/petsc/features/gpus.html">PETSc GPUs</a> for
      more information. PETSc has Vec classes VECCUDA and VECVIENNACL which perform almost all
      the vector operations on the GPU. The Mat classes AIJCUSPARSE and AIJVIENNACL perform
      matrix-vector products on the GPU but do not have matrix assembly on the
      GPU yet. Both of these classes run in parallel with MPI. All KSP methods,
      except KSPIBCGS, run all their vector operations on the GPU thus, for
      example Jacobi preconditioned Krylov methods run completely on the GPU.
      Preconditioners are a problem, we could do with some help for these. The
      example <a href="http://www.mcs.anl.gov/petsc/petsc-master/src/snes/examples/tutorials/ex47cu.cu.html">src/snes/examples/tutorials/ex47cu.cu</a>
      demonstates how the nonlinear function evaluation can be done on the
      GPU.

      <h3><a name="precision">Can I run PETSc with extended precision?</a></h3>

      Yes, with gcc 4.6 and later (and gfortran 4.6 and later)
      <code>./configure</code> PETSc using the options
      <code>--with-precision=__float128 --download-f2cblaslapack</code>.
      External packages cannot be used in this mode

      <h3><a name="qd">Why doesn't PETSc use QD to implement support for exended precision?</a></h3>

      We tried really hard but could not. The problem is that the QD c++ classes,
      though they try to implement the built-in data types of double etc are not
      native types and cannot "just be used" in a general piece of numerical
      source code rather the code has to rewritten to live within the limitations
      of QD classes. But note, above, that PETSc can be built to use quad precision.

      <hr>

      <h2><a name="installation">Installation</a></h2>

      <h3><a name="already-installed">How do I begin using PETSc if the software has already been completely built and installed by someone else?</a></h3>

      Assuming that the PETSc libraries have been successfully built for
      a particular architecture and level of optimization, a new user must
      merely:

      <ol>
        <li>
          Set the environmental variable PETSC_DIR to the full
          path of the PETSc home directory (for example,
          /home/username/petsc).
        </li>

        <li>
          Set the environmental variable PETSC_ARCH, which indicates the
          configuration on which PETSc will be used. Note that the PETSC_ARCH is
          simply a name the installer used when installing the libraries. There
          many be several on a single system, like mylinux-g for the debug
          versions of the library and mylinux-O for the optimized version, or
          petscdebug for the debug version and petscopt for the optimized
          version.
        </li>

        <li>
          Begin by copying one of the many PETSc examples (in, for example,
          petsc/src/ksp/examples/tutorials) and its corresponding makefile.
        </li>

        <li>
          See the introductory section of the PETSc users manual for tips on
          documentation.
        </li>
      </ol>

      <h3><a name="reduce-disk-space">The PETSc distribution is SO large. How can I reduce my disk space usage?</a></h3>

      <ol>
        <li>
          The directory /usr/share/doc/petsc3.9-doc contains a set of HTML manual pages in
          for use with a browser. You can delete these pages to save some disk space.
        </li>

        <li>
          The PETSc users manual is provided in PDF in
          <a href="http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf">manual.pdf</a>.
        </li>

        <li>
          The PETSc test suite contains sample output for many of the examples.
          These are contained in the PETSc directories
          /usr/lib/petscdir/petsc*/*/share/petsc/examples/src/*/*/examples/tutorials/output and
          examples/tests/output.
        </li>

        <li>
          The debugging versions of the libraries are larger than the optimized
          versions. In a pinch you can work with the optimized version although
          we do not recommend it generally because finding bugs is much easier
          with the debug version.
        </li>
      </ol>

      <h3><a name="petsc-uni">I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?</a></h3>

      No, run ./configure with the option <code>--with-mpi=0</code>

      <h3><a name="no-x">Can I install PETSc to not use X windows (either under Unix or Windows with gcc, the gnu compiler)?</a></h3>

      Yes. Run ./configure with the additional flag <code>--with-x=0</code>

      <h3><a name="use-mpi">Why do you use MPI</a>?</h3>

      MPI is the message-passing standard. Because it is a standard, it will not
      change over time; thus, we do not have to change PETSc every time the
      provider of the message-passing system decides to make an interface change.
      MPI was carefully designed by experts from industry, academia, and
      government labs to provide the highest quality performance and capability.
      For example, the careful design of communicators in MPI allows the easy
      nesting of different libraries; no other message-passing system provides
      this support. All of the major parallel computer vendors were involved in
      the design of MPI and have committed to providing quality implementations.
      In addition, since MPI is a standard, several different groups have already
      provided complete free implementations. Thus, one does not have to rely on
      the technical skills of one particular group to provide the message-passing
      libraries. Today, MPI is the only practical, portable approach to writing
      efficient parallel numerical software.

      <h3><a name="mpi-compilers">What do I do if my MPI compiler wrappers are invalid</a>?</h3>

      Most MPI implementations provide compiler wrappers (such as mpicc) which
      give the include and link options necessary to use that verson of MPI to
      the underlying compilers . These wrappers are either absent or broken in
      the MPI pointed to by --with-mpi-dir. You can rerun the configure with the
      additional option --with-mpi-compilers=0, which will try to auto-detect
      working compilers; however, these compilers may be incompatible with the
      particular MPI build. If this fix does not work, run with
      --with-cc=c_compiler where you know c_compiler works with this particular
      MPI, and likewise for C++ and Fortran.

      <h3><a name="with-64-bit-indices">When should/can I use the ./configure option --with-64-bit-indices?</a></h3>

      By default the type that PETSc uses to index into arrays and keep sizes of
      arrays is a PetscInt defined to be a 32 bit int. If your problem

      <ul>
        <li>involves more than 2^31 - 1 unknowns (around 2 billion) OR</li>
        <li>your matrix might contain more than 2^31 - 1 nonzeros on a single process</li>
      </ul>

      then you need to use this option. Otherwise you will get strange crashes.

      <p>
        This option can be used when you are using either 32 bit or 64 bit
        pointers. You do not need to use this option if you are using 64 bit
        pointers unless the two conditions above hold.
      </p>

      <h3><a name="compilererror">What if I get an internal compiler error?</a></h3>

      You can rebuild the offending file individually with a lower optimization level. Then make sure to complain to the
      compiler vendor and file a bug report. For example, if the compiler chokes on
      src/mat/impls/baij/seq/baijsolvtrannat.c you can run

      <p>make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o</p>

      and then invoke 'make all' again.

      <h3><a name="install-petsc4py-dev">How do I install petsc4py with the development PETSc?</a></h3>

      You can follow these steps

      <ol>
        <li>install Cython</li>
        <li>install PETSc with --download-petsc4py</li>
      </ol>

      <h3><a name="gfortran">What Fortran compiler do you recommend for the Apple Mac OS X?</a></h3>

      <p>
        We recommend using brew (http://brew.sh) to install gfortran

      <p>
        Please contact Apple at <a
          href="http://www.apple.com/feedback">http://www.apple.com/feedback</a>
        and urge them to bundle gfortran with future versions of Xcode.
      </p>

      <h3><a name="packagelocations">How can I find the URL locations of the packages you install using --download-PACKAGE??</a></h3>

      <p>
        grep "self.download "  config/BuildSystem/config/packages/*.py
      </p>

      <hr>

      <h3><a name="multiplempis">How to fix the problem: PETSc was configured with one MPICH (or OpenMPI) mpi.h version but now appears to be compiling using a different MPICH (or OpenMPI) mpi.h version</a></h3>

<p>This happens for generally one of two reasons:
  <ul>
    <li>You previously ran ./configure with the option --download-mpich (or --download-openmpi) but later ran ./configure to use a version of MPI already installed on the machine. Solution: do "rm -rf ${PETSC_ARCH}" and run ./configure again.</li>
    <li>You have two versions of MPI installed on your machine in different locations and both are being picked up in different parts of the configuration/make process. For example you may have MPI installed in /usr/local/mpi and in /Apps/OpenMPI. Solution: remove one of the two installations.</li>
  </ul>


      <h3><a name="PetscOptionsInsertFile">What does it mean when make test gives:</a></h3>
            <pre>
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes
See http://www.mcs.anl.gov/petsc/documentation/faq.html
[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
</pre></li>

      <p>
        The machine has a funky network configuration and for some reason MPICH is unable to communicate between processes with the socket connections it has established. This can happen even if you are running MPICH on just one machine. Often you will find that ping `hostname` fails with this network configuration; that is processes on the machine cannot even connect to the same machine. You can try completely disconnecting your machine from the network and see if make test then works or speaking with your system adminstrator. You can also use the ./configure options --download-mpich --download-mpich-device=ch3:nemesis
      </p>

      <hr>

      <h2><a name="using">Using</a></h2>

      <h3><a name="redirectstdout">How can I redirect PETSc's stdout and stderr when programming with a GUI interface in Windows Developer Studio or too C++ streams?</a> </h3>

      <p>
        To overload just the error messages write your own MyPrintError() function
        that does whatever you want (including pop up windows etc) and use it like
        below.
      </p>

      <pre>
extern "C" {
  int PASCAL WinMain(HINSTANCE inst,HINSTANCE dumb,LPSTR param,int show);
};

#include &lt;petscsys.h&gt;
#include &lt;mpi.h&gt;

const char help[] = "Set up from main";

int MyPrintError(const char error[], ...) {
  printf("%s", error);
  return 0;
}

int main(int ac, char *av[]) {
  char buf[256];
  int i;
  HINSTANCE inst;
  PetscErrorCode ierr;

  inst=(HINSTANCE)GetModuleHandle(NULL);
  PetscErrorPrintf = MyPrintError;

  buf[0]=0;
  for (i=1; i&lt;ac; i++) {
    strcat(buf,av[i]);
    strcat(buf," ");
  }

  PetscInitialize(&amp;ac, &amp;av, NULL, help);

  return WinMain(inst,NULL,buf,SW_SHOWNORMAL);
}
    </pre>

      <p>
        file in the project and compile with this preprocessor definitiions:
        <code>WIN32,_DEBUG,_CONSOLE,_MBCS,USE_PETSC_LOG,USE_PETSC_BOPT_g,USE_PETSC_STA CK,_AFXDLL</code>
      </p>

      <p>
        And these link options: <code>/nologo /subsystem:console /incremental:yes
          /debug /machine:I386 /nodefaultlib:"libcmtd.lib"
          /nodefaultlib:"libcd.lib" /nodefaultlib:"mvcrt.lib"
          /pdbtype:sept</code>
      </p>

      <p>
        Note that it is compiled and linked as if it was a console program. The
        linker will search for a main, and then from it the WinMain will start.
        This works with MFC templates and derived classes too.
      </p>

      <p>
        Note: When writing a Window's console application you do not need to do
        anything, the stdout and stderr is automatically output to the console
        window.
      </p>

      <p>To change where all PETSc stdout and stderr go write a function</p>

      <p>
        You can also reassign PetscVFPrintf() to handle stdout and stderr any way
        you like write the following function:
      </p>

      <pre>
PetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp) {
  PetscErrorCode ierr;

  PetscFunctionBegin;
  if (fd != stdout &amp;&amp; fd != stderr) { /* handle regular files */
    ierr = PetscVFPrintfDefault(fd,format,Argp); CHKERR(ierr);
  } else {
    char buff[BIG];
    int length;
    ierr = PetscVSNPrintf(buff,BIG,format,&amp;length,Argp);CHKERRQ(ierr);

    /* now send buff to whatever stream or whatever you want */
  }
  PetscFunctionReturn(0);
}
    </pre>

      and assign <code>PetscVFPrintf = mypetscprintf;</code> before
      <code>PetscInitialize()</code> in your main program.

      <h3><a name="hypre">I want to use hypre boomerAMG without GMRES but when I run <code>-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly</code> I don't get a very accurate answer!</a> </h3>

      You should run with -ksp_type richardson to have PETSc run several V or
      W cycles. -ksp_type of preonly causes boomerAMG to use only one V/W cycle.
      You can control how many cycles are used in a single application of the
      boomerAMG preconditioner with <code>-pc_hypre_boomeramg_max_iter
        &lt;it&gt;</code> (the default is 1). You can also control the tolerance
      boomerAMG uses to decide if to stop before max_iter with
      <code>-pc_hypre_boomeramg_tol &lt;tol&gt;</code> (the default is 1.e-7).
      Run with <code>-ksp_view</code> to see all the hypre options used and
      <code>-help | grep boomeramg</code> to see all the command line options.

      <h3><a name="nosaij">You have AIJ and BAIJ matrix formats, and SBAIJ for symmetric storage, how come no SAIJ?</a></h3>

        Just for historical reasons; the SBAIJ format with blocksize one is just as
      efficient as an SAIJ would be.

      <h3><a name="domaindecomposition">How do I use PETSc for Domain Decomposition?</a></h3>

      <p>
        PETSc includes Additive Schwarz methods in the suite of preconditioners.
        These may be activated with the runtime option <code>-pc_type asm.</code>
        Various other options may be set, including the degree of overlap
        <code>-pc_asm_overlap &lt;number&gt;</code> the type of restriction/extension
        <code>-pc_asm_type [basic,restrict,interpolate,none]</code> - Sets ASM type
        and several others. You may see the available ASM options by using
        <code>-pc_type asm -help</code> Also, see the procedural interfaces in the
        manual pages, with names <code>PCASMxxxx()</code> and check the index of the
        <a href="manual.pdf">users manual</a>
        for <code>PCASMxxx</code>().
      </p>

      <p>
        PETSc also contains a domain decomposition inspired wirebasket or face
        based two level method where the coarse mesh to fine mesh interpolation
        is defined by solving specific local subdomain problems. It currently
        only works for 3D scalar problems on structured grids created with PETSc
        DMDAs. See the manual page for PCEXOTIC and
        src/ksp/ksp/examples/tutorials/ex45.c for any example.
      </p>

      <p>
        PETSc also contains a balancing Neumann-Neumann type preconditioner, see the
        manual page for PCBDDC. This requires matrices be constructed with
        <code>MatCreateIS()</code> via the finite element method. See src/ksp/ksp/examples/tests/ex59.c
      </p>

      <h3><a name="blocks">Can I create BAIJ matrices with different size blocks for different block rows?</a></h3>

      Sorry, this is not possible, the BAIJ format only supports a single fixed
      block size on the entire matrix. But the AIJ format automatically searches
      for matching rows and thus still takes advantage of the natural blocks in
      your matrix to obtain good performance. Unfortunately you cannot use the
      <code>MatSetValuesBlocked()</code>.

      <h3><a name="mpi-vec-access">How do I access the values of a parallel PETSc vector on a different process than owns them?</a></h3>

      <ul>
        <li>On each process create a local vector large enough to hold all the values it wishes to access</li>
        <li>Create a VecScatter that scatters from the parallel vector into the local vectors</li>
        <li>Use VecGetArray() to access the values in the local vector</li>
      </ul>

      <h3><a name="mpi-vec-to-seq-vec">How do I collect all the values from a parallel PETSc vector into a sequential vector on each processor?</a></h3>

      <ul>
        <li>Create the scatter context that will do the communication</li>
        <li> <a href="manualpages/Vec/VecScatterCreateToAll.html">VecScatterCreateToAll</a>(v,&amp;ctx,&amp;w);</li>
        <li>
          Actually do the communication; this can be done repeatedly as needed
          <ul>
            <li><a href="manualpages/Vec/VecScatterBegin.html">VecScatterBegin</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
            <li> <a href="manualpages/Vec/VecScatterEnd.html">VecScatterEnd</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
          </ul>
        </li>
        <li>
          Remember to free the scatter context when no longer needed
          <ul>
            <li> <a href="manualpages/Vec/VecScatterDestroy.html">VecScatterDestroy</a>(ctx);</li>
          </ul>
        </li>
      </ul>

      Note that this simply concatenates in the parallel ordering of the vector.
      If you are using a vector from DMCreateGlobalVector() you likely want to
      first call DMDAGlobalToNaturalBegin/End() to scatter the original vector
      into the natural ordering in a new global vector before calling
      VecScatterBegin/End() to scatter the natural vector onto all processes.

      <h3><a name="mpi-vec-to-mpi-vec">How do I collect all the values from a parallel PETSc vector into a vector on the zeroth processor?</a></h3>

      <ul>
        <li>
          Create the scatter context that will do the communication
          <ul>
            <li><a href="manualpages/Vec/VecScatterCreateToZero.html">VecScatterCreateToZero</a>(v,&amp;ctx,&amp;w);</li>
          </ul>
        </li>
        <li>
          Actually do the communication; this can be done repeatedly as needed
          <ul>
            <li><a href="manualpages/Vec/VecScatterBegin.html">VecScatterBegin</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
            <li> <a href="manualpages/Vec/VecScatterEnd.html">VecScatterEnd</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
          </ul>
        </li>
        <li>
          Remember to free the scatter context when no longer needed
          <ul>
            <li> <a href="manualpages/Vec/VecScatterDestroy.html">VecScatterDestroy</a>(ctx);</li>
          </ul>
        </li>
      </ul>

      Note that this simply concatenates in the parallel ordering of the vector.
      If you are using a vector from DMCreateGlobalVector() you likely want to
      first call DMDAGlobalToNaturalBegin/End() to scatter the original vector
      into the natural ordering in a new global vector before calling
      VecScatterBegin/End() to scatter the natural vector onto process 0.

      <h3><a name="sparse-matrix-ascii-format">How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, SLAPC or other ASCII format?</a></h3>

      See the examples in src/mat/examples/tests, specifically ex72.c, ex78.c,
      and ex32.c. You will likely need to modify the code slightly to match your
      required ASCII format. Note: Never read or write in parallel an ASCII
      matrix file, instead for reading: read in sequentially with a standalone
      code based on ex72.c, ex78.c, or ex32.c then save the matrix with the
      binary viewer PetscBinaryViewerOpen() and load the matrix in parallel in
      your "real" PETSc program with MatLoad(); for writing save with the binary
      viewer and then load with the sequential code to store it as ASCII.

      <h3><a name="setfromoptions">Does TSSetFromOptions(), SNESSetFromOptions() or KSPSetFromOptions() reset all the parameters I previously set or how come my TS/SNES/KSPSetXXX() does not seem to work?</a></h3>

      <p>
        If XXSetFromOptions() is used (with -xxx_type aaaa) to change the type of
        the object then all parameters associated with the previous type are
        removed.  Otherwise it does not reset parameters.
      </p>

      <p>
        TS/SNES/KSPSetXXX() commands that set properties for a particular type of
        object (such as KSPGMRESSetRestart()) ONLY work if the object is ALREADY
        of that type. For example, with
        <code>KSPCreate(PETSC_COMM_WORLD,&amp;ksp); KSPGMRESSetRestart(ksp,10);</code>
        the restart will be ignored since the type has not yet been set to GMRES.
        To have those values take effect you should do one of the following:
      </p>

      <ul style="list-style-type: none; padding: .5em;">
        <li><code>XXXCreate(..,&amp;obj);</code></li>
        <li>
          <code>XXXSetFromOptions(obj)</code>; allow setting the
          type from the command line, if it is not on the
          command line then the default type is automatically
          set
        </li>
        <li>
          <code>XXXSetYYYYY(obj,...)</code>;  if the obj is the
          appropriate type then the operation takes place
        </li>
        <li>
          <code>XXXSetFromOptions(obj)</code>; allow user to
          overwrite options hardwired in code (optional)<br>
        </li>
        <li>
          The other approach is to replace the first
          <code>XXXSetFromOptions()</code> to <code>XXXSetType(obj,type)</code>
          and hardwire the type at that point.
        </li>
      </ul>

      <h3><a name="makefiles">Can I use my own makefiles or rules for compiling code, instead of using PETSc's?</a></h3>

      Yes, see the section of the <a href="manual.pdf">users manual</a> called Makefiles

      <h3><a name="cmake">Can I use CMake to build my own project that depends on PETSc?</a></h3>

      Use the FindPETSc.cmake module from <a
        href="https://github.com/jedbrown/cmake-modules/">this repository</a>.
      See the CMakeLists.txt from <a
        href="https://github.com/jedbrown/dohp">Dohp</a> for example usage.

      <h3><a name="carriagereturns">How can I put carriage returns in PetscPrintf() statements from Fortran?</a></h3>

      <p>
        You can use the same notation as in C, just put a \n in the string. Note
        that no other C format instruction is supported.
      </p>

      <p>
        Or you can use the Fortran concatination // and char(10); for example
        'some string'//char(10)//'another string on the next line'
      </p>

      <h3><a name="cxxmethod">How can I implement callbacks using C++ class methods?</a></h3>

      <p>
        Declare the class method <tt>static</tt>. Static methods do not have a <tt>this</tt> pointer, but the <tt>void*</tt> context parameter will usually be cast to a pointer to the class where it can serve the same function. Note that all PETSc callbacks return <tt>PetscErrorCode</tt>.
      </p>

      <h3><a name="functionjacobian">Everyone knows that when you code Newton's method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?</a></h3>

      <p>
        The update in Newton's method is computed as u^{n+1} = u^n - lambda
        * approx-inverse[J(u^n)] * F(u^n)]. The reason PETSc doesn't default to
        computing both the function and Jacobian at the same time is<br>
      </p>

      <ol>
        <li>
          In order to do the line search, F (u^n - lambda * step) may need to be
          computed for several lambda, the Jacobian is not needed for each of
          those and one does not know in advance which will be the final lambda
          until after the function value is computed, so many extra Jacobians may
          be computed.
        </li>
        <li>
          In the final step if || F(u^p)|| satisfies the convergence criteria
          then a Jacobian need not be computed.
        </li>
      </ol>

      <p>
        You are free to have your "FormFunction" compute as
        much of the Jacobian at that point as you like, keep
        the information in the user context (the final
        argument to FormFunction and FormJacobian) and then
        retreive the information in your FormJacobian()
        function.
      </p>

      <h3><a name="jacobianfree">How can I use Newton's method Jacobian free? Can I difference a different function than provided
             with <a href="manualpages/SNES/SNESSetFunction.html">SNESSetFunction()?</a></h3>

      <p>
         The simplest way is with the option -snes_mf, this will use finite differencing of the function provided to SNESComputeFunction() to approximate the action of Jacobian. Since no matrix-representation of the
         Jacobian is provided the -pc_type used with this option must be -pc_type none.  You can provide a custom preconditioner with SNESGetKSP(), KSPGetPC(),
         PCSetType(pc,<a href="manualpages/PC/PCSHELL.html">PCSHELL</a>).
      </p>

      <p>
        The option -snes_mf_operator will use Jacobian free to apply the Jacobian (in the Krylov solvers) but will use whatever matrix you provided with SNESSetJacobian() (assuming you set one) to compute the preconditioner.
      </p>

      <p>
       To write the code (rather than use the options above) use <a href="manualpages/SNES/MatCreateSNESMF.html">MatCreateSNESMF()</a> and pass the resulting matrix object to
       SNESSetJacobian(). For purely matrix-free (like -snes_mf) pass the matrix object for both matrix arguments and pass the function MatMFFDComputeJacobian(). To provide your own approximate Jacobian matrix to compute
       the preconditioner (like -snes_mf_operator), pass this other matrix as the second matrix argument to SNESSetJacobian(). Make sure your provided computejacobian() function calls MatAssemblyBegin/End() separately on BOTH
       matrix arguments to this function. See <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tests/ex7.c.html">src/snes/examples/tests/ex7.c</a>
      </p>

      <p>
       To difference a different function than that passed to SNESSetJacobian() to compute the matrix-free Jacobian multiply call
       <a href="manualpages/Mat/MatMFFDSetFunction.html">MatMFFDSetFunction()</a> to set that other function.
       See <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tests/ex7.c.html">src/snes/examples/tests/ex7.c.h</a>
      </p>

      <h3><a name="conditionnumber">How can I determine the condition number of a matrix?</a></h3>

      For small matrices, the condition number can be reliably computed using
      <code>-pc_type svd -pc_svd_monitor</code>.  For larger matrices, you can
      run with <code>-pc_type none -ksp_type gmres -ksp_monitor_singular_value
        -ksp_gmres_restart 1000</code> to get approximations to the condition
      number of the operator. This will generally be accurate for the largest
      singular values, but may overestimate the smallest singular value unless
      the method has converged. Make sure to avoid restarts. To estimate the
      condition number of the preconditioned operator, use <code>-pc_type
        somepc</code> in the last command.

      <h3><a name="invertmatrix">How can I compute the inverse of a matrix in PETSc?</a></h3>

      It is very expensive to compute the inverse of a matrix and very rarely
      needed in practice. We highly recommend avoiding algorithms that need it.
      The inverse of a matrix (dense or sparse) is essentially always dense, so
      begin by creating a dense matrix B and fill it with the identity matrix
      (ones along the diagonal), also create a dense matrix X of the same size
      that will hold the solution. Then factor the matrix you wish to invert with
      MatLUFactor() or MatCholeskyFactor(), call the result A. Then call MatMatSolve(A,B,X)
      to compute the inverse into X. <a href="#schurcomplement">See also</a>.

      <h3><a name="schurcomplement">How can I compute the Schur complement, Kbb - Kab * inverse(Kbb) * Kba in PETSc?</a></h3>

      <p>
        It is very expensive to compute the Schur complement of a matrix and very
        rarely needed in practice. We highly recommend avoiding algorithms that
        need it. The Schur complement of a matrix (dense or sparse) is essentially
        always dense, so begin by
      </p>

      <ul>
        <li>forming a dense matrix Kba,</li>
        <li>also create another dense matrix T of the same size.</li>
        <li>
          Then either factor the matrix Kaa directly with MatLUFactor() or
          MatCholeskyFactor(), or use MatGetFactor() followed by
          MatLUFactorSymbolic() followed by MatLUFactorNumeric() if you wish to
          use and external solver package like SuperLU_Dist.  Call the result A.
        </li>
        <li>Then call MatMatSolve(A,Kba,T).</li>
        <li>Then call MatMatMult(Kab,T,MAT_INITIAL_MATRIX,1.0,&amp;S).</li>
        <li>Now call MatAXPY(S,-1.0,Kbb,MAT_SUBSET_NONZERO).</li>
        <li>Followed by MatScale(S,-1.0);</li>
      </ul>

      For computing Schur complements like this it does not make sense to use the
      KSP iterative solvers since for solving many moderate size problems using
      a direct factorization is much faster than iterative solvers.  As you can
      see, this requires a great deal of work space and computation so is best
      avoided.  However, it is not necessary to assemble the Schur complement
      S in order to solve systems with it.

      Use MatCreateSchurComplement(Kaa,Kaa_pre,Kab,Kba,Kbb,&amp;S) to create
      a matrix that applies the action of S (using Kaa_pre to solve with Kaa),
      but does not assemble.  Alternatively, if you already have a block matrix
      K = [Kaa, Kab; Kba, Kbb] (in some ordering), then you can create index sets
      (IS) isa and isb to address each block, then use MatGetSchurComplement() to
      create the Schur complement and/or an approximation suitable for
      preconditioning.  Since S is generally dense, standard preconditioning
      methods cannot typically be applied directly to Schur complements.  There
      are many approaches to preconditioning Schur complements including using
      the SIMPLE approximation K_bb - Kba inv(diag(Kaa)) Kab to create a sparse
      matrix that approximates the Schur complement (this is returned by default
      for the optional "preconditioning" matrix in MatGetSchurComplement()).  An
      alternative is to interpret the matrices as differential operators and
      apply approximate commutator arguments to find a spectrally equivalent
      operation that can be applied efficiently (see the "PCD" preconditioners
      from Elman, Silvester, and Wathen).  A variant of this is the least squares
      commutator, which is closely related to the Moore-Penrose pseudoinverse,
      and is available in PCLSC which operates on matrices of type
      MATSCHURCOMPLEMENT.

      <h3><a name="fem">Do you have examples of doing unstructured grid finite element computations (FEM) with PETSc?</a></h3>

      There are at least two ways to write a finite element code using PETSc

      <ol>
        <li>
          use DMPlex, which is a high level approach to manage your mesh and
          discretization. <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex62.c.html">SNES
          ex62</a> solves the Stokes equation using this approach.
        </li>
        <li>
          manage the grid data structure yourself and use
          PETSc IS and VecScatter to communicate the required
          ghost point communication. See <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex10d/ex10.c.html">src/snes/examples/tutorials/ex10d/ex10.c</a>
        </li>
      </ol>

      <h3><a name="da_mpi_cart">The PETSc DA object decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?</a></h3>

      <p>
        The MPI_Cart_create() first divides the mesh along the z direction, then
        the y, then the x. DMDA divides along the x, then y, then z. Thus, for
        example, rank 1 of the processes will be in a different part of the mesh
        for the two schemes. To resolve this you can create a new MPI
        communicator that you pass to DMDACreate() that renumbers the process
        ranks so that each physical process shares the same part of the mesh with
        both the DMDA and the MPI_Cart_create(). The code to determine the new
        numbering was provided by Rolf Kuiper.
      </p>

      <pre>
// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
// (no parallelization in direction 'dir' means dir_procs = 1)

MPI_Comm NewComm;
int MPI_Rank, NewRank, x,y,z;

// get rank from MPI ordering:
MPI_Comm_rank(MPI_COMM_WORLD, &amp;MPI_Rank);

// calculate coordinates of cpus in MPI ordering:
x = MPI_rank / (z_procs*y_procs);
y = (MPI_rank % (z_procs*y_procs)) / z_procs;
z = (MPI_rank % (z_procs*y_procs)) % z_procs;

// set new rank according to PETSc ordering:
NewRank = z*y_procs*x_procs + y*x_procs + x;

// create communicator with new ranks according to
PETSc ordering:
MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &amp;NewComm);

// override the default communicator (was
MPI_COMM_WORLD as default)
PETSC_COMM_WORLD = NewComm;
    </pre>

      <h3><a name="redistribute">When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?</a></h3>

      <p>
        For nonsymmetric systems put the appropriate boundary solutions in the
        x vector and use <a href="manualpages/Mat/MatZeroRows.html">MatZeroRows()</a> followed by <a href="manualpages/KSP/KSPSetOperators.html">KSPSetOperators()</a>. For symmetric
        problems use <a href="manualpages/Mat/MatZeroRowsColumns.html">MatZeroRowsColumns()</a> instead.  If you have many Dirichlet
        locations you can use MatZeroRows() (not MatZeroRowsColumns()) and
        -ksp_type preonly -pc_type redistribute; see <a href="manualpages/PC/PCREDISTRIBUTE.html">
        PCREDISTRIBUTE</a>) and PETSc will repartition the parallel matrix for load
        balancing; in this case the new matrix solved remains symmetric even though
        MatZeroRows() is used.
      </p>

      <p>
        An alternative approach is, when assemblying the matrix (generating values
        and passing them to the matrix), never to include locations for the Dirichlet
        grid points in the vector and matrix, instead taking them into account as you
        put the other values into the load.
      </p>

      <h3><a name="matlab">How can I get PETSc Vecs and Mats to MATLAB or vice versa?</a></h3>

      There are five ways to work with PETSc and MATLAB

      <ol>
        <li>
          Using the MATLAB Engine, allowing PETSc to automatically call MATLAB
          to perform some specific computations. This does not allow MATLAB to be
          used interactively by the user. See the <a
            href="manualpages/Sys/PetscMatlabEngine.html">PetscMatlabEngine</a>.
        </li>

        <li>
          To save PETSc Mats and Vecs to files that can be read from MATLAB use <a
            href="manualpages/Viewer/PetscViewerBinaryOpen.html">PetscViewerBinaryOpen()</a>
          viewer and VecView() or MatView() to save objects for MATLAB and
          VecLoad() and MatLoad() to get the objects that MATLAB has saved. See
          PetscBinaryRead.m and PetscBinaryWrite.m in share/petsc/matlab for loading and
          saving the objects in MATLAB.
        </li>

        <li>
          You can open a socket connection between MATLAB and PETSc to allow
          sending objects back and forth between an interactive MATLAB session
          and a running PETSc program. See
          <a href="manualpages/Viewer/PetscViewerSocketOpen.html">PetscViewerSocketOpen</a>()
          for access from the PETSc side and PetscReadBinary.m in share/petsc/matlab for
          access from the MATLAB side.
        </li>

        <li>
          You can save PETSc Vecs (not Mats) with the <a
            href="manualpages/Viewer/PetscViewerMatlabOpen.html">PetscViewerMatlabOpen</a>()
          viewer that saves .mat files can then be loaded into MATLAB.
        </li>
        <li>
          We are just beginning to develop in <a
            href="../developers/index.html">petsc master (branch in git)</a> an API to call most of
          the PETSc function directly from MATLAB; we could use help in
          developing this. See share/petsc/matlab/classes/PetscInitialize.m
        </li>
      </ol>

      <h3><a name="cython">How do I get started with Cython so that I can extend petsc4py?</a></h3>

      Steps I used:

      <ol>
        <li>
          Learn how to <a href="http://docs.cython.org/src/quickstart/build.html">build a Cython module</a>
        </li>
        <li>
          Go through the simple example provided by Denis <a href="http://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython">here</a>.
          Note also the next comment that shows how to create numpy arrays in the Cython and pass them back.
        </li>
        <li>
          Check out <a href="http://docs.cython.org/src/tutorial/numpy.html">this page</a> which tells you how to get fast indexing
        </li>
        <li>
          Have a look at the petsc4py <a href="http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi">array source</a>
        </li>
      </ol>

     <h3><a name="customnormforksp">I would like to compute a custom norm for KSP to use as a convergence test or for monitoring?</a></h3>

     You need to call <a href="manualpages/KSP/KSPBuildResidual.html">KSPBuildResidual</a>() on your KSP object and then
      compute the     appropriate norm on the resulting residual. Note that depending on the
      <a href="manualpages/KSP/KSPSetNormType.html">KSPSetNormType</a>() of the method you may not return the same
      norm as provided by the method. See also
      <a href="manualpages/KSP/KSPSetPCSide.html">KSPSetPCSide</a>()

      <h3><a name="parallelsolvesequentialcode">If I have a sequential program can I use a parallel direct solver?</a></h3>

      <p> Yes, you must ./configure PETSc with MPI (even though you will not use MPI) and the options --download-superlu_dist
        --download-parmetis --download-metis --with-openmp</p>

      <p> Your compiler must support OpenMP. To have the linear solver run in parallel run your program with</p>

      <p> OMP_NUM_THREADS=n ./programname -pc_type lu -pc_factor_mat_solver superlu_dist</p>

      <p> where n is the number of threads and should be less than or equal to the number of cores available. Do not expect to get
        great speedups but you should see some reduction in the time for the linear solvers.</p>

      <p> If your code is MPI parallel you can also use these same options to have SuperLU_dist utilize multiple threads per MPI process for
        the direct solver. Make sure that the OMP_NUM_THREADS you use per MPI process is less than or equal to the number of cores available
        for each MPI process. For example if your compute nodes have 6 cores and you use 2 MPI processes per node then set OMP_NUM_THREADS to 2 or 3.

      <hr>

      <h2><a name="execution">Execution</a></h2>

      <h3><a name="long-link-time">PETSc executables are SO big and take SO long to link.</a></h3>

      We find this annoying as well. On most machines PETSc can use shared
      libraries, so executables should be much smaller, run ./configure with the
      additional option --with-shared-libraries (this is the default). Also, if you have room,
      compiling and linking PETSc on your machine's /tmp disk or similar local
      disk, rather than over the network will be much faster.

      <h3><a name="petsc-options">PETSc has so many options for my program that it is hard to keep them straight.</a></h3>

      Running the PETSc program with the option -help will print of many of the
      options. To print the options that have been specified within a program,
      employ -options_left to print any options that the user specified but were
      not actually used by the program and all options used; this is helpful for
      detecting typo errors.

      <h3><a name="petsc-log-info">PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?</a></h3>

      You can use the option -info to get more details about the solution
      process. The option -log_view provides details about the distribution of
      time spent in the various phases of the solution process. You can run with
      -ts_view or -snes_view or -ksp_view to see what solver options are being
      used. Run with -ts_monitor -snes_monitor or -ksp_monitor to watch
      convergence of the methods.  -snes_converged_reason and
      -ksp_converged_reason will indicate why and if the solvers have converged.

      <h3><a name="efficient-assembly">Assembling large sparse matrices takes a long time. What can I do make this process faster? or MatSetValues() is so slow, what can I do to speed it up?</a></h3>

      See the <a href="manual.pdf#nameddest=ch_performance">Performance chapter of the users manual</a> for many tips on this.

      <ol>
        <li>
          Preallocate enough space for the sparse matrix. For example, rather
          than calling MatCreateSeqAIJ(comm,n,n,0,NULL,&amp;mat); call
          MatCreateSeqAIJ(comm,n,n,rowmax,NULL,&amp;mat); where rowmax is
          the maximum number of nonzeros expected per row. Or if you know the
          number of nonzeros per row, you can pass this information in instead of
          the NULL argument. See the manual pages for each of the
          MatCreateXXX() routines.
        </li>

        <li>
          Insert blocks of values into the matrix, rather than individual components.
        </li>
      </ol>


      <p>Preallocation of matrix memory is crucial for good performance for large problems, see:</p>

      <ul>
        <li><a href="manual.pdf#sec_matsparse">manual.pdf#sec_matsparse</a></li>
        <li><a href="manualpages/Mat/MatCreateMPIAIJ.html">manualpages/Mat/MatCreateMPIAIJ.html</a></li>
      </ul>

      <p>
        If you can set several nonzeros in a block at the same time, this is faster
        than calling MatSetValues() for each individual matrix entry.
      </p>

      <p>
        It is best to generate most matrix entries on the process they belong to
        (so they do not have to be stashed and then shipped to the owning process).
        Note: it is fine to have some entries generated on the "wrong" process,
        just not many.
      </p>

      <h3><a name="log-view">How can I generate performance summaries with PETSc?</a></h3>

      Use these options at runtime: -log_view. See the <a href="manual.pdf#nameddest=ch_performance">Performance chapter of the users manual</a>
      for information on interpreting the summary data. If using the PETSc
      (non)linear solvers, one can also specify -snes_view or -ksp_view for
      a printout of solver info. Only the highest level PETSc object used needs
      to specify the view option.

      <h3><a name="parallel-roundoff">Why do I get different answers on a different numbers of processors?</a></h3>

      Most commonly, you are using a preconditioner which behaves differently
      based upon the number of processors, such as Block-Jacobi which is the
      PETSc default. However, since computations are reordered in parallel, small
      roundoff errors will still be present with identical mathematical
      formulations. If you set a tighter linear solver tolerance (using
      -ksp_rtol), the differences will decrease.

      <h3><a name="mg-log">How do I know the amount of time spent on each level of the multigrid solver/preconditioner?</a></h3>

      Run with <code>-log_view</code> and <code>-pc_mg_log</code>

      <h3><a name="datafiles">Where do I get the input matrices for the examples?</a></h3>

      Some examples use ${DATAFILESPATH}/matrices/medium and other files. These
      test matrices in PETSc binary format can be found with anonymous ftp from
      <a href="http://ftp.mcs.anl.gov">ftp.mcs.anl.gov</a> in the directory
      pub/petsc/Datafiles/matrices. The are not included with the PETSc distribution in the
      interest of reducing the distribution size. You can obtain many of the files in pub/petsc/Datafiles.tar.gz

      <h3><a name="info">When I dump some matrices and vectors to binary, I seem to be generating some empty files with .info extensions. What's the deal with these?</a></h3>

      PETSc binary viewers put some additional information into .info files like
      matrix block size; it is harmless but if you really don't like it you can
      use -viewer_binary_skip_info or PetscViewerBinarySkipInfo() note you need
      to call PetscViewerBinarySkipInfo() before PetscViewerFileSetName(). In
      other words you cannot use PetscViewerBinaryOpen() directly.

      <h3><a name="slowerparallel">Why is my parallel solver slower than my sequential solver, or I have poor speed-up?</a></h3>

      This can happen for many reasons:

      <ul>
        <li>
          First make sure it is truely the time in KSPSolve() that is slower (by
          running the code with <a href="#log-view">-log_view</a>).
          Often the slower time is in <a href="#slow">generating the matrix</a>
          or some other operation.
        </li>

        <li>
          There must be enough work for each process to overweigh the
          communication time. We recommend an absolute minimum of about 10,000
          unknowns per process, better is 20,000 or more.
        </li>

        <li>
          Make sure the <a href="#computers">communication speed of the parallel computer</a>
          is good enough for parallel solvers.
        </li>

        <li>
          Check the number of solver iterates with the parallel solver against
          the sequential solver. Most preconditioners require more iterations
          when used on more processes, this is particularly true for block
          Jaccobi, the default parallel preconditioner, you can try -pc_type asm
          (<a href="manualpages/PC/PCASM.html">PCASM</a>)
          its iterations scale a bit better for more processes.  You may also consider
          multigrid preconditioners like <a href="manualpages/PC/PCMG.html">PCMG</a>
          or BoomerAMG in <a href="manualpages/PC/PCHYPRE.html">PCHYPRE</a>.
        </li>
      </ul>

      <h3><a name="pipelined">What steps are necessary to make the pipelined solvers execute efficiently?</a></h3>

      Pipelined solvers like
      <a href="manualpages/KSP/KSPPGMRES.html">KSPPGMRES</a>,
      <a href="manualpages/KSP/KSPPIPECG.html">KSPPIPECG</a>,
      <a href="manualpages/KSP/KSPPIPECR.html">KSPPIPECR</a>, and
      <a href="manualpages/KSP/KSPGROPPCG.html">KSPGROPPCG</a>
      may require special MPI configuration to effectively overlap reductions with computation.
      In general, this requires an MPI-3 implementation, an implementation that supports multiple threads, and use of a "progress thread".
      Consult the documentation from your vendor or computing facility for more.

      <dl>
        <dt>MPICH</dt>
        <dd>
          MPICH version 3.0 and later implements the MPI-3 standard and the default configuration supports use of threads.
          Use of a progress thread is configured by setting the environment variable <tt>MPICH_ASYNC_PROGRESS=1</tt>.
        </dd>
        <dt>Cray MPI</dt>
        <dd>
          Cray MPT-5.6 supports MPI-3, but requires the environment variable <tt>MPICH_MAX_THREAD_SAFETY=multiple</tt> for threads, plus either <tt>MPICH_ASYNC_PROGRESS=1</tt> or <tt>MPICH_NEMESIS_ASYNC_PROGRESS=1</tt>.
        </dd>
      </dl>

      <h3><a name="singleprecision">When using PETSc in single precision mode (--with-precision=single when running ./configure) are the operations done in single or double precision?</a></h3>

      PETSc does NOT do any explicit conversion of single precision to double
      before performing computations; this it depends on the hardware and
      compiler what happens. For example, the compiler could choose to put the
      single precision numbers into the usual double precision registers and then
      use the usual double precision floating point unit.  Or it could use SSE2
      instructions that work directly on the single precision numbers. It is
      a bit of a mystery what decisions get made sometimes. There may be compiler
      flags in some circumstances that can affect this.

      <h3><a name="newton">Why is Newton's method (SNES) not converging, or converges slowly?</a></h3>

      Newton's method may not converge for many reasons, here are some of the most common.

      <ul>
        <li>The Jacobian is wrong (or correct in sequential but not in parallel).</li>
        <li>The linear system <a href="#kspdiverged">is not solved</a> or is not solved accurately enough.</li>
        <li>The Jacobian system has a singularity that the linear solver is not handling.</li>
        <li>There is a bug in the function evaluation routine.</li>
        <li>The function is not continuous or does not have continuous first derivatives (e.g. phase change or TVD limiters).</li>
        <li>
          The equations may not have a solution (e.g. limit cycle instead of
          a steady state) or there may be a "hill" between the initial guess and
          the steady state (e.g. reactants must ignite and burn before reaching
          a steady state, but the steady-state residual will be larger during
          combustion).
        </li>
      </ul>

      Here are some of the ways to help debug lack of convergence of Newton.

      <ul>
        <li>Run with the options <code>-snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason</code>.
          <ul>
            <li>
              If the linear solve does not converge, check if the Jacobian is
              correct, then see <a href="#kspdiverged">this question</a>.
            </li>
            <li>
              If the preconditioned residual converges, but the true residual
              does not, the preconditioner may be singular.
            </li>
            <li>
              If the linear solve converges well, but the line search fails, the
              Jacobian may be incorrect.
            </li>
          </ul>
        </li>

        <li>
          Run with <code>-pc_type lu</code> or <code>-pc_type svd</code> to see
          if the problem is a poor linear solver
        </li>

        <li>
          Run with <code>-mat_view</code> or <code>-mat_view draw</code> to see
          if the Jacobian looks reasonable
        </li>

        <li>
          Run with <code>-snes_test_jacobian -snes_test_jacobian_view</code> to see if the
          Jacobian you are using is wrong. Compare the output when you add
          <code>-mat_fd_type ds</code> to see if the result is sensitive to the
          choice of differencing parameter.
        </li>

        <li>
          Run with <code>-snes_mf_operator -pc_type lu</code> to see if the
          Jacobian you are using is wrong. If the problem is too large for
          a direct solve, try <code>-snes_mf_operator -pc_type ksp -ksp_ksp_rtol
            1e-12</code>. Compare the output when you add <code>-mat_mffd_type
            ds</code> to see if the result is sensitive to choice of differencing
          parameter.
        </li>

        <li>Run on one processor to see if the problem is only in parallel.</li>

        <li>
          Run with <code>-snes_linesearch_monitor</code> to see if the line search
          is failing (this is usually a sign of a bad Jacobian). Use -info in PETSc 3.1
          and older versions, <code>-snes_ls_monitor</code> in PETSc 3.2
          and <code>-snes_linesearch_monitor</code> in PETSc 3.3 and later.
        </li>

        <li>
          Run with <code>-info</code> to get more detailed information on the
          solution process.
        </li>
      </ul>

      Here are some ways to help the Newton process if everything above checks out

      <ul>
        <li>
          Run with grid sequencing (<code>-snes_grid_sequence</code> if working
          with a DM is all you need) to generate better initial guess on your
          finer mesh
        </li>

        <li>
          Run with quad precision (./configure with --with-precision=__float128
          --download-f2cblaslapack with PETSc 3.2 and later and recent versions
          of the GNU compilers)
        </li>

        <li>
          Change the units (nondimensionalization), boundary condition scaling,
          or formulation so that the Jacobian is better conditioned. See <a href="http://en.wikipedia.org/wiki/Buckingham_π_theorem">http://en.wikipedia.org/wiki/Buckingham_π_theorem</a>
        </li>

        <li>
          Mollify features in the function that do not have continuous first
          derivatives (often occurs when there are "if" statements in the
          residual evaluation, e.g.  phase change or TVD limiters). Use
          a variational inequality solver (SNESVINEWTONRSLS) if the discontinuities are
          of fundamental importance.
        </li>

        <li>
          Try a trust region method (<code>-ts_type tr</code>, may have to adjust
          parameters).
        </li>

        <li>
          Run with some continuation parameter from a point where you know the
          solution, see TSPSEUDO for steady-states.
        </li>

        <li>
          There are homotopy solver packages like PHCpack that can get you all
          possible solutions (and tell you that it has found them all) but those
          are not scalable and cannot solve anything but small problems.
        </li>
      </ul>

      <h3><a name="kspdiverged">Why is the linear solver (KSP) not converging, or converges slowly?</a></h3>

      <p>
        Always run with <code>-ksp_converged_reason -ksp_monitor_true_residual</code>
        when trying to learn why a method is not converging. Common reasons for
        KSP not converging are
      </p>

      <ul>
        <li>
          The equations are singular by accident (e.g. forgot to impose boundary
          conditions). Check this for a small problem using <code>-pc_type svd
            -pc_svd_monitor</code>.
        </li>

        <li>
          The equations are intentionally singular (e.g.  constant null space),
          but the Krylov method was not informed, see MatSetNullSpace().
        </li>

        <li>
          The equations are intentionally singular and MatSetNullSpace() was
          used, but the right hand side is not consistent. You may have to call
          MatNullSpaceRemove() on the right hand side before calling KSPSolve().
          See MatSetTransposeNullSpace()
        </li>

        <li>
          The equations are indefinite so that standard preconditioners don't
          work. Usually you will know this from the physics, but you can check
          with <code>-ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none</code>.
          For simple saddle point problems, try <code>-pc_type fieldsplit
            -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point</code>.
          For more difficult problems, read the literature to find robust methods
          and ask petsc-users@mcs.anl.gov or petsc-maint@mcs.anl.gov if you want
          advice about how to implement them.
        </li>

        <li>
          If the method converges in preconditioned residual, but not in true
          residual, the preconditioner is likely singular or nearly so. This is
          common for saddle point problems (e.g. incompressible flow) or strongly
          nonsymmetric operators (e.g. low-Mach hyperbolic problems with large
          time steps).
        </li>

        <li>
          The preconditioner is too weak or is unstable. See if <code>-pc_type
            asm -sub_pc_type lu</code> improves the convergence rate. If GMRES is
          losing too much progress in the restart, see if longer restarts help
          <code>-ksp_gmres_restart 300</code>. If a transpose is available, try
          <code>-ksp_type bcgs</code> or other methods that do not require
          a restart. (Note that convergence with these methods is frequently
          erratic.)
        </li>

        <li>
          The preconditioner is nonlinear (e.g. a nested iterative solve), try
          <code>-ksp_type fgmres</code> or <code>-ksp_type gcr</code>.
        </li>

        <li>
          You are using geometric multigrid, but some equations (often boundary
          conditions) are not scaled compatibly between levels. Try
          <code>-pc_mg_galerkin both</code> to algebraically construct a correctly
          scaled coarse operator or make sure that all the equations are scaled
          in the same way if you want to use rediscretized coarse levels.
        </li>

        <li>
          The matrix is very ill-conditioned. Check the <a href="#conditionnumber">condition number</a>.
          <ul>
            <li> Try to improve it by choosing the relative scaling of components/boundary conditions.</li>
            <li>Try <code>-ksp_diagonal_scale -ksp_diagonal_scale_fix</code>.</li>
            <li>Perhaps change the formulation of the problem to produce more friendly algebraic equations.</li>
          </ul>
        </li>

        <li>
          The matrix is nonlinear (e.g. evaluated using finite differencing of
          a nonlinear function). Try different differencing parameters,
          <code>./configure --with-precision=__float128 --download-f2cblaslapack</code>,
          check if it converges in "easier" parameter regimes.
        </li>

        <li>A symmetric method is being used for a non-symmetric problem.</li>

        <li>
          Classical Gram-Schmidt is becoming unstable, try <code>-ksp_gmres_modifiedgramschmidt</code>
          or use a method that orthogonalizes differently, e.g. <code>-ksp_type gcr</code>.
        </li>
      </ul>



      <hr>

      <h2><a name="debugging">Debugging</a></h2>

      <h3><a name="debug-ibm">How do I turn off PETSc signal handling so I can use the -C option on xlF?</a></h3>

      <p>Immediately after calling PetscInitialize() call PetscPopSignalHandler()</p>

      <p>
        Some Fortran compilers including the IBM xlf, xlF etc compilers have
        a compile option (-C for IBM's) that causes all array access in Fortran
        to be checked that they are in-bounds. This is a great feature but does
        require that the array dimensions be set explicitly, not with a *.
      </p>

      <h3><a name="start_in_debugger-doesnotwork">How do I debug if -start_in_debugger does not work on my machine?</a></h3>

      <p> On newer Mac OSX machines - one has to be in admin group to be able to use debugger</p>

      <p>
        On newer UBUNTU linux machines - one has to disable <a href="https://wiki.ubuntu.com/Security/Features#ptrace">ptrace_scop</a>
        with "sudo echo 0 &gt; /proc/sys/kernel/yama/ptrace_scope" - to get start
        in debugger working.
      </p>

      <p>
        If start_in_debugger does not really work on your OS, for a uniprocessor
        job, just try the debugger directly, for example: gdb ex1. You can also
        use Totalview which is a good graphical parallel debugger.
      </p>

      <h3><a name="debug-hang">How do I see where my code is hanging?</a></h3>

      You can use the -start_in_debugger option to start all processes in the
      debugger (each will come up in its own xterm) or run in Totalview. Then use
      cont (for continue) in each xterm. Once you are sure that the program is
      hanging, hit control-c in each xterm and then use 'where' to print a stack
      trace for each process.

      <h3><a name="debug-inspect">How can I inspect Vec and Mat values when in the debugger?</a></h3>

      <p>
        I will illustrate this with gdb, but it should be similar on other
        debuggers. You can look at local Vec values directly by obtaining the
        array. For a Vec v, we can print all local values using
      </p>

      <pre>
(gdb) p ((Vec_Seq*) v-&gt;data)-&gt;array[0]@v-&gt;map.n
    </pre>

      <p>
        However, this becomes much more complicated for a matrix.  Therefore, it
        is advisable to use the default viewer to look at the object. For a Vec
        v and a Mat m, this would be
      </p>

      <pre>
  (gdb) call VecView(v, 0)
  (gdb) call MatView(m, 0)
    </pre>

      <p>or with a communicator other than MPI_COMM_WORLD,</p>

      <pre>
(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m-&gt;comm))
    </pre>

      <p>
        Totalview 8.8.0 has a new feature that allows libraries to provide their
        own code to display objects in the debugger. Thus in theory each PETSc
        object, Vec, Mat etc could have custom code to print values in the
        object. We have only done this for the most elementary display of Vec and
        Mat. See the routine TV_display_type() in src/vec/vec/interface/vector.c
        for an example of how these may be written. Contact us if you would like
        to add more.
      </p>

      <h3><a name="debug-fp">How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?</a></h3>
      <p>
        The best way to locate floating point exceptions is to use a debugger.
        On supported architectures (including Linux and glibc-based systems), just run in a debugger and pass -fp_trap to the PETSc application.
        This will activate signaling exceptions and the debugger will break on the line that first divides by zero or otherwise generates an exceptions.
        Without a debugger, running with -fp_trap in debug mode will only identify the function in which the error occurred, but not the line or the type of exception.
        If -fp_trap is not supported on your architecture, consult the documentation for your debugger since there is likely a way to have it catch exceptions.
      </p>

      <h3><a name="libimf">Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory.</a></h3>

      <p>
        The Intel compilers use shared libraries (like libimf) that cannot by
        default at run time. When using the Intel compilers (and running the
        resulting code) you must make sure that the proper Intel initialization
        scripts are run.  This is usually done by putting some code into your
        .cshrc, .bashrc, .profile etc file. Sometimes on batch file systems that
        do now access your initialization files (like .cshrc) you must include
        the initialization calls in your batch file submission.
      </p>

      For example, on my Mac using csh I have the following in my .cshrc file

      <pre>
source /opt/intel/cc/10.1.012/bin/iccvars.csh
source /opt/intel/fc/10.1.012/bin/ifortvars.csh
source /opt/intel/idb/10.1.012/bin/idbvars.csh
    </pre>

      in my .profile I have

      <pre>
source /opt/intel/cc/10.1.012/bin/iccvars.sh
source /opt/intel/fc/10.1.012/bin/ifortvars.sh
source /opt/intel/idb/10.1.012/bin/idbvars.sh
    </pre>

      <h3><a name="objecttypenotset">What does Object Type not set: Argument # n mean?</a></h3>

      Many operations on PETSc objects require that the specific type of the
      object be set before the operations is performed. You must call
      XXXSetType() or XXXSetFromOptions() before you make the offending call. For
      example, MatCreate(comm,&amp;A); MatSetValues(A,....); will not work.  You
      must add MatSetType(A,...) or MatSetFromOptions(A,....); before the call to
      MatSetValues();

      <h3><a name="split">What does <code>Error detected in PetscSplitOwnership() about "sum of local lengths ...":</code> mean?</a></h3>

      In a previous call to VecSetSizes(), MatSetSizes(), VecCreateXXX() or
      MatCreateXXX() you passed in local and global sizes that do not make sense
      for the correct number of processors. For example if you pass in a local
      size of 2 and a global size of 100 and run on two processors, this cannot
      work since the sum of the local sizes is 4, not 100.

      <h3><a name="valgrind">What does <code>Corrupt argument or Caught signal or SEQV or segmentation violation or bus error</code> mean? Can I use valgrind to debug memory corruption issues?</a></h3>

      <p>
        Sometimes it can mean an argument to a function is invalid. In Fortran
        this may be caused by forgeting to list an argument in the call,
        especially the final ierr.
      </p>

      <p>
        Otherwise it is usually caused by memory corruption; that is somewhere
        the code is writing out of array bounds. To track this down rerun the
        debug version of the code with the option -malloc_debug. Occasionally the
        code may crash only with the optimized version, in that case run the
        optimized version with -malloc_debug. If you determine the problem is
        from memory corruption you can put the macro CHKMEMQ in the code near the
        crash to determine exactly what line is causing the problem.
      </p>

      If -malloc_debug does not help: on GNU/Linux and Apple Mac OS X machines
      - you can try using <a href="http://valgrind.org/">http://valgrind.org</a>
      to look for memory corruption. - Make sure valgrind is installed

      You might also consider using <a href="http://drmemory.org">http://drmemory.org</a> which has
      support for GNU/Linux, Apple Mac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).

      <ul>
        <li>Recommend building PETSc with <code>--download-mpich --with-debugging</code> [debugging is enabled by default]</li>
        <li>Compile application code with this build of PETSc</li>
        <li>run with valgrind using: <code>${PETSC_DIR}/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS</code></li>
        <li>or invoke valgrind directly with: <code>mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS</code></li>
      </ul>

      Notes:

      <ul>
        <li>option <code>--with-debugging</code> enables valgrind to give stack trace with additional source-file:line-number info.</li>
        <li>option <code>--download-mpich</code> gives valgrind clean MPI - hence the recommendation.</li>
        <li>Wrt Other MPI impls, Open MPI should also work. MPICH1 will not work.</li>
        <li>if <code>--download-mpich</code> is used - mpiexec will be in PETSC_ARCH/bin</li>
        <li><code>--log-file=valgrind.log.%p</code> option tells valgrind to store the output from each proc in a different file [as %p i.e PID, is different for each MPI proc].</li>
        <li>On Apple you need the additional valgrind option <code>--dsymutil=yes</code></li>
        <li>memcheck will not find certain array access that violate static array declarations so if memcheck runs clean you can try the <code>--tool=exp-ptrcheck</code> instead.</li>
      </ul>

      <h3><a name="zeropivot">What does <code>Detected zero pivot in LU factorization</code> mean?</a></h3>

      <p>
        A zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not
        always mean that the matrix is singular. You can use '-pc_factor_shift_type
        NONZERO -pc_factor_shift_amount [amount]' or '-pc_factor_shift_type
        POSITIVE_DEFINITE'; '-[level]_pc_factor_shift_type NONZERO
        -pc_factor_shift_amount [amount]'   or '-[level]_pc_factor_shift_type
        POSITIVE_DEFINITE' to prevent the zero pivot. [level] is "sub" when lu,
        ilu, cholesky, or icc are employed in each individual block of the bjacobi
        or ASM preconditioner; and [level] is "mg_levels" or "mg_coarse" when lu,
        ilu, cholesky, or icc are used inside multigrid smoothers or to the coarse
        grid solver.  See PCFactorSetShiftType(), PCFactorSetAmount().
      </p>

      <p>This error can also happen if your matrix is singular, see MatSetNullSpace() for how to handle this.</p>

      <p>
        If this error occurs in the zeroth row of the matrix, it is likely you have
        an error in the code that generates the matrix.
      </p>

      <h3><a name="xwindows">You create <code>Draw</code> windows or <code>ViewerDraw</code> windows or use options <code>-ksp_monitor_lg_residualnorm</code> or <code>-snes_monitor_lg_residualnorm</code> and the program seems to run OK but windows never open.</a></h3>

      The libraries were compiled without support for X windows. Make sure that
      ./configure was run with the option <code>--with-x</code>

      <h3><a name="memory">The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory.</a></h3>

      Problem: Possibly some of the following:

      <ol>
        <li>You are creating new PETSc objects but never freeing them.</li>

        <li>There is a memory leak in PETSc or your code.</li>

        <li>
          Something much more subtle: (if you are using Fortran). When you
          declare a large array in Fortran, the operating system does not
          allocate all the memory pages for that array until you start using the
          different locations in the array. Thus, in a code, if at each step you
          start using later values in the array your virtual memory usage will
          "continue" to increase as measured by <code>ps</code> or <code>top</code>.
        </li>

        <li>
          You are running with the -log, -log_mpe, or -log_all option. With these options, a great
          deal of logging information is stored in memory until the conclusion of
          the run.
        </li>

        <li>
          You are linking with the MPI profiling libraries; these cause logging
          of all MPI activities. Another Symptom is at the conclusion of the run
          it may print some message about writing log files.
        </li>
      </ol>

      Cures:

      <ol>
        <li>
          Run with the -malloc_debug option and -malloc_dump. Or use the commands
          PetscMallocDump() and PetscMallocLogDump() sprinkled in your code to
          track memory that is allocated and not later freed. Use the commands
          PetscMallocGetCurrentUsage() and PetscMemoryGetCurrentUsage() to
          monitor memory allocated and PetscMallocGetMaximumUsage() and PetscMemoryGetMaximumUsage()
          for total memory used as the code progresses.
        </li>

        <li>This is just the way Unix works and is harmless.</li>

        <li>
          Do not use the -log, -log_mpe, or -log_all option, or use
          PLogEventDeactivate() or PLogEventDeactivateClass()
          to turn off logging of specific events.
        </li>

        <li>Make sure you do not link with the MPI profiling libraries.</li>
      </ol>

      <h3><a name="key">When calling MatPartitioningApply() you get a message Error! Key 16615 not found</a></h3>

      The graph of the matrix you are using is not symmetric. You must use symmetric matrices for partitioning.

      <h3><a name="gmres">With GMRES At restart the second residual norm printed does not match the first.</a></h3>

      <pre>
26 KSP Residual norm 3.421544615851e-04
27 KSP Residual norm 2.973675659493e-04
28 KSP Residual norm 2.588642948270e-04
29 KSP Residual norm 2.268190747349e-04
30 KSP Residual norm 1.977245964368e-04
30 KSP Residual norm 1.994426291979e-04 &lt;----- At restart the residual norm is printed a second time
    </pre>

      <p>
        <b>Problem</b>: Actually this is not surprising. GMRES computes the norm of the
        residual at each iteration via a recurrence relation between the norms of
        the residuals at the previous iterations and quantities computed at the
        current iteration; it does not compute it via directly || b - A x^{n} ||.
        Sometimes, especially with an ill-conditioned matrix, or computation of the
        matrix-vector product via differencing, the residual norms computed by
        GMRES start to "drift" from the correct values. At the restart, we compute
        the residual norm directly, hence the "strange stuff," the difference
        printed. The drifting, if it remains small, is harmless (doesn't affect the
        accuracy of the solution that GMRES computes).
      </p>

      <p>
        <b>Cure</b>: There realy isn't a cure, but if you use a more powerful
        preconditioner the drift will often be smaller and less noticeable. Of if
        you are running matrix-free you may need to tune the matrix-free
        parameters.
      </p>

      <h3><a name="doubleits">Why do some Krylov methods seem to print two residual norms per iteration?</a></h3>

      <pre>
1198 KSP Residual norm 1.366052062216e-04
1198 KSP Residual norm 1.931875025549e-04
1199 KSP Residual norm 1.366026406067e-04
1199 KSP Residual norm 1.931819426344e-04
    </pre>

      <p>
        Some Krylov methods, for example tfqmr, actually have a "sub-iteration"
        of size 2 inside the loop; each of the two substeps has its own matrix
        vector product and application of the preconditioner and updates the
        residual approximations. This is why you get this "funny" output where it
        looks like there are two residual norms per iteration. You can also think
        of it as twice as many iterations.
      </p>

      <h3><a name="dylib">Unable to locate PETSc dynamic library /home/balay/spetsc/lib/libg/linux/libpetsc</a></h3>

      <p>
        When using DYNAMIC libraries - the libraries cannot be moved after they are
        installed. This could also happen on clusters - where the paths are
        different on the (run) nodes - than on the (compile) front-end. Do not use
        dynamic libraries &amp; shared libraries. Run ./configure with
        <code>--with-shared-libraries=0 --with-dynamic-loading=0</code>.
        This option has been removed in petsc-3.5.
      </p>

      <h3><a name="bisect">How do I determine what update to PETSc broke my code?</a></h3>

      If at some point [in petsc code history] you had a working code - but the
      latest petsc code broke it, its possible to determine the petsc code change
      that might have caused this behavior. This is achieved by:

      <ul>
        <li>using <a href="http://git-scm.com/">Git</a> to access PETSc sources</li>

        <li>knowing the Git commit for the <b>known working</b> version of PETSc</li>

        <li>knowing the Git commit for the <b>known broken</b> version of PETSc</li>

        <li>using the <a href="https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html">bisect</a> functionality of Git</li>
      </ul>

      This process can be as follows:

      <ul>
        <li>get petsc development (master branch in git) sources:
          <ul>
            <li><code>git clone https://bitbucket.org/petsc/petsc</code></li>
          </ul>
        </li>

        <li>
          Find the <b>good</b> and <b>bad</b> markers to
          start the bisection process. This can be done either by checking
          <code>git log</code> or <code>gitk</code> or <a href="https://bitbucket.org/petsc/petsc">https://bitbucket.org/petsc/petsc</a>
          or the web history of petsc-release clones. Lets say the known bad
          commit is 21af4baa815c and known good commit is 5ae5ab319844
        </li>

        <li>
          Now start the bisection process with these known revisions. [build PETSc, and test your code to confirm known good/bad behavior]
          <ul>
            <li><code>git bisect start 21af4baa815c 5ae5ab319844</code></li>
            <li>&lt;build/test/confirm-bad&gt;</li>
            <li><code>git bisect bad</code></li>
            <li>&lt;build/test/confirm-good&gt;</li>
            <li><code>git bisect good</code></li>
          </ul>
        </li>

        <li>
          Now until done - keep bisecting, building PETSc, and testing your code with it and determine if the code is working or not.
          After something like 5-15 iterations, <code>git bisect</code> will
          pin-point the exact code change that resulted in the difference in
          application behavior.
        </li>

        <li>
          See <a href="https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html">git-bisect(1)</a>
          and the <a href="http://git-scm.com/book/en/Git-Tools-Debugging-with-Git">debugging
          section of the Git Book</a> for more debugging tips.
        <li>
      </ul>

      <hr>

      <h2><a name="shared-libraries">Shared Libraries</a></h2>

      <h3><a name="install-shared">Can I install PETSc libraries as shared libraries?</a></h3>

      Yes.Use the <code>./configure --with-shared-libraries</code>

      <h3><a name="why-use-shared">Why should I use shared libraries?</a></h3>

      When you link to shared libraries, the function symbols from the shared
      libraries are not copied in the executable. This way the size of the
      executable is considerably smaller than when using regular libraries.  This
      helps in a couple of ways:

      <ol>
        <li>saves disk space when more than one executable is created, and</li>
        <li>improves the compile time immensly, because the compiler has to write a much smaller file (executable) to the disk.</li>
      </ol>

      <h3><a name="link-shared">How do I link to the PETSc shared libraries?</a></h3>

      By default, the compiler should pick up the shared libraries instead of the regular ones. Nothing special should be done for this.

      <h3><a name="link-regular-lib">What If I want to link to the regular .a library files?</a></h3>

      You must run ./configure with the option --with-shared-libraries=0 (you
      can use a different PETSC_ARCH for this build so you can easily switch
      between the two).

      <h3><a name="move-shared-exec">What do I do if I want to move my executable to a different machine?</a></h3>

      <p>
        You would also need to have access to the shared libraries on this new
        machine. The other alternative is to build the exeutable without shared
        libraries by first deleting the shared libraries, and then creating the
        executable.
      </p>

    </div>

    
  </body>
</html>