1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head> <link rel="canonical" href="http://www.mcs.anl.gov/petsc/petsc-current/docs/faq.html" />
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>Documentation: FAQ</title>
</head>
<body bgcolor="#ffffff">
<div id="version" align=right><b>petsc-3.10.3 2018-12-18</b></div>
<div id="bugreport" align=right><a href="mailto:petsc-maint@mcs.anl.gov?subject=Typo or Error in Documentation &body=Please describe the typo or error in the documentation: petsc-3.10.3 v3.10.3 docs/faq.html "><small>Report Typos and Errors</small></a></div>
<h1>Documentation: FAQ</h1>
<style type="text/css">
#answers > h3 a[name] {
color: red;
}
#answers > ol > li {
list-style-type: lower-alpha;
padding: .5em 0;
}
</style>
<div id="main">
<h4><a href="#general">General</a></h4>
<ul>
<li><a href="#petsc-mailing-list">How can I subscribe to the PETSc mailing lists?</a></li>
<li><a href="#book">Any useful books on numerical computing?</a></li>
<li><a href="#computers">What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?</a></li>
<li><a href="#license">What kind of license is PETSc released under?</a></li>
<li><a href="#why-c">Why is PETSc programmed in C, instead of Fortran or C++?</a> </li>
<li><a href="#logging-overhead">Does all the PETSc error checking and logging reduce PETSc's efficiency?</a></li>
<li><a href="#work-efficiently">How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?</a></li>
<li><a href="#complex">For complex numbers will I get better performance using C or C++?</a></li>
<li><a href="#different">How come when I run the same program on the same number of processes I get "different" answers?</a></li>
<li><a href="#differentiterations">How come when I run the same linear solver with different number of processes it takes a different number of iterations?</a></li>
<li><a href="#gpus">Can PETSc use GPUs to speed up the computation time?</a></li>
<li><a href="#precision">Can I run PETSc with extended precision?</a></li>
<li><a href="#qd">Why doesn't PETSc use QD to implement support for extended precision?</a></li>
</ul>
<h4><a href="#installation">Installation</a></h4>
<ul>
<li><a href="#already-installed">How do I begin using PETSc if the software has already been completely built and installed by someone else?</a></li>
<li><a href="#reduce-disk-space">The PETSc distribution is SO large. How can I reduce my disk space usage?</a></li>
<li><a href="#petsc-uni">I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?</a></li>
<li><a href="#no-x">Can I install PETSc to not use X windows (either under Unix or Windows with gcc, the gnu compiler)?</a></li>
<li><a href="#use-mpi">Why do you use MPI</a>?</li> <li><a href="#mpi-compilers">What do I do if my MPI compiler wrappers are invalid</a>?</li>
<li><a href="#with-64-bit-indices">When should/can I use the ./configure option --with-64-bit-indices</a>?</li>
<li><a href="#compilererror">What if I get an internal compiler error?</a></li>
<li><a href="#install-petsc4py-dev">How do I install petsc4py with the development PETSc</a>?</li>
<li><a href="#gfortran">What Fortran compiler do you recommend for the Apple Mac OS X?</a></li>
<li><a href="#packagelocations">How can I find the URL locations of the packages you install using --download-PACKAGE?</a></li>
<li><a href="#multiplempis">How to fix the problem: PETSc was configured with one MPICH or OpenMPI mpi.h version but now appears to be compiling using a different MPICH OpenMPI mpi.h version</a></li>
<li><a href="#PetscOptionsInsertFile">What does it mean when make test gives:</a>
<pre>
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes
See http://www.mcs.anl.gov/petsc/documentation/faq.html
[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
</pre></li>
</ul>
<h4><a href="#usage">Usage</a></h4>
<ul>
<li><a href="#redirectstdout">How can I redirect PETSc's stdout and stderr when programming with a GUI interface in Windows Developer Studio or to C++ streams?</a></li>
<li><a href="#hypre">I want to use hypre boomerAMG without GMRES but when I run -pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly I don't get a very accurate answer!</a></li>
<li><a href="#nosaij">You have AIJ and BAIJ matrix formats, and SBAIJ for symmetric storage, how come no SAIJ?</a></li>
<li><a href="#domaindecomposition">How do I use PETSc for domain decomposition?</a></li>
<li><a href="#blocks">Can I create BAIJ matrices with different size blocks for different block rows?</a></li>
<li><a href="#mpi-vec-access">How do I access values from a parallel PETSc vector on a different process than the one that owns the values?</a></li>
<li><a href="#mpi-vec-to-seq-vec">How do I collect all the values from a parallel PETSc vector into a sequential vector on each processor?</a></li>
<li><a href="#mpi-vec-to-mpi-vec">How do I collect all the values from a parallel PETSc vector into a vector on the zeroth (or any particular) processor?</a></li>
<li><a href="#sparse-matrix-ascii-format">How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, SLAPC or other ASCII format?</a></li>
<li><a href="#setfromoptions">Does TSSetFromOptions(), SNESSetFromOptions() or KSPSetFromOptions() reset all the parameters I set or how come TS/SNES/KSPSetXXX() don't seem to work?</a></li>
<li><a href="#makefiles">Can I use my own makefiles or rules for compiling code, rather than PETSc's?</a></li>
<li><a href="#cmake">Can I use CMake to build my own project that depends on PETSc?</a></li>
<li><a href="#carriagereturns">How can I put carriage returns in PetscPrintf() statements from Fortran?</a></li>
<li><a href="#cxxmethod">How can I implement callbacks using C++ class methods?</a></li>
<li><a href="#functionjacobian">Everyone knows that when you code Newton's method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?</a></li>
<li><a href="#jacobianfree">How can I use Newton's method Jacobian free? Can I difference a different function than provided with SNESSetFunction()?</a></li>
<li><a href="#conditionnumber">How can I determine the condition number of a matrix?</a></li>
<li><a href="#invertmatrix">How can I compute the inverse of a PETSc matrix?</a></li>
<li><a href="#schurcomplement">How can I compute a Schur complement: Kbb - Kba *inverse(Kaa)*Kab?</a></li>
<li><a href="#fem">Do you have examples of doing unstructured grid finite element computations (FEM) with PETSc?</a></li>
<li><a href="#da_mpi_cart">The PETSc DMDA object decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?</a></li>
<li><a href="#redistribute">When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-symmetric system. How can I apply Dirichlet boundary conditions and yet keep the matrix symmetric?</a></li>
<li><a href="#matlab">How can I use PETSc with MATLAB? How can I get PETSc Vecs and Mats to MATLAB or vice versa?</a></li>
<li><a href="#cython">How do I get started with Cython so that I can extend petsc4py?</a></li>
<li><a href="#customnormforksp">I would like to compute a custom norm for KSP to use as a convergence test or for monitoring?</a></li>
<li><a href="#parallelsolvesequentialcode">If I have a sequential program can I use a parallel direct solver?</a></li>
</ul>
<h4><a href="#Execution">Execution</a></h4>
<ul>
<li><a href="#long-link-time">PETSc executables are SO big and take SO long to link.</a></li>
<li><a href="#petsc-options">PETSc has so many options for my program that it is hard to keep them straight.</a></li>
<li><a href="#petsc-log-info">PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program? </a></li>
<li><a href="#efficient-assembly">Assembling large sparse matrices takes a long time. What can I do make this process faster? or MatSetValues() is <b>so slow</b>, what can I do to make it faster?</a></li>
<li><a href="#log-view">How can I generate performance summaries with PETSc?</a></li>
<li><a href="#parallel-roundoff">Why do I get different answers on a different numbers of processors?</a></li>
<li><a href="#mg-log">How do I know the amount of time spent on each level of the solver in multigrid (PCType of PCMG) -pc_type mg.</a></li>
<li><a href="#datafiles">Where do I get the input matrices for the examples?</a></li>
<li><a href="#info">When I dump some matrices and vectors to binary, I seem to be generating some empty files with .info extensions. What's the deal with these?</a></li>
<li><a href="#slowerparallel">Why is my parallel <b>solver slower</b> than the sequential solver, or I have poor speed-up?</a></li>
<li><a href="#pipelined">What steps are necessary to make the pipelined solvers execute efficiently?</a></li>
<li><a href="#singleprecision">When using PETSc in single precision mode (--with-precision=single when running ./configure) are the operations done in single or double precision?</a></li>
<li><a href="#newton">Why is Newton's method (SNES) not converging, or converges slowly?</a></li>
<li><a href="#kspdiverged">Why is the linear solver (KSP) not converging, or converges slowly?</a></li>
</ul>
<h4><a href="#debugging">Debugging</a></h4>
<ul>
<li><a href="#debug-ibmfortran">How do I turn off PETSc signal handling so I can use the -C option on xlF?</a></li>
<li><a href="#start_in_debugger-doesnotwork">How do I debug if -start_in_debugger does not work on my machine?</a></li>
<li><a href="#debug-hang">How can I see where my code is hanging?</a></li>
<li><a href="#debug-inspect">How can I inspect Vec and Mat values when in the debugger?</a></li>
<li><a href="#debug-fp">How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?</a></li>
<li><a href="#libimf">Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory.</a></li>
<li><a href="#objecttypenotset">What does Object Type not set: Argument # n mean?</a></li>
<li><a href="#split">What does Error detected in PetscSplitOwnership() about "sum of local lengths ...": mean?</a></li>
<li><a href="#valgrind">What does Corrupt argument or Caught signal or SEQV or segmentation violation or bus error mean? Can I use valgrind to debug memory corruption issues?</a></li>
<li><a href="#zeropivot">What does Detected zero pivot in LU factorization mean?</a></li>
<li><a href="#xwindows">You create Draw windows or ViewerDraw windows or use options -ksp_monitor or_draw -snes_monitor_lg_residualnorm and the program seems to run OK but windows never open.</a></li>
<li><a href="#memory">The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory.</a></li>
<li><a href="#key">When calling MatPartitioningApply() you get a message <code>Error! Key 16615 not found</code></a></li>
<li><a href="#gmres">With GMRES At restart the second residual norm printed does not match the first </a></li>
<li><a href="#doubleits">Why do some Krylov methods seem to print two residual norms per iteration?</a></li>
<li><a href="#dylib">Unable to locate PETSc dynamic library /home/balay/spetsc/lib/libg/linux/libpetsc</a></li>
<li><a href="#bisect">How do I determine what update to PETSc broke my code?</a></li>
</ul>
<h4><a href="#shared-libraries">Shared Libraries</a></h4>
<ul>
<li><a href="#install-shared">Can I install PETSc libraries as shared libraries?</a></li>
<li><a href="#why-use-shared">Why should I use shared libraries?</a></li>
<li><a href="#link-shared">How do I link to the PETSc shared libraries?</a></li>
<li><a href="#link-regular-lib">What if I want to link to the regular .a library files?</a></li>
<li><a href="#move-shared-exec">What do I do if I want to move my executable to a different machine?</a></li>
</ul>
</div>
<hr>
<div id="answers">
<h1><a name="general">General</a></h1>
<h3><a name="petsc-mailing-list">How can I subscribe to the PETSc mailing lists?</a></h3>
See <a href="http://www.mcs.anl.gov/petsc/miscellaneous/mailing-lists.html">http://www.mcs.anl.gov/petsc/miscellaneous/mailing-lists.html</a>
<h3><a name="book">Any useful books on numerical computing?</a></h3>
<a href="http://ebooks.cambridge.org/ebook.jsf?bid=CBO9780511617973">Writing Scientific Software: A Guide to Good Style</a>
<h3><a name="computers">What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?</a></h3>
PETSc can be used with any kind of parallel system that supports MPI
<b>BUT</b> for any decent performance one needs
<ul>
<li>
a <b>fast, low-latency interconnect</b>; any ethernet, even 10 gigE
simply cannot provide the needed performance.
</li>
<li>
<b>high per-CPU MEMORY performance</b>. Each CPU (core in multi-core
systems) needs to have its <b>own</b> memory bandwith of at least 2 or
more gigabytes/second. For example, standard dual processor "PC's" will
<b>not</b> provide better performance when the second processor is
used, that is, you will not see speed-up when you using the second
processor. This is because the speed of sparse matrix computations is
almost totally determined by the speed of the memory, not the speed of
the CPU.
</li>
</ul>
To obtain good performance it is important that you know your
machine, how many nodes, how many memory sockets per node and how
many cores per memory socket and how much memory bandwidth you
obtain per core, memory socket, and node. If you do not know this
and can run MPI programs with mpiexec (that is, you don't have
batch system) in ${PETSC_DIR} run </p>
make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]</p>
then it will provide a summary of the bandwidth received with different number of MPI processes and potential speedups. If you have a batch system
<ul>
<li> cd src/benchmarks/streams</li>
<li> make MPIVersion</li>
<li> submit MPIVersion to the batch system a number of times with 1, 2, 3, etc MPI processes collecting all of the output from the runs into the single file scaling.log.</li>
<li> copy scaling.log into the src/benchmarks/streams directory</li>
<li> ./process.py createfile ; process.py </li>
</ul>
Even if you have enough memory bandwidth if the OS switches
processes between cores performance can degrade. Smart
process to core/socket binding (this just means locking a
process to a particular core or memory socket) may help
you. For example, consider using fewer processes than cores
and binding processes to separate sockets so that each
process uses a different memory bus:
<ul>
<li>
<dt><a href="http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding">MPICH2 binding with the Hydra process manager</a></dt>
<dd><code>mpiexec.hydra -n 4 --binding cpu:sockets</code></dd>
</li>
<li>
<dt><a href="http://www.open-mpi.org/doc/v1.5/man1/mpiexec.1.php#sect8">Open MPI binding</a></dt>
<dd><code>mpiexec -n 4 --bysocket --bind-to-socket --report-bindings</code></dd>
</li>
<li>
<dt><tt>taskset</tt>, part of the <tt>util-linux</tt> package</dt>
<dd>
<tt>Usage: taskset [options] [mask | cpu-list] [pid|cmd [args...]]</tt>, type <tt>man taskset</tt> for details.
Make sure to set affinity for your program, not for the <tt>mpiexec</tt> program.
</dd>
</li>
<li>
<dt><tt>numactl</tt></dt>
<dd>In addition to task affinity, this tool also allows changing the default memory affinity policy.
On Linux, the default policy is to attempt to find memory on the same memory bus that serves the core that a thread is running on at whatever time the memory is faulted (<em>not</em> when <tt>malloc()</tt> is called).
If local memory is not available, it is found elsewhere, possibly leading to serious memory imbalances.
The option <tt>--localalloc</tt> allocates memory on the local NUMA node, similar to the <tt>numa_alloc_local()</tt> function in the <tt>libnuma</tt> library.
The option <tt>--cpunodebind=nodes</tt> binds the process to a given NUMA node (note that this can be larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
The option <tt>--physcpubind=cpus</tt> binds the process to a given processor core (numbered according to <tt>/proc/cpuinfo</tt>, therefore including logical cores if Hyper-threading is enabled).
With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your machine to calculate the correct NUMA node or processor number given the environment variable <tt>OMPI_COMM_WORLD_LOCAL_RANK</tt>.
In most cases, it is easier to make <tt>mpiexec</tt> or a resource manager set affinities.
</dd>
</li>
</ul>
</p>
The software <a href="http://open-mx.gforge.inria.fr">http://open-mx.gforge.inria.fr</a>
provides faster speed for ethernet systems, we have not tried it but it
claims it can dramatically reduce latency and increase bandwidth on
Linux system. You must first install this software and then install
MPICH or Open MPI to use it.
</p>
<h3><a name="license">What kind of license is PETSc released under?</a></h3>
See the <a href="license.html">licensing notice.</a>
<h3><a name="why-c">Why is PETSc programmed in C, instead of Fortran or C++?</a></h3>
C enables us to build data structures for storing sparse matrices, solver
information, etc. in ways that Fortran simply does not allow. ANSI C is
a complete standard that all modern C compilers support. The language is
identical on all machines. C++ is still evolving and compilers on different
machines are not identical. Using C function pointers to provide data
encapsulation and polymorphism allows us to get many of the advantages of
C++ without using such a large and more complicated language. It would be
natural and reasonable to have coded PETSc in C++; we opted to use
C instead.
<h3><a name="logging-overhead">Does all the PETSc error checking and logging reduce PETSc's efficiency?</a></h3>
No.
<h3><a name="work-efficiently">How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?</a></h3>
<ol>
<li>We work very efficiently.
<ol>
<li>
We use Emacs for all editing; the etags feature makes navigating
and changing our source code very easy.
</li>
<li>
Our manual pages are generated automatically from
formatted comments in the code, thus alleviating the
need for creating and maintaining manual pages.
</li>
<li>
We employ automatic nightly tests of PETSc on several
different machine architectures. This process helps us
to discover problems the day after we have introduced
them rather than weeks or months later.
</li>
</ol>
</li>
<li>
We are very careful in our design (and are constantly
revising our design) to make the package easy to use,
write, and maintain.
</li>
<li>
We are willing to do the grunt work of going through
all the code regularly to make sure that <u>all</u> code
conforms to our interface design. We will <u>never</u>
keep in a bad design decision simply because changing it
will require a lot of editing; we do a lot of editing.
</li>
<li>
We constantly seek out and experiment with new design
ideas; we retain the useful ones and discard the rest.
All of these decisions are based on <u>practicality</u>.
</li>
<li>
Function and variable names are chosen to be very
consistent throughout the software. Even the rules about
capitalization are designed to make it easy to figure out
the name of a particular object or routine. Our memories
are terrible, so careful consistent naming puts less
stress on our limited human RAM.
</li>
<li>
The PETSc directory tree is carefully designed to make
it easy to move throughout the entire package.
</li>
<li>
Our bug reporting system, based on email to <a
href="../documentation/bugreporting.html">petsc-maint@mcs.anl.gov</a>,
makes it very simple to keep track of what bugs have been found and
fixed. In addition, the bug report system retains an archive of all
reported problems and fixes, so it is easy to refind fixes to
previously discovered problems.
</li>
<li>
We contain the complexity of PETSc by using object-oriented programming
techniques including data encapsulation (this is why your program
cannot, for example, look directly at what is inside the object Mat)
and polymorphism (you call MatMult() regardless of whether your matrix
is dense, sparse, parallel or sequential; you don't call a different
routine for each format).
</li>
<li>We try to provide the functionality requested by our users.</li>
<li>We never sleep.</li>
</ol>
<h3><a name="complex">For complex numbers will I get better performance with C++?</a></h3>
To use PETSc with complex numbers you either <code>./configure</code> with
the option <code>--with-scalar-type=complex</code> and either
<code>--with-clanguage=c++</code> or, the default,
<code>--with-clanguage=c</code>. In our experience they will deliver very
similar performance (speed), but if one is concerned they should just try
both and see if one is faster.
<h3><a name="different">How come when I run the same program on the same number of processes I get a "different" answer?</a></h3>
<p>
Inner products and norms in PETSc are computed using the MPI_Allreduce()
command. In different runs the order at which values arrive at a given
process (via MPI) can be in a different order, thus the order in which some
floating point arithmetic operations are performed will be different. Since
floating point arithmetic arithmetic is not associative, the computed
quantity may be (slightly) different. Over a run the many slight
differences in the inner products and norms will effect all the computed
results. It is important to realize that none of the computed answers are
any less right or wrong (in fact the sequential computation is no more
right then the parallel ones), they are all equally valid.
</p>
<p>
The discussion above assumes that the exact same algorithm is being used on
the different number of processes. When the algorithm is different for the
different number of processes (almost all preconditioner algorithms except
Jacobi are different for different number of processes) then one expects to
see (and does) a greater difference in results for different numbers of
processes. In some cases (for example block Jacobi preconditioner) it may
be that the algorithm works for some number of processes and does not work
for others.
</p>
<h3><a name="differentiterations">How come when I run the same linear solver on a different number of processes it takes a different number of iterations?</a></h3>
The convergence of many of the preconditioners in PETSc including the
default parallel preconditioner block Jacobi depends on the number of
processes. The more processes the (slightly) slower convergence it has.
This is the nature of iterative solvers, the more parallelism means the
more "older" information is used in the solution process hence slower
convergence.
<h3><a name="gpus">Can PETSc use GPUs to speedup computations?</a></h3>
The <a href="https://bitbucket.org/petsc/petsc">PETSc developer repository</a> has some support for running portions of the computation on
GPUs. See <a href="http://www.mcs.anl.gov/petsc/features/gpus.html">PETSc GPUs</a> for
more information. PETSc has Vec classes VECCUDA and VECVIENNACL which perform almost all
the vector operations on the GPU. The Mat classes AIJCUSPARSE and AIJVIENNACL perform
matrix-vector products on the GPU but do not have matrix assembly on the
GPU yet. Both of these classes run in parallel with MPI. All KSP methods,
except KSPIBCGS, run all their vector operations on the GPU thus, for
example Jacobi preconditioned Krylov methods run completely on the GPU.
Preconditioners are a problem, we could do with some help for these. The
example <a href="http://www.mcs.anl.gov/petsc/petsc-master/src/snes/examples/tutorials/ex47cu.cu.html">src/snes/examples/tutorials/ex47cu.cu</a>
demonstates how the nonlinear function evaluation can be done on the
GPU.
<h3><a name="precision">Can I run PETSc with extended precision?</a></h3>
Yes, with gcc 4.6 and later (and gfortran 4.6 and later)
<code>./configure</code> PETSc using the options
<code>--with-precision=__float128 --download-f2cblaslapack</code>.
External packages cannot be used in this mode
<h3><a name="qd">Why doesn't PETSc use QD to implement support for exended precision?</a></h3>
We tried really hard but could not. The problem is that the QD c++ classes,
though they try to implement the built-in data types of double etc are not
native types and cannot "just be used" in a general piece of numerical
source code rather the code has to rewritten to live within the limitations
of QD classes. But note, above, that PETSc can be built to use quad precision.
<hr>
<h2><a name="installation">Installation</a></h2>
<h3><a name="already-installed">How do I begin using PETSc if the software has already been completely built and installed by someone else?</a></h3>
Assuming that the PETSc libraries have been successfully built for
a particular architecture and level of optimization, a new user must
merely:
<ol>
<li>
Set the environmental variable PETSC_DIR to the full
path of the PETSc home directory (for example,
/home/username/petsc).
</li>
<li>
Set the environmental variable PETSC_ARCH, which indicates the
configuration on which PETSc will be used. Note that the PETSC_ARCH is
simply a name the installer used when installing the libraries. There
many be several on a single system, like mylinux-g for the debug
versions of the library and mylinux-O for the optimized version, or
petscdebug for the debug version and petscopt for the optimized
version.
</li>
<li>
Begin by copying one of the many PETSc examples (in, for example,
petsc/src/ksp/examples/tutorials) and its corresponding makefile.
</li>
<li>
See the introductory section of the PETSc users manual for tips on
documentation.
</li>
</ol>
<h3><a name="reduce-disk-space">The PETSc distribution is SO large. How can I reduce my disk space usage?</a></h3>
<ol>
<li>
The directory /usr/share/doc/petsc3.9-doc contains a set of HTML manual pages in
for use with a browser. You can delete these pages to save some disk space.
</li>
<li>
The PETSc users manual is provided in PDF in
<a href="http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf">manual.pdf</a>.
</li>
<li>
The PETSc test suite contains sample output for many of the examples.
These are contained in the PETSc directories
/usr/lib/petscdir/petsc*/*/share/petsc/examples/src/*/*/examples/tutorials/output and
examples/tests/output.
</li>
<li>
The debugging versions of the libraries are larger than the optimized
versions. In a pinch you can work with the optimized version although
we do not recommend it generally because finding bugs is much easier
with the debug version.
</li>
</ol>
<h3><a name="petsc-uni">I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?</a></h3>
No, run ./configure with the option <code>--with-mpi=0</code>
<h3><a name="no-x">Can I install PETSc to not use X windows (either under Unix or Windows with gcc, the gnu compiler)?</a></h3>
Yes. Run ./configure with the additional flag <code>--with-x=0</code>
<h3><a name="use-mpi">Why do you use MPI</a>?</h3>
MPI is the message-passing standard. Because it is a standard, it will not
change over time; thus, we do not have to change PETSc every time the
provider of the message-passing system decides to make an interface change.
MPI was carefully designed by experts from industry, academia, and
government labs to provide the highest quality performance and capability.
For example, the careful design of communicators in MPI allows the easy
nesting of different libraries; no other message-passing system provides
this support. All of the major parallel computer vendors were involved in
the design of MPI and have committed to providing quality implementations.
In addition, since MPI is a standard, several different groups have already
provided complete free implementations. Thus, one does not have to rely on
the technical skills of one particular group to provide the message-passing
libraries. Today, MPI is the only practical, portable approach to writing
efficient parallel numerical software.
<h3><a name="mpi-compilers">What do I do if my MPI compiler wrappers are invalid</a>?</h3>
Most MPI implementations provide compiler wrappers (such as mpicc) which
give the include and link options necessary to use that verson of MPI to
the underlying compilers . These wrappers are either absent or broken in
the MPI pointed to by --with-mpi-dir. You can rerun the configure with the
additional option --with-mpi-compilers=0, which will try to auto-detect
working compilers; however, these compilers may be incompatible with the
particular MPI build. If this fix does not work, run with
--with-cc=c_compiler where you know c_compiler works with this particular
MPI, and likewise for C++ and Fortran.
<h3><a name="with-64-bit-indices">When should/can I use the ./configure option --with-64-bit-indices?</a></h3>
By default the type that PETSc uses to index into arrays and keep sizes of
arrays is a PetscInt defined to be a 32 bit int. If your problem
<ul>
<li>involves more than 2^31 - 1 unknowns (around 2 billion) OR</li>
<li>your matrix might contain more than 2^31 - 1 nonzeros on a single process</li>
</ul>
then you need to use this option. Otherwise you will get strange crashes.
<p>
This option can be used when you are using either 32 bit or 64 bit
pointers. You do not need to use this option if you are using 64 bit
pointers unless the two conditions above hold.
</p>
<h3><a name="compilererror">What if I get an internal compiler error?</a></h3>
You can rebuild the offending file individually with a lower optimization level. Then make sure to complain to the
compiler vendor and file a bug report. For example, if the compiler chokes on
src/mat/impls/baij/seq/baijsolvtrannat.c you can run
<p>make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o</p>
and then invoke 'make all' again.
<h3><a name="install-petsc4py-dev">How do I install petsc4py with the development PETSc?</a></h3>
You can follow these steps
<ol>
<li>install Cython</li>
<li>install PETSc with --download-petsc4py</li>
</ol>
<h3><a name="gfortran">What Fortran compiler do you recommend for the Apple Mac OS X?</a></h3>
<p>
We recommend using brew (http://brew.sh) to install gfortran
<p>
Please contact Apple at <a
href="http://www.apple.com/feedback">http://www.apple.com/feedback</a>
and urge them to bundle gfortran with future versions of Xcode.
</p>
<h3><a name="packagelocations">How can I find the URL locations of the packages you install using --download-PACKAGE??</a></h3>
<p>
grep "self.download " config/BuildSystem/config/packages/*.py
</p>
<hr>
<h3><a name="multiplempis">How to fix the problem: PETSc was configured with one MPICH (or OpenMPI) mpi.h version but now appears to be compiling using a different MPICH (or OpenMPI) mpi.h version</a></h3>
<p>This happens for generally one of two reasons:
<ul>
<li>You previously ran ./configure with the option --download-mpich (or --download-openmpi) but later ran ./configure to use a version of MPI already installed on the machine. Solution: do "rm -rf ${PETSC_ARCH}" and run ./configure again.</li>
<li>You have two versions of MPI installed on your machine in different locations and both are being picked up in different parts of the configuration/make process. For example you may have MPI installed in /usr/local/mpi and in /Apps/OpenMPI. Solution: remove one of the two installations.</li>
</ul>
<h3><a name="PetscOptionsInsertFile">What does it mean when make test gives:</a></h3>
<pre>
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes
See http://www.mcs.anl.gov/petsc/documentation/faq.html
[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
</pre></li>
<p>
The machine has a funky network configuration and for some reason MPICH is unable to communicate between processes with the socket connections it has established. This can happen even if you are running MPICH on just one machine. Often you will find that ping `hostname` fails with this network configuration; that is processes on the machine cannot even connect to the same machine. You can try completely disconnecting your machine from the network and see if make test then works or speaking with your system adminstrator. You can also use the ./configure options --download-mpich --download-mpich-device=ch3:nemesis
</p>
<hr>
<h2><a name="using">Using</a></h2>
<h3><a name="redirectstdout">How can I redirect PETSc's stdout and stderr when programming with a GUI interface in Windows Developer Studio or too C++ streams?</a> </h3>
<p>
To overload just the error messages write your own MyPrintError() function
that does whatever you want (including pop up windows etc) and use it like
below.
</p>
<pre>
extern "C" {
int PASCAL WinMain(HINSTANCE inst,HINSTANCE dumb,LPSTR param,int show);
};
#include <petscsys.h>
#include <mpi.h>
const char help[] = "Set up from main";
int MyPrintError(const char error[], ...) {
printf("%s", error);
return 0;
}
int main(int ac, char *av[]) {
char buf[256];
int i;
HINSTANCE inst;
PetscErrorCode ierr;
inst=(HINSTANCE)GetModuleHandle(NULL);
PetscErrorPrintf = MyPrintError;
buf[0]=0;
for (i=1; i<ac; i++) {
strcat(buf,av[i]);
strcat(buf," ");
}
PetscInitialize(&ac, &av, NULL, help);
return WinMain(inst,NULL,buf,SW_SHOWNORMAL);
}
</pre>
<p>
file in the project and compile with this preprocessor definitiions:
<code>WIN32,_DEBUG,_CONSOLE,_MBCS,USE_PETSC_LOG,USE_PETSC_BOPT_g,USE_PETSC_STA CK,_AFXDLL</code>
</p>
<p>
And these link options: <code>/nologo /subsystem:console /incremental:yes
/debug /machine:I386 /nodefaultlib:"libcmtd.lib"
/nodefaultlib:"libcd.lib" /nodefaultlib:"mvcrt.lib"
/pdbtype:sept</code>
</p>
<p>
Note that it is compiled and linked as if it was a console program. The
linker will search for a main, and then from it the WinMain will start.
This works with MFC templates and derived classes too.
</p>
<p>
Note: When writing a Window's console application you do not need to do
anything, the stdout and stderr is automatically output to the console
window.
</p>
<p>To change where all PETSc stdout and stderr go write a function</p>
<p>
You can also reassign PetscVFPrintf() to handle stdout and stderr any way
you like write the following function:
</p>
<pre>
PetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp) {
PetscErrorCode ierr;
PetscFunctionBegin;
if (fd != stdout && fd != stderr) { /* handle regular files */
ierr = PetscVFPrintfDefault(fd,format,Argp); CHKERR(ierr);
} else {
char buff[BIG];
int length;
ierr = PetscVSNPrintf(buff,BIG,format,&length,Argp);CHKERRQ(ierr);
/* now send buff to whatever stream or whatever you want */
}
PetscFunctionReturn(0);
}
</pre>
and assign <code>PetscVFPrintf = mypetscprintf;</code> before
<code>PetscInitialize()</code> in your main program.
<h3><a name="hypre">I want to use hypre boomerAMG without GMRES but when I run <code>-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly</code> I don't get a very accurate answer!</a> </h3>
You should run with -ksp_type richardson to have PETSc run several V or
W cycles. -ksp_type of preonly causes boomerAMG to use only one V/W cycle.
You can control how many cycles are used in a single application of the
boomerAMG preconditioner with <code>-pc_hypre_boomeramg_max_iter
<it></code> (the default is 1). You can also control the tolerance
boomerAMG uses to decide if to stop before max_iter with
<code>-pc_hypre_boomeramg_tol <tol></code> (the default is 1.e-7).
Run with <code>-ksp_view</code> to see all the hypre options used and
<code>-help | grep boomeramg</code> to see all the command line options.
<h3><a name="nosaij">You have AIJ and BAIJ matrix formats, and SBAIJ for symmetric storage, how come no SAIJ?</a></h3>
Just for historical reasons; the SBAIJ format with blocksize one is just as
efficient as an SAIJ would be.
<h3><a name="domaindecomposition">How do I use PETSc for Domain Decomposition?</a></h3>
<p>
PETSc includes Additive Schwarz methods in the suite of preconditioners.
These may be activated with the runtime option <code>-pc_type asm.</code>
Various other options may be set, including the degree of overlap
<code>-pc_asm_overlap <number></code> the type of restriction/extension
<code>-pc_asm_type [basic,restrict,interpolate,none]</code> - Sets ASM type
and several others. You may see the available ASM options by using
<code>-pc_type asm -help</code> Also, see the procedural interfaces in the
manual pages, with names <code>PCASMxxxx()</code> and check the index of the
<a href="manual.pdf">users manual</a>
for <code>PCASMxxx</code>().
</p>
<p>
PETSc also contains a domain decomposition inspired wirebasket or face
based two level method where the coarse mesh to fine mesh interpolation
is defined by solving specific local subdomain problems. It currently
only works for 3D scalar problems on structured grids created with PETSc
DMDAs. See the manual page for PCEXOTIC and
src/ksp/ksp/examples/tutorials/ex45.c for any example.
</p>
<p>
PETSc also contains a balancing Neumann-Neumann type preconditioner, see the
manual page for PCBDDC. This requires matrices be constructed with
<code>MatCreateIS()</code> via the finite element method. See src/ksp/ksp/examples/tests/ex59.c
</p>
<h3><a name="blocks">Can I create BAIJ matrices with different size blocks for different block rows?</a></h3>
Sorry, this is not possible, the BAIJ format only supports a single fixed
block size on the entire matrix. But the AIJ format automatically searches
for matching rows and thus still takes advantage of the natural blocks in
your matrix to obtain good performance. Unfortunately you cannot use the
<code>MatSetValuesBlocked()</code>.
<h3><a name="mpi-vec-access">How do I access the values of a parallel PETSc vector on a different process than owns them?</a></h3>
<ul>
<li>On each process create a local vector large enough to hold all the values it wishes to access</li>
<li>Create a VecScatter that scatters from the parallel vector into the local vectors</li>
<li>Use VecGetArray() to access the values in the local vector</li>
</ul>
<h3><a name="mpi-vec-to-seq-vec">How do I collect all the values from a parallel PETSc vector into a sequential vector on each processor?</a></h3>
<ul>
<li>Create the scatter context that will do the communication</li>
<li> <a href="manualpages/Vec/VecScatterCreateToAll.html">VecScatterCreateToAll</a>(v,&ctx,&w);</li>
<li>
Actually do the communication; this can be done repeatedly as needed
<ul>
<li><a href="manualpages/Vec/VecScatterBegin.html">VecScatterBegin</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
<li> <a href="manualpages/Vec/VecScatterEnd.html">VecScatterEnd</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
</ul>
</li>
<li>
Remember to free the scatter context when no longer needed
<ul>
<li> <a href="manualpages/Vec/VecScatterDestroy.html">VecScatterDestroy</a>(ctx);</li>
</ul>
</li>
</ul>
Note that this simply concatenates in the parallel ordering of the vector.
If you are using a vector from DMCreateGlobalVector() you likely want to
first call DMDAGlobalToNaturalBegin/End() to scatter the original vector
into the natural ordering in a new global vector before calling
VecScatterBegin/End() to scatter the natural vector onto all processes.
<h3><a name="mpi-vec-to-mpi-vec">How do I collect all the values from a parallel PETSc vector into a vector on the zeroth processor?</a></h3>
<ul>
<li>
Create the scatter context that will do the communication
<ul>
<li><a href="manualpages/Vec/VecScatterCreateToZero.html">VecScatterCreateToZero</a>(v,&ctx,&w);</li>
</ul>
</li>
<li>
Actually do the communication; this can be done repeatedly as needed
<ul>
<li><a href="manualpages/Vec/VecScatterBegin.html">VecScatterBegin</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
<li> <a href="manualpages/Vec/VecScatterEnd.html">VecScatterEnd</a>(ctx,v,w,INSERT_VALUES,SCATTER_FORWARD);</li>
</ul>
</li>
<li>
Remember to free the scatter context when no longer needed
<ul>
<li> <a href="manualpages/Vec/VecScatterDestroy.html">VecScatterDestroy</a>(ctx);</li>
</ul>
</li>
</ul>
Note that this simply concatenates in the parallel ordering of the vector.
If you are using a vector from DMCreateGlobalVector() you likely want to
first call DMDAGlobalToNaturalBegin/End() to scatter the original vector
into the natural ordering in a new global vector before calling
VecScatterBegin/End() to scatter the natural vector onto process 0.
<h3><a name="sparse-matrix-ascii-format">How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, SLAPC or other ASCII format?</a></h3>
See the examples in src/mat/examples/tests, specifically ex72.c, ex78.c,
and ex32.c. You will likely need to modify the code slightly to match your
required ASCII format. Note: Never read or write in parallel an ASCII
matrix file, instead for reading: read in sequentially with a standalone
code based on ex72.c, ex78.c, or ex32.c then save the matrix with the
binary viewer PetscBinaryViewerOpen() and load the matrix in parallel in
your "real" PETSc program with MatLoad(); for writing save with the binary
viewer and then load with the sequential code to store it as ASCII.
<h3><a name="setfromoptions">Does TSSetFromOptions(), SNESSetFromOptions() or KSPSetFromOptions() reset all the parameters I previously set or how come my TS/SNES/KSPSetXXX() does not seem to work?</a></h3>
<p>
If XXSetFromOptions() is used (with -xxx_type aaaa) to change the type of
the object then all parameters associated with the previous type are
removed. Otherwise it does not reset parameters.
</p>
<p>
TS/SNES/KSPSetXXX() commands that set properties for a particular type of
object (such as KSPGMRESSetRestart()) ONLY work if the object is ALREADY
of that type. For example, with
<code>KSPCreate(PETSC_COMM_WORLD,&ksp); KSPGMRESSetRestart(ksp,10);</code>
the restart will be ignored since the type has not yet been set to GMRES.
To have those values take effect you should do one of the following:
</p>
<ul style="list-style-type: none; padding: .5em;">
<li><code>XXXCreate(..,&obj);</code></li>
<li>
<code>XXXSetFromOptions(obj)</code>; allow setting the
type from the command line, if it is not on the
command line then the default type is automatically
set
</li>
<li>
<code>XXXSetYYYYY(obj,...)</code>; if the obj is the
appropriate type then the operation takes place
</li>
<li>
<code>XXXSetFromOptions(obj)</code>; allow user to
overwrite options hardwired in code (optional)<br>
</li>
<li>
The other approach is to replace the first
<code>XXXSetFromOptions()</code> to <code>XXXSetType(obj,type)</code>
and hardwire the type at that point.
</li>
</ul>
<h3><a name="makefiles">Can I use my own makefiles or rules for compiling code, instead of using PETSc's?</a></h3>
Yes, see the section of the <a href="manual.pdf">users manual</a> called Makefiles
<h3><a name="cmake">Can I use CMake to build my own project that depends on PETSc?</a></h3>
Use the FindPETSc.cmake module from <a
href="https://github.com/jedbrown/cmake-modules/">this repository</a>.
See the CMakeLists.txt from <a
href="https://github.com/jedbrown/dohp">Dohp</a> for example usage.
<h3><a name="carriagereturns">How can I put carriage returns in PetscPrintf() statements from Fortran?</a></h3>
<p>
You can use the same notation as in C, just put a \n in the string. Note
that no other C format instruction is supported.
</p>
<p>
Or you can use the Fortran concatination // and char(10); for example
'some string'//char(10)//'another string on the next line'
</p>
<h3><a name="cxxmethod">How can I implement callbacks using C++ class methods?</a></h3>
<p>
Declare the class method <tt>static</tt>. Static methods do not have a <tt>this</tt> pointer, but the <tt>void*</tt> context parameter will usually be cast to a pointer to the class where it can serve the same function. Note that all PETSc callbacks return <tt>PetscErrorCode</tt>.
</p>
<h3><a name="functionjacobian">Everyone knows that when you code Newton's method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?</a></h3>
<p>
The update in Newton's method is computed as u^{n+1} = u^n - lambda
* approx-inverse[J(u^n)] * F(u^n)]. The reason PETSc doesn't default to
computing both the function and Jacobian at the same time is<br>
</p>
<ol>
<li>
In order to do the line search, F (u^n - lambda * step) may need to be
computed for several lambda, the Jacobian is not needed for each of
those and one does not know in advance which will be the final lambda
until after the function value is computed, so many extra Jacobians may
be computed.
</li>
<li>
In the final step if || F(u^p)|| satisfies the convergence criteria
then a Jacobian need not be computed.
</li>
</ol>
<p>
You are free to have your "FormFunction" compute as
much of the Jacobian at that point as you like, keep
the information in the user context (the final
argument to FormFunction and FormJacobian) and then
retreive the information in your FormJacobian()
function.
</p>
<h3><a name="jacobianfree">How can I use Newton's method Jacobian free? Can I difference a different function than provided
with <a href="manualpages/SNES/SNESSetFunction.html">SNESSetFunction()?</a></h3>
<p>
The simplest way is with the option -snes_mf, this will use finite differencing of the function provided to SNESComputeFunction() to approximate the action of Jacobian. Since no matrix-representation of the
Jacobian is provided the -pc_type used with this option must be -pc_type none. You can provide a custom preconditioner with SNESGetKSP(), KSPGetPC(),
PCSetType(pc,<a href="manualpages/PC/PCSHELL.html">PCSHELL</a>).
</p>
<p>
The option -snes_mf_operator will use Jacobian free to apply the Jacobian (in the Krylov solvers) but will use whatever matrix you provided with SNESSetJacobian() (assuming you set one) to compute the preconditioner.
</p>
<p>
To write the code (rather than use the options above) use <a href="manualpages/SNES/MatCreateSNESMF.html">MatCreateSNESMF()</a> and pass the resulting matrix object to
SNESSetJacobian(). For purely matrix-free (like -snes_mf) pass the matrix object for both matrix arguments and pass the function MatMFFDComputeJacobian(). To provide your own approximate Jacobian matrix to compute
the preconditioner (like -snes_mf_operator), pass this other matrix as the second matrix argument to SNESSetJacobian(). Make sure your provided computejacobian() function calls MatAssemblyBegin/End() separately on BOTH
matrix arguments to this function. See <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tests/ex7.c.html">src/snes/examples/tests/ex7.c</a>
</p>
<p>
To difference a different function than that passed to SNESSetJacobian() to compute the matrix-free Jacobian multiply call
<a href="manualpages/Mat/MatMFFDSetFunction.html">MatMFFDSetFunction()</a> to set that other function.
See <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tests/ex7.c.html">src/snes/examples/tests/ex7.c.h</a>
</p>
<h3><a name="conditionnumber">How can I determine the condition number of a matrix?</a></h3>
For small matrices, the condition number can be reliably computed using
<code>-pc_type svd -pc_svd_monitor</code>. For larger matrices, you can
run with <code>-pc_type none -ksp_type gmres -ksp_monitor_singular_value
-ksp_gmres_restart 1000</code> to get approximations to the condition
number of the operator. This will generally be accurate for the largest
singular values, but may overestimate the smallest singular value unless
the method has converged. Make sure to avoid restarts. To estimate the
condition number of the preconditioned operator, use <code>-pc_type
somepc</code> in the last command.
<h3><a name="invertmatrix">How can I compute the inverse of a matrix in PETSc?</a></h3>
It is very expensive to compute the inverse of a matrix and very rarely
needed in practice. We highly recommend avoiding algorithms that need it.
The inverse of a matrix (dense or sparse) is essentially always dense, so
begin by creating a dense matrix B and fill it with the identity matrix
(ones along the diagonal), also create a dense matrix X of the same size
that will hold the solution. Then factor the matrix you wish to invert with
MatLUFactor() or MatCholeskyFactor(), call the result A. Then call MatMatSolve(A,B,X)
to compute the inverse into X. <a href="#schurcomplement">See also</a>.
<h3><a name="schurcomplement">How can I compute the Schur complement, Kbb - Kab * inverse(Kbb) * Kba in PETSc?</a></h3>
<p>
It is very expensive to compute the Schur complement of a matrix and very
rarely needed in practice. We highly recommend avoiding algorithms that
need it. The Schur complement of a matrix (dense or sparse) is essentially
always dense, so begin by
</p>
<ul>
<li>forming a dense matrix Kba,</li>
<li>also create another dense matrix T of the same size.</li>
<li>
Then either factor the matrix Kaa directly with MatLUFactor() or
MatCholeskyFactor(), or use MatGetFactor() followed by
MatLUFactorSymbolic() followed by MatLUFactorNumeric() if you wish to
use and external solver package like SuperLU_Dist. Call the result A.
</li>
<li>Then call MatMatSolve(A,Kba,T).</li>
<li>Then call MatMatMult(Kab,T,MAT_INITIAL_MATRIX,1.0,&S).</li>
<li>Now call MatAXPY(S,-1.0,Kbb,MAT_SUBSET_NONZERO).</li>
<li>Followed by MatScale(S,-1.0);</li>
</ul>
For computing Schur complements like this it does not make sense to use the
KSP iterative solvers since for solving many moderate size problems using
a direct factorization is much faster than iterative solvers. As you can
see, this requires a great deal of work space and computation so is best
avoided. However, it is not necessary to assemble the Schur complement
S in order to solve systems with it.
Use MatCreateSchurComplement(Kaa,Kaa_pre,Kab,Kba,Kbb,&S) to create
a matrix that applies the action of S (using Kaa_pre to solve with Kaa),
but does not assemble. Alternatively, if you already have a block matrix
K = [Kaa, Kab; Kba, Kbb] (in some ordering), then you can create index sets
(IS) isa and isb to address each block, then use MatGetSchurComplement() to
create the Schur complement and/or an approximation suitable for
preconditioning. Since S is generally dense, standard preconditioning
methods cannot typically be applied directly to Schur complements. There
are many approaches to preconditioning Schur complements including using
the SIMPLE approximation K_bb - Kba inv(diag(Kaa)) Kab to create a sparse
matrix that approximates the Schur complement (this is returned by default
for the optional "preconditioning" matrix in MatGetSchurComplement()). An
alternative is to interpret the matrices as differential operators and
apply approximate commutator arguments to find a spectrally equivalent
operation that can be applied efficiently (see the "PCD" preconditioners
from Elman, Silvester, and Wathen). A variant of this is the least squares
commutator, which is closely related to the Moore-Penrose pseudoinverse,
and is available in PCLSC which operates on matrices of type
MATSCHURCOMPLEMENT.
<h3><a name="fem">Do you have examples of doing unstructured grid finite element computations (FEM) with PETSc?</a></h3>
There are at least two ways to write a finite element code using PETSc
<ol>
<li>
use DMPlex, which is a high level approach to manage your mesh and
discretization. <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex62.c.html">SNES
ex62</a> solves the Stokes equation using this approach.
</li>
<li>
manage the grid data structure yourself and use
PETSc IS and VecScatter to communicate the required
ghost point communication. See <a href="http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex10d/ex10.c.html">src/snes/examples/tutorials/ex10d/ex10.c</a>
</li>
</ol>
<h3><a name="da_mpi_cart">The PETSc DA object decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?</a></h3>
<p>
The MPI_Cart_create() first divides the mesh along the z direction, then
the y, then the x. DMDA divides along the x, then y, then z. Thus, for
example, rank 1 of the processes will be in a different part of the mesh
for the two schemes. To resolve this you can create a new MPI
communicator that you pass to DMDACreate() that renumbers the process
ranks so that each physical process shares the same part of the mesh with
both the DMDA and the MPI_Cart_create(). The code to determine the new
numbering was provided by Rolf Kuiper.
</p>
<pre>
// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
// (no parallelization in direction 'dir' means dir_procs = 1)
MPI_Comm NewComm;
int MPI_Rank, NewRank, x,y,z;
// get rank from MPI ordering:
MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank);
// calculate coordinates of cpus in MPI ordering:
x = MPI_rank / (z_procs*y_procs);
y = (MPI_rank % (z_procs*y_procs)) / z_procs;
z = (MPI_rank % (z_procs*y_procs)) % z_procs;
// set new rank according to PETSc ordering:
NewRank = z*y_procs*x_procs + y*x_procs + x;
// create communicator with new ranks according to
PETSc ordering:
MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm);
// override the default communicator (was
MPI_COMM_WORLD as default)
PETSC_COMM_WORLD = NewComm;
</pre>
<h3><a name="redistribute">When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?</a></h3>
<p>
For nonsymmetric systems put the appropriate boundary solutions in the
x vector and use <a href="manualpages/Mat/MatZeroRows.html">MatZeroRows()</a> followed by <a href="manualpages/KSP/KSPSetOperators.html">KSPSetOperators()</a>. For symmetric
problems use <a href="manualpages/Mat/MatZeroRowsColumns.html">MatZeroRowsColumns()</a> instead. If you have many Dirichlet
locations you can use MatZeroRows() (not MatZeroRowsColumns()) and
-ksp_type preonly -pc_type redistribute; see <a href="manualpages/PC/PCREDISTRIBUTE.html">
PCREDISTRIBUTE</a>) and PETSc will repartition the parallel matrix for load
balancing; in this case the new matrix solved remains symmetric even though
MatZeroRows() is used.
</p>
<p>
An alternative approach is, when assemblying the matrix (generating values
and passing them to the matrix), never to include locations for the Dirichlet
grid points in the vector and matrix, instead taking them into account as you
put the other values into the load.
</p>
<h3><a name="matlab">How can I get PETSc Vecs and Mats to MATLAB or vice versa?</a></h3>
There are five ways to work with PETSc and MATLAB
<ol>
<li>
Using the MATLAB Engine, allowing PETSc to automatically call MATLAB
to perform some specific computations. This does not allow MATLAB to be
used interactively by the user. See the <a
href="manualpages/Sys/PetscMatlabEngine.html">PetscMatlabEngine</a>.
</li>
<li>
To save PETSc Mats and Vecs to files that can be read from MATLAB use <a
href="manualpages/Viewer/PetscViewerBinaryOpen.html">PetscViewerBinaryOpen()</a>
viewer and VecView() or MatView() to save objects for MATLAB and
VecLoad() and MatLoad() to get the objects that MATLAB has saved. See
PetscBinaryRead.m and PetscBinaryWrite.m in share/petsc/matlab for loading and
saving the objects in MATLAB.
</li>
<li>
You can open a socket connection between MATLAB and PETSc to allow
sending objects back and forth between an interactive MATLAB session
and a running PETSc program. See
<a href="manualpages/Viewer/PetscViewerSocketOpen.html">PetscViewerSocketOpen</a>()
for access from the PETSc side and PetscReadBinary.m in share/petsc/matlab for
access from the MATLAB side.
</li>
<li>
You can save PETSc Vecs (not Mats) with the <a
href="manualpages/Viewer/PetscViewerMatlabOpen.html">PetscViewerMatlabOpen</a>()
viewer that saves .mat files can then be loaded into MATLAB.
</li>
<li>
We are just beginning to develop in <a
href="../developers/index.html">petsc master (branch in git)</a> an API to call most of
the PETSc function directly from MATLAB; we could use help in
developing this. See share/petsc/matlab/classes/PetscInitialize.m
</li>
</ol>
<h3><a name="cython">How do I get started with Cython so that I can extend petsc4py?</a></h3>
Steps I used:
<ol>
<li>
Learn how to <a href="http://docs.cython.org/src/quickstart/build.html">build a Cython module</a>
</li>
<li>
Go through the simple example provided by Denis <a href="http://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython">here</a>.
Note also the next comment that shows how to create numpy arrays in the Cython and pass them back.
</li>
<li>
Check out <a href="http://docs.cython.org/src/tutorial/numpy.html">this page</a> which tells you how to get fast indexing
</li>
<li>
Have a look at the petsc4py <a href="http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi">array source</a>
</li>
</ol>
<h3><a name="customnormforksp">I would like to compute a custom norm for KSP to use as a convergence test or for monitoring?</a></h3>
You need to call <a href="manualpages/KSP/KSPBuildResidual.html">KSPBuildResidual</a>() on your KSP object and then
compute the appropriate norm on the resulting residual. Note that depending on the
<a href="manualpages/KSP/KSPSetNormType.html">KSPSetNormType</a>() of the method you may not return the same
norm as provided by the method. See also
<a href="manualpages/KSP/KSPSetPCSide.html">KSPSetPCSide</a>()
<h3><a name="parallelsolvesequentialcode">If I have a sequential program can I use a parallel direct solver?</a></h3>
<p> Yes, you must ./configure PETSc with MPI (even though you will not use MPI) and the options --download-superlu_dist
--download-parmetis --download-metis --with-openmp</p>
<p> Your compiler must support OpenMP. To have the linear solver run in parallel run your program with</p>
<p> OMP_NUM_THREADS=n ./programname -pc_type lu -pc_factor_mat_solver superlu_dist</p>
<p> where n is the number of threads and should be less than or equal to the number of cores available. Do not expect to get
great speedups but you should see some reduction in the time for the linear solvers.</p>
<p> If your code is MPI parallel you can also use these same options to have SuperLU_dist utilize multiple threads per MPI process for
the direct solver. Make sure that the OMP_NUM_THREADS you use per MPI process is less than or equal to the number of cores available
for each MPI process. For example if your compute nodes have 6 cores and you use 2 MPI processes per node then set OMP_NUM_THREADS to 2 or 3.
<hr>
<h2><a name="execution">Execution</a></h2>
<h3><a name="long-link-time">PETSc executables are SO big and take SO long to link.</a></h3>
We find this annoying as well. On most machines PETSc can use shared
libraries, so executables should be much smaller, run ./configure with the
additional option --with-shared-libraries (this is the default). Also, if you have room,
compiling and linking PETSc on your machine's /tmp disk or similar local
disk, rather than over the network will be much faster.
<h3><a name="petsc-options">PETSc has so many options for my program that it is hard to keep them straight.</a></h3>
Running the PETSc program with the option -help will print of many of the
options. To print the options that have been specified within a program,
employ -options_left to print any options that the user specified but were
not actually used by the program and all options used; this is helpful for
detecting typo errors.
<h3><a name="petsc-log-info">PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?</a></h3>
You can use the option -info to get more details about the solution
process. The option -log_view provides details about the distribution of
time spent in the various phases of the solution process. You can run with
-ts_view or -snes_view or -ksp_view to see what solver options are being
used. Run with -ts_monitor -snes_monitor or -ksp_monitor to watch
convergence of the methods. -snes_converged_reason and
-ksp_converged_reason will indicate why and if the solvers have converged.
<h3><a name="efficient-assembly">Assembling large sparse matrices takes a long time. What can I do make this process faster? or MatSetValues() is so slow, what can I do to speed it up?</a></h3>
See the <a href="manual.pdf#nameddest=ch_performance">Performance chapter of the users manual</a> for many tips on this.
<ol>
<li>
Preallocate enough space for the sparse matrix. For example, rather
than calling MatCreateSeqAIJ(comm,n,n,0,NULL,&mat); call
MatCreateSeqAIJ(comm,n,n,rowmax,NULL,&mat); where rowmax is
the maximum number of nonzeros expected per row. Or if you know the
number of nonzeros per row, you can pass this information in instead of
the NULL argument. See the manual pages for each of the
MatCreateXXX() routines.
</li>
<li>
Insert blocks of values into the matrix, rather than individual components.
</li>
</ol>
<p>Preallocation of matrix memory is crucial for good performance for large problems, see:</p>
<ul>
<li><a href="manual.pdf#sec_matsparse">manual.pdf#sec_matsparse</a></li>
<li><a href="manualpages/Mat/MatCreateMPIAIJ.html">manualpages/Mat/MatCreateMPIAIJ.html</a></li>
</ul>
<p>
If you can set several nonzeros in a block at the same time, this is faster
than calling MatSetValues() for each individual matrix entry.
</p>
<p>
It is best to generate most matrix entries on the process they belong to
(so they do not have to be stashed and then shipped to the owning process).
Note: it is fine to have some entries generated on the "wrong" process,
just not many.
</p>
<h3><a name="log-view">How can I generate performance summaries with PETSc?</a></h3>
Use these options at runtime: -log_view. See the <a href="manual.pdf#nameddest=ch_performance">Performance chapter of the users manual</a>
for information on interpreting the summary data. If using the PETSc
(non)linear solvers, one can also specify -snes_view or -ksp_view for
a printout of solver info. Only the highest level PETSc object used needs
to specify the view option.
<h3><a name="parallel-roundoff">Why do I get different answers on a different numbers of processors?</a></h3>
Most commonly, you are using a preconditioner which behaves differently
based upon the number of processors, such as Block-Jacobi which is the
PETSc default. However, since computations are reordered in parallel, small
roundoff errors will still be present with identical mathematical
formulations. If you set a tighter linear solver tolerance (using
-ksp_rtol), the differences will decrease.
<h3><a name="mg-log">How do I know the amount of time spent on each level of the multigrid solver/preconditioner?</a></h3>
Run with <code>-log_view</code> and <code>-pc_mg_log</code>
<h3><a name="datafiles">Where do I get the input matrices for the examples?</a></h3>
Some examples use ${DATAFILESPATH}/matrices/medium and other files. These
test matrices in PETSc binary format can be found with anonymous ftp from
<a href="http://ftp.mcs.anl.gov">ftp.mcs.anl.gov</a> in the directory
pub/petsc/Datafiles/matrices. The are not included with the PETSc distribution in the
interest of reducing the distribution size. You can obtain many of the files in pub/petsc/Datafiles.tar.gz
<h3><a name="info">When I dump some matrices and vectors to binary, I seem to be generating some empty files with .info extensions. What's the deal with these?</a></h3>
PETSc binary viewers put some additional information into .info files like
matrix block size; it is harmless but if you really don't like it you can
use -viewer_binary_skip_info or PetscViewerBinarySkipInfo() note you need
to call PetscViewerBinarySkipInfo() before PetscViewerFileSetName(). In
other words you cannot use PetscViewerBinaryOpen() directly.
<h3><a name="slowerparallel">Why is my parallel solver slower than my sequential solver, or I have poor speed-up?</a></h3>
This can happen for many reasons:
<ul>
<li>
First make sure it is truely the time in KSPSolve() that is slower (by
running the code with <a href="#log-view">-log_view</a>).
Often the slower time is in <a href="#slow">generating the matrix</a>
or some other operation.
</li>
<li>
There must be enough work for each process to overweigh the
communication time. We recommend an absolute minimum of about 10,000
unknowns per process, better is 20,000 or more.
</li>
<li>
Make sure the <a href="#computers">communication speed of the parallel computer</a>
is good enough for parallel solvers.
</li>
<li>
Check the number of solver iterates with the parallel solver against
the sequential solver. Most preconditioners require more iterations
when used on more processes, this is particularly true for block
Jaccobi, the default parallel preconditioner, you can try -pc_type asm
(<a href="manualpages/PC/PCASM.html">PCASM</a>)
its iterations scale a bit better for more processes. You may also consider
multigrid preconditioners like <a href="manualpages/PC/PCMG.html">PCMG</a>
or BoomerAMG in <a href="manualpages/PC/PCHYPRE.html">PCHYPRE</a>.
</li>
</ul>
<h3><a name="pipelined">What steps are necessary to make the pipelined solvers execute efficiently?</a></h3>
Pipelined solvers like
<a href="manualpages/KSP/KSPPGMRES.html">KSPPGMRES</a>,
<a href="manualpages/KSP/KSPPIPECG.html">KSPPIPECG</a>,
<a href="manualpages/KSP/KSPPIPECR.html">KSPPIPECR</a>, and
<a href="manualpages/KSP/KSPGROPPCG.html">KSPGROPPCG</a>
may require special MPI configuration to effectively overlap reductions with computation.
In general, this requires an MPI-3 implementation, an implementation that supports multiple threads, and use of a "progress thread".
Consult the documentation from your vendor or computing facility for more.
<dl>
<dt>MPICH</dt>
<dd>
MPICH version 3.0 and later implements the MPI-3 standard and the default configuration supports use of threads.
Use of a progress thread is configured by setting the environment variable <tt>MPICH_ASYNC_PROGRESS=1</tt>.
</dd>
<dt>Cray MPI</dt>
<dd>
Cray MPT-5.6 supports MPI-3, but requires the environment variable <tt>MPICH_MAX_THREAD_SAFETY=multiple</tt> for threads, plus either <tt>MPICH_ASYNC_PROGRESS=1</tt> or <tt>MPICH_NEMESIS_ASYNC_PROGRESS=1</tt>.
</dd>
</dl>
<h3><a name="singleprecision">When using PETSc in single precision mode (--with-precision=single when running ./configure) are the operations done in single or double precision?</a></h3>
PETSc does NOT do any explicit conversion of single precision to double
before performing computations; this it depends on the hardware and
compiler what happens. For example, the compiler could choose to put the
single precision numbers into the usual double precision registers and then
use the usual double precision floating point unit. Or it could use SSE2
instructions that work directly on the single precision numbers. It is
a bit of a mystery what decisions get made sometimes. There may be compiler
flags in some circumstances that can affect this.
<h3><a name="newton">Why is Newton's method (SNES) not converging, or converges slowly?</a></h3>
Newton's method may not converge for many reasons, here are some of the most common.
<ul>
<li>The Jacobian is wrong (or correct in sequential but not in parallel).</li>
<li>The linear system <a href="#kspdiverged">is not solved</a> or is not solved accurately enough.</li>
<li>The Jacobian system has a singularity that the linear solver is not handling.</li>
<li>There is a bug in the function evaluation routine.</li>
<li>The function is not continuous or does not have continuous first derivatives (e.g. phase change or TVD limiters).</li>
<li>
The equations may not have a solution (e.g. limit cycle instead of
a steady state) or there may be a "hill" between the initial guess and
the steady state (e.g. reactants must ignite and burn before reaching
a steady state, but the steady-state residual will be larger during
combustion).
</li>
</ul>
Here are some of the ways to help debug lack of convergence of Newton.
<ul>
<li>Run with the options <code>-snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason</code>.
<ul>
<li>
If the linear solve does not converge, check if the Jacobian is
correct, then see <a href="#kspdiverged">this question</a>.
</li>
<li>
If the preconditioned residual converges, but the true residual
does not, the preconditioner may be singular.
</li>
<li>
If the linear solve converges well, but the line search fails, the
Jacobian may be incorrect.
</li>
</ul>
</li>
<li>
Run with <code>-pc_type lu</code> or <code>-pc_type svd</code> to see
if the problem is a poor linear solver
</li>
<li>
Run with <code>-mat_view</code> or <code>-mat_view draw</code> to see
if the Jacobian looks reasonable
</li>
<li>
Run with <code>-snes_test_jacobian -snes_test_jacobian_view</code> to see if the
Jacobian you are using is wrong. Compare the output when you add
<code>-mat_fd_type ds</code> to see if the result is sensitive to the
choice of differencing parameter.
</li>
<li>
Run with <code>-snes_mf_operator -pc_type lu</code> to see if the
Jacobian you are using is wrong. If the problem is too large for
a direct solve, try <code>-snes_mf_operator -pc_type ksp -ksp_ksp_rtol
1e-12</code>. Compare the output when you add <code>-mat_mffd_type
ds</code> to see if the result is sensitive to choice of differencing
parameter.
</li>
<li>Run on one processor to see if the problem is only in parallel.</li>
<li>
Run with <code>-snes_linesearch_monitor</code> to see if the line search
is failing (this is usually a sign of a bad Jacobian). Use -info in PETSc 3.1
and older versions, <code>-snes_ls_monitor</code> in PETSc 3.2
and <code>-snes_linesearch_monitor</code> in PETSc 3.3 and later.
</li>
<li>
Run with <code>-info</code> to get more detailed information on the
solution process.
</li>
</ul>
Here are some ways to help the Newton process if everything above checks out
<ul>
<li>
Run with grid sequencing (<code>-snes_grid_sequence</code> if working
with a DM is all you need) to generate better initial guess on your
finer mesh
</li>
<li>
Run with quad precision (./configure with --with-precision=__float128
--download-f2cblaslapack with PETSc 3.2 and later and recent versions
of the GNU compilers)
</li>
<li>
Change the units (nondimensionalization), boundary condition scaling,
or formulation so that the Jacobian is better conditioned. See <a href="http://en.wikipedia.org/wiki/Buckingham_π_theorem">http://en.wikipedia.org/wiki/Buckingham_π_theorem</a>
</li>
<li>
Mollify features in the function that do not have continuous first
derivatives (often occurs when there are "if" statements in the
residual evaluation, e.g. phase change or TVD limiters). Use
a variational inequality solver (SNESVINEWTONRSLS) if the discontinuities are
of fundamental importance.
</li>
<li>
Try a trust region method (<code>-ts_type tr</code>, may have to adjust
parameters).
</li>
<li>
Run with some continuation parameter from a point where you know the
solution, see TSPSEUDO for steady-states.
</li>
<li>
There are homotopy solver packages like PHCpack that can get you all
possible solutions (and tell you that it has found them all) but those
are not scalable and cannot solve anything but small problems.
</li>
</ul>
<h3><a name="kspdiverged">Why is the linear solver (KSP) not converging, or converges slowly?</a></h3>
<p>
Always run with <code>-ksp_converged_reason -ksp_monitor_true_residual</code>
when trying to learn why a method is not converging. Common reasons for
KSP not converging are
</p>
<ul>
<li>
The equations are singular by accident (e.g. forgot to impose boundary
conditions). Check this for a small problem using <code>-pc_type svd
-pc_svd_monitor</code>.
</li>
<li>
The equations are intentionally singular (e.g. constant null space),
but the Krylov method was not informed, see MatSetNullSpace().
</li>
<li>
The equations are intentionally singular and MatSetNullSpace() was
used, but the right hand side is not consistent. You may have to call
MatNullSpaceRemove() on the right hand side before calling KSPSolve().
See MatSetTransposeNullSpace()
</li>
<li>
The equations are indefinite so that standard preconditioners don't
work. Usually you will know this from the physics, but you can check
with <code>-ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none</code>.
For simple saddle point problems, try <code>-pc_type fieldsplit
-pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point</code>.
For more difficult problems, read the literature to find robust methods
and ask petsc-users@mcs.anl.gov or petsc-maint@mcs.anl.gov if you want
advice about how to implement them.
</li>
<li>
If the method converges in preconditioned residual, but not in true
residual, the preconditioner is likely singular or nearly so. This is
common for saddle point problems (e.g. incompressible flow) or strongly
nonsymmetric operators (e.g. low-Mach hyperbolic problems with large
time steps).
</li>
<li>
The preconditioner is too weak or is unstable. See if <code>-pc_type
asm -sub_pc_type lu</code> improves the convergence rate. If GMRES is
losing too much progress in the restart, see if longer restarts help
<code>-ksp_gmres_restart 300</code>. If a transpose is available, try
<code>-ksp_type bcgs</code> or other methods that do not require
a restart. (Note that convergence with these methods is frequently
erratic.)
</li>
<li>
The preconditioner is nonlinear (e.g. a nested iterative solve), try
<code>-ksp_type fgmres</code> or <code>-ksp_type gcr</code>.
</li>
<li>
You are using geometric multigrid, but some equations (often boundary
conditions) are not scaled compatibly between levels. Try
<code>-pc_mg_galerkin both</code> to algebraically construct a correctly
scaled coarse operator or make sure that all the equations are scaled
in the same way if you want to use rediscretized coarse levels.
</li>
<li>
The matrix is very ill-conditioned. Check the <a href="#conditionnumber">condition number</a>.
<ul>
<li> Try to improve it by choosing the relative scaling of components/boundary conditions.</li>
<li>Try <code>-ksp_diagonal_scale -ksp_diagonal_scale_fix</code>.</li>
<li>Perhaps change the formulation of the problem to produce more friendly algebraic equations.</li>
</ul>
</li>
<li>
The matrix is nonlinear (e.g. evaluated using finite differencing of
a nonlinear function). Try different differencing parameters,
<code>./configure --with-precision=__float128 --download-f2cblaslapack</code>,
check if it converges in "easier" parameter regimes.
</li>
<li>A symmetric method is being used for a non-symmetric problem.</li>
<li>
Classical Gram-Schmidt is becoming unstable, try <code>-ksp_gmres_modifiedgramschmidt</code>
or use a method that orthogonalizes differently, e.g. <code>-ksp_type gcr</code>.
</li>
</ul>
<hr>
<h2><a name="debugging">Debugging</a></h2>
<h3><a name="debug-ibm">How do I turn off PETSc signal handling so I can use the -C option on xlF?</a></h3>
<p>Immediately after calling PetscInitialize() call PetscPopSignalHandler()</p>
<p>
Some Fortran compilers including the IBM xlf, xlF etc compilers have
a compile option (-C for IBM's) that causes all array access in Fortran
to be checked that they are in-bounds. This is a great feature but does
require that the array dimensions be set explicitly, not with a *.
</p>
<h3><a name="start_in_debugger-doesnotwork">How do I debug if -start_in_debugger does not work on my machine?</a></h3>
<p> On newer Mac OSX machines - one has to be in admin group to be able to use debugger</p>
<p>
On newer UBUNTU linux machines - one has to disable <a href="https://wiki.ubuntu.com/Security/Features#ptrace">ptrace_scop</a>
with "sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope" - to get start
in debugger working.
</p>
<p>
If start_in_debugger does not really work on your OS, for a uniprocessor
job, just try the debugger directly, for example: gdb ex1. You can also
use Totalview which is a good graphical parallel debugger.
</p>
<h3><a name="debug-hang">How do I see where my code is hanging?</a></h3>
You can use the -start_in_debugger option to start all processes in the
debugger (each will come up in its own xterm) or run in Totalview. Then use
cont (for continue) in each xterm. Once you are sure that the program is
hanging, hit control-c in each xterm and then use 'where' to print a stack
trace for each process.
<h3><a name="debug-inspect">How can I inspect Vec and Mat values when in the debugger?</a></h3>
<p>
I will illustrate this with gdb, but it should be similar on other
debuggers. You can look at local Vec values directly by obtaining the
array. For a Vec v, we can print all local values using
</p>
<pre>
(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n
</pre>
<p>
However, this becomes much more complicated for a matrix. Therefore, it
is advisable to use the default viewer to look at the object. For a Vec
v and a Mat m, this would be
</p>
<pre>
(gdb) call VecView(v, 0)
(gdb) call MatView(m, 0)
</pre>
<p>or with a communicator other than MPI_COMM_WORLD,</p>
<pre>
(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))
</pre>
<p>
Totalview 8.8.0 has a new feature that allows libraries to provide their
own code to display objects in the debugger. Thus in theory each PETSc
object, Vec, Mat etc could have custom code to print values in the
object. We have only done this for the most elementary display of Vec and
Mat. See the routine TV_display_type() in src/vec/vec/interface/vector.c
for an example of how these may be written. Contact us if you would like
to add more.
</p>
<h3><a name="debug-fp">How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?</a></h3>
<p>
The best way to locate floating point exceptions is to use a debugger.
On supported architectures (including Linux and glibc-based systems), just run in a debugger and pass -fp_trap to the PETSc application.
This will activate signaling exceptions and the debugger will break on the line that first divides by zero or otherwise generates an exceptions.
Without a debugger, running with -fp_trap in debug mode will only identify the function in which the error occurred, but not the line or the type of exception.
If -fp_trap is not supported on your architecture, consult the documentation for your debugger since there is likely a way to have it catch exceptions.
</p>
<h3><a name="libimf">Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory.</a></h3>
<p>
The Intel compilers use shared libraries (like libimf) that cannot by
default at run time. When using the Intel compilers (and running the
resulting code) you must make sure that the proper Intel initialization
scripts are run. This is usually done by putting some code into your
.cshrc, .bashrc, .profile etc file. Sometimes on batch file systems that
do now access your initialization files (like .cshrc) you must include
the initialization calls in your batch file submission.
</p>
For example, on my Mac using csh I have the following in my .cshrc file
<pre>
source /opt/intel/cc/10.1.012/bin/iccvars.csh
source /opt/intel/fc/10.1.012/bin/ifortvars.csh
source /opt/intel/idb/10.1.012/bin/idbvars.csh
</pre>
in my .profile I have
<pre>
source /opt/intel/cc/10.1.012/bin/iccvars.sh
source /opt/intel/fc/10.1.012/bin/ifortvars.sh
source /opt/intel/idb/10.1.012/bin/idbvars.sh
</pre>
<h3><a name="objecttypenotset">What does Object Type not set: Argument # n mean?</a></h3>
Many operations on PETSc objects require that the specific type of the
object be set before the operations is performed. You must call
XXXSetType() or XXXSetFromOptions() before you make the offending call. For
example, MatCreate(comm,&A); MatSetValues(A,....); will not work. You
must add MatSetType(A,...) or MatSetFromOptions(A,....); before the call to
MatSetValues();
<h3><a name="split">What does <code>Error detected in PetscSplitOwnership() about "sum of local lengths ...":</code> mean?</a></h3>
In a previous call to VecSetSizes(), MatSetSizes(), VecCreateXXX() or
MatCreateXXX() you passed in local and global sizes that do not make sense
for the correct number of processors. For example if you pass in a local
size of 2 and a global size of 100 and run on two processors, this cannot
work since the sum of the local sizes is 4, not 100.
<h3><a name="valgrind">What does <code>Corrupt argument or Caught signal or SEQV or segmentation violation or bus error</code> mean? Can I use valgrind to debug memory corruption issues?</a></h3>
<p>
Sometimes it can mean an argument to a function is invalid. In Fortran
this may be caused by forgeting to list an argument in the call,
especially the final ierr.
</p>
<p>
Otherwise it is usually caused by memory corruption; that is somewhere
the code is writing out of array bounds. To track this down rerun the
debug version of the code with the option -malloc_debug. Occasionally the
code may crash only with the optimized version, in that case run the
optimized version with -malloc_debug. If you determine the problem is
from memory corruption you can put the macro CHKMEMQ in the code near the
crash to determine exactly what line is causing the problem.
</p>
If -malloc_debug does not help: on GNU/Linux and Apple Mac OS X machines
- you can try using <a href="http://valgrind.org/">http://valgrind.org</a>
to look for memory corruption. - Make sure valgrind is installed
You might also consider using <a href="http://drmemory.org">http://drmemory.org</a> which has
support for GNU/Linux, Apple Mac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).
<ul>
<li>Recommend building PETSc with <code>--download-mpich --with-debugging</code> [debugging is enabled by default]</li>
<li>Compile application code with this build of PETSc</li>
<li>run with valgrind using: <code>${PETSC_DIR}/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS</code></li>
<li>or invoke valgrind directly with: <code>mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS</code></li>
</ul>
Notes:
<ul>
<li>option <code>--with-debugging</code> enables valgrind to give stack trace with additional source-file:line-number info.</li>
<li>option <code>--download-mpich</code> gives valgrind clean MPI - hence the recommendation.</li>
<li>Wrt Other MPI impls, Open MPI should also work. MPICH1 will not work.</li>
<li>if <code>--download-mpich</code> is used - mpiexec will be in PETSC_ARCH/bin</li>
<li><code>--log-file=valgrind.log.%p</code> option tells valgrind to store the output from each proc in a different file [as %p i.e PID, is different for each MPI proc].</li>
<li>On Apple you need the additional valgrind option <code>--dsymutil=yes</code></li>
<li>memcheck will not find certain array access that violate static array declarations so if memcheck runs clean you can try the <code>--tool=exp-ptrcheck</code> instead.</li>
</ul>
<h3><a name="zeropivot">What does <code>Detected zero pivot in LU factorization</code> mean?</a></h3>
<p>
A zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not
always mean that the matrix is singular. You can use '-pc_factor_shift_type
NONZERO -pc_factor_shift_amount [amount]' or '-pc_factor_shift_type
POSITIVE_DEFINITE'; '-[level]_pc_factor_shift_type NONZERO
-pc_factor_shift_amount [amount]' or '-[level]_pc_factor_shift_type
POSITIVE_DEFINITE' to prevent the zero pivot. [level] is "sub" when lu,
ilu, cholesky, or icc are employed in each individual block of the bjacobi
or ASM preconditioner; and [level] is "mg_levels" or "mg_coarse" when lu,
ilu, cholesky, or icc are used inside multigrid smoothers or to the coarse
grid solver. See PCFactorSetShiftType(), PCFactorSetAmount().
</p>
<p>This error can also happen if your matrix is singular, see MatSetNullSpace() for how to handle this.</p>
<p>
If this error occurs in the zeroth row of the matrix, it is likely you have
an error in the code that generates the matrix.
</p>
<h3><a name="xwindows">You create <code>Draw</code> windows or <code>ViewerDraw</code> windows or use options <code>-ksp_monitor_lg_residualnorm</code> or <code>-snes_monitor_lg_residualnorm</code> and the program seems to run OK but windows never open.</a></h3>
The libraries were compiled without support for X windows. Make sure that
./configure was run with the option <code>--with-x</code>
<h3><a name="memory">The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory.</a></h3>
Problem: Possibly some of the following:
<ol>
<li>You are creating new PETSc objects but never freeing them.</li>
<li>There is a memory leak in PETSc or your code.</li>
<li>
Something much more subtle: (if you are using Fortran). When you
declare a large array in Fortran, the operating system does not
allocate all the memory pages for that array until you start using the
different locations in the array. Thus, in a code, if at each step you
start using later values in the array your virtual memory usage will
"continue" to increase as measured by <code>ps</code> or <code>top</code>.
</li>
<li>
You are running with the -log, -log_mpe, or -log_all option. With these options, a great
deal of logging information is stored in memory until the conclusion of
the run.
</li>
<li>
You are linking with the MPI profiling libraries; these cause logging
of all MPI activities. Another Symptom is at the conclusion of the run
it may print some message about writing log files.
</li>
</ol>
Cures:
<ol>
<li>
Run with the -malloc_debug option and -malloc_dump. Or use the commands
PetscMallocDump() and PetscMallocLogDump() sprinkled in your code to
track memory that is allocated and not later freed. Use the commands
PetscMallocGetCurrentUsage() and PetscMemoryGetCurrentUsage() to
monitor memory allocated and PetscMallocGetMaximumUsage() and PetscMemoryGetMaximumUsage()
for total memory used as the code progresses.
</li>
<li>This is just the way Unix works and is harmless.</li>
<li>
Do not use the -log, -log_mpe, or -log_all option, or use
PLogEventDeactivate() or PLogEventDeactivateClass()
to turn off logging of specific events.
</li>
<li>Make sure you do not link with the MPI profiling libraries.</li>
</ol>
<h3><a name="key">When calling MatPartitioningApply() you get a message Error! Key 16615 not found</a></h3>
The graph of the matrix you are using is not symmetric. You must use symmetric matrices for partitioning.
<h3><a name="gmres">With GMRES At restart the second residual norm printed does not match the first.</a></h3>
<pre>
26 KSP Residual norm 3.421544615851e-04
27 KSP Residual norm 2.973675659493e-04
28 KSP Residual norm 2.588642948270e-04
29 KSP Residual norm 2.268190747349e-04
30 KSP Residual norm 1.977245964368e-04
30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time
</pre>
<p>
<b>Problem</b>: Actually this is not surprising. GMRES computes the norm of the
residual at each iteration via a recurrence relation between the norms of
the residuals at the previous iterations and quantities computed at the
current iteration; it does not compute it via directly || b - A x^{n} ||.
Sometimes, especially with an ill-conditioned matrix, or computation of the
matrix-vector product via differencing, the residual norms computed by
GMRES start to "drift" from the correct values. At the restart, we compute
the residual norm directly, hence the "strange stuff," the difference
printed. The drifting, if it remains small, is harmless (doesn't affect the
accuracy of the solution that GMRES computes).
</p>
<p>
<b>Cure</b>: There realy isn't a cure, but if you use a more powerful
preconditioner the drift will often be smaller and less noticeable. Of if
you are running matrix-free you may need to tune the matrix-free
parameters.
</p>
<h3><a name="doubleits">Why do some Krylov methods seem to print two residual norms per iteration?</a></h3>
<pre>
1198 KSP Residual norm 1.366052062216e-04
1198 KSP Residual norm 1.931875025549e-04
1199 KSP Residual norm 1.366026406067e-04
1199 KSP Residual norm 1.931819426344e-04
</pre>
<p>
Some Krylov methods, for example tfqmr, actually have a "sub-iteration"
of size 2 inside the loop; each of the two substeps has its own matrix
vector product and application of the preconditioner and updates the
residual approximations. This is why you get this "funny" output where it
looks like there are two residual norms per iteration. You can also think
of it as twice as many iterations.
</p>
<h3><a name="dylib">Unable to locate PETSc dynamic library /home/balay/spetsc/lib/libg/linux/libpetsc</a></h3>
<p>
When using DYNAMIC libraries - the libraries cannot be moved after they are
installed. This could also happen on clusters - where the paths are
different on the (run) nodes - than on the (compile) front-end. Do not use
dynamic libraries & shared libraries. Run ./configure with
<code>--with-shared-libraries=0 --with-dynamic-loading=0</code>.
This option has been removed in petsc-3.5.
</p>
<h3><a name="bisect">How do I determine what update to PETSc broke my code?</a></h3>
If at some point [in petsc code history] you had a working code - but the
latest petsc code broke it, its possible to determine the petsc code change
that might have caused this behavior. This is achieved by:
<ul>
<li>using <a href="http://git-scm.com/">Git</a> to access PETSc sources</li>
<li>knowing the Git commit for the <b>known working</b> version of PETSc</li>
<li>knowing the Git commit for the <b>known broken</b> version of PETSc</li>
<li>using the <a href="https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html">bisect</a> functionality of Git</li>
</ul>
This process can be as follows:
<ul>
<li>get petsc development (master branch in git) sources:
<ul>
<li><code>git clone https://bitbucket.org/petsc/petsc</code></li>
</ul>
</li>
<li>
Find the <b>good</b> and <b>bad</b> markers to
start the bisection process. This can be done either by checking
<code>git log</code> or <code>gitk</code> or <a href="https://bitbucket.org/petsc/petsc">https://bitbucket.org/petsc/petsc</a>
or the web history of petsc-release clones. Lets say the known bad
commit is 21af4baa815c and known good commit is 5ae5ab319844
</li>
<li>
Now start the bisection process with these known revisions. [build PETSc, and test your code to confirm known good/bad behavior]
<ul>
<li><code>git bisect start 21af4baa815c 5ae5ab319844</code></li>
<li><build/test/confirm-bad></li>
<li><code>git bisect bad</code></li>
<li><build/test/confirm-good></li>
<li><code>git bisect good</code></li>
</ul>
</li>
<li>
Now until done - keep bisecting, building PETSc, and testing your code with it and determine if the code is working or not.
After something like 5-15 iterations, <code>git bisect</code> will
pin-point the exact code change that resulted in the difference in
application behavior.
</li>
<li>
See <a href="https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html">git-bisect(1)</a>
and the <a href="http://git-scm.com/book/en/Git-Tools-Debugging-with-Git">debugging
section of the Git Book</a> for more debugging tips.
<li>
</ul>
<hr>
<h2><a name="shared-libraries">Shared Libraries</a></h2>
<h3><a name="install-shared">Can I install PETSc libraries as shared libraries?</a></h3>
Yes.Use the <code>./configure --with-shared-libraries</code>
<h3><a name="why-use-shared">Why should I use shared libraries?</a></h3>
When you link to shared libraries, the function symbols from the shared
libraries are not copied in the executable. This way the size of the
executable is considerably smaller than when using regular libraries. This
helps in a couple of ways:
<ol>
<li>saves disk space when more than one executable is created, and</li>
<li>improves the compile time immensly, because the compiler has to write a much smaller file (executable) to the disk.</li>
</ol>
<h3><a name="link-shared">How do I link to the PETSc shared libraries?</a></h3>
By default, the compiler should pick up the shared libraries instead of the regular ones. Nothing special should be done for this.
<h3><a name="link-regular-lib">What If I want to link to the regular .a library files?</a></h3>
You must run ./configure with the option --with-shared-libraries=0 (you
can use a different PETSC_ARCH for this build so you can easily switch
between the two).
<h3><a name="move-shared-exec">What do I do if I want to move my executable to a different machine?</a></h3>
<p>
You would also need to have access to the shared libraries on this new
machine. The other alternative is to build the exeutable without shared
libraries by first deleting the shared libraries, and then creating the
executable.
</p>
</div>
</body>
</html>
|