1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753

This is the README file for GMPECM. (See INSTALLecm for installing GMPECM
and the ecm library, and README.lib for using the ecm library.)
Table of contents of this file:
1. Basic usage.
2. How to use P1, P+1, and ECM efficiently?
3. Extra factors and BrentSuyama's extension.
4. Memory usage.
5. Expression syntax reference for GMPECM's syntax parser.
6. Options param, sigma, x0, y0, A, torsion.
7. Options save, resume and chkpnt.
8. How to get the best of GMPECM?
9. GMPECM and GPU.
11. Record factors.
11. Known problems.
##############################################################################
1. Basic usage
GMPECM reads the numbers to be factored from stdin (one number on each
line) and requires a numerical parameter, the stage 1 bound B1. A reasonable
stage 2 bound B2 for the given B1 is chosen by default, but can be overridden
by a second numerical parameter. By default, GMPECM uses the ECM factoring
algorithm.
Example: To run one curve of ECM with B1=1000000 on each number in the file
"composites", run
ecm 1000000 < composites
To use a B2 value of ~5*10^8 instead of the default value of ~10^9, run
ecm 1000000 5e8 < composites
Scientific notation is accepted for B1 and B2 values. The actual B2 value
used may be larger than the specified value to let parameters satisfy some
conditions imposed by the stage 2 algorithm.
To run one curve with B1=11e7 on M1061, simply do:
echo "2^10611"  ecm 11e7
To run more than one ECM curve on each input number, use the c parameter.
Example: to run 100 curves with B1=1000000 and default B2 on each number
in "composites", run
ecm c 100 1000000 < composites
To use the P1 or P+1 factoring methods, use the pm1 or pp1 parameter,
respectively.
Example: to use the P1 method with B1=10^9 on all numbers in the file
"composites", run
ecm pm1 1e9 < composites
Note that, unlike for ECM, using the same B1,B2 bounds on one number is
quite useless for P1, and of limited use for P+1. See "2. How to use P1,
P+1, and ECM efficiently?"
##############################################################################
2. How to use P1, P+1, and ECM efficiently?
The P1 method works well when the input number has a prime factor P such
that P1 is "smooth", i.e., has all its prime factor less or equal the
step 1 bound B1, except one which may be less or equal the second step
bound B2. For P=67872792749091946529, we have P1 = 2^5 * 11 * 17 * 19 *
43 * 149 * 8467 * 11004397, so this factor will be found as long as B1 >= 8467
and B2 >= 11004397:
$ echo 67872792749091946529  ./ecm pm1 x0 2809890345 8467 11004397
GMPECM ... [powered by GMP ...] [P1]
Input number is 67872792749091946529 (20 digits)
Using B1=8467, B2=671019370830, x0=2809890345
Step 1 took 3ms
Step 2 took 14ms
********** Factor found in step 2: 67872792749091946529
Found input number N
There is no need to run P1 several times with the same B1 and B2 as there
is for ECM, since a factor found with one seed will (almost always) be found
by another one.
Note: if a factor of P1 if known, you can specify it with go. For example
it is known that the Fermat number 2^(2^n)+1 has factors of the form
k*2^(n+2)+1, thus 2^(n+2) is a known factor of P1. The go expression has
a special placeholder 'N' which stands for the input number, for example:
$ echo "2^(2^12)+1"  ecm go "N1" pm1 1e6
The P+1 method works well when the input number has a prime factor P such
that P+1 is "smooth". For P=4190453151940208656715582382315221647, we have
P+1 = 2^4 * 283 * 2423 * 21881 * 39839 * 1414261 * 2337233 * 132554351, so
this factor will be found as long as B1 >= 2337233 and B2 >= 132554351:
$ echo 4190453151940208656715582382315221647  ./ecm pp1 x0 7 2337233 132554351
GMPECM ... [powered by GMP ...] [P+1]
Input number is 4190453151940208656715582382315221647 (37 digits)
Using B1=2337233, B2=2324738343958122, x0=7
Step 1 took 750ms
Step 2 took 120ms
********** Factor found in step 2: 4190453151940208656715582382315221647
Found input number N
However not all seeds will succeed: only half of the seeds 'x0' work for P+1
(namely those where the Jacobi symbol of x0^24 and P is 1.) Unfortunately,
since P is usually not known in advance, there is no way to ensure that this
holds. However, if the seed is chosen randomly, there is a probability of
about 1/2 that it will give a Jacobi symbol of 1 (i.e., the factor P will
be found if P+1 is smooth enough). A rule of thumb is to run 3 times P+1
with different random seeds. The seeds 2/7 and 6/5 have a slightly higher
chance of success than average as they lead to a group order divisible by
6 or 4, respectively. When factoring Fibonacci numbers F_n or Lucas
numbers L_n, using the seed 23/11 ensures that the group order is
divisible by 2n, making other P+1 (and probably P1) work unnecessary.
As of version 6.2, a new stage 2 for the P1 and P+1 algorithms is
implemented. It uses less memory and is faster than the previous code,
thus allowing larger B2 values. If GMPECM is configured with the
"enableopenmp" flag and is compiled with a compiler that implements
OpenMP, it uses multithreading for computation of polynomial roots and
NTT multiplication. When not using the NTT, it benefits from multithreading
only in the computation of roots phase. The number of threads to use can
be controlled with the OMP_NUM_THREADS environment variable. Unlike the
previous generic stage 2, the new stage 2 cannot use the BrentSuyama
extension (power and dickson parameters). Specifying these options on
the command line forces use of the generic stage 2. Note: the notation of
the parameters follows that in the paper, the number of multipoint
evaluations (similar to "blocks") is given by s_2. You can specify a lower
limit for s_2 by the k command line parameter.
The ECM method is a probabilistic method, and can be viewed in some sense
as a generalization of the P1 and P+1 method, where we only require that
P+t+1 is smooth, where t depends on the curve we use and satisfies
t <= 2*P^(1/2) (Hasse's theorem). The optimal B1 and B2 bounds have
to be chosen according to the (usually unknown) size of P. The following
table gives a set of nearly optimal B1 and B2 pairs, with the corresponding
expected number of curves to find a factor of given size (column "power 1"
does not take into account the extra factors found by BrentSuyama's exten
sion, whereas column "default poly" takes them into account, with the poly
nomial used by default: D(n) means Dickson's polynomial of degree n):
digits D optimal B1 default B2 expected curves
N(B1,B2,D)
power 1 default poly
20 11e3 1.9e6 74 74 [x^1]
25 5e4 1.3e7 221 214 [x^2]
30 25e4 1.3e8 453 430 [D(3)]
35 1e6 1.0e9 984 904 [D(6)]
40 3e6 5.7e9 2541 2350 [D(6)]
45 11e6 3.5e10 4949 4480 [D(12)]
50 43e6 2.4e11 8266 7553 [D(12)]
55 11e7 7.8e11 20158 17769 [D(30)]
60 26e7 3.2e12 47173 42017 [D(30)]
65 85e7 1.6e13 77666 69408 [D(30)]
Table 1: optimal B1 and expected number of curves to find a
factor of D digits with GMPECM.
After performing the expected number of curves from Table 1, the
probability that a factor of D digits was missed is exp(1), i.e.,
about 37%. After twice the expected number of curves, it is exp(2),
i.e., about 14%, and so on.
Example: after performing 8266 curves with B1=43e6 and B2=2.4e11
(or 7553 curves with dickson 12), the probability to miss a 50digit
factor is about 37%.
From version 6.0 on, GMPECM prints the expected number of curves and expected
time to find factors of different sizes in verbose mode (option v). This
makes it easy to further optimize parameters for a certain factor size if
desired: simply try to minimize the expected time.
(lengthy NOTE: The order of an elliptic curve with Montgomery parameterization
is known to be divisible by 4. Depending on the parametrization that is used
(see Section 6) this number is increased in different ways. Let us call T the
known torsion (depending on the parametrization). One can assume that the
probability that the order is B1,B2 smooth should be about as great as for a
random integer 1/Tth in value. However, Montgomery observed that the order
behaves even nicer than that: heuristically, it seems that the order is as more
likely to be smooth than a random integer about 1/Tth. In "Finding ECMFriendly
Curves through a Study of Galois Properties", Barbulescu, Bos, Bouvier,
Kleinjung and Montgomery showed how to computed the average valuation that
should be used instead of the average valuation of random number. This is the
values we use in GMPECM and the computed probabilities match those observed in
experiments very well. This however means that the so computed values for the
expected number of curves for given B1,B2 values and factor sizes may not match
those published in the previous literature.
The factor GMPECM uses is defined as ECM_EXTRA_SMOOTHNESS in rho.c,
EXTRA_SMOOTHNESS_SQUARE and EXTRA_SMOOTHNESS 32BITS_D in ecm.c depending on the
parametrization. You can change it to
ECM_EXTRA_SMOOTHNESS 3.0
EXTRA_SMOOTHNESS_SQUARE 0.333333333
EXTRA_SMOOTHNESS 32BITS_D 0.333333333
if you want to reproduce the more pessimistic values found in the literature.)
In summary, we advise the following method:
0  choose a target factor size of D digits
1  choose optimal B1 and B2 values to find factors of D digits (cf Table 1)
2  run once P1 with 10*B1, and the default B2 chosen by GMPECM
3  (optional) run 3 times P+1 with 5*B1, and the default B2
4  run N(B1,B2,D) times ECM with those B1 and B2, where N(B1,B2,D) is the
expected number of ECM curves with step 1 bound B1, step 2 bound B2,
to find a factor of D digits (cf above table).
5  if no factor is found, either increase D by 5 digits and go to 0, or use
another factorization method (MPQS, NFS)
Note: if a factor is found in steps 2, 3 or 4, simply continue the current
step with the remaining cofactor (if composite). There is no need
to start again from 0, since the cofactor was already tested, too.
##############################################################################
3. Extra factors and BrentSuyama's extension.
GMPECM may sometimes find some "extra" factors, such that one factor of P1,
P+1 or P+t+1 exceeds the step 2 bound B2, thanks to BrentSuyama's extension.
Let's explain how it works for P1, since it's simpler. The classical step 2
(without BrentSuyama's extension) considers s^(j*d) mod N and s^i mod N,
where N is the number to factor, and s is the residue computed in stage 1.
Here, d is fixed, and the integers i and j vary in two sets so that j*di
covers all primes in [B1, B2]. Now consider a polynomial f(x), and compute
s^f(j*d) and s^f(i) instead of s^(j*d) and s^i [thus the classical step 2
corresponds to f(x)=x^1]. Then P will be found whenever all but one of the
factors of P1 are <= B1, and one factor divides some f(j*d)  f(i):
$ echo 1207946164033269799036708918081  ./ecm pm1 k 3 power 12 286493 25e6
GMPECM ... [powered by GMP ...] [P1]
Input number is 1207946164033269799036708918081 (31 digits)
Using B1=286493, B2=30806172, polynomial x^12, x0=1548711558
Step 1 took 320ms
Step 2 took 564ms
********** Factor found in step 2: 1207946164033269799036708918081
Found input number N
Here the largest factor of P1 is 83957197, which is 3.35 times larger than B2.
Warning: these "extra" factorizations may not be reproducible from one
version of GMPECM to another one, since they depend on some internal
parameters that may change.
For P1 with the generic stage 2, the degree of the BrentSuyama polynomial
should be even. Since i^2k  (j*d)^2k = (i^k  (j*d)^k)(i^k + (j*d)^k),
this allows testing two values symmetric around a multiple of d
simultaneously, halving the amount of computation required in stage 2.
P+1 with the generic stage 2 and ECM do this inherently.
The new fast stage 2 for P1 and P+1 does not support the BrentSuyama
extension. By default, the fast stage 2 is used for P1 and P+1;
giving a power or dickson parameter on the command line forces use of
the previous, generic stage 2. It is recommended to use the new stage 2
(from version 6.2) for P1 and P+1, which is the default: it is so much
faster that it largely compensates the few extra factors that are not found
because BrentSuyama's extension is not available.
The default polynomial used for ECM with a given B2 should be near optimal,
i.e., give only a marginal overhead in step 2, while enabling extra factors.
##############################################################################
4. Memory usage.
Step 1 does not require much memory: O(n) for an input number of n digits.
Step 2 may be quite memory expensive, especially for large B2, since its
efficient algorithms use some large tables. To reduce the memory usage of
step 2, you may increase the 'k' parameter, which controls the number of
"blocks" performed in step 2. Multiplying the default value of k by 4 will
decrease the memory usage by a factor of 2. For example with B2=1e10 and
a 155digit number, step 2 requires about 55MB with the default k=4, but
only 27MB with k=16. Increasing k does, however, slightly increase the time
required for step 2 (see section "How to get the best of GMPECM?").
An estimation of the memory usage is given at the start of stage 2:
$ echo 95209938255048826235189575712705128366296557149606415206280987204268594538412191641776798249266895999715600261737863698825644292938050707507901970225804581 > c155
$ ecm v k 4 10 1e10 < c155
...
Estimated memory usage: 55M
...
Step 2 took 18649ms
$ ecm v k 16 10 1e10 < c155
...
Estimated memory usage: 27M
...
Step 2 took 26972ms
Another way is to use the treefile parameter, which causes some of the
tables to be stored on disk instead of in memory. Using the option
"treefile /var/tmp/ecmtree" will create the files "/var/tmp/ecmtree.1",
"/var/tmp/ecmtree.2" etc. The files are deleted upon completion of stage 2:
$ ecm v treefile /tmp/ecmtree k 4 10 1e10 < c155
...
Estimated memory usage: 36M
...
Step 2 took 18648ms
Due to time consuming disk I/O, this will cause stage 2 to take somewhat
longer. How much memory is saved depends on stage 2 parameters, but a
typical value is that memory use is reduced by a factor of about 1.5.
Increasing the number of blocks with k also reduces the amount of data that
needs to get written to disk, thus reducing disk I/O time. Combining these
parameters is a very effective way of reducing memory use.
Up from version 6.1, there is still another (better) possibility, with the
maxmem option. The commandline maxmem nnn option tells GMPECM to use at
most nnn MB in stage 2. It is better than k because it takes into account
the size of the number to be factored, and automatically adjusts the number
of blocks to use:
$ ./ecm v maxmem 40 10 1e10 < c155
...
dF=8192, k=15, d=79170, d2=11, i0=10
...
Estimated memory usage: 27M
...
Step 2 took 25456ms
##############################################################################
5. Expression syntax reference for GMPECM's syntax parser.
GMPECM can handle several kinds of expressions as input numbers. Here is the
syntax that is handled:
1. Raw decimal numbers like 123456789
2. Comments can be placed in the file. The C++ "one line comment" //
is used. Everything after the // on a line (including the //) is ignored.
Warning: no input number should appear on such a comment line.
3. Line continuation. If a line ends with a backslash character '\',
it is considered it continues on the next line (ignoring the '\').
4. Any white space (space, tab, end of line) is ignored. However, the "end of
line" is used to end the expression (unless of course there is a '\'
character before the end of line). For example, processing this:
1 2 3 4 5 6 7 8 9
would be the same as processing
123456789
5. "common" arithmetic expressions (* / +  %), the period '.' might be used
in place of * for multiply, and  can be unary minus (e.g., 55555551).
Example: echo "3*5+2"  ./ecm 100
6. Grouping ( [ { for start of group (which symbol is used does not matter)
and ) ] } to end a group (again all 3 symbols mean the SAME thing).
7. Exponentiation with the ^ character (i.e., 2^24 is the same as 16777216).
Example: echo "2^24+1"  ./ecm 100
8. Simple factorial using the exclamation point ! character. Example is
53! == 1*2*3*4...*52*53. Example: echo '53!+1'  ./ecm 1e2
9. Multifactorial as in: n!m with an example: 15!3 == 15.12.9.6.3.
10. Simple Primorial using the # character with example of 11# == 2*3*5*7*11
11. Reduced Primorial n#m with example of 17#5 == 5.7.11.13.17
12. Functions are possible with the expression parser. Currently, the only
available function is Phi(x,n), however other functions should be easy
to add in the future.
Note: Expressions are maintained as much as possible (even if the expression
becomes longer than the decimal expansion). Expressions are output as cofactors
(if the input was an expression), and are stored into save/resume files
(again if and only if the original input was an expression, and not an expanded
decimal number). When a factor is found, the cofactor expression is of the
form (original_expression)/factor_found (see however option cofdec):
$ echo "3*2^210+1"  ./ecm sigma 4007218240 2500
GMPECM ... [powered by GMP ...] [ECM]
Input number is 3*2^210+1 (64 digits)
Using B1=2500, B2=186156, polynomial x^1, sigma=4007218240
Step 1 took 16ms
Step 2 took 16ms
********** Factor found in step 2: 1358437
Found probable prime factor of 7 digits: 1358437
Probable prime cofactor (3*2^210+1)/1358437 has 58 digits
##############################################################################
6. Options param, sigma, x0, y0, A, torsion.
This section explains the behavior of the options 'param', 'sigma', 'x0', 'y0',
'A' and 'torsion' when GMPECM is used to run ECM.
The parameters A, x0 and y0 can have integer or rational values, for example
A 22/7 x0 1/3 y0 2/7.
a) with param <= 4 (default value is 0): curves in Montgomery form

In stage 1, the elliptic curves used are of the following Montgomery form:
b*y^2 = x^3 + A*x^2 + x. The starting point in ( x0 : : 1).
In this case we do not need the value of y0.
Either sigma is given, in which case the values of A and x0 are deduced from
sigma and the params option (see below).
Or A and x0 are given (in which case both must be given).
The param option is used to choose the parametrization from which the curves
are taken. It can take 4 values (s is the 'sigma' parameter):
0: use Suyama parametrization. It was the parametrization used previous
versions of GMPECM.
u = s^25, v = 4*s,
A = (vu)^3*(3*u+v)/(4*u^3*v)2,
x0=u^3/v^3.
1: (only for 64bit processors)
A = 4*s^2/2^642,
x0 = 2.
2: use an elliptic parametrization to compute d(s) such that the curve has
a 6torsion point.
A = 4*d(s)2,
x0 = 2.
3: it is mostly used for the gpu computation.
A = 4*s/2^322,
x0 = 2.
Generally, the parameter s is chosen randomly but the sigma option can be
used to choose its value.
In the case where only "sigma s" is used, "param 0" is assumed in order to be
compatible with older version of GMPECM. The above is also true when resuming
from a file (see Section 7). In the case where only "param p" is used, the
parameter is chosen at random by the ecm library. In the case where neither
"sigma" nor "param" is used, the choice of the parametrization is left to ecm
library and the parameter is chosen at random. In the case where the two
options are used, "param p sigma s" can be shorten in "sigma p:s".
The curve can also be chosen with the A option which directly specifies the
values of the coefficient a of the curve. In this case the x0 option is
mandatory.
The x0 option can be used to override the x0 value in the case where param=0
and is mandatory in the case the curve is given with the A option. In the case
where param=1, param=2 or param=3, x0 must be equal to 2, so this should not be
used.
b) with param = 5: curves in Weierstrass form

In this case we use curves in Weierstrass form:
E: y^2 = x^3 + A*x + b
The user must provide all of A, x0 and y0. GMPECM will consider that
(x0, y0) is a point on E, where b = y0^2x0^3A*x0 mod N. Step 2 is performed
as usual, since it already uses the Weierstrass form.
The options of Section 7 apply.
Applications include using curves with some exotic prescribed torsion
subgroup (see d below), or CM (Complex Multiplication) curves.
c) with param = 6: curves in Hessian form

This is to use curves in Hessian form: X^3+Y^3+1=3*A*X*Y, A^3 <> 1 mod N.
These curves have torsion group Z3xZ3 over Q(sqrt(3)) and are useful when the
prime factor p we are looking for is known to satisfy p = 1 mod 3. It should be
used with
A <D> x0 <x0> y0 <y0>
where (x0, y0) is a base point on the curve. Since p = 1 mod 3 is quite
frequent, a try at this version is not unthinkable in general.
d) torsion:

This option enables the user to ask for a given curve with specific torsion
group (over the rationals). For instance
torsion Z5 sigma 2
will generate a curve E over the rationals with torsion group Z/5Z and a point
of infinite order, both generated from parameter sigma = 2.
Note that in the present case, param is overriden by whatever "good" form
the curves are built with that prescribed torsion.
Available torsion groups (over the rationals) are:
* Z5: for sigma != 0, 1/2, 1/3, 1/4, cf. [1]
* Z7: [1]
* Z9: [1]
* Z10: [1]
* Z2xZ8: [1]
* Z3xZ3
* Z4xZ4
References: [1] Atkin/Morain, Math. Comp., 1993.
##############################################################################
7. Options save, resume and chkpnt.
These save and resume options are useful to save the current state of the
computation after step 1, or to exchange data with other software. It allows
to perform step 1 with GMPECM, and step 2 with another software (or
viceversa).
Note: the residue from the end of stage 1 gets written to the file only
after stage 2, if stage 2 is performed in the same program run. This way,
if a factor is found, the save file entry will contain the new cofactor
(if composite) or will be omitted (if cofactor is a probable prime). For
periodic saving during stage 1 for crash recovery, use chkpnt,
described below.
Here is an example how to reuse some P1 computation:
$ cat c71
13155161912808540373988986448257115022677318870175067553764004308210487
$ ./ecm save toto pm1 mpzmod x0 2 5000000 < c71
GMPECM ... [powered by GMP ...] [P1]
Input number is 13155161912808540373988986448257115022677318870175067553764004308210487 (71 digits)
Using B1=5000000, B2=352526802, polynomial x^24, x0=2
Step 1 took 3116ms
Step 2 took 2316ms
The file "toto" now contains some information about the method used, the step
1 bound, the number to factor, the value X at the end of step 1 (in hexa
decimal), and a checksum to verify that no data was corrupted:
$ cat toto
METHOD=P1; B1=5000000; N=13155161912808540373988986448257115022677318870175067553764004308210487; X=0x12530157ae22ae14d54d6a5bc404ae9458e54032c1bb2ab269837d1519f; CHECKSUM=2287710189; PROGRAM=GMPECM 6.2; X0=0x2; WHO=zimmerma@clafoutis.loria.fr; TIME=Sat Apr 12 13:41:01 2008;
Then one can resume the computation with larger B1 and/or B2 as follows:
$ ./ecm resume toto 1e7
GMPECM ... [powered by GMP ...] [ECM]
Resuming P1 residue saved by zimmerma@clafoutis.loria.fr with GMPECM 6.2 on Sat Apr 12 13:41:01 2008
Input number is 13155161912808540373988986448257115022677318870175067553764004308210487 (71 digits)
Using B1=500000010000000, B2=99468481326917772,
Step 1 took 3076ms
Step 2 took 4304ms
********** Factor found in step 2: 1448595612076564044790098185437
Found probable prime factor of 31 digits: 1448595612076564044790098185437
Probable prime cofactor 9081321110693270343633073697474256143651 has 40 digits
The second run only considered the primes in [5e610e6] in step 1,
which saved half the time of step 1.
The format used is the following:
 each line corresponds to a composite (expression ARE saved in the save file)
 a line contains assignments <id>=<value> separated by semicolons ';'
 possible values for <id> are
 METHOD (value = ECM or P1 or P+1)
 PARAM (value = ECM parametrization) [ECM only] (see Section 6)
 SIGMA (value = ECM sigma parameter) [ECM only] (see Section 6)
 B1 (first step bound)
 N (composite number to factor)
 X (value at the end of step 1)
 A (Aparameter of the elliptic curve) [ECM only]
 CHECKSUM (internal value to check correctness of the format)
 PROGRAM (program used to perform step 1, useful for factor credits)
 X0 (initial point for ECM, or initial residue for P1/P+1) [optional]
 WHO (who performed step 1)
 TIME (date and time of first step)
PARAM, SIGMA and X0 would be optional, and would be mainly be used in case of
a factor is found, to be able to reproduce the factorization.
For ECM, one of the SIGMA or A values must be present, so that the
computation can be continued on the correct curve.
The B1 and X values satisfy the condition that X is a lcm(1,2,...,B1)th
power in the (multiplicatively written) group.
If consecutive lines in a save file being resumed contain the same number to
be factored, say when many ECM curves on one number have been saved, factors
discovered by GMPECM are carried over from one attempt to the next so that
factors will be reported only once. If the cofactor is a probable prime, or
if the one option was given and a factor was found, the remaining
consecutive lines for that number will be skipped.
Note: it is allowed to have both save f1 and resume f2 for the same run,
however the files f1 and f2 should be different.
Remark: you should not perform in parallel several resume runs on the same
input with the same B1/B2 values, since those runs will do the same
computations. Options save/resume are useful in the following cases:
(a) somebody did a previous step 1 computation with another software
which is faster than GMPECM, and wants to perform Step 2 with
GMPECM which is faster for that task.
(b) somebody did a previous step 1 for P1 or P+1 up to a given bound
B1, and you want to extend that computation with B1' > B1, without
restarting from scratch. Note: this does not apply to ECM, where
the smoothness property depends on the (random) curve chosen, not
only on the input number.
(c) you did a huge step 1 P1 or P+1 computation on a given machine, and you
want to perform a huge step 2 in parallel on several
machines. For example machine 1 tests the range B2_1B2_2, machine
2 tests B2_2B2_3, ... This also decreases the memory usage for
each machine, which is function of the range width B2maxB2min.
For the same reason as (b), this does not apply to ECM.
The chkpnt option causes GMPECM to write the current residue periodically
during the stage 1 computation. This is useful as a safeguard in case the
GMPECM process is terminated, or the computer loses power, etc. The
checkpoint is written every ten minutes, when a signal (SIGINT, SIGTERM) is
received, and at the end of stage 1. The format of the checkpoint file is
very similar to that of regular save files, and checkpoints can be resumed
with the resume option.
For example:
$ ecm chkpnt pm1chkpoint pm1 1e10 1 < largenumber.txt
[Computer crashes during computation]
$ ecm resume pm1chkpoint 1e10 1
Note: if an existing file is specified as the checkpoint file, it will be
silently overwritten!
Note 2: When resuming a checkpoint file, additional small primes may be
processed in stage 1 when the checkpoint file is resumed, so the
endofstage 1 residues of an uninterrupted run and a checkpointed run
may not match. The extra primes do not reduce the probability of finding
factors, however.
##############################################################################
8. How to get the best of GMPECM?
After configuring GMPECM, and "make", do:
$ make ecmparams
This will optimize parameters for your machine and put them in ecmparams.h.
The ecm program automatically selects what it thinks is the best
arithmetic for the given input number. If that choice is not optimal, you may
force the use of a certain arithmetic by trying options modmulm, mpzmod,
redc. (The best choice should depend on B1 and B2 only very little, so long
as B1 is not too small, say >= 1000.)
Number of step 2 blocks. The step 2 range [B1,B2] is divided into k
"big blocks". The default value of k is chosen to be near to optimal.
However, it may be that for a given (B1,B2) pair, another value of k
may be better. Try for this to give the option k <value> to ecm,
where <value> is 1, 2, 3, ... This will force ecm to divide step 2
in at least <value> blocks.
Changing the value of the number of blocks will not modify the chance
of finding a factor (except for extra factors, but some will be lost,
and some will be won, so the balance should be nearly even). However it
will change the time spent in Step 2 and modify the memory used by Step 2
(see the section "Memory usage").
Optimal thresholds. The thresholds for the algorithms used in ecm are defined
in ecmparams.h, which tries to select a good "params.h" file.
Several params.h files are included in the distribution
and the configure script will select one matching your machine if it exists
(see the output of ecm printconfig).
If there is no params.h for your machine then see the section "How to get the
best of GMPECM?" to generate one optimized for your machine.
Stage 2 now uses NumberTheoretic Transforms (NTT) for polynomial arithmetic
by default for numbers of at most 30 machine words (NTT_SIZE_THRESHOLD in
ecmecm.h). The NTT code forces dF to be a power of 2; it can be disabled by
passing the commandline option nontt and unconditionally enabled by ntt.
Performance of NTT is dependent on:
 Architecture. NTT seems to give the greatest improvement on Athlons,
and the least improvement on Pentiums without SSE2.
 Thresholds. It is vital to have ecmparams.h properly tuned for your machine.
 C compiler. The SSE2 assembly code for 32 bit and the assembly code for
64 bit only work for x86 using gcc or Intel cc, so it is compiler dependent.
Note on factoring Fermat numbers:
GMPECM features Sch�nhageStrassen multiplication for polynomials in
stage 2 when factoring Fermat numbers (not in the new, fast stage 2 for
P+1 and P1. This is to be implemented.) This greatly reduces the number of
modular multiplications required, thus improving speed. It does, however,
restrict the length of the polynomials to powers of two, so that for a given
number of blocks (k parameter), the B2 value can only increase by factors of
approximately 4.
For the number of blocks, choices of 2, 3 or 4 usually give best performance.
However, if the polynomial degree becomes too large, relatively expensive
Karatsuba or ToomCoom methods are required to split the polynomial before
Sch�nhageStrassen's method can handle them. That can make a larger number
of blocks worthwhile. When factoring the mth Fermat number F_m = 2^(2^m)+1,
degrees up to dF=2^(m+1) can be handled directly. If your B2 choice
requires a degree much larger than this (dF is printed with the v
parameter), try increasing the number of blocks with k and see if
performance improves.
The BrentSuyama extension should not be used when factoring Fermat numbers,
it is more efficient to simply increase B2. Therefore, power 1 for P+1 and
ECM, and power 2 for P1 are the default for Fermat numbers. (Larger
degrees for BrentSuyama may possibly become worthwhile for P1 runs on
smaller Fermat numbers and extremely large B2, when Karatsuba and ToomCook
are used extensively.)
Factoring Fermat numbers uses a lot of memory, depending on the size of the
Fermat number and on dF. For dF=65536 and F_12, the memory used is about
1700MB. If your system does not have enough memory, you will have to use a
larger number of blocks to reach the desired B2 value with a smaller poly
degree dF, which sacrifices some performance. Additionally, you may use the
treefile option (see 4. Memory usage)
k=1 k=2 k=3 k=4
dF=256 582132 1222002 1864182 2504052
dF=512 2443992 5008092 7572192 10131672
dF=1024 10016172 20263332 30519732 40766892
dF=2048 42634420 85689250 128744080 171798910
dF=4096 173259252 347500242 521780502 696021492
dF=8192 711738310 1425139180 2138540050 2851940920
dF=16384 2850278350 5703881830 8557643650 11411247130
dF=32768 11702792020 23412731170 35122670320 46832609470
dF=65536 48071333326 96165459406 144259585486 192353711566
dF=131072 194020810630 388069884940 582118959250 776168033560
Table 2: Stage 2 interval length B2B2min, for dF
a power of 2 and small values of k.
For example, if you'd like to run stage 2 on F_12 with B2 ~= 40G, try
parameters "k 1 <B1> 48e9", "k 3 <B1> 35e9" or "k 4 <B1> 46e9".
##############################################################################
9. GMPECM and GPU
The stage 1 of ECM can be computed on a Nvidia GPU.
More details are given in README.gpu.
##############################################################################
10. Record factors.
If you find a very large factor, the program will print a message like:
Report your potential champion to <email address>
(see <url for record factors>)
This means that your factor might be a champion, i.e., one of the topten
largest factors ever found by the corresponding method (P1, P+1 or ECM).
Cf the following URLs:
ECM: http://wwwmaths.anu.edu.au/~brent/ftp/champs.txt
P1: http://www.loria.fr/~zimmerma/records/Pminus1.html
P+1: http://www.loria.fr/~zimmerma/records/Pplus1.html
##############################################################################
11. Known problems.
On some machines, GMPECM uses the clock() function to measure the cpu time
used by step 1 and 2. Since that function returns a 32bit integer, there
is a possible wraparound effect when the clock() value goes back from 2^321
to 0, which may produce negative timings.
The NTT code uses primes that fit in one machine word and that are
congruent to 1 (mod l), where l is the largest transform length required
for the desired stage 2 parameters. For very large B2 on 32bit machines,
there may not be enough suitable primes, which may limit the possible
transform length to less than what available memory would permit. This
problem occurs mostly in the fast stage 2 for P1 and P+1, as the generic
stage 2 uses far more memory for a given polynomial degree, so that memory
on a 32bit machine will be exhausted long before suitable NTT primes are.
The maximal transform length depends on the size of the input number.
For a transform length l on a 32 bit machine, N must satisfy:
l=2^11:N<2^756200, l=2^12:N<2^379353, l=2^13:N<2^190044, l=2^14:N<2^94870,
l=2^15:N<2^47414, l=2^16:N<2^23322, l=2^17:N<2^11620, l=2^18:N<2^5891,
l=2^19:N<2^2910, l=2^20:N<2^1340, l=2^21:N<2^578, l=2^22:N<2^228.
Since log(N)*l is approximately constant, this limits the amount of memory
that can be used to about 600MB for P1, and 1200MB for P+1.
