1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457
|
<!-- #################################################################### -->
<!-- ## ## -->
<!-- ## methods.xml FactInt documentation Stefan Kohl ## -->
<!-- ## ## -->
<!-- #################################################################### -->
<Chapter Label="ch:Methods">
<Heading>The Routines for Specific Factorization Methods</Heading>
Descriptions of the factoring methods implemented in this package can
be found in <Cite Key="Bressoud89"/> and <Cite Key="Cohen93"/>.
Cohen's book contains also descriptions of the other methods mentioned
in the preface.
<!-- #################################################################### -->
<Section Label="sec:TrialDivision">
<Heading>Trial division</Heading>
<Index Key="trial division">trial division</Index>
<ManSection>
<Func Name="FactorsTD" Arg="n [, Divisors ]" Label="trial division"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> by trial division.
The optional argument <A>Divisors</A> is the list of trial divisors.
If not given, it defaults to the list of primes <M>p < 1000</M>.
<Example>
<![CDATA[
gap> FactorsTD(12^25+25^12);
[ [ 13, 19, 727 ], [ 5312510324723614735153 ] ]
]]>
</Example>
</Description>
</ManSection>
</Section>
<!-- #################################################################### -->
<Section Label="sec:PollardsPminus1">
<Heading>Pollard's <M>p-1</M></Heading>
<Index Key="Pollard's p-1">Pollard's <M>p-1</M></Index>
<ManSection>
<Func Name="FactorsPminus1" Arg="n [, [ a, ] Limit1 [, Limit2 ] ]"
Label="Pollard's p-1"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> using Pollard's <M>p-1</M>.
It uses <A>a</A> as base for exponentiation, <A>Limit1</A> as first
stage limit and <A>Limit2</A> as second stage limit.
If the function is called with three arguments, these arguments are
interpreted as <A>n</A>, <A>Limit1</A> and <A>Limit2</A>. Defaults are
chosen for all arguments which are omitted. <P/>
Pollard's <M>p-1</M> is based on the fact that exponentiation
(mod <M>n</M>) can be done efficiently enough to compute
<M>a^{k!}</M> mod <M>n</M> for sufficiently large <M>k</M>
in a reasonable amount of time. Assume that <M>p</M> is a prime factor
of <M>n</M> which does not divide <M>a</M>, and that <M>k!</M>
is a multiple of <M>p-1</M>. Then
<Index Key="Lagrange's Theorem">Lagrange's Theorem</Index>
Lagrange's Theorem states that <M>a^{k!} \equiv 1</M>
(mod <M>p</M>). If <M>k!</M> is not a multiple of <M>q-1</M> for
another prime factor <M>q</M> of <M>n</M>, it is likely that the
factor <M>p</M> can be determined by computing
<M>\gcd(a^{k!}-1,n)</M>. A prime factor <M>p</M> is usually found if
the largest prime factor of <M>p-1</M> is not larger than <A>Limit2</A>,
and the second-largest one is not larger than <A>Limit1</A>.
(Compare with <Ref Func="FactorsPplus1" Label="Williams' p+1"/>
and <Ref Func="FactorsECM" Label="Elliptic Curves Method, ECM"/>.)
<Example>
<![CDATA[
gap> FactorsPminus1( Factorial(158) + 1, 100000, 1000000 );
[ [ 2879, 5227, 1452486383317, 9561906969931, 18331561438319,
4837142997094837608115811103417329505064932181226548534006749213\
4508231090637045229565481657130504121732305287984292482612133314325471\
3674832962773107806789945715570386038565256719614524924705165110048148\
7161609649806290811760570095669 ], [ ] ]
gap> List( last[ 1 ]{[ 3, 4, 5 ]}, p -> Factors( p - 1 ) );
[ [ 2, 2, 3, 3, 81937, 492413 ], [ 2, 3, 3, 3, 5, 7, 7, 1481, 488011 ]
, [ 2, 3001, 7643, 399613 ] ]
]]>
</Example>
</Description>
</ManSection>
</Section>
<!-- #################################################################### -->
<Section Label="sec:WilliamsPplus1">
<Heading>Williams' <M>p+1</M></Heading>
<Index Key="Williams' p+1">Williams' <M>p+1</M></Index>
<ManSection>
<Func Name="FactorsPplus1" Arg="n [, [ Residues, ] Limit1 [, Limit2 ] ]"
Label="Williams' p+1"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> using Williams' <M>p+1</M>.
It tries <A>Residues</A> different residues, and uses <A>Limit1</A>
as first stage limit and <A>Limit2</A> as second stage limit.
If the function is called with three arguments, these arguments are
interpreted as <A>n</A>, <A>Limit1</A> and <A>Limit2</A>. Defaults are
chosen for all arguments which are omitted. <P/>
Williams' <M>p+1</M> is very similar to Pollard's <M>p-1</M>
(see <Ref Func="FactorsPminus1" Label="Pollard's p-1"/>).
The difference is that the underlying group here can either have order
<M>p+1</M> or <M>p-1</M>, and that the group operation takes more time.
A prime factor <M>p</M> is usually found if the largest prime factor
of the group order is at most <A>Limit2</A> and the second-largest one
is not larger than <A>Limit1</A>.
(Compare also with <Ref Func="FactorsECM"
Label="Elliptic Curves Method, ECM"/>.)
<Example>
<![CDATA[
gap> FactorsPplus1( Factorial(55) - 1, 10, 10000, 100000 );
[ [ 73, 39619, 277914269, 148257413069 ],
[ 106543529120049954955085076634537262459718863957 ] ]
gap> List( last[ 1 ], p -> [ Factors( p - 1 ), Factors( p + 1 ) ] );
[ [ [ 2, 2, 2, 3, 3 ], [ 2, 37 ] ],
[ [ 2, 3, 3, 31, 71 ], [ 2, 2, 5, 7, 283 ] ],
[ [ 2, 2, 2207, 31481 ], [ 2, 3, 5, 9263809 ] ],
[ [ 2, 2, 47, 788603261 ], [ 2, 3, 5, 13, 37, 67, 89, 1723 ] ] ]
]]>
</Example>
</Description>
</ManSection>
</Section>
<!-- #################################################################### -->
<Section Label="sec:ECM">
<Heading>The Elliptic Curves Method (ECM)</Heading>
<Index Key="Elliptic Curves Method (ECM)">
Elliptic Curves Method (ECM)
</Index>
<ManSection>
<Func Name="FactorsECM"
Arg="n [, Curves [, Limit1 [, Limit2 [, Delta ] ] ] ]"
Label="Elliptic Curves Method, ECM"/>
<Func Name="ECM"
Arg="n [, Curves [, Limit1 [, Limit2 [, Delta ] ] ] ]"
Label="shorthand for FactorsECM"/>
<Returns>
a list of two lists: The first list contains the prime factors found,
and the second list contains remaining unfactored parts of <A>n</A>,
if there are any.
</Returns>
<Description>
This function tries to factor <A>n</A> using the Elliptic Curves
Method (ECM).
The argument <A>Curves</A> is the number of curves to be tried.
The argument <A>Limit1</A> is the initial
<Index Key="first stage limit">first stage limit</Index>
first stage limit, and <A>Limit2</A> is the initial
<Index Key="second stage limit">second stage limit</Index>
second stage limit.
The argument <A>Delta</A> is the increment per curve for the first stage
limit. The second stage limit is adjusted appropriately. Defaults are
chosen for all arguments which are omitted. <P/>
<C>FactorsECM</C> recognizes the option <A>ECMDeterministic</A>.
If set, the choice of the curves is deterministic.
This means that in repeated runs of <C>FactorsECM</C> the same
curves are used, and hence for the same <M>n</M> the same
factors are found after the same number of trials. <P/>
The Elliptic Curves Method is based on the fact that exponentiation
in the
<Index Key="elliptic curve groups">elliptic curve groups</Index>
elliptic curve groups <M>E(a,b)/n</M> can be performed fast enough
to compute for example <M>g^{k!}</M> for <M>k</M> large enough
(e.g. 100000 or so) in a reasonable amount of time and without
using much memory, and on Lagrange's Theorem.
Assume that <M>p</M> is a prime divisor of <M>n</M>.
Then Lagrange's Theorem states that if <M>k!</M> is a multiple of
<M>|E(a,b)/p|</M>, then for any
<Index Key="elliptic curve point">elliptic curve point</Index>
elliptic curve point <M>g</M>, the power <M>g^{k!}</M> is the
identity element of <M>E(a,b)/p</M>.
In this situation -- under reasonable circumstances --
the factor <M>p</M> can be determined by taking an appropriate gcd.
<P/>
In practice, the algorithm chooses in some sense <Q>better</Q>
products <M>P_k</M> of small primes rather than <M>k!</M> as exponents.
After reaching the first stage limit with <M>P_{Limit1}</M>, it
considers further products <M>P_{Limit1}q</M> for primes <M>q</M>
up to the second stage limit <A>Limit2</A>, which is usually set equal
to something like 100 times the first stage limit.
The prime <M>q</M> corresponds to the largest prime factor of the
order of the group <M>E(a,b)/p</M>. <P/>
A prime divisor <M>p</M> is usually found if the largest prime factor
of the order of one of the examined elliptic curve groups <M>E(a,b)/p</M>
is at most <A>Limit2</A> and the second-largest one is at most
<A>Limit1</A>. Thus trying a larger number of curves increases the chance
of factoring <A>n</A> as well as choosing a larger value
for <A>Limit1</A> and/or <A>Limit2</A>. It turns out to be not optimal
either to try a large number of curves with very small <A>Limit1</A>
and <A>Limit2</A> or to try only a few curves with very large limits.
(Compare with <Ref Func="FactorsPminus1"
Label="Pollard's p-1"/>.)
<P/>
The elements of the group <M>E(a,b)/n</M> are the points <M>(x,y)</M>
given by the solutions of <M>y^2 = x^3 + ax + by</M> in the residue
class ring (mod <M>n</M>), and an additional point <M>\infty</M>
at infinity, which serves as identity element.
To turn this set into a group, define the product (although elliptic
curve groups are usually written additively, I prefer using the
multiplicative notation here to retain the analogy
to <Ref Func="FactorsPminus1" Label="Pollard's p-1"/> and
<Ref Func="FactorsPplus1" Label="Williams' p+1"/>)
of two points <M>p_1</M> and <M>p_2</M> as follows:
If <M>p_1 \neq p_2</M>, let <M>l</M> be the line through <M>p_1</M>
and <M>p_2</M>, otherwise let <M>l</M> be the tangent to the
curve <M>C</M> given by the above equation in the point
<M>p_1 = p_2</M>. The line <M>l</M> intersects <M>C</M> in a third
point, say <M>p_3</M>. If <M>l</M> does not intersect the curve in
a third affine point, then set <M>p_3 := \infty</M>.
Define <M>p_1 \cdot p_2</M> by the image of <M>p_3</M> under
the reflection across the <M>x</M>-axis.
Define the product of any curve point <M>p</M> and <M>\infty</M> by
<M>p</M> itself. This -- more or less obviously, checking associativity
requires some calculation -- turns the set of points on the given curve
into an abelian group <M>E(a,b)/n</M>. <P/>
However, the calculations are done in
<Index Key="projective coordinates">projective coordinates</Index>
projective coordinates to have an explicit representation of the
identity element and to avoid calculating inverses (mod <M>n</M>)
for the group operation. Otherwise this would require using an
<M>O((log \ n)^3)</M>-algorithm, while multiplication (mod <M>n</M>)
is only <M>O((log \ n)^2)</M>. The corresponding equation is given by
<M>bY^2Z = X^3 + aX^2Z + XZ^2</M>. This form allows even more efficient
computations than the
<Index Key="Weierstrass model">Weierstrass model</Index>
Weierstrass model <M>Y^2Z = X^3 + aXZ^2 + bZ^3</M>, which is the
projective equivalent of the affine representation
<M>y^2 = x^3 + ax + by</M> mentioned above. The algorithm only keeps
track of two of the three coordinates, namely <M>X</M> and <M>Z</M>.
The curves are chosen in a way that ensures the order of the
corresponding group to be divisible by 12. This increases the chance
that it is smooth enough to find a factor of <M>n</M>.
The implementation follows the description of R. P. Brent given in
<Cite Key="Brent96"/>, pp. 5 -- 8. In terms of this paper,
for the second stage the <Q>improved standard continuation</Q> is used.
<Example>
<![CDATA[
gap> FactorsECM(2^256+1,100,10000,1000000,100);
[ [ 1238926361552897,
93461639715357977769163558199606896584051237541638188580280321 ]
, [ ] ]
]]>
</Example>
</Description>
</ManSection>
</Section>
<!-- #################################################################### -->
<Section Label="sec:CFRAC">
<Heading>The Continued Fraction Algorithm (CFRAC)</Heading>
<Index Key="Continued Fraction Algorithm (CFRAC)">
Continued Fraction Algorithm (CFRAC)
</Index>
<ManSection>
<Func Name="FactorsCFRAC" Arg="n"
Label="Continued Fraction Algorithm, CFRAC"/>
<Func Name="CFRAC" Arg="n"
Label="shorthand for FactorsCFRAC"/>
<Returns>
a list of the prime factors of <A>n</A>.
</Returns>
<Description>
This function tries to factor <A>n</A> using the Continued Fraction
Algorithm (CFRAC), also known as Brillhart-Morrison Algorithm.
In case of failure an error is signalled. <P/>
Caution: The run time of this function depends only on the size
of <A>n</A>, and not on the size of the factors.
Thus if a small factor is not found during the preprocessing
which is done before invoking the sieving process, the run time is
as long as if <A>n</A> would have two prime factors of roughly
equal size. <P/>
The Continued Fraction Algorithm tries to find integers <M>x</M>
and <M>y</M> such that <M>x^2 \equiv y^2</M> (mod <M>n</M>),
but not <M>\pm x \equiv \pm y</M> (mod <M>n</M>).
In this situation, taking <M>\gcd(x - y,n)</M> yields a nontrivial
divisor of <M>n</M>. For determining such a pair <M>(x,y)</M>,
the algorithm uses the continued fraction expansion of the square root
of <M>n</M>. If <M>x_i/y_i</M> is a
<Index Key="continued fraction approximation">
continued fraction approximation
</Index>
continued fraction approximation of the square root of <M>n</M>,
then <M>c_i := x_i^2 - ny_i^2</M> is bounded by a small constant times
the square root of <M>n</M>.
The algorithm tries to find as many <M>c_i</M> as possible which factor
completely over a chosen
<Index Key="factor base">factor base</Index>
factor base (a list of small primes) or with only one factor not in the
factor base. The latter ones can be used if and only if a second
<M>c_i</M> with the same <Q>large</Q> factor is found.
<Index Key="factor base" Subkey="large factors">factor base</Index>
Once enough values <M>c_i</M> have been factored, as a final stage
<Index Key="Gaussian Elimination">Gaussian Elimination</Index>
Gaussian Elimination over GF(2) is used to determine which of the
congruences <M>x_i^2 \equiv c_i</M> (mod <M>n</M>) have to be
multiplied together to obtain a congruence of the desired form
<M>x^2 \equiv y^2</M> (mod <M>n</M>).
Let <M>M</M> be the corresponding matrix. Then the entries
of <M>M</M> are given by <M>M_{ij} = 1</M> if an odd power of
the <M>j</M>-th element of the factor base divides the <M>i</M>-th
usable factored value, and <M>M_{ij} = 0</M> otherwise.
To obtain the desired congruence, it is necessary that the rows
of <M>M</M> are linearly dependent.
In other words, this means that the number of factored <M>c_i</M>
needs to be larger than the rank of <M>M</M>, which is
approximately given by the size of the factor base. (Compare
with <Ref Func="FactorsMPQS"
Label="Multiple Polynomial Quadratic Sieve, MPQS"/>.)
<Example>
<![CDATA[
gap> FactorsCFRAC( Factorial(34) - 1 );
[ 10398560889846739639, 28391697867333973241 ]
]]>
</Example>
</Description>
</ManSection>
</Section>
<!-- #################################################################### -->
<Section Label="sec:MPQS">
<Heading>The Multiple Polynomial Quadratic Sieve (MPQS)</Heading>
<Index Key="Multiple Polynomial Quadratic Sieve (MPQS)">
Multiple Polynomial Quadratic Sieve (MPQS)
</Index>
<ManSection>
<Func Name="FactorsMPQS" Arg="n"
Label="Multiple Polynomial Quadratic Sieve, MPQS"/>
<Func Name="MPQS" Arg="n"
Label="shorthand for FactorsMPQS"/>
<Returns>
a list of the prime factors of <A>n</A>.
</Returns>
<Description>
This function tries to factor <A>n</A> using the Single Large Prime
Variation of the Multiple Polynomial Quadratic Sieve (MPQS).
In case of failure an error is signalled. <P/>
Caution: The run time of this function depends only on the size
of <A>n</A>, and not on the size of the factors.
Thus if a small factor is not found during the preprocessing
which is done before invoking the sieving process, the run time is
as long as if <A>n</A> would have two prime factors of roughly
equal size. <P/>
The intermediate results of a computation can be saved by
interrupting it with <C>[Ctrl][C]</C> and calling <C>Pause();</C>
from the break loop. This causes all data needed for resuming
the computation again to be pushed as a record <A>MPQSTmp</A>
on the options stack.
When called again with the same argument <A>n</A>, <C>FactorsMPQS</C>
takes the record from the options stack and continues with the previously
computed factorization data. For continuing the factorization process
in another session, one needs to write this record to a file.
This can be done by the function <C>SaveMPQSTmp(<A>filename</A>)</C>.
The file written by this function can be read by the standard
<C>Read</C>-function of &GAP;. <P/>
The Multiple Polynomial Quadratic Sieve tries to find integers <M>x</M>
and <M>y</M> such that <M>x^2 \equiv y^2</M> (mod <M>n</M>),
but not <M>\pm x \equiv \pm y</M> (mod <M>n</M>).
In this situation, taking <M>\gcd(x - y,n)</M> yields a nontrivial
divisor of <M>n</M>. For determining such a pair <M>(x,y)</M>,
the algorithm chooses polynomials <M>f_a</M> of the form
<M>f_a(r) = ar^2 + 2br + c</M> with suitably chosen coefficients
<M>a</M>, <M>b</M> and <M>c</M> which satisfy <M>b^2 \equiv n</M>
(mod <M>a</M>) and <M>c = (b^2 - n)/a</M>.
The identity <M>a \cdot f_a(r) = (ar + b)^2 - n</M> yields
a congruence (mod <M>n</M>) with a perfect square on one side
and <M>a \cdot f_a(r)</M> on the other. The algorithm uses a sieving
technique similar to the Sieve of Eratosthenes over an appropriately
chosen
<Index Key="sieving interval">sieving interval</Index>
sieving interval to search for factorizations of values <M>f_a(r)</M>
over a chosen factor base. Any two factorizations with the same single
<Q>large</Q> factor which does not belong to the factor base can also be
used. Taking more polynomials and hence shorter sieving intervals has
the advantage of having to factor smaller values <M>f_a(r)</M> over the
factor base. <P/>
Once enough values <M>f_a(r)</M> have been factored, as a final stage
Gaussian Elimination over GF(2) is used to determine which congruences
have to be multiplied together to obtain a congruence of the desired
form <M>x^2 \equiv y^2</M> (mod <M>n</M>). Let <M>M</M> be the
corresponding matrix. Then the entries of <M>M</M> are given by
<M>M_{ij} = 1</M> if an odd power of the <M>j</M>-th element of the
factor base divides the <M>i</M>-th usable factored value, and
<M>M_{ij} = 0</M> otherwise.
To obtain the desired congruence, it is necessary that the rows
of <M>M</M> are linearly dependent.
In other words, this means that the number of usable factorizations of
values <M>f_a(r)</M> needs to be larger than the rank of <M>M</M>.
The latter is approximately equal to the size of the factor base.
(Compare with <Ref Func="FactorsCFRAC"
Label="Continued Fraction Algorithm, CFRAC"/>.)
<Example>
<![CDATA[
gap> FactorsMPQS( Factorial(38) + 1 );
[ 14029308060317546154181, 37280713718589679646221 ]
]]>
</Example>
</Description>
</ManSection>
<Alt Only="HTML"> </Alt>
</Section>
<!-- #################################################################### -->
</Chapter>
<!-- #################################################################### -->
|