Quadratic Sieve

The quadratic sieve is a general purpose factoring method invented by Carl Pomerance in 1981. Invention of quadratic sieve greatly advanced the science of factorization of integers, particularly those considered difficult, those large integers containing exactly two prime factors of approximately the same precision.

Running time of quadratic sieve is dependent on size of integer to be factored, and the quadratic sieve has accomplished some impressive records in the field of factorization of integers.

This page presents some simple examples of the factorization of small integers. Although using the quadratic sieve on small integers is like driving a thumb tack with a sledge hammer, the examples illustrate how the quadratic sieve is implemented.

Introduction

The quadratic sieve attempts to build quadratic congruences of form $x^{2} \equiv y^{2} (\mod N)$ where $N$ is number to be factored.

Many congruences are created:

$x_{1}^{2} \equiv y_{1} (\mod N)$

$x_{2}^{2} \equiv y_{2} (\mod N)$

$x_{3}^{2} \equiv y_{3} (\mod N)$

$\dots \dots \dots$

$x_{10000}^{2} \equiv y_{10000} (\mod N)$

Suitable congruences are combined to produce congruent squares:

$X^{2} \equiv Y^{2} (\mod N)$ where

$X^{2} = x_{2}^{2} \cdot x_{3}^{2} \cdot x_{5}^{2} \dots x_{9999}^{2}$

$Y^{2} = y_{2} \cdot y_{3} \cdot y_{5} \dots y_{9999}$

Then $N ∣ (X^{2} - Y^{2})$ and, with a little luck,

${factor}_{1} = igcd (X + Y, N)$ and

${factor}_{2} = igcd (X - Y, N)$ where $igcd$ is function $integer greatest common divisor.$

Implementation

This exercise calculates prime factors of $N = 5137851827 .$

Build Factor base

Let the Factor Base contain 40 small primes, the first 40 for which $N$ is a quadratic residue.

[2, 11, 13, 17, 19, 23, 37, 41, 43, 59, 73, 103, 107, 113, 127, 131, 137, 149, 163, 179, 181, 
191, 197, 199, 211, 223, 233, 241, 251, 271, 281, 293, 307, 311, 317, 347, 367, 373, 379, 383]

Dictionary with logarithms of primes

Create a table or dictionary containing primes and logarithms of primes. Logarithms to base 2 are calculated. The reason for choosing base $2$ will become apparent later.

$x$	${log}_{2} (x) = \frac{\ln (x)}{\ln (2)}$
$2$	$1.0$
$11$	$3.4594316186372973$
$13$	$3.700439718141092$
$17$	$4.087462841250339$
$19$	$4.247927513443585$
$23$	$4.523561956057013$
$37$	$5.20945336562895$
$41$	$5.357552004618084$
$43$	$5.426264754702098$
$59$	$5.882643049361842$
$73$	$6.189824558880018$
$103$	$6.6865005271832185$
$107$	$6.741466986401147$
$113$	$6.820178962415188$
$127$	$6.9886846867721655$
$131$	$7.03342300153745$
$137$	$7.098032082960526$
$149$	$7.219168520462161$
$163$	$7.348728154231077$
$179$	$7.483815777264256$
$181$	$7.499845887083206$
$191$	$7.577428828035749$
$197$	$7.622051819456376$
$199$	$7.636624620543649$
$211$	$7.721099188707185$
$223$	$7.800899899920305$
$233$	$7.864186144654281$
$241$	$7.912889336229962$
$251$	$7.971543553950772$
$271$	$8.08214904135387$
$281$	$8.134426320220927$
$293$	$8.194756854422248$
$307$	$8.262094845370179$
$311$	$8.280770770130603$
$317$	$8.308339030139408$
$347$	$8.43879185257826$
$367$	$8.519636252843213$
$373$	$8.543031820255237$
$379$	$8.566054038171092$
$383$	$8.581200581924957$

Initialize sieve

In a regular sieve fine material falls through sieve and coarse material is retained. In this sieve coarse material falls through sieve and fine material is retained.

The expression "coarse material" refers to integers containing a small number of large primes, the great majority of integers.

The expression "fine material" refers to integers containing a large number of small primes, primes that are members of this factor base, or "smooth" according to this factor base.

The base or root of operations is the smallest value of $x$ for which $y = x^{2} - N$ is a positive number.

$r o o t = x = 71679, y = 27214 .$

# python code:

root = 71679

sieve = dict()

sieve[root] = [2]

for prime in factorBase[1:] :
    count = 0
    for x in range (root, root+2*(prime+2)) :
        y = x**2 - N
        if (y % prime) == 0 :
            if x in sieve : sieve[x] += [prime]
            else :  sieve[x] = [prime]
            count += 1
        if count >= 2 : break
    if count != 2 : print ('Internal error 1.')

L1 = sorted([p for p in sieve])

print()
print ('Initialized sieve contains', len(sieve),
       'keys in range from', L1[0], 'through', L1[-1],
       'or', L1[-1]-L1[0]+1, 'possibilities.')

Initialized sieve contains 65 keys in range from 71679 through 71954 or 276 possibilities.

for key in L1[::-1] :
    print (key, '|', sieve[key])

$x$	$Factors of y = x^{2} - N$
$71954$	$[379]$
$71946$	$[293]$
$71939$	$[311]$
$71917$	$[293]$
$71915$	$[373]$
$71906$	$[281]$
$71876$	$[211, 307]$
$71864$	$[317]$
$71852$	$[191]$
$71850$	$[241]$
$71849$	$[223, 383]$
$71847$	$[179]$
$71841$	$[347]$
$71839$	$[251]$
$71830$	$[271]$
$71824$	$[163]$
$71820$	$[149]$
$71817$	$[347]$
$71816$	$[149]$
$71815$	$[211]$
$71813$	$[181]$
$71806$	$[367]$
$71804$	$[131]$
$71800$	$[271, 307]$
$71786$	$[241]$
$71780$	$[191]$
$71779$	$[163]$
$71777$	$[113]$
$71776$	$[383]$
$71773$	$[199]$
$71772$	$[131]$
$71771$	$[233]$
$71763$	$[223]$
$71760$	$[107, 127]$
$71758$	$[103]$
$71757$	$[233]$
$71751$	$[73]$
$71750$	$[127]$
$71743$	$[311]$
$71737$	$[317]$
$71733$	$[113, 137, 251]$
$71727$	$[107]$
$71721$	$[43, 103]$
$71720$	$[181]$
$71717$	$[59]$
$71716$	$[41]$
$71715$	$[37]$
$71713$	$[197]$
$71712$	$[59]$
$71711$	$[179]$
$71706$	$[137, 199]$
$71703$	$[197]$
$71702$	$[41]$
$71697$	$[37]$
$71694$	$[73]$
$71692$	$[23]$
$71691$	$[367]$
$71690$	$[17, 23, 373]$
$71688$	$[17, 19]$
$71687$	$[379]$
$71686$	$[19]$
$71685$	$[281]$
$71684$	$[11, 13, 43]$
$71680$	$[13]$
$71679$	$[2, 11]$

Activate sieve

While information in table above is accurate, it is incomplete. Processing of sieve attempts to find values for which information is complete.

Processing locations in sequence

Step 1

The code examines location 71679. Product of factors 2,11 = 22, not enough to provide complete factorization of $y = x^{2} - N .$

Location 71679+2 or 71681 is initialized with [2] and location 71679+11 or 71690 is appended by [11]. Location 71679 is deleted.

$x$	$Factors of y = x^{2} - N$
$71697$	$[37]$
$71694$	$[73]$
$71692$	$[23]$
$71691$	$[367]$
$71690$	$[17, 23, 373, 11] # Factor 11 appended.$
$71688$	$[17, 19]$
$71687$	$[379]$
$71686$	$[19]$
$71685$	$[281]$
$71684$	$[11, 13, 43]$
$71681$	$[2] # Location initialized.$
$71680$	$[13]$

Step 2

The code examines location 71680. Factor 13 is not enough to provide complete factorization of $y = x^{2} - N .$

Location 71680+13 or 71693 is initialized with [13]. Location 71680 is deleted.

$x$	$Factors of y = x^{2} - N$
$71697$	$[37]$
$71694$	$[73]$
$71693$	$[13] # Location initialized.$
$71692$	$[23]$
$71691$	$[367]$
$71690$	$[17, 23, 373, 11]$
$71688$	$[17, 19]$
$71687$	$[379]$
$71686$	$[19]$
$71685$	$[281]$
$71684$	$[11, 13, 43]$
$71681$	$[2]$

Step 3

The code examines location 71681. Factor 2 is not enough to provide complete factorization of $y = x^{2} - N .$

Location 71681+2 or 71683 is initialized with [2]. Location 71681 is deleted.

$x$	$Factors of y = x^{2} - N$
$71697$	$[37]$
$71694$	$[73]$
$71693$	$[13]$
$71692$	$[23]$
$71691$	$[367]$
$71690$	$[17, 23, 373, 11]$
$71688$	$[17, 19]$
$71687$	$[379]$
$71686$	$[19]$
$71685$	$[281]$
$71684$	$[11, 13, 43]$
$71683$	$[2] # Location initialized.$

Step 4

Location 71682 does not exist in sieve. The code examines location 71683. Factor 2 is not enough to provide complete factorization of $y = x^{2} - N .$

Location 71683+2 or 71685 is appended by [2]. Location 71683 is deleted.

$x$	$Factors of y = x^{2} - N$
$71697$	$[37]$
$71694$	$[73]$
$71693$	$[13]$
$71692$	$[23]$
$71691$	$[367]$
$71690$	$[17, 23, 373, 11]$
$71688$	$[17, 19]$
$71687$	$[379]$
$71686$	$[19]$
$71685$	$[281, 2] # Factor 2 appended.$
$71684$	$[11, 13, 43]$

Processing continues in this way until location 71690 is examined. Factors [17, 23, 373, 11] are complete factorization of $y = x^{2} - N .$ Values [ 71690 , [17, 23, 373, 11] ] are appended to array "smooth."

Sieving continues until array "smooth" contains 45 members, at which time sieving is complete.

To improve speed

There are very few values of $y$ that are smooth. Therefore code is designed to discard unsuitable values of $x$ quickly, and to make other decisions quickly.

Decisions to discard unsuitable candidates are:

if x does not exist in sieve.

if there is only 1 prime for location x. Complete factorization of y requires at least 2 primes.

Code intended to reduce processing time: Template:RoundBoxTop $y = x^{2} - N$

$y = (r o o t + a)^{2} - N$

$y = r o o t^{2} + 2 \cdot a \cdot r o o t + a^{2} - N$

$y = r o o t^{2} - N + 2 \cdot a \cdot r o o t + a^{2}$

$y = rootSquaredMinusN + a(twoRoot + a)$

Multiplication in last statement above is of smaller numbers than calculation of $y = x^{2} - N .$ As size of $N$ increases, last statement above becomes more efficient. Template:RoundBoxBottom

Template:RoundBoxTop Divisions and multiplications are time consuming. Instead of multiplying primes or dividing $y = x^{2} - N$ to determine smoothness, logarithms of primes are added and compared to $\log (y) .$ Calculation of $\ln (y)$ or $\log (y)$ to base $10$ is time consuming. Because $y$ is type $int,$ a good approximation of $\log (y)$ to base $2$ can be calculated quickly. As it happens, in this application "close enough" is good enough.

$\log_{2} (37)$ for example:

As binary, $37 =^{'} 10010 1^{'}$

Length of $10010 1^{'}$ is $6 .$ Therefore $\log_{2} (37) = 5 .$

$\log_{2} (53)$ for example:

As binary, $53 =^{'} 11010 1^{'}$

Length of $11010 1^{'}$ is $6 .$ Therefore $\log_{2} (53) = 5 .$

However, second most significant bit is set. Therefore $\log_{2} (53) = 5 + 1 = 6 .$

$x$	$l o g_{2} (x)$
$1$	$0$
$2$	$1$
$6 > x \geq 3$	$2$
$12 > x \geq 6$	$3$
$24 > x \geq 12$	$4$
$48 > x \geq 24$	$5$
$96 > x \geq 48$	$6$
$(3 < < (k + 1)) > x \geq (3 < < k)$	$k + 2$

Template:RoundBoxBottom

Sieving complete

Python code that accomplishes the sieving is:

# python code.

limit = 10000
rootSquaredMinusN = root**2 - N
twoRoot = root << 1
notin = count1 = count = 0
smooth = []
numberOfSmoothDesired = 45
for x in range (root, root+limit) :
    if x & 0xFF == 0 :
        '''
This code is added to provide information about sieve
during sieving.
'''
        L1 = sorted([ p for p in sieve ])
        range_ = L1[-1] - L1[0] + 1
        print ('For x =',x, 'Range of sieve =',range_, 'number of empty locations =', range_ - len(L1) )
    if x not in sieve :
        notin += 1
        continue

    primes = sieve[x]
    for prime in primes :
        v = x + prime
        if v in sieve : sieve[v] += [prime]
        else : sieve[v] = [prime]
    del (sieve[x])

    if len(primes) == 1 :
        count1 += 1 ; continue                 
    a = x - root
    y = rootSquaredMinusN + a*(twoRoot + a)

    sumOfLogs = log[primes[0]] + log[primes[1]]
    for prime in primes[2:] : sumOfLogs += log[prime]

    log_y = logBase2(y)
    if sumOfLogs >= 0.95*log_y :
        count += 1
        product = primes[0] * primes[1]
        for prime in primes[2:] : product *= prime
        if y == product :
            smooth += [[x,primes]]
            if len(smooth) >= numberOfSmoothDesired : break
            
print ('Range of values of x =', x-root+1)
print ('Number of empty locations in sieve =', notin)
print ('Number of locations with only 1 prime =', count1)
print ('Number of locations tested =', x-root+1-notin-count1)
print ('Number of locations chosen for close examination =', count)
print ('Number of values of x retained after close examination =', len(smooth))

$x$	$Range of sieve$	$Number of empty locations$
$71680$	$275$	$210$
$71936$	$353$	$288$
$72192$	$350$	$287$
$72448$	$377$	$312$
$72704$	$331$	$266$
$72960$	$349$	$286$
$73216$	$366$	$305$
$73472$	$378$	$314$
$73728$	$347$	$276$
$73984$	$356$	$287$
$74240$	$368$	$306$
$74496$	$345$	$280$
$74752$	$358$	$292$
$75008$	$358$	$293$
$75264$	$342$	$278$
$75520$	$336$	$272$

Thousands of values of $x$ are tested. At any one time the range of values of $x$ in sieve is $< 400$ and sieve is mostly empty.

Range of values of x = 3922
Number of empty locations in sieve = 585
Number of locations with only 1 prime = 1331
Number of locations tested = 2006
Number of locations chosen for close examination = 45
Number of values of x retained after close examination = 45

Array of smooth values

Array "smooth" contains 45 members:

[71690, [17, 23, 373, 11]]
[71706, [137, 199, 13, 11]]
[71733, [113, 137, 251, 2]]
[71771, [233, 59, 37, 13, 2]]
[71800, [271, 307, 19, 11]]
[71824, [163, 103, 73, 17]]
[71849, [223, 383, 13, 11, 2]]
[71876, [211, 307, 23, 19]]
[72030, [103, 41, 37, 19, 17]]
[72066, [379, 131, 59, 19]]
[72071, [271, 59, 43, 41, 2]]
[72237, [233, 211, 43, 19, 2]]
[72268, [241, 163, 127, 17]]
[72372, [271, 43, 41, 19, 11]]
[72425, [367, 191, 59, 13, 2]]
[72509, [241, 211, 107, 11, 2]]
[72659, [211, 41, 37, 17, 13, 2]]
[72750, [241, 113, 23, 19, 13]]
[72809, [373, 281, 41, 19, 2]]
[72863, [149, 113, 23, 17, 13, 2]]
[73246, [163, 113, 59, 19, 11]]
[73279, [197, 179, 23, 13, 11, 2]]
[73322, [317, 179, 19, 17, 13]]
[73411, [307, 293, 127, 11, 2]]
[73472, [113, 107, 103, 19, 11]]
[73576, [347, 73, 43, 23, 11]]
[73868, [233, 131, 73, 13, 11]]
[73898, [163, 137, 37, 23, 17]]
[73900, [131, 107, 59, 23, 17]]
[73968, [293, 271, 19, 17, 13]]
[74176, [73, 41, 37, 23, 13, 11]]
[74427, [311, 127, 23, 17, 13, 2]]
[74435, [281, 181, 107, 37, 2]]
[74544, [127, 59, 23, 17, 13, 11]]
[74583, [293, 137, 37, 13, 11, 2]]
[74671, [127, 113, 73, 19, 11, 2]]
[74890, [199, 181, 179, 73]]
[75052, [271, 197, 127, 73]]
[75247, [251, 163, 149, 43, 2]]
[75252, [233, 211, 181, 59]]
[75365, [379, 163, 107, 41, 2]]
[75433, [293, 181, 127, 41, 2]]
[75462, [293, 113, 43, 23, 17]]
[75554, [223, 199, 43, 23, 13]]
[75600, [191, 37, 23, 19, 17, 11]]

If you look closely at contents of array "smooth," you notice that primes $383, 367, 317, 347, 311$ are used exactly 1 time for each prime. Therefore, locations $71849, 72425, 73322, 73576, 74427$ are not usable.

This section removes members containing these values of $x$ from array "smooth." Unfortunately, the removal of these values of $x$ causes other members to become unusable. The process continues until array "smooth" is ready, meaning that each prime in array "smooth" appears at least 2 times.

Removing unusable primes from smooth:
unusable = [383, 367, 317, 347, 311]
unusable = [223, 191]
len(smooth) = 38

The final version of array "smooth" contains 38 members:

[71690, [17, 23, 373, 11]]
[71706, [137, 199, 13, 11]]
[71733, [113, 137, 251, 2]]
[71771, [233, 59, 37, 13, 2]]
[71800, [271, 307, 19, 11]]
[71824, [163, 103, 73, 17]]
[71876, [211, 307, 23, 19]]
[72030, [103, 41, 37, 19, 17]]
[72066, [379, 131, 59, 19]]
[72071, [271, 59, 43, 41, 2]]
[72237, [233, 211, 43, 19, 2]]
[72268, [241, 163, 127, 17]]
[72372, [271, 43, 41, 19, 11]]
[72509, [241, 211, 107, 11, 2]]
[72659, [211, 41, 37, 17, 13, 2]]
[72750, [241, 113, 23, 19, 13]]
[72809, [373, 281, 41, 19, 2]]
[72863, [149, 113, 23, 17, 13, 2]]
[73246, [163, 113, 59, 19, 11]]
[73279, [197, 179, 23, 13, 11, 2]]
[73411, [307, 293, 127, 11, 2]]
[73472, [113, 107, 103, 19, 11]]
[73868, [233, 131, 73, 13, 11]]
[73898, [163, 137, 37, 23, 17]]
[73900, [131, 107, 59, 23, 17]]
[73968, [293, 271, 19, 17, 13]]
[74176, [73, 41, 37, 23, 13, 11]]
[74435, [281, 181, 107, 37, 2]]
[74544, [127, 59, 23, 17, 13, 11]]
[74583, [293, 137, 37, 13, 11, 2]]
[74671, [127, 113, 73, 19, 11, 2]]
[74890, [199, 181, 179, 73]]
[75052, [271, 197, 127, 73]]
[75247, [251, 163, 149, 43, 2]]
[75252, [233, 211, 181, 59]]
[75365, [379, 163, 107, 41, 2]]
[75433, [293, 181, 127, 41, 2]]
[75462, [293, 113, 43, 23, 17]]

The Matrix

Prime factors of y = x^2 - N

Assign a unique bit to each prime number in factor base:

$Factor$	$Unique bit assigned.$
$2$	$0000000000000000000000000000000000000001$
$11$	$0000000000000000000000000000000000000010$
$13$	$0000000000000000000000000000000000000100$
$17$	$0000000000000000000000000000000000001000$
$19$	$0000000000000000000000000000000000010000$
$23$	$0000000000000000000000000000000000100000$
$\dots \dots \dots$	$\dots \dots \dots$
$317$	$0000010000000000000000000000000000000000$
$347$	$0000100000000000000000000000000000000000$
$367$	$0001000000000000000000000000000000000000$
$373$	$0010000000000000000000000000000000000000$
$379$	$0100000000000000000000000000000000000000$
$383$	$1000000000000000000000000000000000000000$

Values of x in smooth array

Assign a unique bit to each value of $x$ in smooth array:

$Value of x$	$Unique bit assigned.$
$71690$	$00000000000000000000000000000000000001$
$71706$	$00000000000000000000000000000000000010$
$71733$	$00000000000000000000000000000000000100$
$71771$	$00000000000000000000000000000000001000$
$71800$	$00000000000000000000000000000000010000$
$71824$	$00000000000000000000000000000000100000$
$\dots \dots \dots$	$\dots \dots \dots$
$75052$	$00000100000000000000000000000000000000$
$75247$	$00001000000000000000000000000000000000$
$75252$	$00010000000000000000000000000000000000$
$75365$	$00100000000000000000000000000000000000$
$75433$	$01000000000000000000000000000000000000$
$75462$	$10000000000000000000000000000000000000$

Create Matrix

The matrix contains patterns representing values of $x$ and patterns representing the corresponding prime factors of $y = x^{2} - N .$

$Pattern representing values of x$	$Pattern representing factors of y .$
$00000000000000000000000000000000000001$	$0010000000000000000000000000000000101010$
$00000000000000000000000000000000000010$	$0000000000000000100000010000000000000110$
$00000000000000000000000000000000000100$	$0000000000010000000000010010000000000001$
$00000000000000000000000000000000001000$	$0000000000000100000000000000001001000101$
$00000000000000000000000000000000010000$	$0000000100100000000000000000000000010010$
$00000000000000000000000000000000100000$	$0000000000000000000001000000110000001000$
$00000000000000000000000000000001000000$	$0000000100000001000000000000000000110000$
$00000000000000000000000000000010000000$	$0000000000000000000000000000100011011000$
$00000000000000000000000000000100000000$	$0100000000000000000000001000001000010000$
$00000000000000000000000000001000000000$	$0000000000100000000000000000001110000001$
$00000000000000000000000000010000000000$	$0000000000000101000000000000000100010001$
$00000000000000000000000000100000000000$	$0000000000001000000001000100000000001000$
$00000000000000000000000001000000000000$	$0000000000100000000000000000000110010010$
$00000000000000000000000010000000000000$	$0000000000001001000000000001000000000011$
$00000000000000000000000100000000000000$	$0000000000000001000000000000000011001101$
$00000000000000000000001000000000000000$	$0000000000001000000000000010000000110100$
$00000000000000000000010000000000000000$	$0010000001000000000000000000000010010001$
$00000000000000000000100000000000000000$	$0000000000000000000000100010000000101101$
$00000000000000000001000000000000000000$	$0000000000000000000001000010001000010010$
$00000000000000000010000000000000000000$	$0000000000000000010010000000000000100111$
$00000000000000000100000000000000000000$	$0000000110000000000000000100000000000011$
$00000000000000001000000000000000000000$	$0000000000000000000000000011100000010010$
$00000000000000010000000000000000000000$	$0000000000000100000000001000010000000110$
$00000000000000100000000000000000000000$	$0000000000000000000001010000000001101000$
$00000000000001000000000000000000000000$	$0000000000000000000000001001001000101000$
$00000000000010000000000000000000000000$	$0000000010100000000000000000000000011100$
$00000000000100000000000000000000000000$	$0000000000000000000000000000010011100110$
$00000000001000000000000000000000000000$	$0000000001000000000100000001000001000001$
$00000000010000000000000000000000000000$	$0000000000000000000000000100001000101110$
$00000000100000000000000000000000000000$	$0000000010000000000000010000000001000111$
$00000001000000000000000000000000000000$	$0000000000000000000000000110010000010011$
$00000010000000000000000000000000000000$	$0000000000000000100110000000010000000000$
$00000100000000000000000000000000000000$	$0000000000100000010000000100010000000000$
$00001000000000000000000000000000000000$	$0000000000010000000001100000000100000001$
$00010000000000000000000000000000000000$	$0000000000000101000100000000001000000000$
$00100000000000000000000000000000000000$	$0100000000000000000001000001000010000001$
$01000000000000000000000000000000000000$	$0000000010000000000100000100000010000001$
$10000000000000000000000000000000000000$	$0000000010000000000000000010000100101000$

Process matrix

                   Pattern representing values of x   |           Pattern representing factors of y
------------------------------------------------------|--------------------------------------------------
Line 1:   x1 = 00000000000000000000000000000000000001 |  f1 = 001000000000000000000000000000000010 1 010
               00000000000000000000000000000000000010 |       000000000000000010000001000000000000 0 110
               00000000000000000000000000000000000100 |       000000000001000000000001001000000000 0 001
               00000000000000000000000000000000001000 |       000000000000010000000000000000100100 0 101
               00000000000000000000000000000000010000 |       000000010010000000000000000000000001 0 010
Line 2:   x2 = 00000000000000000000000000000000100000 |  f2 = 000000000000000000000100000011000000 1 000
               00000000000000000000000000000001000000 |       000000010000000100000000000000000011 0 000
Line 3:   x3 = 00000000000000000000000000000010000000 |  f3 = 000000000000000000000000000010001101 1 000
               00000000000000000000000000000100000000 |       010000000000000000000000100000100001 0 000
               00000000000000000000000000001000000000 |       000000000010000000000000000000111000 0 001
               00000000000000000000000000010000000000 |       000000000000010100000000000000010001 0 001
Line 4:   x4 = 00000000000000000000000000100000000000 |  f4 = 000000000000100000000100010000000000 1 000
               00000000000000000000000001000000000000 |       000000000010000000000000000000011001 0 010
               00000000000000000000000010000000000000 |       000000000000100100000000000100000000 0 011
Line 5:   x5 = 00000000000000000000000100000000000000 |  f5 = 000000000000000100000000000000001100 1 101
               00000000000000000000001000000000000000 |       000000000000100000000000001000000011 0 100
               00000000000000000000010000000000000000 |       001000000100000000000000000000001001 0 001
Line 6:   x6 = 00000000000000000000100000000000000000 |  f6 = 000000000000000000000010001000000010 1 101
               00000000000000000001000000000000000000 |       000000000000000000000100001000100001 0 010
               00000000000000000010000000000000000000 |       000000000000000001001000000000000010 0 111
               00000000000000000100000000000000000000 |       000000011000000000000000010000000000 0 011
               00000000000000001000000000000000000000 |       000000000000000000000000001110000001 0 010
               00000000000000010000000000000000000000 |       000000000000010000000000100001000000 0 110
Line 7:   x7 = 00000000000000100000000000000000000000 |  f7 = 000000000000000000000101000000000110 1 000
Line 8:   x8 = 00000000000001000000000000000000000000 |  f8 = 000000000000000000000000100100100010 1 000
Line 9:   x9 = 00000000000010000000000000000000000000 |  f9 = 000000001010000000000000000000000001 1 100
               00000000000100000000000000000000000000 |       000000000000000000000000000001001110 0 110
               00000000001000000000000000000000000000 |       000000000100000000010000000100000100 0 001
Line 10: x10 = 00000000010000000000000000000000000000 | f10 = 000000000000000000000000010000100010 1 110
               00000000100000000000000000000000000000 |       000000001000000000000001000000000100 0 111
               00000001000000000000000000000000000000 |       000000000000000000000000011001000001 0 011
               00000010000000000000000000000000000000 |       000000000000000010011000000001000000 0 000
               00000100000000000000000000000000000000 |       000000000010000001000000010001000000 0 000
               00001000000000000000000000000000000000 |       000000000001000000000110000000010000 0 001
               00010000000000000000000000000000000000 |       000000000000010100010000000000100000 0 000
               00100000000000000000000000000000000000 |       010000000000000000000100000100001000 0 001
               01000000000000000000000000000000000000 |       000000001000000000010000010000001000 0 001
Line 11: x11 = 10000000000000000000000000000000000000 | f11 = 000000001000000000000000001000010010 1 000
                                                                                                   ^
                                           Lines designated with a line number all have bit 3 set: ^

Last line of matrix $Line 11$ becomes the source of the next operation. $Line 11$ has $bit 3$ set. Targets of the operation are all other lines with $bit 3$ set, lines designated as $Line 1$ through $Line 10.$

$Line 11$ is combined in turn with $Line 10, Line 9, Line 8, \dots Line 2, Line 1,$ by means of operation exclusive or, thus:

x10 = x10 ^ x11
f10 = f10 ^ f11

x9 = x9 ^ x11
f9 = f9 ^ f11

...............

x1 = x1 ^ x11
f1 = f1 ^ f11

$Line 11$ is deleted and $bit 3$ has been removed from all values on Right Hand Side of matrix.

The process is repeated. When a zero value is found on Right Hand Side of matrix, it means that this value is a perfect square.

Produce factors of N

Processing the matrix revealed 7 values of X that produce a perfect square on Right Hand Side of congruence.

Perfect square #1

00000000010000000001001000100000000000

The 4 bits set in this pattern represent values of x : [72268, 72750, 73246, 74544]

X = 72268 * 72750 * 73246 * 74544 = 28706195569530528000

These 4 values of x produce a value on RHS of congruence containing factors:

[11, 11, 13, 13, 17, 17, 19, 19, 23, 23, 59, 59, 113, 113, 127, 127, 163, 163, 241, 241]

Every factor in this list appears an even number of times.

Therefore, Y is product of factors: [11, 13, 17, 19, 23, 59, 113, 127, 163, 241]

Y = 35335010025681509

Using $p1 = iGCD(X+Y, N)$ and $p2 = iGCD(X-Y, N)$

p1,p2 = 1, 5137851827

This congruence produced trivial factors of N.

Perfect square #2

01010000010010001000001111010010000000

The 11 bits set in this pattern represent values of x :

[72030, 72237, 72372, 72509, 72659, 72750, 73472, 73968, 74544, 75252, 75433]

X = product of values of x = 331906592471738688342554821695566634067059049758720000

From these 11 values of x:

Y = product of factors:

[2, 2, 11, 11, 13, 13, 17, 17, 19, 19, 19, 23, 37, 41, 41, 43, 
59, 103, 107, 113, 127, 181, 211, 211, 233, 241, 271, 293]

Y = 3343990074727707345948316157765706289327953268

Using $p1 = iGCD(X+Y, N)$ and $p2 = iGCD(X-Y, N)$

p1,p2 = 89123, 57649

This congruence produced non-trivial factors of N.

Example #2

This example presents a more advanced implementation of the quadratic sieve because it includes the following features:

Use of powers of primes.

Use of primes not in factor base, "large" primes.

Participation in a distributed computing effort.

Past attempts to factor large integers were successful only because many computers contributed to the production of "smooth" values of $y .$

One way to organize a distributed computing effort is to assign different tasks to many different computers so that no computer duplicates the work of another. For example, different computers can be assigned the following tasks:

Smooth values for $y = x^{2} - N$ with $y$ positive.

Smooth values for $y = x^{2} - N$ with $y$ negative.

Smooth values for $y = x^{2} - 2 \cdot N$ with $y$ positive.

Smooth values for $y = x^{2} - 2 \cdot N$ with $y$ negative.

This example assumes the role of computer assigned:

Smooth values for $y = x^{2} - 4 \cdot N$ with $y$ negative.

To ensure that this task does not duplicate other work, values of $x$ are all odd.

Let $n = 147573952589676412927 .$

Then $N = 4 \cdot n .$

$base = 24296004001$ where $base$ is an odd integer very close to $\sqrt{N} .$

Template:RoundBoxTop Because $N$ is even and $x$ is always odd, $y = x^{2} - N$ is always odd.

Consequently, prime number $2$ is not in factor base. Template:RoundBoxBottom

Factor base

Create the factor base, a list containing 230 small primes for which $N$ is a quadratic residue:

$[3, 7, 13, 23, 37, 41, 53, 61, 67, \dots \dots 3121, 3137, 3167, 3203, 3217, 3251]$

You may notice that primes $5, 11, 17, 19, \dots$ do not exist in this factor base.

Tables of logarithms

Create a dictionary of logarithms, called $logs$ :

$x$	${log}_{2} (x) = \frac{\ln (x)}{\ln (2)}$
$3$	$1.58496$
$7$	$2.80735$
$13$	$3.70044$
$23$	$4.52356$
$\dots$	$\dots \dots$
$3167$	$11.6289$
$3203$	$11.64521$
$3217$	$11.6515$
$3251$	$11.66667$

For example, ${log}_{2} (3251) = logs[3251] = 11.66667 .$

Create a dictionary of antilogarithms, called $aLogs$ :

${log}_{2} (x)$	$x$
$1.58496$	$3$
$2.80735$	$7$
$3.70044$	$13$
$4.52356$	$23$
$\dots \dots$	$\dots$
$11.6289$	$3167$
$11.64521$	$3203$
$11.6515$	$3217$
$11.66667$	$3251$

For example, desired prime $= aLogs[11.6515] = 3217 .$

Create sieve

Preparation

The following code adds to sieve information concerning powers of primes.

A closer look at this information will be presented later.

# python code

sieve = dict()

def powers_of_primes (p,power,L1) :
    '''
x1,x2 = powers_of_primes (p,power,[xa,xb])
This function may return None.

If power == 2 :
L1 contains 2 values of x that satisfy: x**2 /// N (mod p)
This function calculates the values of x for which:
x**2 /// N (mod (p**2))

If power == 3 :
L1 contains 2 values of x that satisfy: x**2 /// N (mod (p**2))
This function calculates the values of x for which:
x**2 /// N (mod (p**3))

and so on.

The method presented here, while simple and accurate, is
suitable for only small primes. As the size of p increases,
this method becomes impractical.
'''
    thisName = 'powers_of_primes() :'
    P_ = p**(power-1)
    P = P_ * p

    if P > 1_000_000 : return None

    L2 = []
    for x_ in L1 :
# Verify that x_ is valid.
        y = x_**2 - N
        if y % P_ :
            print (thisName,'Error 1 for x,p =',x_,p)
            return None
        for k in range(0,p) :
            x = x_ - k * P_
            y = x**2-N
            if y%P : continue
            if not (x&1) : x -= P # x must be odd.
            if not (x&1) :
                print ('error1a') ; continue
            y = x**2 - N
            if y%P :
                print ('error2') ; continue
# Every increment that goes into sieve must be even because:
# Odd x + even increment gives odd x.
            if x in sieve :
# Value of prime p is not included here. It will be recovered later.
# Hence the requirement for dictionary aLogs.
                sieve[x] += [(P<<1,logs[p])]
            else :
                sieve[x] = [(P<<1,logs[p])]
            L2 += [x]
            break
    if len(L2) != 2 :
        print (thisName, 'error2a, len(L2) =', len(L2))
        return None
    return L2

Initialize sieve

# python code

for p in factorBase :
    modulo = N%p
    for q in range (p>>1, 0, -1) :
        y = q**2 - modulo
        if y%p : continue

        base1 = base - (base%p)
        L1 = []
        for x in (base1 + q, base1 - q) :
            if not (x&1) : x -= p # Make x odd.
            if not (x&1) : print ('error3')
            y = x**2 - N
            if y%p :
                print ('error4') ; continue
# Every increment that goes into sieve must be even because:
# Odd x + even increment gives odd x.
            if x in sieve :
                sieve[x] += [(p<<1,logs[p])]
            else :
                sieve[x] = [(p<<1,logs[p])]
            L1 += [x]
        break
        
    last = L1
    for power in range (2,14) :
        this = powers_of_primes(p,power,last)
        if this == None : break
        last = this

Initialized sieve contains 624 valid locations.

Check sieve

It is wise to check entries in sieve because an error in initialized sieve can cause many problems later.

First entry in sieve is :

$sieve [24296004819] = [(6166, 11.59012)]$

Value $11.59012 = {log}_{2} (3083)$ and $6166 = 2 * 3083 .$

Value $6166$ is the decrement, always even.

Decrement must be exactly divisible by associated prime.

Value $24296004819 = x$ and $(x^{2} - N) % prime$ must be $0 .$

The other entry containing this prime is:

$sieve [24296002923] = [(3902, 10.93), (6166, 11.59012)]$

There must be exactly 2 instances of each decrement, and difference between associated values of $x$ must not be exactly divisible by decrement.

Here is an entry for prime $3 :$

$sieve [24294998077] = [(1062882, 1.58496)] .$ Value $1.58496 = {log}_{2} (3)$

Decrement $= 1062882 = 2 * 531441$ and $531441 = 3^{12} .$

This decrement appears in the sieve exactly 2 times, at $24294998077$ and $24295715435,$ and $(24294998077 - 24295715435) % 1062882$ is non-zero, correct.

Check: $(2429499807 7^{2} - N) % (3^{12}) .$ This calculation $= 0,$ which is correct.

In this way, all entries in sieve are checked.

Activate sieve

# python code.
print ('''
Activate sieve
''')
log_of_maximum_prime2 = 0.9*2*logs[factorBase[-1]]
smooth = []
last_log = 0

# (x_ + a)**2 = x_**2 + 2ax_ + a**2
# (x_ + a)**2 - N = x_**2-N + 2ax_ + a**2
# (x_ + a)**2 - N = x_**2-N + a(2x_ + a)
# x_ = base
base_squared_minus_N = base**2 - N
two_times_base = 2*base

count1 = count2 = count3 = 0

for x in range (max, max-(2*(10**5)), -2) :
    if (x & 0x7FFF) == 1 :
# This piece of code is not normally required.
# It is included here to provide some statistics about sieve during sieving.
        L1 = sorted([ p for p in sieve ])
        min_ = L1[0] ; max_ = L1[-1] ; used = len(L1)
        print ('''
max = {}
min = {}
range of sieve = {}
number of values used = {}
'''.format(max_, min_, 1+max_-min_, len(sieve))
)

    if x not in sieve :
        count1 += 1 ; continue

    values = list(sieve[x]) ; del (sieve[x])

    sum_of_logs = 0
    for v in values :
        increment, log2_ = v
        next = x-increment
        if next in sieve : sieve[next] += [v]
        else : sieve[next] = [v]
        sum_of_logs += log2_

    if sum_of_logs < last_log :
        count2 += 1 ; continue

    a = x - base ;
    y = base_squared_minus_N + a*(two_times_base + a)
    logy = logBase2 (abs(y))
# this_log is < logy to allow inclusion of "large" prime.
    last_log = this_log = logy-log_of_maximum_prime2
    if sum_of_logs < this_log :
        count3 += 1 ; continue
    smooth += [[x,values]]

The following data are representative of statistics collected during sieving.

max = 24295997441
min = 24294230499
range of sieve = 1766943
number of values used = 601

Range of values in sieve was about 1.75 million, and number of entries in sieve at any time was about 600. Template:RoundBoxTop Within this range of values of $x$ only odd values are used. Template:RoundBoxBottom

# python code.

print ('''
Sieving complete:

not in sieve = count1 = {}
(sum_of_logs < last_log) = count2 = {}
(sum_of_logs >= last_log), but (sum_of_logs < this_log) =  count3 = {}
length of ( smooth ) = {}
total = {}
'''.format(count1,count2,count3,len(smooth), count1+count2+count3+len(smooth))
)

Sieving complete:

not in sieve = count1 = 8238
(sum_of_logs < last_log) = count2 = 89543
(sum_of_logs >= last_log), but (sum_of_logs < this_log) =  count3 = 2
length of ( smooth ) = 2217
total = 100000

8,238 values of $x$ were discarded because they did not exist in sieve.

89,543 values of $x$ were discarded because they were not "smooth." Note that decision to discard did not depend on calculation of $y$ or ${log}_{2} (y) .$

2,219 values of $x$ were examined for smoothness. Of these, 2 were discarded.

Array "smooth" contains 2,217 members.

Process array "smooth."

Array "smooth" contains entries similar to this example:

[
24296003881, 
[(106, 5.72792), (386, 7.59246), (562, 8.13443), (54, 1.58496), (18, 1.58496), (14, 2.80735), (6, 1.58496)]
]

$24296003881$ is value of $x .$

Antilogs are replaced by the corresponding prime.

[(106, 53), (386, 193), (562, 281), (54, 3), (18, 3), (14, 7), (6, 3)]

decrement $106 = 2 \cdot 53 .$

decrement $386 = 2 \cdot 193 .$

decrement $562 = 2 \cdot 281 .$

decrement $54 = 2 \cdot 27 = 2 \cdot 3^{3} .$

decrement $18 = 2 \cdot 9 = 2 \cdot 3^{2} .$

decrement $14 = 2 \cdot 7 .$

decrement $6 = 2 \cdot 3 .$

If there is entry for $3^{3},$ there must be entry for $3^{2} .$

If there is entry for $3^{2},$ there must be entry for $3 .$

# python code.
>>> N,x
(590295810358705651708, 24296003881)
>>> y = abs(x**2 - N) ; y
5773138589547
>>> q,remainder = divmod (y, 53*193*281*3*3*7*3);q,remainder
(10627, 0)
>>>

Product $53 * 193 * 281 * 3 * 3 * 7 * 3$ divides $y$ exactly.

Quotient $10627$ is a "large" prime. Without the powers of $3,$ this entry may have been missed.

This entry becomes :

[
24296003881, 
[53, 193, 281, 3, 7, 10627]
]

When a prime is included here, it was used an odd number of times above.

Template:RoundBoxTop Here is an example of a smooth value of $y :$

[24295805845, [3203, 641, 577, 547, 127, 13, 3, 3]]

There is no large prime. This entry becomes:

[24295805845, [3203, 641, 577, 547, 127, 13]]

Prime $3$ is not used. It did not appear an odd number of times. Template:RoundBoxBottom The smooth array is processed as necessary to eliminate any entry containing a prime used only once. Any prime in smooth is used at least twice.

Here is an example deleted because the large prime was used only once:

[24295822145, [619, 181, 157, 83, 3], 2017529]

Here is an example retained because the large prime was used twice:

[24296002963, [797, 83, 3, 3, 23, 3, 3], 408803]

This entry becomes:

[24296002963, [797, 83, 23, 408803]]

Recall that the range of values of $x$ examined by the sieve was 200,000.

Yet, here we see large prime 408,803 used twice. How is this possible?

Because prime $1732331$ was found twice, the same question can be asked for these entries:

[24295960025, [863, 547, 67, 13, 3, 1732331]]
[24295924525, [1277, 857, 97, 7, 3, 1732331]]

Here is an example of a large prime used 3 times:

[24295844383, [1259, 983, 863, 67, 12043]]
[24295854167, [2179, 1093, 443, 191, 3, 12043]]
[24295964813, [2347, 661, 641, 53, 3, 12043]]

If you look closely, $24295964813 - 24295844383 = 120430 = 12043 \cdot 10 .$ Why only 3 entries for prime $12043 ?$

Result is array smooth containing 286 entries. Of these 96 are smooth. 190 contain large primes.

The total number of primes is 281. Of these 91 are large primes. You can see that large primes are almost a third of all primes used.

Note that number of entries is more than number of primes. This indicates that the probability of finding perfect squares is good. Template:RoundBoxTop Although this array is called "smooth," the numbers show that it is not smooth. Perhaps a name such as "useful" would be better.

$91 * 2 = 182,$ less than $190$ containing large primes. This shows that some large primes must be used at least $3$ times. Template:RoundBoxBottom

Produce results

Create and process matrix as in example above.

Remember that all values of $y$ are negative. Use last entry of matrix as source and combine this source with all other lines in matrix. This eliminates factor $- 1$ from matrix.

6 exact squares were produced:

0x441c794316262b96cba6611846d4d08f140a9112094634a0a4005ca0a55049f3242a9a
0x191d545b83a46c88439a5c3aca122ec85583100679c867ecb95c568398b33ae37c35f49
0x2108304080466a0a6b086014a1d2fb252586943d48ac34aa938041431a58210034283a4
0x421979c5e5689e36142c1b3ee2158332eab3a2f21ba1ba9bab1fff1ce2cabeec565245d
0x89e6c569cd412504035f785fb66f3cb43542eb37a09f8572451618e98153cc889520475
0x1093981134aa745002d0c560f80a3686d00c76b3d3e7e44aefbf81350998e106ad09d833

First exact square produced the trivial solution. All others produced the non-trivial solution.

Here are details about solution number 2:

0x191d545b83a46c88439a5c3aca122ec85583100679c867ecb95c568398b33ae37c35f49

Bits in this solution represent values of $x :$

24295805611  24295808157  24295809107  24295810993  24295813401  24295814777
24295815203  24295815539  24295817177  24295817523  24295818143  24295822355
24295822809  24295823345  24295823809  24295824387  24295826851  24295829685
24295834427  24295834433  24295834903  24295835917  24295836679  24295840049
24295840069  24295842793  24295845053  24295850555  24295851025  24295852037
24295856707  24295856797  24295858601  24295859223  24295860675  24295864007
24295868039  24295869383  24295870685  24295872787  24295874983  24295875361
24295875395  24295877125  24295882233  24295887633  24295888671  24295889123
24295889763  24295891739  24295891827  24295894117  24295896665  24295897715
24295897757  24295898563  24295898639  24295900473  24295900529  24295902379
24295903131  24295903735  24295903845  24295906899  24295907743  24295909587
24295909633  24295915013  24295915591  24295925549  24295928591  24295928827
24295933043  24295933489  24295935911  24295937075  24295938625  24295941777
24295944341  24295946505  24295947657  24295947869  24295948051  24295949051
24295950065  24295950821  24295955895  24295956829  24295958335  24295959525
24295960117  24295962247  24295963279  24295963477  24295965661  24295965791
24295965927  24295966189  24295967757  24295969127  24295970477  24295972089
24295972745  24295973099  24295976427  24295978303  24295981347  24295981883
24295982671  24295982999  24295983685  24295984517  24295985429  24295985575
24295987391  24295987487  24295992373  24295993279  24295993381  24295994227
24295994959  24295996271  24295997723  24295997941  24295998495  24295998895
24296000299  24296001081  24296001127  24296002045  24296002715  24296002963

$x = 776937957 \dots \dots 4216461181640625,$ integer containing $1371$ decimal digits.

$y = 128050773 \dots \dots 5131604744727049,$ integer containing $1021$ decimal digits.

# python code.
    gcd1 = iGCD (x+y,n)
    print ('gcd1 =',gcd1)
    gcd2 = iGCD (x-y,n)
    print ('gcd2 =',gcd2)

gcd1 = 193707721
gcd2 = 761838257287

Review

Example #2 above began with 100,000 simple operations on the sieve. Yes, the range of the sieve was about 1.75 million. This means that there were entries in bottom of sieve that were never used.

The 100,000 simple operations produced about 2,000 items of data worth investigating, This investigation produced a matrix of length 286 from which 5 non-trivial congruences were derived.

The original factor base contained 230 primes. Of these, 190 were used, leaving 40 unused.

If it seems that there was "too much data," this is not a problem. If you find yourself with not enough data, this really is a problem. Template:RoundBoxTop $n$ is in fact $2^{67} - 1,$ called $M_{67},$ a number made famous by Professor Frank Nelson Cole in 1903.

Before the era of computers he produced the factors of $n$ with pencil on paper, a magnificent achievement. Template:RoundBoxBottom

Links to Related Topics

Professor Cole's Paper

Quadratic Sieve

Contents

Introduction

Implementation

Build Factor base

Dictionary with logarithms of primes

Initialize sieve

Activate sieve

Processing locations in sequence

Step 1

Step 2

Step 3

Step 4

To improve speed

Sieving complete

Array of smooth values

The Matrix

Prime factors of y = x^2 - N

Values of x in smooth array

Create Matrix

Process matrix

Produce factors of N

Perfect square #1

Perfect square #2

Example #2

Factor base

Tables of logarithms

Create sieve

Preparation

Initialize sieve

Check sieve

Activate sieve

Process array "smooth."

Produce results

Review

Links to Related Topics

Navigation menu

Quadratic Sieve

Introduction

Implementation

Build Factor base

Dictionary with logarithms of primes

Initialize sieve

Activate sieve

Processing locations in sequence

Step 1

Step 2

Step 3

Step 4

To improve speed

Sieving complete

Array of smooth values

The Matrix

Prime factors of y = x^2 - N

Values of x in smooth array

Create Matrix

Process matrix

Produce factors of N

Perfect square #1

Perfect square #2

Example #2

Factor base

Tables of logarithms

Create sieve

Preparation

Initialize sieve

Check sieve

Activate sieve

Process array "smooth."

Produce results

Review

Links to Related Topics

Navigation menu

Search