Smalltalk › Usenets › Dolphin Smalltalk

Some Loose Integer Methods

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

9 messages Options

Randy Manning

Some Loose Integer Methods

Add these methods to Class Integer, and experiment. These are very fast
methods for working with prime numbers.

isPrime
"Answer whether the receiver is a prime number.
Note, hit Ctrl+Break to interrupt this calculation loop.

Example: 123456791 isPrime "

| selfSqrt divisor addendArray indexArray index |

(self < 2) ifTrue: [^false].
"The following two arrays are used to generate the divisor sequence:
2,3,5,7,9,11,13,17,19,21,23,27,... i.e., after divide checking by 2,3, and
5, we
only divide check by the odd numbers greater than 5, which do not end in
5."
addendArray := #( 1 2 2 2 2 2 4 ).
indexArray := #( 2 3 4 5 6 7 4 ).
selfSqrt := self sqrt. divisor := 2. index := 1.
[divisor > selfSqrt]
whileFalse: [
(self \\ divisor) = 0 "modulus, remainder division"
ifFalse: ["the divisor is not a factor of self"
divisor := divisor + (addendArray at: index).
index := indexArray at: index
]
ifTrue: ["the divisor is a factor of self"
^false
].
].
^true

primeFactors
"Answer an Array of Integers containing the prime factors of the receiver.
Note, hit Ctrl+Break to interrupt this calculation loop.

Example: 1234567890 primeFactors "

| col factor divisor addendArray indexArray index |

(self < 2) ifTrue: [self error: 'Receiver must be greater than or equal to
2'. ^nil].
"The following two arrays are used to generate the divisor sequence:
2,3,5,7,9,11,13,17,19,21,23,27,... i.e., after divide checking by 2,3, and
5, we
only divide check by the odd numbers greater than 5, which do not end in
5."
addendArray := #( 1 2 2 2 2 2 4 ).
indexArray := #( 2 3 4 5 6 7 4 ).
col := OrderedCollection new.
factor := self. divisor := 2. index := 1.
[divisor > (factor sqrt)]
whileFalse: [
(factor \\ divisor) = 0 "modulus, remainder division"
ifFalse: ["the divisor is not a prime factor of factor"
divisor := divisor + (addendArray at: index).
index := indexArray at: index.
]
ifTrue: ["the divisor is a prime factor of factor"
col add: divisor.
factor := factor // divisor "integer division"
].
].
col add: factor.
^col asArray

primesTo: anInteger
"Answer an Array of Integers containing the prime numbers ranging from the
receiver to anInteger.
Note, hit Ctrl+Break to interrupt this calculation loop.

Example: 12345000 primesTo: 12346000 "

(self < 2) ifTrue: [self error:'Receiver must be greater than or equal to
2'. ^nil ].
(anInteger isKindOf: Integer)
ifFalse: [self error: 'Argument must be an Integer'. ^nil ].
self > anInteger ifTrue: [self error: 'Argument must be greater than or
equal to receiver'. ^nil ].
^((self to: anInteger) select: [ :num | num isPrime]) asArray

Good Luck
Randy

Ian Bartholomew-18

Re: Some Loose Integer Methods

Randy,

> Add these methods to Class Integer, and experiment. These are very fast
> methods for working with prime numbers.

You might want to consider giving #primeTo: a special test and, if the end
value is below a predetermined limit, using a sieve rather than a integer by
integer enumeration. For example my standard sieve (without any special
optimisation) finds all the primes below 1,000,000 in ~1.4 seconds compared
to the ~18 seconds using a #select loop.

Just a thought.

--
Ian Bartholomew

Joerg Beekmann

Re: Some Loose Integer Methods

How many primes are there below 1,000,000? I suspect there may not be too
many to type into a method as a literal array. That would certainly let you
"find" them the fastest way possible.

--
Joerg Beekmann
jbeekmann<at>telus.net
"Ian Bartholomew" <[hidden email]> wrote in message
news:DFlR9.8345$4k6.792230@wards...
> Randy,
>
> > Add these methods to Class Integer, and experiment. These are very fast
> > methods for working with prime numbers.
>
> You might want to consider giving #primeTo: a special test and, if the end
> value is below a predetermined limit, using a sieve rather than a integer
by
> integer enumeration. For example my standard sieve (without any special
> optimisation) finds all the primes below 1,000,000 in ~1.4 seconds
compared
> to the ~18 seconds using a #select loop.
>
> Just a thought.
>
> --
> Ian Bartholomew
>
>

Ian Bartholomew-18

Re: Some Loose Integer Methods

Joerg Beekmann wrote:
> How many primes are there below 1,000,000? I suspect there may not be too
> many to type into a method as a literal array. That would certainly let
> you "find" them the fastest way possible.

You must be a faster typist than me :-)

There are 78498 primes below 1000000 and the literal string needed would be
538471 characters long.

--
Ian Bartholomew

Randy Manning

Re: Some Loose Integer Methods

Ian,

I clicked "Reply" instead of "Reply Group" in my last message to you...
OOPS!!
The Greek sieve method is indeed very fast! Thanks for your source!. I'd
like to post it here, if it's OK with you? However, I think that I would
need to implement it in such a way as to know the memory and speed
capabilities of the machine that it is currently running on. I'm still not
sure if the sieve (from 2 to N) method is faster than my #primesTo: method
for integers within a small range of very large valued integers; Say, from
within the range of: 300,000,000. to 300,000,999.

By the way, my #primesTo: method was just a quick afterthought after I had
created #isPrime and #primeFactors methods.

But, I have a feeling that, somehow, the Greek sieve still rules! Long
live Athens!

~Randy

Ian Bartholomew-18

Re: Some Loose Integer Methods

Randy,

> The Greek sieve method is indeed very fast! Thanks for your source!. I'd
> like to post it here, if it's OK with you?

Sure. It's nothing special though, just a basic unoptimised sieve. You can
probably find better implementations if you do a bit of browsing - the UIUC
archive maybe?

> However, I think that I would
> need to implement it in such a way as to know the memory and speed
> capabilities of the machine that it is currently running on. I'm still not
> sure if the sieve (from 2 to N) method is faster than my #primesTo: method
> for integers within a small range of very large valued integers; Say, from
> within the range of: 300,000,000. to 300,000,999.

Yes. That's why I suggested a test to use the faster sieve if the
parameters were small and switch to your original method if not. The test
could then take into account the speed/memory of the target machine.

If your application is _only_ ever going to want large primes then it's
probably not worth bothering about the sieve.

> By the way, my #primesTo: method was just a quick afterthought after I
> had created #isPrime and #primeFactors methods.

FWIW, there is another method that I have used in the past and that is by
creating a file that contains the complete range of primes that you might
need. So you would use the sieve to create a file containing the primes
from, say, 2 to 1000000. You could then set up a sieve for the range
1000001 to 2000000 and use the contents of the first file to set the
prime/non prime flags. Append these new primes to the first file, set up
the sieve for 2000001 to 3000000 and so on...

You end up with a *big* *BIG* file (which you can easily access randomly if
you keep file offset pointers after each generation iteration) but as you
only have to do this once it can be a lot quicker in the long run.

--
Ian Bartholomew

Randy Manning

Re: Some Loose Integer Methods

"Ian Bartholomew" <[hidden email]> wrote in message
news:orJR9.8694$4k6.843031@wards...
> Randy,
>
> > The Greek sieve method is indeed very fast! Thanks for your source!. I'd
> > like to post it here, if it's OK with you?
>
> Sure. It's nothing special though, just a basic unoptimised sieve. You
can
> probably find better implementations if you do a bit of browsing - the
UIUC
> archive maybe?
>

Thanks Ian,
Here's Ian's sieve implementation mail message to me:

Randy,

> Excellent Idea. I think Integer class>>primesUpTo: implements a sieve -
> Correct me if I'm wrong please!

That method is actually a very good example of the way _not_ to do it. It
works OK for small ranges but is extremely (to say the least) inefficient
for larger ones.

My simple sieve implementation is -

=====

primesFrom: start to: stop
| flags |
flags := (Array new: stop + 1) atAllPut: true.
flags at: 1 put: false.
1 to: flags size do: [:index1 |
(flags at: index1)
ifTrue: [
index1 + index1 to: flags size by: index1 do: [:index2 |
flags at: index2 put: false]]].
^(start to: stop) select: [:each | flags at: each]

=====

This method (sieve) is memory intensive but is also quite efficient for
small to medium ranges (say < 10000000). If you want large ranges of primes
and need to reduce the amount of memory used you can start playing with the
coding. The first couple of steps are normally -

- adjusting the Array access to only store flags for odd integers. Just
involves dividing by 2
- adjusting the array to use a single bit, rather than a byte, for each odd
value. This involves dividing accesses by 16 and doing a bit of bit
twiddling.

These reduce the memory use but obviously increase the execution time.
There are also a few mathmatical tweaks that can be implemented as well. For
example, I think the first loop can safely be changed to read

1 to: flags size // 2 + 1 do: [:index1 |

but I haven't tested it!

Let me know if you want any further pointers

Regards
Ian

Randy Manning

Re: Some Loose Integer Methods

Ian,

Check this out, it seems rather intriguing to me:

Inspect the following: (100 factorial) primeFactors

The resulting inspector contains all the prime numbers from 2 to 97 in ever
decreasing numbers of occurrences.

I wonder if there is some kind of relation between the number of each prime
factor and its position within the #primeFactors array?

~Randy

Chris Uppal-3

Re: Some Loose Integer Methods

Randy Manning wrote:

> The resulting inspector contains all the prime numbers from 2 to 97
> in ever decreasing numbers of occurrences.
>
> I wonder if there is some kind of relation between the number of
> each prime factor and its position within the #primeFactors array?

Well, I think you are searching for factors in numeric order, so what that's
really saying is that the smaller primes more factors than larger ones.

One way to think about it is to consider random large numbers. They will have
2 as a factor (on average) half the time, but then the remainder after dividing
by two is also a large number, so they'll have another factor of two (on
average) half the time, and so on. So a random large number will (on average)
have 1/2 + 1/4 + 1/8 +... = 1 occurrences of 2 in its list of factors. Whereas
(for example) the same reasoning suggests that it will have 1/7 + 1/49 + 1/343
= 1/6 occurences of 7 amongst its list of prime factors. In general a prime,
p, will occur 1/(p-1) times on average.

That's actually a bit of a handwaving argument -- i.e. wrong -- since the
factors aren't really independent of each other (after all the number has to
have *some* factors or it wouldn't be large!), but it is probably not too far
off for large numbers when we consider the small(ish) primes. If it has any
explanatory power at all, then it suggests that prime factors should (on
average) decrease in frequency as the size of prime increases.

Incidently, 100 factorial is a rather odd number when it comes to prime
factors. The way it is built guarantees that the number will have many more
small factors (and no large ones) than the typical number of about the same
size. You *could* try (100 factorial + 1) primeFactors to see what I mean (but
expect it to take a *very* long time ;-). In general N factorial has about (N
/ (p-1)) occurences of each prime factor p (where p <= N), so you a similar
pattern of decreasing occurences to the "random" case, but with each prime
occuring many more (in fact N) times than is typical.

-- chris