the djb way

scientific djb


in prime time: primegen


Link: http://cr.yp.to/primegen.html
Version: primegen-0.97 (1999.03.04)
Download: primegen-0.97.tar.gz
MD5 (primegen-0.97.tar.gz) = 7f2a260e5d0c0a4f9dd2e10cc6c8b984
Build type: djb classic (make setup check)
errno patch: yes
Theory: Prime Sieves Using Binary Quadratic Forms (pdf)

introduction

What student of computer science hasn't coded up a prime number algorithm at least one time or another? I still remember a Scientific American article from long, long ago, "One Million Primes through the Sieve", describing an implementation of the classic sieve of Eratosthenes method for finding primes. It inspired an entertaining diversion with the Turbo Pascal compiler one weekend, running on an original 8086-class IBM PC, circa 1985.

Sure enough, one million primes eventually trickled through that sieve --with run times of several hours, if not days. I guess life was slower back then...

Prime numbers are serious business in computer science and number theory. This is especially true in the field of cryptography, where public-key encryption protocols are based on the use of large primes, and security is premised on the difficulty of finding the prime factors of very large numbers.

Primes are useful for other purposes as well. For example, hash tables are best initialized with a prime number of "buckets" in order to minimize collisions.

So we might expect Bernstein to have taken a crack at a prime number enumerator of his own, and we would be right. He has developed a package based on his academic research in this field, called primegen.

The primegen programs are based on an algorithm called "the sieve of Atkin", described in this paper (pdf). Of course I could explain it to you, but then I would have to kill you.

The speed of this thing is astonishing. On a P4-class, 2.4ghz workstation, you will find this: over 5 million primes blow through this sieve in less than 2 seconds! I'm not sure if this is the fastest algorithm yet devised, but one thing is certain: Atkin kicks Eratosthenes butt.

The package comes with a couple command-line utilities that demonstrate the algorithm, as well as a primegen(3) manual that describes how to include the primegen library in your own applications.

This page includes the following sections:


installation

The build/install procedure for primegen is the usual make setup check.

For systems that require the errno patch, first edit the file named error.h and change the declaration of errno from:

extern int errno

To:

#include <error.h>

Then continue with the usual:

# make
# make setup check

The build/install procedure will then result in the following files installed on your system:

Programmers working with the primegen library may want to setup a symlink according to the conventions expected by the linker:

# ln -s /usr/local/lib/primegen.a /usr/local/lib/libprimegen.a

Compiling primegen applications will then look something like this:

$ gcc -o myapp myapp.c -I/usr/local/include -L/usr/local/lib -lprimegen

Be aware that, if you have the BSD "games" installed --as in /usr/games-- it will include its own primes utility. To use the primegen version instead:


usage

To generate all the primes within a range of numbers, use the primes utility:

primes low high

For example, generate all the primes between 1 and 100:

$ primes 1 100
2
3
5
7
<snip>
83
89
97

That was easy. This should be harder: checksum the 5 million (and change) primes between 1 and 1 billion:

$  primes 1 1000000000 | md5
92c178cc5bb85e06366551c0ae7e18f6

Amazingly quick! For an unscientific comparison, I used the time(1) utility on both the Bernstein version and the BSD "games" version:

  djb BSD
real: 0m16.577s 0m35.107s
user: 0m16.039s 0m34.486s
sys: 0m0.501s 0m0.500s

This shows the primegen version running a little more than twice as fast as the BSD version. Of course, we are measuring a lot of other factors here beyond the sieve algorithm (such as putchar() versus printf() for the output.) But it seems clear that Atkin easily pastes the old Greek.

In other words, Eratosthenes is past his prime.


interface

You don't have to understand the Atkin sieve to use it. It is very easy to link all the power of this algorithm into your own programs.

For example, let's say we want a utility to find the largest prime less than or equal to any given number. We might call this utility primen, for "prime nearest", and could write it like this:


/*
primen.c
find largest prime less than or equal to target
use djb primegen library ("atkin's sieve")
wcm, 2004.02.19 - 2004.02.19
===
*/
#include <stdlib.h>
#include <stdio.h>

#include <primegen.h>

int
main(int argc, char **argv)
{
  primegen  pg;
  uint64    target = 0L;
  uint64    last = 0L;

  if(argv[1])
      target = strtoull(argv[1],NULL,10);
  if(target < 2)
      exit(EXIT_FAILURE);

  primegen_init(&pg);

  while(primegen_peek(&pg) <= target)
      last = primegen_next(&pg);

  printf("%llu\n", last);

  return 0;
}
/* that's all, folks! */

Copy the uint32.h and uint64.h platform-specific header files from the primegen build into your own project directory. Then compile and link the program:

$ gcc -Wall -o primen primen.c -I. -lprimegen

Try it out:

$ ./primen 65535
65521

This is a primal scream! See the simple primegen(3) man page for anything else you ever wanted to know about programming with this interface.


Copyright © 2003, 2004, Wayne Marshall.
All rights reserved.

Last edit 2004.02.19, wcm.