the djb way

crypto


hash127


Link: http://cr.yp.to/hash127.html
Version: hash127-0.70 (2000.03.21, beta)
Download: hash127-0.70.tar.gz
MD5 (hash127-0.70.tar.gz) = a574929fb61425a429c18228e44a15d4
Build type: djb classic (make setup check)
Notes: Public domain.

introduction

The hash127 package provides a one-way hash function designed specifically for use in message authentication applications. The implementation is particularly innovative in its use of floating point numerical methods for fixed-precision calculations on large integers. And it is fast.

This page includes the following sections:


background

A cryptographic hash function H() is applied to a message M of arbitrary length for the purpose of obtaining a digital signature s of some fixed length:

s = H(M)

A good hash function H() has the following essential properties:

A good hash function also has the following desirable property:

One example of a good one-way hash function is MD5. We have seen a lot of 128-bit MD5 signatures in these pages, useful for verifying the integrity of distribution files. MD5 is also among the fastest of secure, one-way hashing functions.

Another good hash function is SHA, an implementation of which is included in the OpenSSL package. SHA is slower than MD5, but creates a 160-bit digital signature:

$ openssl dgst -sha1 hash127-0.70.tar.gz
SHA1(hash127-0.70.tar.gz)= 9482d3c1b2f698fc2b9feadbe22a5052c9da6ab1

Any good hash function may also be used in the context of message authentication. In this case, a secret key k is typically concatenated with the message, and the hash function is applied to the result:

s = H(M + k)

When used in this way, signature s is also known as the Message Authentication Code, or MAC, for the message.

The protocol for usage is simple. The sender Sandra computes s from message M and secret key k as above. Sandra may then send the message and signature to her friend Rodney, with whom she has shared the secret k. If Rodney computes the same signature s, he is assured the message he received is really the message Sandra sent.

Another simple example of a message authentication protocol is the APOP command in the Post Office Protocol (RFC 1939). Whenever a client contacts a POP3 server, she first receives a banner that looks something like this:

+OK <23168.1072502584@popcorn.example.edu>

The syntax of the banner constructed by the server should be unique for each session and is derived from:

<pid.clock@fqdn>

The client then computes a MAC digest using MD5, concatenating the banner with the secret passphrase for her account on the POP3 server:

s = MD5(banner + secret)

Or, for the specific example:

s = MD5("<23168.1072502584@popcorn.example.edu>mypassword")
s = 0e211108dc98710ab7c793add61ec123

The client can now pass this message authentication code back to the server, with the account name for her mailbox:

APOP paula 0e211108dc98710ab7c793add61ec123

The server will make the same MD5 computation on its side, and compare the result. If they are the same, the client will be allowed access to her mailbox.

The hash127 package is Bernstein's library for message authentication. He describes the hash127 algorithms and security in his paper, Floating-Point Arithmetic and Message Authentication. The central innovation of this work is the use of floating-point numerical methods in fixed-precision computations on large integers. These methods make hash127 substantially faster than MD5 on the general-purpose processors in most hardware commonly available today.

The hash127 library is built around an internal hash function of the form:

Hr(M) = (r^n+1 + m[0]r^n + m[1]r^n-1 + ... + m[n-1]r) mod (2^127 - 1)

In the above function, r is a random 128-bit integer, M is comprised of m[0..(n-1)] 32-bit elements, and n is the number of elements in M. Note that the last term is modulo prime and distributes the result of the summation over a full 127 bits.

The hash127 package uses Hr(M) to provide a message authentication function of the form:

s = (Hr(M) + k) mod (2^127 - 1)

In this function, the secret key k is a 128-bit integer. In usage of hash127, both r and k are assumed to be secret. The digital signature s that results is 127 bits. (Note that, for k = 0, this function is simply Hr(M), that is, a hash without an authentication key.)

The protocol for using hash127 authentication is similar to that described earlier. Sandra and Rodney share secrets r and k in advance. Sandra computes s from message M using the secret r and k. She then sends M and s to Rodney. Rodney recomputes s from M, using secret integers r and k. If the signature agrees, he is assured the message is authentic.

In practice, the most secure usage of hash127 authentication assumes key k will be unique for each message. One way to do this is for Sandra and Rodney to share a set of randomly distributed secret keys in advance, to be used like one-time pads. Sandra then sends s, M, as well as the message number j (also called the nonce) in the clear. Rodney then selects key k[j] to authenticate the message. Afterward, the key k[j] is discarded; the next message will use another key k[j].

Applications may use other methods for providing unique keys. The APOP protocol sketched above demonstrates one method that processes may use to create unique keys/nonces for each session. See the FAQ for hash127, What if I can't afford to share 128 truly random bits per message? for other possible approaches.


installation

The build/install procedure for hash127 is the usual make setup check.

Before compiling:

The build/install procedure will then result in the following files installed on your system:

On most platforms, users may want to symlink this file according to the conventions expected by the linker:

# ln -s /usr/local/lib/hash127.a /usr/local/lib/libhash127.a

Compiling hash127 applications may then look something like this:

$ gcc -o myapp myapp.c -I/usr/local/include -L/usr/local/lib -lhash127

Note that Bernstein also advises the use of compiler option -malign-double on x86 platforms.


interface/usage

Application usage of the hash127 library looks something like this:


#include <hash127.h>

int32 *M;
unsigned int n;
int32  r[4];
int32  k[4];
int32  s[4];
struct hash127 big_r;

// assign r, assign k, read M, ...

hash127_expand(&big_r, r);
hash127(s, M, n, &big_r, k);

// ...

The hash127_expand() function is called first to initialize the library with a 128-bit representation of r (hence big_r) from the array of four 32-bit integers in r. Then hash127() itself may be called. The arguments to hash127() are:

s
the resulting 127-bit signature, will be stored in s[0..3]
M
the message, as an array of 32-bit integers
n
the number of 32-bit elements in the array M
&big_r
structure containing 128-bit representation of r
k
the 128-bit secret key, stored in array k[0..3]

Note the following implications of this interface:

The hash127() function returns its result in array s[0..3], representing the 127-bit hash signature:

2^96(2^31+s[3]) + 2^64(2^31+s[2]) + 2^32(2^31+s[1]) + (2^31+s[0])

The library also includes an alternative function, hash127_little(), for hashing strings of bytes:

//...
char *M;
//...
hash127_little(s, (int32 *)M, n, &big_r, k);

Note that n here is still the number of 32-bit elements in M, and not the number of bytes. On x86 (and other little-endian platforms), the hash127_little() function is identical to hash127().


questions for readers

We would like to improve the usefulness and accuracy of the information on this page. If you can help with any answers to the questions below, please write to us at wcm@guinix.com. Thanks!

  1. Has any independent cryptanalysis been published on hash127?

  2. Are there any publicly available applications now using hash127 authentication?

  3. Have any other publicly available cryptographic applications been written (or rewritten) to take advantage of the floating-point numerical methods for large integers developed for hash127?

  4. Protocols for specific applications will need to agree on the selection of certain r. Bernstein's paper (section 10, p. 15) describes Hr(M): The crucial property of hr, for random r, is that it has low collision probability. A theorem (8.1) is provided for the assertion. Are there some r that are weak for the purposes of low collison probablity (r=0 is obvious; r=1 doesn't look too hot, either...)? Can weak r candidates be readily identified and discarded, prior to usage?

  5. The hash127(s, ...) function returns the result in array s[0..3], representing the 127-bit hash signature:

    2^96(2^31+s[3]) + 2^64(2^31+s[2]) + 2^32(2^31+s[1]) + (2^31+s[0])

    Does the following simple printf() construction properly and portably represent the hexadecimal string for a hash127 signature:

    printf("s = %08l%08l%08l%08l\n",
      (2147483648UL + s[3]),
      (2147483648UL + s[2]),
      (2147483648UL + s[1]),
      (2147483648UL + s[0]));
    
  6. One can imagine applications where the size of the message is not known in advance (as when reading from stdin or a socket), and/or may be too large to fit in core. Now consider a solution that iterates hash127() calls over fixed-size chunks of the message, perhaps substituting the original secret key k on iteration i>0 with s(i-1) on each successive iteration, and padding the end of the message (to the chunk size, appended with message length) as in MD5/SHA. Would such an implementation adversely affect the properties of hash127?

  7. Schneier [1996] states (p. 456): Any MAC can be turned into a one-way hash function by making the key public. Is hash127 viable (cryptographically speaking) for message digest/file fingerprint applications, given public r and k?


Copyright © 2002, 2003, 2004, Wayne Marshall.
All rights reserved.

Last edit 2004.02.19, wcm.