crosstool-ng for the Maverick Crunch processors

Here you can download a version of the wonderful crosstool-ng that generates a toolchain that is compatible with armel (so you can mix it with binaries/libraries downloaded from Debian armel (tested with lenny)) optimized for the Maverick Crunch architecture. It's already configured to build in /opt/toolchains/ directory. This work is based on patches by Martin Guy and tested both on Cirrus demo board for the EP9302 processor and on a custom board.

Benchmarks

As a reference here are the results obtained with the nbyte benchmark compiled with the common CodeSourcery armel toolchain:

CFLAGS = -Wall -O3 -ffast-math -s -static -march=armv4t

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :           61.83  :       1.59  :       0.52
STRING SORT         :          3.9683  :       1.77  :       0.27
BITFIELD            :      1.7835e+07  :       3.06  :       0.64
FP EMULATION        :          12.769  :       6.13  :       1.41
FOURIER             :          63.135  :       0.07  :       0.04
ASSIGNMENT          :          0.5321  :       2.02  :       0.53
IDEA                :          181.82  :       2.78  :       0.83
HUFFMAN             :          90.777  :       2.52  :       0.80
NEURAL NET          :        0.091946  :       0.15  :       0.06
LU DECOMPOSITION    :          2.8249  :       0.15  :       0.11
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 2.573
FLOATING-POINT INDEX: 0.116
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : 
L2 Cache            : 
OS                  : Linux 2.6.31.1
C compiler          : /opt/maverick/arm-2009q1/bin/arm-none-linux-gnueabi-gcc
libc                : static
MEMORY INDEX        : 0.452
INTEGER INDEX       : 0.836
FLOATING-POINT INDEX: 0.064
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
and with this toolchain:
CFLAGS = -Wall -mcpu=ep9312 -mfpu=maverick -mfloat-abi=softfp -O3 -ffast-math -s -static

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          54.684  :       1.40  :       0.46
STRING SORT         :          6.5741  :       2.94  :       0.45
BITFIELD            :      1.4297e+07  :       2.45  :       0.51
FP EMULATION        :          14.043  :       6.74  :       1.55
FOURIER             :          450.91  :       0.51  :       0.29
ASSIGNMENT          :         0.54074  :       2.06  :       0.53
IDEA                :          171.24  :       2.62  :       0.78
HUFFMAN             :           111.9  :       3.10  :       0.99
NEURAL NET          :         0.45648  :       0.73  :       0.31
LU DECOMPOSITION    :          11.748  :       0.61  :       0.44
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 2.733
FLOATING-POINT INDEX: 0.612
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : 
L2 Cache            : 
OS                  : Linux 2.6.31.1
C compiler          : /opt/toolchains/arm-crunch-linux-gnueabi/bin/arm-crunch-linux-gnueabi-gcc
libc                : static
MEMORY INDEX        : 0.499
INTEGER INDEX       : 0.862
FLOATING-POINT INDEX: 0.339
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

a little trick to gain some more performance on some operations is to link with the softfloat-crunch library by Nicolas Pitre:

CFLAGS = -Wall -mcpu=ep9312 -mfpu=maverick -mfloat-abi=softfp -O3 -ffast-math -s -static
LDFLAGS has added libfloat.a

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST                : Iterations/sec.  : Old Index   : New Index
                    :                  : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT        :          61.659  :       1.58  :       0.52
STRING SORT         :          6.5436  :       2.92  :       0.45
BITFIELD            :      1.4308e+07  :       2.45  :       0.51
FP EMULATION        :          14.043  :       6.74  :       1.55
FOURIER             :          450.91  :       0.51  :       0.29
ASSIGNMENT          :         0.53996  :       2.05  :       0.53
IDEA                :          171.69  :       2.63  :       0.78
HUFFMAN             :          111.77  :       3.10  :       0.99
NEURAL NET          :         0.45634  :       0.73  :       0.31
LU DECOMPOSITION    :          11.601  :       0.60  :       0.43
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX       : 2.779
FLOATING-POINT INDEX: 0.609
Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU                 : 
L2 Cache            : 
OS                  : Linux 2.6.31.1
C compiler          : /opt/toolchains/arm-crunch-linux-gnueabi/bin/arm-crunch-linux-gnueabi-gcc
libc                : static
MEMORY INDEX        : 0.498
INTEGER INDEX       : 0.888
FLOATING-POINT INDEX: 0.338
Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
But the gain is small and wasn't worth to integrate in crosstool-ng for our application.

Correctness

I tested the correctness of floating point with the paranoia tool. With -ffast-math none of the toolchains (neither the CodeSourcery one) hasn't finished the test. Without -ffast-math there are still some minor problems. Here is the summary:
.....

FLAW:  X = 3.05947655544740190e-308
	is not equal to Z = 2.22507385850720138e-308 .
yet X - Z yields 0.00000000000000000e+00 .
    Should this NOT signal Underflow, this is a SERIOUS DEFECT

.....


FLAW:  Underflow can stick at an allegedly positive
value PseudoZero that prints out as 1.11254e-308 .
Since comparison denies Z = 0, evaluating (Z + Z) / Z should be safe.
What the machine gets for (Z + Z) / Z is  0.00000000000000000e+00 .
This is a VERY SERIOUS DEFECT!

.....

Overflow threshold is V  = 1.79769313486231571e+308 .
Overflow saturates at V0 = inf .
No Overflow should be signaled for V * 1 = 1.79769313486231571e+308
                           nor for V / 1 = 1.79769313486231571e+308 .
Any overflow signal separating this * from the one
above is a DEFECT.

.....

The number of  SERIOUS DEFECTs  discovered = 1.
The number of  FLAWs  discovered =           2.

.....
As you can see they are about underflow/overflow (or thereby) conditions. Normal calculation (I tested some linear algebra programs) don't show problems.
chripell@gmail.com