
Toby Corkindale writes:
I know it's a rather naive benchmark, but I tried running a tiny Perl program to find all the prime numbers under 100,000 on both the Raspberry Pi and the BeagleBone.
FWIW I use this synthetic benchmark: http://homepages.cwi.nl/~steven/dry.c On a TF101 (tegra2), with cc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, I get cc -DPASS2 -O dry.c dry1.o -o dry2o [...] Trying 50000000 runs through Dhrystone: Microseconds for one run through Dhrystone: 0.4 Dhrystones per Second: 2596054 On an i7-870, with gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, I get Trying 500000000 runs through Dhrystone: Microseconds for one run through Dhrystone: 0.0 Dhrystones per Second: 25163562 IIUC the cool kids use SPECint (not synthetic) but you gotta pay for it. https://en.wikipedia.org/wiki/SPECint
They're both 700 MHz ARM CPUs, but the Raspberry runs on the older v6 spec CPU. Surprisingly, this seems to make a huge difference to performance.
Equivalent to comparing a 3GHz Pentium III and a 3GHz Pentium 4. You're running Debian armhf on both, and while that *supports* v6 (unlike Ubuntu arm/armhf), it may still be optimized for v7. Generating benchmark numbers is easy, interpreting them is hard :-)
I thought I'd try it quickly in Scala, but it seems the JVM isn't very well optimised on ARM yet :(
ARMv6 implements some JVM bytecodes directly in hardware. FOSS JVMs cannot use them. https://en.wikipedia.org/wiki/Jazelle I'm a bit hazy on the current state of play WRT. ARMv7. Re "how do I make it go faster", all of the usual funroll-loops.org discussion applies.