> I blindly assume this has been done to death thousands of times
> (literally). Somebody somewhere probably has oodles of bigint routines in
> various HLLs languages, optimized for various cpus. Check
> news://comp.lang.misc for potential contributors.
>
> Also, I do wonder how such things are verified (especially on slow/old
> hardware). Do they brute force it? Prove it via mathematics? Or just
> (slowly, partially) run a fraction daily until they double-check all their
> ranges output valid data? How would somebody verify correctness on an old
> 286/386/486?
>
> I'm sure it's been done many times over. I just don't know the details.
> Probably lots of papers, books, source code. (FPU? BCD? SIMD?)
You don't need to verify the entire 64-bit or even 32-bit range, although the latter is already doable on a regular PC for many simple algorithms.
First, you can reduce your data format to fewer bits. This makes it more brute forçable(?). You don't really need all 52 bits of a double mantissa and 11 bits of exponent to simulate floating point behavior qualitatively. You can have all the same behaviors in IEEE half precision format, in 10+5=15 bits. Likewise with ints: use 8-bit ones to exhaustively validate your algorithm, no need to try doing it with, say, 32-bit ones.
Second, you are mostly interested in what generally happens in large subranges and at their much smaller boundaries (say, in one subrange everything's OK, while in the other you have an overflow or NaN, so you're interested in these two subranges and the boundary between them). A lot of bugs are typically at the boundaries.
Third, you may generate various bit patterns instead of trying to cover a large (sub)range. Say, a single bit set to 1, while zeroes are in all others. Move that one bit across all bit positions. Repeat the same with bits inverted. Repeat the same but with 2 bits set to 1. Ditto 3 bits. You can also replace single bits set to 1 with groups of a few 1's.
Fourth, to get away from the regularity of your test data patterns (and possibly some systematically missed cases), generate random inputs as well.
Fifth, yes, any degree of parallelization is good: multiple CPU cores, SIMD instructions, etc. |