
Entry #100197: Don't Trust the Hardwareby Jennifer Elaan
When all you have is Verilog, everything starts looking like programmable hardware.
This code assumes the underlying CPU has a very primitive ALU that supports only bit operations (bit shift, XOR, AND and OR), lacking any form of integer arithmetic. There are no instances of +,,*,or /, used as operators, in the entire body of CalcFunc.cpp. In addition to standard barrelshift multiply and divide, and floating point emulation, there are two especially interesting blocks of code in this system, add() and binlog(): add() is actually rather fast, considering the limitations listed above. It's an O(log N) algorithm, a direct software translation of a multilevel carry lookahead add unit. It does a lot of work in parallel, and is completely branchfree, which helps performance on most superscalar CPUs. binlog() is taken directly from the crossplatform implementation of the binary logarithm from a library I wrote. While most CPUs have a singlecycle version of this instruction, including x86, it is always useful to define a crossplatform implementation. This one runs much faster than any of the O(n) loopbased implementations. It is also a concatenation of two simpler functions, neither of which has a standard instruction: binceil (round a number up to the next power of two) and countbits (count the number of 1 bits set in an integer), both of which are O(log N) algorithms. Download • 0 Comments 