In the previous post, we discussed the performance space of the minimum function which was implemented via a simple ternary operator and with the help of bit magic. Now we continue to talk about performance and bit hacks. In particular, we will divide a positive number by three:

```
uint Div3Simple(uint n) => n / 3;
uint Div3BitHacks(uint n) => (uint)((n * (ulong)0xAAAAAAAB) >> 33);
```

As usual, it's hard to say which method is faster in advanced because the performance depends on the environment. Here are some interesting results:

Simple | BitHacks | |
---|---|---|

LegacyJIT-x86 | ≈8.3ns | ≈2.6ns |

LegacyJIT-x64 | ≈2.6ns | ≈1.7ns |

RyuJIT-x64 | ≈6.9ns | ≈1.5ns |

Mono4.6.2-x86 | ≈8.5ns | ≈14.4ns |

Mono4.6.2-x64 | ≈8.3ns | ≈2.8ns |

Performance is tricky. Especially, if you are working with very fast operations. In today benchmarking exercise, we will try to measure performance of two simple methods which calculate minimum of two numbers. Sounds easy? Ok, let's do it, here are our guinea pigs for today:

```
int MinTernary(int x, int y) => x < y ? x : y;
int MinBitHacks(int x, int y) => x & ((x - y) >> 31) | y & (~(x - y) >> 31);
```

And here are some results:

Random | Const | |||
---|---|---|---|---|

Ternary | BitHacks | Ternary | BitHacks | |

LegacyJIT-x86 | ≈643µs | ≈227µs | ≈160µs | ≈226µs |

LegacyJIT-x64 | ≈450µs | ≈123µs | ≈68µs | ≈123µs |

RyuJIT-x64 | ≈594µs | ≈241µs | ≈180µs | ≈241µs |

Mono-x64 | ≈203µs | ≈283µs | ≈204µs | ≈282µs |

What's going on here? Let's discuss it in detail.

Read more