## 65535 interfaces ought to be enough for anybody

It was a bright, sunny morning. There were no signs of trouble. I came to work, opened Slack, and received many messages from my coworkers about failed tests.

After a few hours of investigation, the situation became clear:

• I’m responsible for the unit tests subsystem in Rider, and only tests from this subsystem were failing.
• I didn’t commit anything to the subsystem for a week because I worked with a local branch. Other developers also didn’t touch this code.
• The unit tests subsystem is completely independent. It’s hard to imagine a situation when only the corresponded tests would fail, thousands of other tests pass, and there are no changes in the source code.
• git blame helped to find the “bad commit”: it didn’t include anything suspicious, only a few additional classes in other subsystems.
• Only tests on Linux and MacOS were red. On Windows, everything was ok.
• Stacktraces in failed tests were completely random. We had a new stack trace in each test from different subsystems. There was no connection between these stack traces, unit tests source code, and the changes in the “bad commit.” There was no clue where we should look for a problem.

So, what was special about this “bad commit”? Spoiler: after these changes, we sometimes have more than 65535 interface implementations at runtime.

## A bug story about named mutex on Mono

When you write some multithreading magic on .NET, you can use a cool synchronization primitive called Mutex:

var mutex = new Mutex(false, "Global\\MyNamedMutex");


You also can make it named (and share the mutex between processes) which works perfectly on Windows:

However, today the .NET Framework is cross-platform, so this code should work on any operation system. What will happen if you use named mutex on Linux or MacOS with the help of Mono or CoreCLR? Is it possible to create some tricky bug based on this case? Of course, it does. Today I want to tell you a story about such bug in Rider which was a headache for several weeks.

## InvalidDataException in Process.GetProcesses

Consider the following program:

public static void Main(string[] args)
{
try
{
Process.GetProcesses();
}
catch (Exception e)
{
Console.WriteLine(e);
}
}


It seems that all exceptions should be caught. However, sometimes, I had the following exception on Linux with dotnet cli-1.0.0-preview2:

\$ dotnet run
System.IO.InvalidDataException: Found invalid data while decoding.
at System.IO.StringParser.ParseNextChar()
at Interop.procfs.TryParseStatFile(String statFilePath, ParsedStat& result, ReusableTextReader reusableReader)
at System.Diagnostics.ProcessManager.GetProcessInfos(String machineName)
at System.Diagnostics.Process.GetProcesses(String machineName)
at System.Diagnostics.Process.GetProcesses()
at DotNetCoreConsoleApplication.Program.Main(String[] args) in /home/akinshin/Program.cs:line 12


How is that possible?

## Why is NuGet search in Rider so fast?

I’m the guy who develops the NuGet manager in Rider. It’s not ready yet, there are some bugs here and there, but it already works pretty well. The feature which I am most proud of is smart and fast search:

Today I want to share with you some technical details about how it was implemented.

## NuGet2 and a DirectorySeparatorChar bug

In Rider, we care a lot about performance. I like to improve the application responsiveness and do interesting optimizations all the time. Rider is already well-optimized, and it’s often hard to make significant performance improvements, so usually I do micro-optimizations which do not have a very big impact on the whole application. However, sometimes it’s possible to improve the speed of a feature 100 times with just a few lines of code.

Rider is based on ReSharper, so we have a lot of cool features out of the box. One of these features is Solution-Wide Analysis which lets you constantly keep track of issues in your solution. Sometimes, solution-wide analysis takes a lot of time to run because there are many files which should be analyzed. Of course, it works super fast on small and projects.

Let’s talk about a performance bug (#RIDER-3742) that we recently had.

• Repro: Open Rider, create a new “ASP .NET MVC Application”, enable solution wide-analysis.
• Expected: The analysis should take 1 second.
• Actual: The analysis takes 1 second on Windows and 2 minutes on Linux and MacOS.

## Performance exercise: Division

In the previous post, we discussed the performance space of the minimum function which was implemented via a simple ternary operator and with the help of bit magic. Now we continue to talk about performance and bit hacks. In particular, we will divide a positive number by three:

uint Div3Simple(uint n)   => n / 3;
uint Div3BitHacks(uint n) => (uint)((n * (ulong)0xAAAAAAAB) >> 33);


As usual, it’s hard to say which method is faster in advanced because the performance depends on the environment. Here are some interesting results:

SimpleBitHacks
LegacyJIT-x86≈8.3ns≈2.6ns
LegacyJIT-x64≈2.6ns≈1.7ns
RyuJIT-x64≈6.9ns≈1.5ns
Mono4.6.2-x86≈8.5ns≈14.4ns
Mono4.6.2-x64≈8.3ns≈2.8ns

## Performance exercise: Minimum

Performance is tricky. Especially, if you are working with very fast operations. In today benchmarking exercise, we will try to measure performance of two simple methods which calculate minimum of two numbers. Sounds easy? Ok, let’s do it, here are our guinea pigs for today:

int MinTernary(int x, int y)  => x < y ? x : y;
int MinBitHacks(int x, int y) => x & ((x - y) >> 31) | y & (~(x - y) >> 31);


And here are some results:

RandomConst
TernaryBitHacksTernaryBitHacks
LegacyJIT-x86≈643µs≈227µs≈160µs≈226µs
LegacyJIT-x64≈450µs≈123µs≈68µs≈123µs
RyuJIT-x64≈594µs≈241µs≈180µs≈241µs
Mono-x64≈203µs≈283µs≈204µs≈282µs

What’s going on here? Let’s discuss it in detail.

## Stopwatch under the hood

Update: You can find an updated and significantly improved version of this post in my book “Pro .NET Benchmarking”.

In the previous post, we discussed DateTime. This structure can be used in situations when you don’t need a good level of precision. If you want to do high-precision time measurements, you need a better tool because DateTime has a small resolution and a big latency. Also, time is tricky, you can create wonderful bugs if you don’t understand how it works (see Falsehoods programmers believe about time and More falsehoods programmers believe about time).

In this post, we will briefly talk about the Stopwatch class:

• Which kind of hardware timers could be a base for Stopwatch
• High precision timestamp API on Windows and Linux
• Latency and Resolution of Stopwatch in different environments
• Common pitfalls: which kind of problems could we get trying to measure small time intervals

If you are not a .NET developer, you can also find a lot of useful information in this post: mainly we will discuss low-level details of high-resolution timestamping (probably your favorite language also uses the same API). As usual, you can also find useful links for further reading.

## DateTime under the hood

Update: You can find an updated and significantly improved version of this post in my book “Pro .NET Benchmarking”.

DateTime is a widely used .NET type. A lot of developers use it all the time, but not all of them really know how it works. In this post, I discuss DateTime.UtcNow: how it’s implemented, what the latency and the resolution of DateTime on Windows and Linux, how the resolution can be changed, and how it can affect your application. This post is an overview, so you probably will not see super detailed explanations of some topics, but you will find a lot of useful links for further reading.

## LegacyJIT-x86 and first method call

Today I tell you about one of my favorite benchmarks (this method doesn’t return a useful value, we need it only as an example):

[Benchmark]
public string Sum()
{
double a = 1, b = 1;
var sw = new Stopwatch();
for (int i = 0; i < 10001; i++)
a = a + b;
return string.Format("{0}{1}", a, sw.ElapsedMilliseconds);
}


An interesting fact: if you call Stopwatch.GetTimestamp() before the first call of the Sum method, you improve Sum performance several times (works only with LegacyJIT-x86).