Performance gains when specifying a List’s capacity – Part 2


Performance gains when specifying a List’s capacity – Part 2

This is a continuation from our last post.

In our last post, we experimented a bit with creating and populating a new list, in order to discover the gains of using the capacity parameter. Following James Curran’s suggestion, let’s compare how our code would perform in cases that we try to guess our list’s size. The code below creates and populates a list 100 times.

var itemsToAdd = int.MaxValue / 100;
var stopwatch = new Stopwatch();
var timesToRepeat = 100;

for (int i = 0; i <= 100; i += 10)
{
    var listCapacity = i * itemsToAdd / 100;

    stopwatch.Restart();
    for (int k = 0; k < timesToRepeat; k++)
    {
        var list = new List<int>(listCapacity);
        for (int j = 0; j < itemsToAdd; j++)
            list.Add(j);
    }
    stopwatch.Stop();

    Console.WriteLine($"With {i}% ({listCapacity}) of total capacity: Total: {stopwatch.ElapsedMilliseconds}ms Avg. {stopwatch.ElapsedMilliseconds / timesToRepeat}ms");
}
With 0% (0) of total capacity: Total: 10816ms Avg. 108ms
With 10% (2147483) of total capacity: Total: 9691ms Avg. 96ms
With 20% (4294967) of total capacity: Total: 8617ms Avg. 86ms
With 30% (6442450) of total capacity: Total: 7482ms Avg. 74ms
With 40% (8589934) of total capacity: Total: 8619ms Avg. 86ms
With 50% (10737418) of total capacity: Total: 7154ms Avg. 71ms
With 60% (12884901) of total capacity: Total: 8360ms Avg. 83ms
With 70% (15032385) of total capacity: Total: 8262ms Avg. 82ms
With 80% (17179868) of total capacity: Total: 8613ms Avg. 86ms
With 90% (19327352) of total capacity: Total: 8928ms Avg. 89ms
With 100% (21474836) of total capacity: Total: 6389ms Avg. 63ms

The results might be a bit confusing. As our initial capacity gets closer to the final capacity, we should be seeing some kind of linear improvement, right? But then, it seems that between 10% and 90% we hardly get any difference, and even stranger, sometimes we get a worse performance, what’s going on? In order to address that, let’s tune up our code a bit. In our next code, we will monitor the times that we had to resize our list.

var itemsToAdd = int.MaxValue / 100;
var stopwatch = new Stopwatch();
var timesToRepeat = 100;

for (int i = 0; i <= 100; i += 10)
{
    var listCapacity = i * itemsToAdd / 100;
    var operations = 0;

    stopwatch.Restart();
    for (int k = 0; k < timesToRepeat; k++)
    {
        var list = new List<int>(listCapacity);
        var lastCapacity = listCapacity;
        for (int j = 0; j < itemsToAdd; j++)
        {
            list.Add(j);
            if (lastCapacity != list.Capacity)
            {
                operations++;
                lastCapacity = list.Capacity;
            }
        }
    }
    stopwatch.Stop();

    Console.WriteLine($"With {i}% ({listCapacity}) of total capacity: Total: {stopwatch.ElapsedMilliseconds}ms Avg. {stopwatch.ElapsedMilliseconds / timesToRepeat}ms Resized {operations / timesToRepeat} times per repetition");
}
With 0% (0) of total capacity: Total: 14008ms Avg. 140ms Resized 24 times per repetition
With 10% (2147483) of total capacity: Total: 12569ms Avg. 125ms Resized 4 times per repetition
With 20% (4294967) of total capacity: Total: 11840ms Avg. 118ms Resized 3 times per repetition
With 30% (6442450) of total capacity: Total: 10362ms Avg. 103ms Resized 2 times per repetition
With 40% (8589934) of total capacity: Total: 11678ms Avg. 116ms Resized 2 times per repetition
With 50% (10737418) of total capacity: Total: 10030ms Avg. 100ms Resized 1 times per repetition
With 60% (12884901) of total capacity: Total: 10252ms Avg. 102ms Resized 1 times per repetition
With 70% (15032385) of total capacity: Total: 11388ms Avg. 113ms Resized 1 times per repetition
With 80% (17179868) of total capacity: Total: 12111ms Avg. 121ms Resized 1 times per repetition
With 90% (19327352) of total capacity: Total: 12596ms Avg. 125ms Resized 1 times per repetition
With 100% (21474836) of total capacity: Total: 9817ms Avg. 98ms Resized 0 times per repetition

Update: more accurate data

James M Curran further investigated the numbers regarding this test and was kind enough to share it with us. You can find the data below:

Conclusion

As we can see, the amount of times that we had to resize our code is quite stable between 10% and 90%. That’s because each time the list needs to be resized, its size gets doubled, saving us a lot of resize operations. After this we can definitely say that if you had the slightest idea of the size of your list, you should trust your guts and go for it!