Collection Initializer Performance in C# - How To Get An 87% Boost!

After seeing some posts on LinkedIn discussing collection initializers, I became curious. There was a claim that using collection expressions would have better performance than collection initializers. As a result, I set out to measure collection initializer performance in C# using BenchmarkDotNet. And yes, while these might be micro-optimizations for many people, I thought it would be cool to explore.

Besides, maybe there's someone out there with something like this on their hot-path that needs to squeeze a bit more out of their application :)

Remember to check out these platforms:

// FIXME: social media icons coming back soon!


What Are Collection Initializers and Collection Expressions in C#?

In one of my most recent articles, I explain the basics of collection initializers with some simple code examples. Simple put, instead of manually writing code like the following to initialize a collection:

List<string> devLeaderCoolList = new List<string>();
devLeaderCoolList.Add("Hello");
devLeaderCoolList.Add(", ");
devLeaderCoolList.Add("World!");

... we can instead reduce it to something more succinct like the following:

List<string> devLeaderCoolList = [ "Hello", ", ", "World!" ];

Pretty neat, right? And this collection expression syntax is even more lightweight than we've had access to in recent times.

But syntax and readability aside (Not to minimize the benefits of readability of code, but I'm trying not to put you to sleep), what about the performance?! I bet you didn't even consider that with all of the different collection initializer syntax that we have that we'd see a performance difference!

Well, Dave Callan got me thinking about that when he posted this on LinkedIn:

Dave Callan - Collection Initializer Collection Expression Benchmarks

This image was originally posted by Dave Callan on LinkedIn, and that has inspired this entire article. So let's jump into some benchmarks!


Exploring List Collection Initializer Performance in C#

This section will detail the benchmarks for initializing lists in C# in various ways. I'll provide coverage on different collection initializers, the newer collection expression syntax, and even compare it to doing it manually! Surely, adding everything by hand would be slower than setting ourselves up for success by doing it all with a collection initializer -- but we should cover our bases.

I will not be covering the spread operator in these benchmarks because I'd like to focus on that more for collection combination benchmarks. Admittedly, yes, it is still creating a collection... but I feel like the use case is different and I'd like to split it up.

I'll be using BenchmarkDotNet for all of these benchmarks, so if you're not familiar with using BenchmarkDotNet you can check out this video and see how to use it for yourself:

The List Benchmark Code

With the BenchmarkDotNet NuGet installed, here's what I am using at the entry point to kick things off (for all benchmark examples in this article):

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

using System.Reflection;

BenchmarkRunner.Run(
    Assembly.GetExecutingAssembly(),
    args: args);

It's not very exciting -- but I wanted to show you there's nothing fancy going on here. Just running all of the benchmarks we have access to. And here is the list benchmark code:

[MemoryDiagnoser]
[MediumRunJob]
public class ListBenchmarks
{
    private static readonly string[] _dataAsArray = new string[]
    {
        "Apple",
        "Banana",
        "Orange",
    };

    private static IEnumerable<string> GetDataAsIterator()
    {
        yield return "Apple";
        yield return "Banana";
        yield return "Orange";
    }

    [Benchmark(Baseline = true)]
    public List<string> ClassicCollectionInitializer_NoCapacity()
    {
        return new List<string>()
        {
            "Apple",
            "Banana",
            "Orange",
        };
    }

    [Benchmark]
    public List<string> ClassicCollectionInitializer_SetCapacity()
    {
        return new List<string>(3)
        {
            "Apple",
            "Banana",
            "Orange",
        };
    }

    [Benchmark]
    public List<string> CollectionExpression()
    {
        return
        [
            "Apple",
            "Banana",
            "Orange",
        ];
    }

    [Benchmark]
    public List<string> CopyConstructor_Array()
    {
        return new List<string>(_dataAsArray);
    }

    [Benchmark]
    public List<string> CopyConstructor_Iterator()
    {
        return new List<string>(GetDataAsIterator());
    }

    [Benchmark]
    public List<string> ManuallyAdd_NoCapacitySet()
    {
        List<string> list = [];
        list.Add("Apple");
        list.Add("Banana");
        list.Add("Orange");
        return list;
    }

    [Benchmark]
    public List<string> ManuallyAdd_CapacitySet()
    {
        List<string> list = new(3);
        list.Add("Apple");
        list.Add("Banana");
        list.Add("Orange");
        return list;
    }
}

Note in the above code example the baseline we will be comparing against is what I consider the traditional collection initializer:

return new List<string>()
{
    "Apple",
    "Banana",
    "Orange",
};

The List Benchmark Results

And of course, I wouldn't make you go compile and run these yourself, so let's look at the results below:

C# Collection Initializer and Collection Expression Benchmarks for List

Let's go through the results from worst to best based on the Ratio column (Higher is worse):

  • 3.31X - Using a copy constructor where we pass in an iterator is the worst performing. This is likely due to the overhead of creating an iterator, especially for such a small and simple overall operation AND because there's no known capacity when using an iterator.
  • 1.76X - Using a copy constructor even with an array isn't great. If you know what you need to put into the collection, you're better off using a normal classic collection initializer. An argument for both the copy constructors though is that if this isn't on the hot path it might be more maintainable in your code to copy a collection vs instantiate it with duplicated values manually.
  • 1.10X - Manually adding things to a collection without setting a capacity is only a little bit slower than using a collection initializer with no capacity! 10% slower based on these benchmarks.
  • 1.0X - The baseline here is a classic collection initializer with no capacity set

Here is where we start to see some speed up!

  • 0.64X - Using a collection expression was 64% of the time! That's a pretty dramatic improvement for what just looks like a syntax change, and that's very much inline with what Dave Callan's screenshot shows.
  • 0.61X - Manually adding things to a list that has an initial capacity is actually FASTER than these other collection initializers and collection expression that we've seen so far!
  • 0.53X - Using a classic collection initializer but providing the capacity is almost HALF the time!

One of the common themes here is that providing a capacity is a BIG performance gain. We realized an ~87% gain over our baseline simply by providing it a capacity. Side note: why couldn't the compiler do some kind of optimization here if we know the collection size in the braces?!


Exploring Dictionary Collection Initializer Performance in C#

Dictionaries don't yet have a fancy collection expression that uses square brackets and removes even more bloat, but we do have several variations of collection initializers to use. These benchmarks will be very similar, also using BenchmarkDotNet, and they also use the same entry point program -- so I won't repeat it here.

I know dictionaries can have two types to work with, and I wanted to keep this similar to the list example -- not because they are similar implementations of collections, but because I didn't want to just pollute this article with more variations of things for no reason. I decided to go with a Dictionary<string, string> where the keys are what we already looked at, and the values are just some short strings to work with that are unique.

The Dictionary Benchmark Code

Here's the code for the dictionary benchmarks:

[MemoryDiagnoser]
[MediumRunJob]
public class DictionaryBenchmarks
{
    private static readonly Dictionary<string, string> _sourceData = new()
    {
        ["Apple"] = "The first value",
        ["Banana"] = "The next value",
        ["Orange"] = "The last value",
    };

    private static IEnumerable<KeyValuePair<string, string>> GetDataAsIterator()
    {
        foreach (var item in _sourceData)
        {
            yield return item;
        }
    }

    [Benchmark(Baseline = true)]
    public Dictionary<string, string> CollectionInitializer_BracesWithoutCapacity()
    {
        return new Dictionary<string, string>()
        {
            { "Apple", "The first value" },
            { "Banana", "The next value" },
            { "Orange",  "The last value" },
        };
    }

    [Benchmark]
    public Dictionary<string, string> CollectionInitializer_BracesWithCapacity()
    {
        return new Dictionary<string, string>(3)
        {
            { "Apple", "The first value" },
            { "Banana", "The next value" },
            { "Orange",  "The last value" },
        };
    }

    [Benchmark]
    public Dictionary<string, string> CollectionInitializer_BracketsWithoutCapacity()
    {
        return new Dictionary<string, string>()
        {
            ["Apple"] = "The first value",
            ["Banana"] = "The next value",
            ["Orange"] = "The last value",
        };
    }

    [Benchmark]
    public Dictionary<string, string> CollectionInitializer_BracketsWithCapacity()
    {
        return new Dictionary<string, string>(3)
        {
            ["Apple"] = "The first value",
            ["Banana"] = "The next value",
            ["Orange"] = "The last value",
        };
    }

    [Benchmark]
    public Dictionary<string, string> CopyConstructor_Dictionary()
    {
        return new Dictionary<string, string>(_sourceData);
    }

    [Benchmark]
    public Dictionary<string, string> CopyConstructor_Iterator()
    {
        return new Dictionary<string, string>(GetDataAsIterator());
    }

    [Benchmark]
    public Dictionary<string, string> ManuallyAdd_NoCapacitySet()
    {
        Dictionary<string, string> dict = [];
        dict.Add("Apple", "The first value");
        dict.Add("Banana", "The next value");
        dict.Add("Orange", "The last value");
        return dict;
    }

    [Benchmark]
    public Dictionary<string, string> ManuallyAdd_CapacitySet()
    {
        Dictionary<string, string> dict = new(3);
        dict.Add("Apple", "The first value");
        dict.Add("Banana", "The next value");
        dict.Add("Orange", "The last value");
        return dict;
    }

    [Benchmark]
    public Dictionary<string, string> ManuallyAssign_NoCapacitySet()
    {
        Dictionary<string, string> dict = [];
        dict["Apple"] = "The first value";
        dict["Banana"] = "The next value";
        dict["Orange"] = "The last value";
        return dict;
    }

    [Benchmark]
    public Dictionary<string, string> ManuallyAssign_CapacitySet()
    {
        Dictionary<string, string> dict = new(3);
        dict["Apple"] = "The first value";
        dict["Banana"] = "The next value";
        dict["Orange"] = "The last value";
        return dict;
    }
}

You'll notice two themes creeping up:

  • We have two different flavors of collection initializers: square brackets and curly braces
  • We can manually instantiate a dictionary by adding or directly assigning (which are NOT the exact same behavior)

Otherwise, we still have capacity considerations just like the list benchmarks!

The Dictionary Benchmark Results

The dictionary benchmarks are as follows:

C# Collection Initializer and Collection Expression Benchmarks for Dictionary

Doing the same exercise of highest to lowest ratio:

  • 2.03X - Copy constructor with iterator strikes again! I suspect for similar reasons -- no count for knowing the capacity and the overhead of creating the iterator relative to the number of items.
  • 1.02X - Manually adding items to the dictionary WITH the capacity set was almost the exact same performance as using the collection initializer! This one to me is very surprising given the known capacity usually speeds things up a great deal.
  • 1.0X - Our classic collection initializer using curly braces and not setting the capacity is our baseline

Everything beyond here is technically faster according to our benchmarks:

  • 0.96X - In the case where we provide a capacity, we're a little bit faster than the baseline using the same syntax.
  • 0.95X - Manually assigning items without a known capacity is even faster, and about 5% faster than the baseline. I still wouldn't be making much of a fuss here.
  • 0.94X - Classic collection initializer but using square brackets and not setting a capacity is ~6% faster than the baseline.
  • 0.90X - Manually assigning items when the capacity is known is ~11% faster than the baseline, so we're starting to pick up steam a little bit here!
  • 0.87X - Classic collection initializer but using square brackets WITH a known capacity is ~15% faster than the baseline
  • 0.86X - Manually adding things to the dictionary but NOT setting a capacity is ~16% faster...

Okay, wait a second. Now we're going to see that doing a dictionary copy is the FASTEST with a ~35% speed boost? I'm not sure how we've started to see known capacities not helping and copy constructors being fastest.

Even I'm skeptical now. So I wanted to rerun the benchmarks and I wanted to add a variant of each of the manual benchmarks that uses new() instead of an empty collection expression, [].

C# Collection Initializer and Collection Expression Benchmarks for Dictionary V2

In this run of the benchmarks, things are much closer across the board. I don't think that discredits the previous benchmarks, because truly many of them were also relatively close with the iterator copy constructor remaining the huge outlier. But the other huge outlier that remains is the copy constructor using another dictionary!

My takeaway is this:

  • For the marginal boost in performance, I think you should opt for readability here -- especially when the collection sizes are this small.
  • If you know that you're trying to set up a dictionary to be just like another, apparently copying it is much faster. So if you have this kind of thing on a hot path, here's a fun little micro-optimization.

Are These Realistic Benchmarks?

I've written hundreds of articles, made hundreds of YouTube videos, and more posts across social media platforms than I could ever count. There will be people who want to pick these benchmarks apart, and unfortunately, their goal will seem like they're just trying to discredit what's being presented.

However, I *DO* think it's important to discuss the context of the benchmarks and look at what's being considered in these scenarios:

  • In the grand scheme of things, I'd suspect it's unlikely that you're going to get huge performance gains from focusing on these collection initializers. There's probably bigger fish to fry in your code. But we do see there are some gains to be had!
  • When using an iterator example with a very small set of data or other very fast operations, the overhead of the iterator itself may dwarf some of the other actions taking place
  • The use case for these different ways of creating lists varies. For some, we're defining the full collection whereas for others we're using values from another collection. The use case isn't necessarily apples to apples for a comparison.
  • There seem to be big gains from knowing the capacity up front, which is likely helping reduce collection resizing behind the scenes. How might this change if we were dealing with larger data sets?

The goal of presenting these benchmarks is not to tell you that you must do things a certain way -- it's simply to show you some interesting information. Even if you are hyper-focused on performance, you should benchmark and profile your own code! Don't rely on my results here. Let these serve as a starting point that you might be able to tune things on your hot path that you didn't realize.

What other considerations can you think of? Feel free to share in the comments -- but be conversational, please.


Wrapping Up Collection Initializer Performance in C#

Overall, I consider most of what we see in this article on collection initializer performance in C# to be micro-optimizations -- more than likely. I wouldn't lose sleep over using one way over another, as long as you're optimizing for readability and your profiling results don't show you spending most of your time doing collection initialization. I hope that you got to have fun exploring this with me and see that if you're ever curious you can go set up some simple benchmarks to experiment!

If you found this useful and you're looking for more learning opportunities, consider subscribing to my free weekly software engineering newsletter and check out my free videos on YouTube! Meet other like-minded software engineers and join my Discord community!

Affiliations

These are products & services that I trust, use, and love. I get a kickback if you decide to use my links. There’s no pressure, but I only promote things that I like to use!

  • BrandGhost: My social media content and scheduling tool that I use for ALL of my content!
  • RackNerd: Cheap VPS hosting options that I love for low-resource usage!
  • Contabo: Affordable VPS hosting options!
  • ConvertKit: The platform I use for my newsletter!
  • SparkLoop: Helps add value to my newsletter!
  • Opus Clip: Tool for creating short-form videos!
  • Newegg: For all sorts of computer components!
  • Bulk Supplements: Huge selection of health supplements!
  • Quora: I answer questions when folks request them!

Frequently Asked Questions: Collection Initializer Performance in C#

What are collection initializers in C#?

Collection initializers in C# allow you to populate a collection with a set of predefined elements in a concise syntax, enhancing readability and maintainability of code.

How do collection expressions improve C#?

Collection expressions, introduced in C# 11, offer a more streamlined syntax for collection initialization, including support for the spread operator, making code simpler and more expressive.

Are there performance benefits to using collection expressions in C#?

While collection expressions mainly improve code readability and brevity, they can also contribute to performance improvements by optimizing the way collections are initialized and merged.

How to Use BenchmarkDotNet: 6 Simple Performance-Boosting Tips to Get Started

Learn how to use BenchmarkDotNet to effectively create and run benchmarks on your C# code. Dig into where you have opportunities to optimize your C# code!

Collection Initializers and Collection Expressions in C# - Simplified Code Examples

See code examples for C# collection initializers and collection expressions! Compare and contrast the readability of these different examples.

C# Regex Performance: How To Squeeze Out Performance

Regular expressions are powerful for pattern matching, but what about performance? Check out this article for details on C# regex performance from benchmarks!

An error has occurred. This application may no longer respond until reloaded. Reload x