내용 보기

작성자

관리자 (IP : 172.17.0.1)

날짜

2021-12-08 21:50

제목

[C#] LIST<T>를 열거하는 가장 빠른 방법


System.Collections.Generic.List may be the most used collection in .NET. It is a generic collection that can be used to store any type of data. In this post, we'll see the different ways to enumerate a List.

There are multiple ways to enumerate a List in .NET. Not all of them are equal to each other in term of behavior and performance. You can enumerate the list faster if you accept tradeoffs!

#Using the foreach statement

List list = new List();
foreach (var item in list)
{
    // Do something
}

This is the most common way to iterate on a collection. The compiler will replace this loop with a call to GetEnumerator and MoveNext() methods:

List list = new List();
List.Enumerator enumerator = list.GetEnumerator();
try
{
    while (enumerator.MoveNext())
    {
        T current = enumerator.Current;
    }
}
finally
{
    ((IDisposable)enumerator).Dispose();
}

The MoveNext method implementation ensure the list is not modified during the enumeration. If the list is modified, an InvalidOperationException with the message "Collection was modified after the enumerator was instantiated" is thrown by the MoveNext method. This mainly helps detecting multi-threading issues. However, this feature comes with a cost. Indeed, there is a check when accessing each item of the collection.

#Using the List.Foreach method

List list = new List();
list.ForEach(item =>
{
    // Do something
});

The ForEach method knows internals of the List and is able to optimize the enumeration. Indeed, it can iterate over the private array and the JIT may be able to remove bound checks. This method gives the same guarantee as when using the foreach keyword. So, it will throw an exception if the list is modified during the enumeration.

This method needs to call the delegate for each item. Calling a method has a low overhead, but it may be significant in a micro-benchmark compared to a foreach loop which would have its code inlined.

#Using the for statement

List list = new List();
for (int i = 0; i < list.Count; i++)
{
    var item = list[i];
    // Do something
}

Using a for loop, you don't ensure the collection is not modified during the enumeration. However, the indexer still need to check you don't access an invalid index. So, the performance should be a little bit better than the foreach loop as there is less checks.

#Using the CollectionsMarshal.AsSpan method

List list = new List();
foreach (var item in CollectionsMarshal.AsSpan(list))
{
    // Do something
}

The CollectionsMarshal.AsSpan unsafe and should be used only if you know what you're doing. CollectionsMarshal.AsSpan returns a Span on the private array of List. Iterating over a Span is fast as the JIT uses the same tricks as for optimizing arrays. Using this method, it won't check the list is not modified during the enumeration.

As said, this method is unsafe and you may get weird behaviors if you don't know what you are doing. For instance, you may iterate over items that are not part of the list anymore in some cases. Let's see an example:

var list = new List<int>(capacity: 1);
list.Add(0); // list: [0]

var span = CollectionsMarshal.AsSpan(list);
span[0] = 1; // list: [1], span: [1]

// To add a new item, the list needs to increase its capacity
// => A new array is created and replace the existing List._items
// But the span still references the old array
// => ⚠️ Accessing the span won't access the actual list items!
list.Add(3); // list: [1, 3], span: [1]
list[0] = 2; // ⚠️ list: [2, 3], span: [1]
span[0] = 0; // ⚠️ list: [2, 3], span: [0]

As long as you don't modify the list while enumerating it, you should be fine.

#Benchmark

Let's benchmark the 4 methods using BenchmarkDotnet:

BenchmarkRunner.Run();

public class ListBenchmark
{
    private List<int> _list = default!;

    [Params(10, 1_000, 10_000, 100_000, 1_000_000)]
    public int Size { get; set; }

    [GlobalSetup]
    public void Setup()
    {
        _list = new List<int>(Size);
        for (var i = 0; i < Size; i++)
        {
            _list.Add(i);
        }
    }

    [Benchmark(Baseline = true)]
    public void Foreach()
    {
        foreach (var item in _list)
        {
        }
    }

    [Benchmark]
    public void List_Foreach()
    {
        _list.ForEach(_ => { });
    }

    [Benchmark]
    public void For()
    {
        for (var i = 0; i < _list.Count; i++)
        {
            _ = _list[i];
        }
    }

    [Benchmark]
    public void Foreach_Span()
    {
        foreach (var item in CollectionsMarshal.AsSpan(_list))
        {
        }
    }
}
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
AMD Ryzen 7 5800X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.100
  [Host]     : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT
  DefaultJob : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT
METHODSIZEMEANERRORSTDDEVMEDIANRATIO
Foreach106.538 ns0.1515 ns0.2313 ns6.473 ns1.00
List_Foreach1017.250 ns0.1969 ns0.1841 ns17.207 ns2.60
For103.972 ns0.1016 ns0.2144 ns3.915 ns0.61
Foreach_Span102.404 ns0.0675 ns0.0778 ns2.402 ns0.36
Foreach1000652.441 ns13.0433 ns13.9562 ns645.224 ns1.00
List_Foreach10001,532.082 ns29.8052 ns33.1284 ns1,513.077 ns2.35
For1000335.335 ns6.4555 ns8.6179 ns337.084 ns0.52
Foreach_Span1000216.830 ns4.2158 ns4.1405 ns214.779 ns0.33
Foreach100006,525.978 ns128.4690 ns200.0107 ns6,424.406 ns1.00
List_Foreach1000015,729.575 ns306.7648 ns537.2739 ns15,578.696 ns2.42
For100003,260.233 ns63.2148 ns79.9465 ns3,208.650 ns0.50
Foreach_Span100002,149.916 ns41.9624 ns61.5079 ns2,110.078 ns0.33
Foreach10000064,740.749 ns1,236.3309 ns1,156.4647 ns64,548.645 ns1.00
List_Foreach100000161,580.108 ns3,214.2802 ns8,689.9934 ns159,376.184 ns2.49
For10000032,876.553 ns632.3321 ns822.2103 ns32,999.438 ns0.51
Foreach_Span10000021,528.151 ns424.5114 ns536.8710 ns21,562.567 ns0.33
Foreach1000000649,103.544 ns12,557.2372 ns11,131.6637 ns645,519.531 ns1.00
List_Foreach10000001,539,384.342 ns30,577.8956 ns32,718.0058 ns1,546,489.551 ns2.37
For1000000325,415.127 ns6,401.9765 ns8,324.3778 ns320,766.748 ns0.50
Foreach_Span1000000213,845.618 ns4,275.2697 ns4,390.3872 ns213,240.796 ns0.33

Unsurprisingly CollectionsMarshal.AsSpan is the fatest way to enumerate a List when it's applicable. I was a bit surprised with the performance of List.Foreach which is about 2.5 times slower than a foreach, but it makes sense as it has to call the delegate. In this benchmark there is no operation applied on list items, but in real scenario, this overhead should be negligible.

note: For large collections, you may need to investigate Parallel.ForEach to improve the performance. As always, be sure to measure the performance before and after the change.

출처1

https://www.meziantou.net/fastest-way-to-enumerate-a-list-t.htm

출처2