這一篇介紹 .NET 6 新增的另一個功能,Chunk
本集提要
- 框架 : .NET 6
- 功能 : Chunk
說明
Chunk 的作用很簡單,就是把一個序列分割成多個相同數量的序列 (最後一個會是剩餘數量,會與前面的數量不同),這個方法執行後回傳一個 IEnumerable<T[]> 。
例如有一個以下的序列,其中有十三個元素:
List<Person> people = new List<Person>
{
new Person { Name = "John", Age = 21 },
new Person { Name = "Alex", Age = 34 },
new Person { Name = "Mary", Age = 29 },
new Person { Name = "Sophia", Age = 24 },
new Person { Name = "Michael", Age = 40 },
new Person { Name = "Emma", Age = 26 },
new Person { Name = "Daniel", Age = 33 },
new Person { Name = "Olivia", Age = 28 },
new Person { Name = "James", Age = 45 },
new Person { Name = "Isabella", Age = 30 },
new Person { Name = "Benjamin", Age = 31 },
new Person { Name = "Mia", Age = 32 },
new Person { Name = "Lucas", Age = 27 }
};
要將這個 List<Person> 以三個為一單位分割,並顯示結果:
IEnumerable<Person[]> chunkedPeople = people.Chunk(3);
int index = 1;
foreach (var chunk in chunkedPeople)
{
Console.WriteLine($"Chunk {index}: {string.Join(", ", chunk.Select(p => p.Name))}");
index++;
}
Chunk 1: John, Alex, Mary
Chunk 2: Sophia, Michael, Emma
Chunk 3: Daniel, Olivia, James
Chunk 4: Isabella, Benjamin, Mia
Chunk 5: Lucas
如果沒有 Chunk ,那我們可能得這麼搞:
static IEnumerable<T[]> CustomChunk<T>(IEnumerable<T> source, int size)
{
if (source == null)
{
throw new ArgumentNullException(nameof(source));
}
var queue = new Queue<T>(source);
while (queue.Count > 0)
{
var chunk = new T[Math.Min(size, queue.Count)];
for (int i = 0; i < chunk.Length; i++)
{
chunk[i] = queue.Dequeue();
}
yield return chunk;
}
}
Benchmark
附帶也寫了個 Benchmark 測試:
internal class Program
{
static void Main(string[] args)
{
var summary = BenchmarkRunner.Run<ChunkBenchmark>();
}
}
[MemoryDiagnoser]
public class ChunkBenchmark
{
private List<Person> _people;
[GlobalSetup]
public void Setup()
{
var random = new Random();
_people = Enumerable.Range(1, 1000).Select(i => new Person
{
Name = $"Name_{i}",
Age = random.Next(10, 81)
}).ToList();
}
[Benchmark]
[Arguments(3)]
[Arguments(7)]
[Arguments(43)]
public void CallChunk(int size)
{
var result = _people.Chunk(size).ToList();
}
[Benchmark]
[Arguments(3)]
[Arguments(7)]
[Arguments(43)]
public void CallCustomChunk(int size)
{
var result = CustomChunk(_people, size).ToList();
}
static IEnumerable<T[]> CustomChunk<T>(IEnumerable<T> source, int size)
{
if (source == null)
{
throw new ArgumentNullException(nameof(source));
}
var queue = new Queue<T>(source);
while (queue.Count > 0)
{
var chunk = new T[Math.Min(size, queue.Count)];
for (int i = 0; i < chunk.Length; i++)
{
chunk[i] = queue.Dequeue();
}
yield return chunk;
}
}
}
public class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
結果有點意外,自己搞出來的 CustomChunk 的效能比較好,不過記憶體消耗比較大,或許有可能是我程式碼考慮的不夠周詳,又或是微軟的程式碼裡用了 Array.Resize 的緣故。
// * Summary *
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4751/23H2/2023Update/SunValley3)
12th Gen Intel Core i7-1265U, 1 CPU, 12 logical and 10 physical cores
.NET SDK 9.0.200-preview.0.25057.12
[Host] : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
DefaultJob : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
| Method | size | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
|---------------- |----- |----------:|----------:|----------:|-------:|-------:|----------:|
| CallChunk | 3 | 10.981 us | 0.1972 us | 0.2192 us | 3.9063 | 0.2899 | 23.98 KB |
| CallCustomChunk | 3 | 6.380 us | 0.1259 us | 0.1177 us | 5.1804 | 0.5112 | 31.77 KB |
| CallChunk | 7 | 8.072 us | 0.0886 us | 0.0829 us | 2.5330 | 0.1221 | 15.57 KB |
| CallCustomChunk | 7 | 4.426 us | 0.0506 us | 0.0448 us | 3.7994 | 0.3128 | 23.27 KB |
| CallChunk | 43 | 6.759 us | 0.0616 us | 0.0515 us | 1.6251 | 0.0534 | 10 KB |
| CallCustomChunk | 43 | 3.430 us | 0.0288 us | 0.0256 us | 2.7618 | 0.1717 | 16.91 KB |
// * Summary *
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4751/23H2/2023Update/SunValley3)
12th Gen Intel Core i7-1265U, 1 CPU, 12 logical and 10 physical cores
.NET SDK 9.0.200-preview.0.25057.12
[Host] : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
DefaultJob : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
| Method | size | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
|---------------- |----- |----------:|----------:|----------:|-------:|-------:|----------:|
| CallChunk | 3 | 10.532 us | 0.1145 us | 0.1071 us | 3.9063 | 0.2899 | 23.98 KB |
| CallCustomChunk | 3 | 6.557 us | 0.1219 us | 0.1140 us | 5.1804 | 0.5112 | 31.77 KB |
| CallChunk | 7 | 8.208 us | 0.1490 us | 0.2363 us | 2.5330 | 0.1221 | 15.57 KB |
| CallCustomChunk | 7 | 4.501 us | 0.0877 us | 0.1009 us | 3.7994 | 0.3128 | 23.27 KB |
| CallChunk | 43 | 6.835 us | 0.1349 us | 0.1385 us | 1.6251 | 0.0534 | 10 KB |
| CallCustomChunk | 43 | 3.436 us | 0.0654 us | 0.0778 us | 2.7618 | 0.1717 | 16.91 KB |
Benchmark 的程式碼在此。