Introduction
Have you ever used LINQ? If you are a dotnet developer, then I’m sure you have as it is probably the most well know way to work with your data source. Sometimes your LINQ operations might take more time than you actually expect. One of the possible solutions is to use something that is named PLINQ. The “P” letter stands for one of the most powerful word in the field of computer science (drumroll) – PARALLEL. In this article I will show you how you can speed up your LINQ query by using PLINQ.
LINQ
If you have never used LINQ let me give you quick and simple explanation. Long story shorts, the LINQ is a library that helps you to write queries against data source. You can use it for any collection of objects that supports IEnumerable or the generic IEnumerable<T> interface. There are two syntaxes you can use: query-syntax and method-syntax. To be honest, the method-syntax is much more readable for me, but of course it’s up to you which one you’ll use (here are some docs regarding the syntax). LINQ offers many operations you can use such as filtering, ordering, grouping, joining and so on.
PLINQ
I don’t try to give my own explanation as the one from the docs looks pretty neat and concise: “Parallel LINQ (PLINQ) is a parallel implementation of the Language-Integrated Query (LINQ) pattern. PLINQ implements the full set of LINQ standard query operators as extension methods for the System.Linq namespace and has additional operators for parallel operations. PLINQ combines the simplicity and readability of LINQ syntax with the power of parallel programming.”
Anything that is parallel sounds like the potential performance booster. Bare in mind the cost of parallelization. It is highly possible that PLINQ will execute slower than its LINQ counterpart (I encourage to read the Understanding Speedup in PLINQ).
LINQ vs PLINQ – benchmark
Let’s consider the following code snippets:
public double SelectLINQ() =>
_list
.Select(x => Math.Sqrt(x))
.Select(x => Math.Tan(x))
.Select(x => Math.Cos(x))
.Select(x => Math.Sin(x))
.Select(x => Math.Log10(x))
.Max();
public double SelectPLINQ(int degree) =>
_list
.AsParallel()
.WithDegreeOfParallelism(degree)
.Select(x => Math.Sqrt(x))
.Select(x => Math.Tan(x))
.Select(x => Math.Cos(x))
.Select(x => Math.Sin(x))
.Select(x => Math.Log10(x))
.Max();
As you can see, executing PLINQ is as simple as adding the .AsParallel()
invocation. WithDegreeOfParallelism is PLINQ-specific part that specifies the maximum number of processors that PLINQ should use to parallelize the query. Both methods do exactly the same. Let’s look at the benchmark:
The benchmark’s been executed for the 10_000_000 elements collection and for different degrees of parallelism (degree of parallelism concerns only PLINQ). As you can see, by using PLINQ the execution times drops down almost three times for the particular degree of parallelism (1420.2 ms vs 496.9 ms). My machine has 4 logical and 4 physical cores (no hyperthreading) so it makes sense that for degree = 4 the results are the best.
Summary
To sum up, PLINQ is an interesting solution when it comes to decrease the LINQ execution time. The benchmark shows that the such a tiny code adjustment would decrease the execution time almost three times. Keep in mind that PLINQ is not always a solution and it might introduce additional overhead that entails the worse performance. PLINQ works well as long as used correctly. Probably the best approach is to try the PLINQ and test your code by using simple benchmark.
As always I’ve prepared an example and you can find it on my github, project: LINQ-vs-PLINQ, test project: LINQ-vs-PLINQ-tests, benchmark project: LINQ-vs-PLINQ-benchmark. I encourage you to go through the code, debug it and try to test your own scenarios.
Have a nice day, bye!
Be First to Comment