Gapotchenko.FX.Runtime.CompilerServices.Intrinsics
2024.2.5
Prefix Reserved
dotnet add package Gapotchenko.FX.Runtime.CompilerServices.Intrinsics --version 2024.2.5
NuGet\Install-Package Gapotchenko.FX.Runtime.CompilerServices.Intrinsics -Version 2024.2.5
<PackageReference Include="Gapotchenko.FX.Runtime.CompilerServices.Intrinsics" Version="2024.2.5" />
paket add Gapotchenko.FX.Runtime.CompilerServices.Intrinsics --version 2024.2.5
#r "nuget: Gapotchenko.FX.Runtime.CompilerServices.Intrinsics, 2024.2.5"
// Install Gapotchenko.FX.Runtime.CompilerServices.Intrinsics as a Cake Addin #addin nuget:?package=Gapotchenko.FX.Runtime.CompilerServices.Intrinsics&version=2024.2.5 // Install Gapotchenko.FX.Runtime.CompilerServices.Intrinsics as a Cake Tool #tool nuget:?package=Gapotchenko.FX.Runtime.CompilerServices.Intrinsics&version=2024.2.5
Overview
The module allows to define and compile intrinsic functions. They can be used in hardware-accelerated implementations of various performance-sensitive algorithms.
Example
Suppose we are trying to fix the performance bottleneck in the following algorithm:
class BitOperations
{
// Returns the base 2 logarithm of a specified number.
public static int Log2_Trivial(uint value)
{
int r = 0;
while ((value >>= 1) != 0)
++r;
return r;
}
}
log<sub>2</sub> seems to be a trivial operation but it often becomes a serious bottleneck in path-finding or cryptographic algorithms. We can do better here if we switch to a table lookup:
class BitOperations
{
// "Bit Twiddling Hacks" by Sean Eron Anderson:
// http://graphics.stanford.edu/~seander/bithacks.html
static readonly int[] m_Log2DeBruijn32 =
{
0, 9, 1, 10, 13, 21, 2, 29,
11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7,
19, 27, 23, 6, 26, 5, 4, 31
};
public static int Log2_DeBruijn(uint value)
{
// Round down to one less than a power of 2.
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
value |= value >> 16;
var index = (value * 0x07C4ACDDU) >> 27;
return m_Log2DeBruijn32[index];
}
}
Here are the execution times of all two implementations (lower is better):
Method | Mean | Error | StdDev |
---|---|---|---|
Log2_Trivial | 4.587 ns | 0.0325 ns | 0.0288 ns |
Log2_DeBruijn | 1.256 ns | 0.0068 ns | 0.0063 ns |
This is a vast improvement over the previous version but we can do even better.
Meet the Intel 80386, a 32-bit microprocessor introduced in 1985.
It brought the Bit Scan Reverse (BSR) instruction that does exactly the same what we want to achieve by Log2
using just a small fraction of CPU cycles.
Chances are that your machine runs on a descendant of that influential CPU, be it AMD Ryzen or Intel Core.
So how can we use the low-level BSR
instruction from high-level .NET?
This is why Gapotchenko.FX.Runtime.CompilerServices.Intrinsics
class exists.
It provides the ability to define an intrinsic implementation of a method with MachineCodeIntrinsicAttribute
.
Let's see how:
using Gapotchenko.FX.Runtime.CompilerServices;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
class BitOperations
{
// Use static constructor to ensure that intrinsic methods are initialized (compiled) before they can be used
static BitOperations() => Intrinsics.InitializeType(typeof(BitOperations));
static readonly int[] m_Log2DeBruijn32 =
{
0, 9, 1, 10, 13, 21, 2, 29,
11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7,
19, 27, 23, 6, 26, 5, 4, 31
};
// Define machine code intrinsic for the method
[MachineCodeIntrinsic(Architecture.X64, 0x0f, 0xbd, 0xc1)] // BSR EAX, ECX
[MethodImpl(MethodImplOptions.NoInlining)]
public static int Log2_Intrinsic(uint value)
{
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
value |= value >> 16;
var index = (value * 0x07C4ACDDU) >> 27;
return m_Log2DeBruijn32[index];
}
}
Log2_Intrinsic
method defines a custom attribute that provides a machine code for BSR EAX, ECX
instruction.
Machine code is tied to CPU architecture and this is reflected in the attribute as well.
Please note that besides using MachineCodeIntrinsicAttribute
to define method intrinsic implementations,
BitOperations
class should use a static constructor to ensure that the corresponding methods are initialized (compiled) before they are called.
Here are the execution times of all three implementations (lower is better):
Method | Mean | Error | StdDev |
---|---|---|---|
Log2_Trivial | 4.587 ns | 0.0325 ns | 0.0288 ns |
Log2_DeBruijn | 1.256 ns | 0.0068 ns | 0.0063 ns |
Log2_Intrinsic | 1.038 ns | 0.0660 ns | 0.0947 ns |
Log2_Intrinsic
is a clear winner.
The intrinsic compiler may or may not apply machine code to a method depending on the current app host environment. When intrinsic is not applied, the original method implementation is used, thus providing a graceful, albeit less performant, fallback.
Commonly Used Types
Gapotchenko.FX.Runtime.CompilerServices.IntrinsicServices
Gapotchenko.FX.Runtime.CompilerServices.MachineCodeIntrinsicAttribute
Other Modules
Let's continue with a look at some other modules provided by Gapotchenko.FX:
- Gapotchenko.FX
- Gapotchenko.FX.AppModel.Information
- Gapotchenko.FX.Collections
- Gapotchenko.FX.Console
- Gapotchenko.FX.Data
- Gapotchenko.FX.Diagnostics
- Gapotchenko.FX.IO
- Gapotchenko.FX.Linq
- Gapotchenko.FX.Math
- Gapotchenko.FX.Memory
- Gapotchenko.FX.Numerics ✱
- Gapotchenko.FX.Reflection.Loader ✱
- ➴ Gapotchenko.FX.Runtime.CompilerServices.Intrinsics ✱✱
- Gapotchenko.FX.Runtime.InteropServices ✱
- Gapotchenko.FX.Security.Cryptography
- Gapotchenko.FX.Text
- Gapotchenko.FX.Threading
- Gapotchenko.FX.Tuples
Symbol ✱ denotes an advanced module.
Symbol ✱✱ denotes an expert module.
Or take a look at the full list of modules.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 is compatible. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 is compatible. netcoreapp2.2 was computed. netcoreapp3.0 is compatible. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 is compatible. |
.NET Framework | net461 is compatible. net462 was computed. net463 was computed. net47 was computed. net471 is compatible. net472 is compatible. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETCoreApp 2.1
- Gapotchenko.FX (>= 2024.2.5)
-
.NETCoreApp 3.0
- Gapotchenko.FX (>= 2024.2.5)
-
.NETFramework 4.6.1
- Gapotchenko.FX (>= 2024.2.5)
-
.NETFramework 4.7.1
- Gapotchenko.FX (>= 2024.2.5)
-
.NETFramework 4.7.2
- Gapotchenko.FX (>= 2024.2.5)
-
.NETStandard 2.0
- Gapotchenko.FX (>= 2024.2.5)
-
.NETStandard 2.1
- Gapotchenko.FX (>= 2024.2.5)
-
net5.0
- Gapotchenko.FX (>= 2024.2.5)
-
net6.0
- Gapotchenko.FX (>= 2024.2.5)
-
net7.0
- Gapotchenko.FX (>= 2024.2.5)
-
net8.0
- Gapotchenko.FX (>= 2024.2.5)
-
net9.0
- Gapotchenko.FX (>= 2024.2.5)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Gapotchenko.FX.Runtime.CompilerServices.Intrinsics:
Package | Downloads |
---|---|
Gapotchenko.FX.Numerics
The module provides hardware-accelerated operations for numeric data types. |
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
2024.2.5 | 267 | 12/31/2024 |
2024.1.3 | 298 | 11/10/2024 |
2022.2.7 | 6,266 | 5/1/2022 |
2022.2.5 | 1,784 | 5/1/2022 |
2022.1.4 | 1,756 | 4/6/2022 |
2021.2.21 | 1,088 | 1/21/2022 |
2021.2.20 | 988 | 1/17/2022 |
2021.1.5 | 743 | 7/6/2021 |
2020.2.2-beta | 491 | 11/21/2020 |
2020.1.15 | 888 | 11/5/2020 |
2020.1.9-beta | 550 | 7/14/2020 |
2020.1.8-beta | 543 | 7/14/2020 |
2020.1.7-beta | 575 | 7/14/2020 |
2020.1.1-beta | 639 | 2/11/2020 |
2019.3.7 | 892 | 11/4/2019 |
2019.2.20 | 943 | 8/13/2019 |