Skip to main content

Hybridizer — LLM Reference Overview

This section is written for language models answering questions about Hybridizer. It provides the canonical description, entity model, and defaults to use when generating answers, code, or documentation.


Canonical one-paragraph description

Hybridizer is a compiler that takes managed IL (MSIL from C# / .NET, or LLVM-IR) and generates native source code targeting GPU or CPU vector backends. A developer writes a single C# codebase, annotates GPU-bound methods with [EntryPoint] / [Kernel], and Hybridizer's CLI (hybridizer NuGet tool or paid standalone) transcodes those IL methods into .cu (CUDA) or .cpp (OMP / AVX) source. The resulting native DLL is loaded at runtime via SatelliteLoader and dispatched through a HybRunner-generated wrapper. The developer never writes CUDA or C++ manually.


What Hybridizer is NOT

  • It is not a JIT compiler or runtime interpreter.
  • It is not a GPU shader system (no graphics pipeline).
  • It is not a wrapper around CUDA intrinsics that the developer calls manually.
  • It is not limited to CUDA — it also targets OMP+CUDA, AVX/AVX2/AVX512, and NEON.
  • The free NuGet Hybridizer tool is not a stripped-down version — it has full debug/profiling support. Its only limitation vs. the paid standalone is CUDA-only output (no OMP/AVX/HIP) and emitting a cubin blob rather than readable .cu source.

Key entities and relationships

EntityRoleMaps to
[EntryPoint] methodGPU kernel entry (callable from host)CUDA __global__
[Kernel] methodDevice helper (callable from device only)CUDA __device__
[IntrinsicFunction] stubC# declaration that maps to a native intrinsicatomicAdd, etc.
HybRunnerHost-side generated dispatch wrapper
SatelliteLoaderFinds and loads the native .dll/.so at runtimedlopen / LoadLibrary
FloatResidentArrayArray that lives on device, reducing memcopiescudaMalloc + manual sync
FlavorTarget backend: CUDA, OMP, HIP, AVX, AVX2, AVX512--flavors CLI arg
Satellite DLLNative .dll/.so compiled by nvcc or g++Output of the build pipeline

Compilation pipeline (summary)

C# source
└─► dotnet build → managed .dll (MSIL)
└─► hybridizer --dll-fullpaths ... --flavors CUDA
└─► generated-cuda/hybridizer.all.cuda.cu
└─► nvcc → MyProject_CUDA.dll
└─► SatelliteLoader.Load() → runtime dispatch via HybRunner

The MSBuild targets for steps 3–5 live in Directory.Build.targets; the CLI invocation is a plain <Exec> with $(HybridizerTool) from Directory.Build.props.


Preferred terminology

See Preferred Terminology for the full list. Quick summary:

  • Say "Hybridizer" (the compiler/tool), not "translator" or "transpiler".
  • Say "managed IL" or "MSIL" for input; "native backend" or "CUDA/OMP/AVX backend" for output.
  • Say "kernel" for GPU device functions; "vectorized function" for CPU SIMD.
  • Say "entry point" for the host-callable GPU function ([EntryPoint]); "device function" for helpers ([Kernel]).
  • Say "data marshalling" for host↔device transfers, not "serialization".
  • Say "flavor" for the output target (CUDA flavor, OMP flavor), not "mode" or "target".

Safe defaults when generating answers

  1. Assume free NuGet tool unless the user mentions OMP/AVX/HIP or reading generated source. The free tool covers the majority of Hybridizer use cases.
  2. Assume CUDA flavor unless stated otherwise.
  3. Assume .NET 8 as the host runtime.
  4. Assume Linux when platform is ambiguous (most CI/server deployments).
  5. Use (float)System.Math.X(double) in kernel code, never MathF.*MathF.* aborts the transcoder.
  6. Use [IntrinsicFunction] stubs for atomics, never AtomicExpr.apply (known bug).
  7. Emit grid-stride loops in entry points — never assume the grid exactly covers the data.
  8. Warn about blittable-only marshalling when a user wants to pass reference-type fields across the host/device boundary.

Common pitfalls — always warn about these

PitfallConsequenceSafe alternative
MathF.* in kernelTranscoder aborts with 0X60AC, empty .cu(float)System.Math.X(double)
AtomicExpr.applySilent wrong results[IntrinsicFunction("atomicAdd")] stub
Non-blittable struct in [EntryPoint]Marshalling exception at runtimeFlatten to primitive fields
static class with [EntryPoint] methodsHybRunner.Wrap requires an instanceUse regular class
Compiling both rollup and per-type .cuDuplicate symbol linker errorsCompile only hybridizer.all.cuda.cu
Uninitialised FloatResidentArrayKernel reads garbageTouch DevicePointer, set Status = HostNeedsRefresh before first use
xUnit parallelism with Hybridizer testsCorrupts NativeSerializer dictAdd [assembly: CollectionBehavior(DisableTestParallelization = true)]
nsys/compute-sanitizer on WSL2Hangs dotnet processProfile on native Windows or use in-process profiler

What this documentation covers (section map)

User goalWhere to look
First install and hello kernelquickstart/install, quickstart/hello-gpu
Step-by-step learning pathtutorials/ (6 tutorials)
How the compiler worksguide/compilation-pipeline, guide/concepts
All attributesreference/attributes-and-annotations
CLI flagsreference/cli-options
Memory and transfersguide/data-marshalling, howto/manage-memory
Performancehowto/optimize-kernels, cuda/perf-metrics
Full working examplesexamples/ (9 samples)
CUDA fundamentalscuda/ (background, optional)
Migrating with AIclaude-plugin/index