Skip to main content

Glossary

Definitions of key terms used throughout the Hybridizer documentation.

A

AVX (Advanced Vector Extensions)

SIMD instruction set for x86 processors enabling 256-bit (AVX/AVX2) or 512-bit (AVX512) vector operations.

B

Block

A group of threads in CUDA that can share memory and synchronize. Maps to work-group in OpenCL.

Blittable

A data type with identical layout in managed and native memory, allowing direct memory copy without marshalling.

C

Coalescence

Memory access pattern where consecutive threads access consecutive memory addresses, enabling efficient GPU memory transactions.

CUDA

Compute Unified Device Architecture — NVIDIA's parallel computing platform and programming model.

D

Device

The GPU or accelerator that executes kernels. Contrasts with "Host" (CPU).

Device Function

A function executed on the device, callable from other device code. Marked with [Kernel] in Hybridizer.

E

Entry Point

A method marked for GPU execution, callable from host code. Marked with [EntryPoint] in Hybridizer.

F

Flavor

The target output format of the Hybridizer (CUDA, OMP, AVX, etc.). Each flavor generates code for a specific backend.

G

Global Memory

Main GPU memory accessible by all threads. High capacity but high latency.

Grid

The collection of all blocks launched for a kernel execution.

Grid-Stride Loop

A loop pattern where each thread processes multiple elements by striding through the data.

H

Host

The CPU that launches kernels and manages device memory.

HybRunner

The runtime class for invoking Hybridizer-generated kernels.

I

IL (Intermediate Language)

The bytecode format compiled from high-level languages. MSIL for .NET.

Intrinsic

A function or constant that maps directly to a hardware-specific operation.

K

Kernel

In Hybridizer, a device function marked with [Kernel]. In CUDA, synonymous with entry point.

M

Marshalling

The process of converting data between managed (.NET) and native (C/C++) representations.

MSIL

Microsoft Intermediate Language — the bytecode format for .NET assemblies.

N

NEON

SIMD instruction set for ARM processors, supporting 128-bit vector operations.

O

Occupancy

The ratio of active warps to maximum warps per SM. Higher occupancy can hide memory latency.

OMP (OpenMP)

A parallel programming standard for shared-memory multi-threading.

P

Phivect

Internal vector library used by Hybridizer for CPU vectorization backends.

Pinned Memory

Page-locked host memory enabling faster GPU transfers and async copies.

S

Shared Memory

Fast on-chip memory shared by threads within a block. ~100× faster than global memory.

SIMD

Single Instruction Multiple Data — executing the same operation on multiple data elements.

SIMT

Single Instruction Multiple Thread — NVIDIA's execution model where threads in a warp execute together.

SM (Streaming Multiprocessor)

The fundamental compute unit of an NVIDIA GPU. Each SM contains multiple CUDA cores.

Stream

A sequence of operations that execute in order on the GPU. Different streams can execute concurrently.

T

Template Concept

An interface marked with [HybridTemplateConcept] for C++ template specialization.

Thread

The smallest unit of parallel execution. In CUDA, threads within a warp execute in lockstep.

W

Warp

A group of 32 threads that execute the same instruction simultaneously on NVIDIA GPUs.

Work Item / Work Group

OpenCL terminology for thread and block, respectively.

Known Limitations

The following are not supported in Hybridizer kernel code:

CategoryUnsupported
MemoryHeap allocation, new for classes
Typesstring, object, dynamic
Control Flowtry/catch/finally, lock
Loopsforeach (use for instead)
GenericsGeneric methods (types OK)
RecursionDirect recursion (use interfaces)

See What is Hybridizer - Limitations for details.

Next Steps