Your First "Hello GPU!"

Let's run your first piece of accelerated code. This simple example adds two vectors together, a classic parallel computing task.

The C# Code

Here is a simple C# console application. The Add method is the one we want to run on the GPU. We mark it with the [EntryPoint] attribute.

using Hybridizer.Runtime.CUDAImports;

namespace HelloWorld
{
    class Program
    {
        [EntryPoint]
        public static void Add(float[] a, float[] b, int n)
        {
            for (int i = 0; i < n; ++i)
            {
                a[i] += b[i];
            }
        }

        static void Main(string[] args)
        {
            const int n = 1024;
            float[] a = new float[n];
            float[] b = new float[n];
            
            // Initialize arrays...

            // Create a cuda thread
            cudaDeviceProp prop;
            cuda.GetDeviceProperties(out prop, 0);
            dynamic wrapper = HybRunner.Cuda().SetDistrib(prop.multiProcessorCount * 16, 128);
            
            // Run on GPU
            wrapper.Add(a, b, n);

            // Verify results...
        }
    }
}

Minimal example: vector add in C# compiled to a CUDA kernel.

Steps:

Create console project and reference Hybridizer packages.
Mark entry method with the required attribute (e.g., [EntryPoint]).
Build to generate CUDA code and compile.
Invoke generated kernel from host. Notes:

Show kernel launch configuration and memory copy.
Verify output and compare to CPU version.

See: Programming Guide → Invoke Generated Code.

The C# Code​

The C# Code