CUDA/Ada is an Ada binding to NVIDIA’s CUDA parallel computing platform and programming model. This project was developed during the course of the master seminar "Program Analysis and Transformation" at the University of Applied Sciences Rapperswil.

Licence

Copyright (C) 2011, 2012 Reto Buerki <reet@codelabs.ch>
Copyright (C) 2011, 2012 Adrian-Ken Rueegsegger <ken@codelabs.ch>
University of Applied Sciences Rapperswil

This program is free software: you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.

Documentation

The paper about CUDA/Ada can be found here: http://www.codelabs.ch/cuda-ada/cuda-ada-article.pdf.

Download

Release version

The current release version of CUDA/Ada is available at http://www.codelabs.ch/download/.

Verify a Release

To verify the integrity and authenticity of the distribution tarball, import the key http://www.codelabs.ch/keys/0xBB793815pub.asc and type the following command:

$ gpg --verify libcudaada-{version}.tar.bz2.sig

The key fingerprint of the public key (0xBB793815) is:

Key fingerprint = A2FB FF56 83FB 67D8 017B  C50C F8C5 F8B5 BB79 3815

Development version

The current development version of CUDA/Ada is available through its git repository:

$ git clone http://git.codelabs.ch/git/cuda-ada.git

A browsable version of the repository is also available here: http://git.codelabs.ch/?p=cuda-ada.git.

Build

To compile CUDA/Ada on your system, you need to have the following software installed:

Testing

CUDA/Ada contains an unit test suite which can be run by entering the following command:

$ make tests

Of course, you need a CUDA aware GPU for this to work.

Benchmarks

CUDA/Ada provides benchmarking code that measures matrix addition and multiplication in Ada, CUDA/Ada and native CUDA C. The benchmarks can be run by issuing the following command:

$ make perf COUNT=20

This will print the cumulated execution times of twenty successive matrix operations for the different implementations.

CUDA/Ada performance

The chart shows the cumulated execution times (in seconds) of performing a matrix multiplication operation on a 512 by 512 matrix 20 times. All CUDA implementations used the same kernel, a grid size of 32 and a block size of 16.

Slides

The slides of the presentation about CUDA/Ada held on January 12, 2012 as part of the master seminar at the University of Applied Sciences Rapperswil can be downloaded here.

Example

--
--  Copyright (C) 2011 Reto Buerki <reet@codelabs.ch>
--  Copyright (C) 2011 Adrian-Ken Rueegsegger <ken@codelabs.ch>
--  University of Applied Sciences Rapperswil
--
--  This program is free software: you can redistribute it and/or modify
--  it under the terms of the GNU General Public License as published by
--  the Free Software Foundation, either version 3 of the License, or
--  (at your option) any later version.
--
--  This program is distributed in the hope that it will be useful,
--  but WITHOUT ANY WARRANTY; without even the implied warranty of
--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
--  GNU General Public License for more details.
--
--  You should have received a copy of the GNU General Public License
--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
--

with Ada.Text_IO;
with Ada.Numerics.Real_Arrays;

with CUDA.Autoinit;
with CUDA.Compiler;

pragma Unreferenced (CUDA.Autoinit);

procedure Add
is
   use CUDA;

   package Real_Vector_Args is new Compiler.Arg_Creators
     (Data_Type => Ada.Numerics.Real_Arrays.Real_Vector);
   use Real_Vector_Args;

   N : constant := 32 * 1024;

   A      : Ada.Numerics.Real_Arrays.Real_Vector         := (1 .. N => 2.0);
   B      : Ada.Numerics.Real_Arrays.Real_Vector         := (1 .. N => 2.0);
   C      : aliased Ada.Numerics.Real_Arrays.Real_Vector := (1 .. N => 0.0);
   Src    : Compiler.Source_Module_Type;
   Func   : Compiler.Function_Type;
   Module : Compiler.Module_Type;
begin
   Src := Compiler.Create
     (Preamble  => "#define N" & N'Img,
      Operation => "__global__ void add( float *a, float *b, float *c ) {" &
      "   int tid = blockIdx.x;"                                           &
      "   while (tid < N) {"                                               &
      "        c[tid] = a[tid] + b[tid];"                                  &
      "        tid += gridDim.x;"                                          &
      "}}");

   Module := Compiler.Compile (Source => Src);
   Func   := Compiler.Get_Function (Module => Module,
                                    Name   => "add");

   Func.Call
     (Args =>
        (1 => In_Arg (Data => A),
         2 => In_Arg (Data => B),
         3 => Out_Arg (Data => C'Access)));

   for I in C'Range loop
      if C (I) /= A (I) + B (I) then
         raise Program_Error with "Results mismatch";
      end if;
   end loop;

   Ada.Text_IO.Put_Line ("Calculation successful");
end Add;