4.25

4.2 | 4 ratings Rate this file 71 downloads (last 30 days) File Size: 49.01 KB File ID: #20220

Fast 2D GPU-based convolution

by Alexander Huth

 

09 Jun 2008 (Updated 16 Jul 2008)

Code covered by the BSD License  

Graphics chip assisted fast 2d convolution

Download Now | Watch this File

File Information
Description

cudaconv - Performs 2d convolution using an NVIDIA graphics chipset.

For large datasets (~1 million elements) and especially for large kernels (performance does not scale much with kernel size) cudaconv can outperform conv2 by as much as 5000%.

I did not create this algorithm.. it is adapted from an example included in the CUDA SDK and wrapped in MATLAB-compatible C code.

With very large data matrices, it can *completely* crash your computer(/graphics driver?), so beware. In testing, I found an upper limit on convolution size (limited either by the size the CUDA FFT function can accept or the size of a 2D texture) of roughly 2^20 elements, so above that the code breaks the convolution into smaller pieces. If you are feeling adventurous, feel free to raise that limit, but be aware that at those sizes cudaconv is already roughly 50-100x faster than conv2.

MATLAB release MATLAB 7.4 (R2007a)
Other requirements To compile and run this software, you need the NVIDIA CUDA Toolkit (http://www.nvidia.com/object/cuda_get.html) and a modern NVIDIA graphics card. Tested on OS X 10.5, assumed to work under any brand of Linux, no guarantees in Windows.
Zip File Content  
Other Files cudaconv/convolutionFFT2D_kernel.cu,
cudaconv/cudaconv.cu,
cudaconv/cudaconv.m,
cudaconv/cudaconv.mexmaci,
cudaconv/cutil.h,
cudaconv/Makefile,
cudaconv/nvmex,
cudaconv/nvopts.sh,
cudaconv/README,
cudaconv/testcudaconv.m
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (8)
17 Jun 2008 Alex Huth

Sorry there was a missing header file -- all should be fixed when the update is posted.

22 Sep 2008 Simon Knight

Hi, I am only getting a matrix of zeros when I run this:
>> y = rand(64);
>> f = 1/9*ones(3);
>> z1 = conv2(y,f, 'same');
>> z2 = cudaconv(y,f);
>> any(any(z1))
ans =
     1
>> any(any(z2))
ans =
     0

I am using R2007a, and have tried on OSX. Is the latest zip file supplied above the one with the corrected header file?
This stuff looks promising, so I'd be very keen to try it.
Thanks!

01 Mar 2009 Bjorn Bjorno

To solve the problem with the zeros output (see previous message by Simon Knight), run the NVIDA CUDA toolkit installer again, opt for the customized installation and check 'CUDAKext'. After rebooting, the cudaconv function should run perfectly.

09 Apr 2009 Yi Cao

It works as expected on my Geforce 8400 GPU.

24 Apr 2009 Bernd

The convolution is very fast and pretty accurate for the 'valid' part of an 2D signal (except the known double-single precision difference), but there are big differences near the edges if using 'same' shape. Therefore I wrote a piece of shaping code to treat it like conv2. Please test and report any coding mistakes!!!
____________________________________________________
function [newimage] = cudaconv2(image,filter,shape)
if nargin == 2
    shape = 'full';
end

if (strcmp(shape, 'full')) % it's not a real 'full' convolution !!!!!
    [im in] = size(image);
    [fm fn] = size(filter);
    outM1 = 1;
    outN1 = 1;
    image2 = zeros(im+fm-1, in+fn-1);
    image2(round(fm/2):round(im + fm/2 - ...1),round(fn/2):round(in + fn/2 - 1)) = image(1:end,1:end);
    output = cudaconv(image2,filter);
    [outM2, outN2] = size(output);
    
elseif (strcmp(shape, 'same')) % large differences on the edges
    output = cudaconv(image,filter);
    [Am An] = size(image);
    outM1 = 1;
    outN1 = 1;
    outM2 = Am;
    outN2 = An;

elseif (strcmp(shape, 'valid')) % very accurate
    output = cudaconv(image,filter);
    [Am An] = size(image);
    [Cm Rn] = size(filter);
    outM1 = round(Cm/2);
    outN1 = round(Rn/2);
    outM2 = round(Am - Cm/2);
    outN2 = round(An - Rn/2);
else
    disp('Shape type not valid');
    return;
end

newimage = output(outM1:outM2,outN1:outN2);
____________________________________________________

24 Apr 2009 Jveer

finally functions that use the GPU!

04 May 2009 Don

I have not ventured outside of matlab yet. How to I compile this code so I can run it?

-D

03 Dec 2009 Alex

Docu clearly states not windows supported. Trying to alter mex files to have this work. Has anyone had any luck getting this to work under windowze?

Please login to add a comment or rating.
Updates
17 Jun 2008

Fixed missing header file, removed unnecessary file resource forks, reformatted m-file help.

16 Jul 2008

Updated help, included testing script and image of benchmarks.

Tag Activity for this File
Tag Applied By Date/Time
cuda Alexander Huth 22 Oct 2008 10:05:22
2d convolution Alexander Huth 22 Oct 2008 10:05:22
fast mex Alexander Huth 22 Oct 2008 10:05:22
2d convolution Ferdie 29 Jul 2009 10:11:36
cuda Ferdie 29 Jul 2009 10:11:38
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com