|
|
| File Information |
| Description |
cudaconv - Performs 2d convolution using an NVIDIA graphics chipset.
For large datasets (~1 million elements) and especially for large kernels (performance does not scale much with kernel size) cudaconv can outperform conv2 by as much as 5000%.
I did not create this algorithm.. it is adapted from an example included in the CUDA SDK and wrapped in MATLAB-compatible C code.
With very large data matrices, it can *completely* crash your computer(/graphics driver?), so beware. In testing, I found an upper limit on convolution size (limited either by the size the CUDA FFT function can accept or the size of a 2D texture) of roughly 2^20 elements, so above that the code breaks the convolution into smaller pieces. If you are feeling adventurous, feel free to raise that limit, but be aware that at those sizes cudaconv is already roughly 50-100x faster than conv2. |
| MATLAB release |
MATLAB 7.4 (R2007a)
|
| Other requirements |
To compile and run this software, you need the NVIDIA CUDA Toolkit (http://www.nvidia.com/object/cuda_get.html) and a modern NVIDIA graphics card.
Tested on OS X 10.5, assumed to work under any brand of Linux, no guarantees in Windows. |
| Zip File Content |
|
| Other Files |
cudaconv/convolutionFFT2D_kernel.cu, cudaconv/cudaconv.cu, cudaconv/cudaconv.m, cudaconv/cudaconv.mexmaci, cudaconv/cutil.h, cudaconv/Makefile, cudaconv/nvmex, cudaconv/nvopts.sh, cudaconv/README, cudaconv/testcudaconv.m
|
|
Tags for This File
|
| Everyone's Tags |
|
| Tags I've Applied |
|
| Add New Tags |
Please login to tag files.
|
| Updates |
| 17 Jun 2008 |
Fixed missing header file, removed unnecessary file resource forks, reformatted m-file help. |
| 16 Jul 2008 |
Updated help, included testing script and image of benchmarks. |
|
Contact us at files@mathworks.com