Hello,
the 64 Bit version of the convolution filter based on CUDA 2.3 is now available. Here the benchmark results tested in MATLAB2008B (32/64 Bit):
# signal elements: 100.000
# kernel elements: 100
***** 32 Bit *********
cuFilter: 0.014 s
MATLAB conv: 0.0433 s
***** 64 Bit ******
cuFilter: 0.007 s
MATLAB conv: 0.040 s
The NVIDIA driver version 190.38 is a basic requirement.
Best regards
Comment only
24 Jul 2009
CUDA Convolution filter
Convolution of an arbitrary 1D complex signal with an arbitrary filter kernel
Hello,
the following features were added:
+ support for CUDA 2.3
+ CUDA device selection for multiple QUADRO/GEFORCE/TESLA devies
+ CUDA device listing
+ improved error handling
the user can now select the CUDA device she/he wants by using a 3rd optional function parameter (0..N-1), where N denotes the number of CUDA devices in your computer.
If the 3rd parameter is set to -1, all CUDA devices
available on the computer are listed.
Best regards
Comment only
24 Jul 2009
CUDA Convolution filter
Convolution of an arbitrary 1D complex signal with an arbitrary filter kernel
Hello,
the following features were added:
+ support for CUDA 2.3
+ CUDA device selection for multiple QUADRO/GEFORCE/TESLA devies
+ CUDA device listing
+ improved error handling
the user can now select the CUDA device she/he wants by using a 3rd optional function parameter (0..N-1), where N denotes the number of CUDA devices in your computer.
If the 3rd parameter is set to -1, all CUDA devices available on the computer are listed.
Best regards
Comment only
03 Jul 2009
CUDA Convolution filter
Convolution of an arbitrary 1D complex signal with an arbitrary filter kernel
Hello,
here the benchmark results for "cuFilter" on a 8800GTS GPU vs. MATLAB "conv" function on a XEON E5345 CPU :
# signal elements: 100.000
# kernel elements: 100
cuFilter: 0.0155 s
conv: 0.0733 s
Best regards
Comment only
02 Jul 2009
CUDA Convolution filter
Convolution of an arbitrary 1D complex signal with an arbitrary filter kernel
Good morning Adrien,
the 8500GT is not well suited (only 16 stream processors) and the CUDA memory for 100.000 signal elements is probably not sufficient enough.
There is no workaround in the software for a "low memory" case. Without the detailed error message i cant dive deeper into the problem.
I'm sure that a statement, that this algorithm is working well on 8800 series GPU's, does'nt help you that much on a 8500GT GPU.
Best regards
Comment only