-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow separate queue for fft and ifft #17
Changes from 6 commits
7836edc
140545f
767d1db
93169fa
1ade3d2
b81f9bc
f8c1bde
6af6505
ca04b50
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -155,20 +155,58 @@ def _make_config(self): | |
int(self.use_lut), int(self.keepShaderCode), | ||
n_batch, skipx, skipy, skipz) | ||
|
||
def fft(self, src: cla.Array, dest: cla.Array = None): | ||
def fft(self, src: cla.Array, dest: cla.Array = None, queue: cl.CommandQueue = None): | ||
""" | ||
Compute the forward FFT | ||
:param src: the source pyopencl Array | ||
:param dest: the destination pyopencl Array. Should be None for an inplace transform | ||
:param queue: the pyopencl CommandQueue to use for the transform. If not given, | ||
the queue given when constructing the application is used. | ||
:raises RuntimeError: in case of a GPU kernel launch error | ||
:return: the transformed array. For a R2C inplace transform, the complex view of the | ||
array is returned. | ||
""" | ||
if not queue: | ||
if dest is None: | ||
if src.queue is None: | ||
warnings.warn("queue is not given and the source array does not have a queue " | ||
"attached to it. Falling back to the queue given to the constructor " | ||
"of VkFFTApp. This is deprecated and will stop working in the future. " | ||
"Use src_array.with_queue(queue) to attach a queue to the array or " | ||
"pass a queue to this method", DeprecationWarning) | ||
queue = self.queue | ||
else: | ||
queue = src.queue | ||
elif dest.queue is None and src.queue is None: | ||
warnings.warn("queue is not given and the source/dest arrays do not have a queue " | ||
"attached to it. Falling back to the queue given to the constructor " | ||
"of VkFFTApp. This is deprecated and will stop working in the future. " | ||
"Use src_array.with_queue(queue) to attach a queue to the array or " | ||
"pass a queue to this method", DeprecationWarning) | ||
queue = self.queue | ||
elif dest.queue != src.queue: | ||
if dest.queue is None: | ||
queue = src.queue | ||
elif src.queue is None: | ||
queue = dest.queue | ||
else: | ||
warnings.warn("queue is not given and the source/dest arrays are not equal. " | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will switch to a RuntimeError instead, no queue supplied and different src and destination queues is a recipe for disasters |
||
"Falling back to the queue given to the constructor. This is deprecated " | ||
"and will stop working in the future. " | ||
"Use src_array.with_queue(queue) to attach a queue to the array or " | ||
"pass a queue to this method", DeprecationWarning) | ||
queue = self.queue | ||
else: | ||
queue = src.queue | ||
|
||
if queue != self.queue: | ||
self.queue.finish() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it's a good idea to include a finish() here , as it would break asynchronous execution. |
||
|
||
if self.inplace: | ||
if dest is not None: | ||
if src.data.int_ptr != dest.data.int_ptr: | ||
raise RuntimeError("VkFFTApp.fft: dest is not None but this is an inplace transform") | ||
res = _vkfft_opencl.fft(self.app, int(src.data.int_ptr), int(src.data.int_ptr), int(self.queue.int_ptr)) | ||
res = _vkfft_opencl.fft(self.app, int(src.data.int_ptr), int(src.data.int_ptr), int(queue.int_ptr)) | ||
check_vkfft_result(res, src.shape, src.dtype, self.ndim, self.inplace, self.norm, self.r2c, | ||
self.dct, backend="opencl") | ||
if self.norm == "ortho": | ||
|
@@ -186,27 +224,65 @@ def fft(self, src: cla.Array, dest: cla.Array = None): | |
raise RuntimeError("VkFFTApp.fft: dest and src are identical but this is an out-of-place transform") | ||
if self.r2c: | ||
assert (dest.size == src.size // src.shape[-1] * (src.shape[-1] // 2 + 1)) | ||
res = _vkfft_opencl.fft(self.app, int(src.data.int_ptr), int(dest.data.int_ptr), int(self.queue.int_ptr)) | ||
res = _vkfft_opencl.fft(self.app, int(src.data.int_ptr), int(dest.data.int_ptr), int(queue.int_ptr)) | ||
check_vkfft_result(res, src.shape, src.dtype, self.ndim, self.inplace, self.norm, self.r2c, | ||
self.dct, backend="opencl") | ||
if self.norm == "ortho": | ||
dest *= self._get_fft_scale(norm=0) | ||
return dest | ||
|
||
def ifft(self, src: cla.Array, dest: cla.Array = None): | ||
def ifft(self, src: cla.Array, dest: cla.Array = None, queue: cl.CommandQueue = None): | ||
""" | ||
Compute the backward FFT | ||
:param src: the source pyopencl.Array | ||
:param dest: the destination pyopencl.Array. Can be None for an inplace transform | ||
:param queue: the pyopencl CommandQueue to use for the transform. If not given, | ||
the queue given when constructing the application is used. | ||
:raises RuntimeError: in case of a GPU kernel launch error | ||
:return: the transformed array. For a C2R inplace transform, the float view of the | ||
array is returned. | ||
""" | ||
if not queue: | ||
if dest is None: | ||
if src.queue is None: | ||
warnings.warn("queue is not given and the source array does not have a queue " | ||
"attached to it. Falling back to the queue given to the constructor " | ||
"of VkFFTApp. This is deprecated and will stop working in the future. " | ||
"Use src_array.with_queue(queue) to attach a queue to the array or " | ||
"pass a queue to this method", DeprecationWarning) | ||
queue = self.queue | ||
else: | ||
queue = src.queue | ||
elif dest.queue is None and src.queue is None: | ||
warnings.warn("queue is not given and the source/dest arrays do not have a queue " | ||
"attached to it. Falling back to the queue given to the constructor " | ||
"of VkFFTApp. This is deprecated and will stop working in the future. " | ||
"Use src_array.with_queue(queue) to attach a queue to the array or " | ||
"pass a queue to this method", DeprecationWarning) | ||
queue = self.queue | ||
elif dest.queue != src.queue: | ||
if dest.queue is None: | ||
queue = src.queue | ||
elif src.queue is None: | ||
queue = dest.queue | ||
else: | ||
warnings.warn("queue is not given and the source/dest arrays are not equal. " | ||
"Falling back to the queue given to the constructor. This is deprecated " | ||
"and will stop working in the future. " | ||
"Use src_array.with_queue(queue) to attach a queue to the array or " | ||
"pass a queue to this method", DeprecationWarning) | ||
queue = self.queue | ||
else: | ||
queue = src.queue | ||
|
||
if queue != self.queue: | ||
self.queue.finish() | ||
|
||
if self.inplace: | ||
if dest is not None: | ||
if src.data.int_ptr != dest.data.int_ptr: | ||
raise RuntimeError("VkFFTApp.fft: dest!=src but this is an inplace transform") | ||
res = _vkfft_opencl.ifft(self.app, int(src.data.int_ptr), int(src.data.int_ptr), int(self.queue.int_ptr)) | ||
res = _vkfft_opencl.ifft(self.app, int(src.data.int_ptr), int(src.data.int_ptr), int(queue.int_ptr)) | ||
check_vkfft_result(res, src.shape, src.dtype, self.ndim, self.inplace, self.norm, self.r2c, | ||
self.dct, backend="opencl") | ||
if self.norm == "ortho": | ||
|
@@ -227,10 +303,10 @@ def ifft(self, src: cla.Array, dest: cla.Array = None): | |
# Special case, src and dest buffer sizes are different, | ||
# VkFFT is configured to go back to the source buffer | ||
res = _vkfft_opencl.ifft(self.app, int(dest.data.int_ptr), int(src.data.int_ptr), | ||
int(self.queue.int_ptr)) | ||
int(queue.int_ptr)) | ||
else: | ||
res = _vkfft_opencl.ifft(self.app, int(src.data.int_ptr), int(dest.data.int_ptr), | ||
int(self.queue.int_ptr)) | ||
int(queue.int_ptr)) | ||
check_vkfft_result(res, src.shape, src.dtype, self.ndim, self.inplace, self.norm, self.r2c, | ||
self.dct, backend="opencl") | ||
if self.norm == "ortho": | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,6 +48,7 @@ def ascent(): | |
try: | ||
import pyopencl as cl | ||
import pyopencl.array as cla | ||
from pyvkfft.opencl import VkFFTApp as clVkFFTApp | ||
|
||
has_pyopencl = True | ||
except ImportError: | ||
|
@@ -412,6 +413,46 @@ def test_pycuda_streams(self): | |
dn = np.roll(d, i * 7, axis=1) | ||
self.assertTrue(np.allclose(dn, vd[i].get(), rtol=rtol, atol=abs(dn).max() * rtol)) | ||
|
||
@unittest.skipIf(not has_pyopencl, "pyopencl is not available") | ||
def test_pyopencl_queues(self): | ||
""" | ||
Test multiple FFT with queues different from the queue used | ||
in creating the VkFFTApp | ||
""" | ||
for dtype in (np.complex64, np.complex128): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You need to check for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand. I'm already checking for complex128. |
||
with self.subTest(dtype=np.dtype(dtype)): | ||
init_ctx("pyopencl", gpu_name=self.gpu, verbose=False) | ||
ctx = gpu_ctx_dic["pyopencl"][2].context | ||
if dtype == np.complex64: | ||
rtol = 1e-6 | ||
else: | ||
rtol = 1e-12 | ||
sh = (256, 256) | ||
d = (np.random.uniform(-0.5, 0.5, sh) + 1j * np.random.uniform(-0.5, 0.5, sh)).astype(dtype) | ||
n_queues = 5 | ||
vd = [] | ||
vapp = [] | ||
|
||
with cl.CommandQueue(ctx) as cq: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Creating the queue using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I agree that it's dangerous to keep persisting the queue in |
||
for i in range(n_queues): | ||
vapp.append(clVkFFTApp(d.shape, d.dtype, ndim=2, norm=1, queue=cq)) | ||
|
||
queues = [cl.CommandQueue(ctx) for _ in range(n_queues)] | ||
for i in range(n_queues): | ||
vd.append(cla.to_device(queues[i], np.roll(d, i * 7, axis=1))) | ||
|
||
for i in range(n_queues): | ||
vapp[i].fft(vd[i], queue=queues[i]) | ||
for i in range(n_queues): | ||
dn = fftn(np.roll(d, i * 7, axis=1)) | ||
self.assertTrue(np.allclose(dn, vd[i].get(), rtol=rtol, atol=abs(dn).max() * rtol)) | ||
|
||
for i in range(n_queues): | ||
vapp[i].ifft(vd[i], queue=queues[i]) | ||
for i in range(n_queues): | ||
dn = np.roll(d, i * 7, axis=1) | ||
self.assertTrue(np.allclose(dn, vd[i].get(), rtol=rtol, atol=abs(dn).max() * rtol)) | ||
|
||
|
||
# The class parameters are written in pyvkfft_test.main() | ||
class TestFFTSystematic(unittest.TestCase): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the array also carries a queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which can be None right? Any suggestions on what to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can be
None
, but I think it's reasonable to expect an array passed in here to have a valid queue.If you were designing this API from scratch, I think the following fallback order would make sense:
queue
(if passed)src.queue
anddest.queue
(if given, must be the same)self.queue
.For backward compatibility, I would warn if
self.queue
disagrees with queue resulting from the above and say thatself.queue
is being used for now, but that that behavior is deprecated, with a certain fuse.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the 'simple' fft API (
pyvkfft.fft.fftn
,pyvkfft.fft.ifftn
...) already allows to transform just by passing an array - its queue will be used as default, and a different one can be supplied. This already allows to use different queues for different transforms.There can be some strange results when different queues are used, e.g. when using a different queue from the array's, the get() method will not wait for the fft to be finished.
I would still keep
VkFFTApp
'sself.queue
as the default (assuming it's a deliberate choice from the developer), but a warning about when different queues are used will be useful.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The queue choice is indeed debatable.. Preferring the array queues could also make sense.. It really depends on the use case.
The main use case is probably not a single array which is transformed with different queues at different times, but rather different arrays which are transformed by the same
VkFFTApp
. In which case using the array queues would indeed be easier...