You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The examples given in the front page (README.md) are not equivalent. That is, the examples showing native_simd and compiler intrinsics are not vectorization of the original scalar_product function because:
Their Vec3D array contains a different number of elements. In the original code, there are 3 floats in the array, and the modified examples use vector types for elements, where the vector type itself is a number of floats. The vectorized functions should operate on the same input data as the original. Yes, that means you should probably show how your library helps dealing with tail data that doesn't fit in a native vector.
The effect of the scalar_product function in the modified examples is different, as it does not produce a single float that is the sum of products of the input arrays. The vectorized code is missing the final reduction step.
I understand that you are trying to be concise in your front page examples, but I still believe the examples should provide user with a realistic comparison of the two ways to implement the same functionality. Without this equivalence, the comparison is pointless, as you are comparing apples to oranges.
The text was updated successfully, but these errors were encountered:
The examples given in the front page (README.md) are not equivalent. That is, the examples showing
native_simd
and compiler intrinsics are not vectorization of the originalscalar_product
function because:Vec3D
array contains a different number of elements. In the original code, there are 3float
s in the array, and the modified examples use vector types for elements, where the vector type itself is a number offloat
s. The vectorized functions should operate on the same input data as the original. Yes, that means you should probably show how your library helps dealing with tail data that doesn't fit in a native vector.scalar_product
function in the modified examples is different, as it does not produce a singlefloat
that is the sum of products of the input arrays. The vectorized code is missing the final reduction step.I understand that you are trying to be concise in your front page examples, but I still believe the examples should provide user with a realistic comparison of the two ways to implement the same functionality. Without this equivalence, the comparison is pointless, as you are comparing apples to oranges.
The text was updated successfully, but these errors were encountered: