You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I am studying the adaptivefloat paper including the number system. I have some questions about the content regarding the hybrid float-integer processing element (PE).
Are the values of the weight and input buffer quantized with adaptivefloat quantization before put to the PE?
When performing a right shift, the paper mentions adding the exponent bias value, which is ca be a negative value in the paper. How should this be calculated?
Why is activation performed after truncation?
The integer-to-float module uses the activation exponent bias, but this was already used during the right shift earlier. Why is it being used again?
I really appreciate your effort to make adaptiveFloat.
Thanks in advance.
The text was updated successfully, but these errors were encountered:
You can see an example of the right shift calculation here. Typically, there would be an offset value in the HW to make sure the right shift is eventually positive
Performing the activation function using the high-precision accumulation datatype (e.g. 32-bit) can be overkill for many DNNs. We can truncate the post-accumulation sums down to 16-bit or below without appreciable accuracy loss.
The right shift uses the exponent bias from the weight and the input activation. The integer-to-float module uses the exponent bias from the output activation.
Hello, I am studying the adaptivefloat paper including the number system. I have some questions about the content regarding the hybrid float-integer processing element (PE).
I really appreciate your effort to make adaptiveFloat.
Thanks in advance.
The text was updated successfully, but these errors were encountered: