-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an RFC for fixed point types. #41
base: main
Are you sure you want to change the base?
Conversation
b126353
to
f706cff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comments:
- We designed
lib.data
andlib.wiring
to be imported as entire modules:from amaranth.lib import data, wiring
. This library should be designed (naming-wise) to work that way too.fixed.Shape
andfixed.Value
are one option, though I can see why others may object to it. - I feel like implicit conversions from floats are potentially a source of bugs severe enough that maybe we shouldn't have them at all.
- How does one perform computation on fixed point values during compilation? I think "you just use floats" is an unsatisfying answer.
text/0041-fixed-point.md
Outdated
The following operations are defined on it: | ||
|
||
- `FixedPoint(f_width, /, *, signed)`: Create a `FixedPoint` with `f_width` fractional bits. | ||
- `FixedPoint(i_width, f_width, /, *, signed)`: Create a `FixedPoint` with `i_width` integer bits and `f_width` fractional bits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lines seem mutually incompatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that the underlying implementation would be FixedPoint(i_or_f_width, f_width = None, /, *, signed)
, but I found it clearer to express it like that. Consider how you can do range(stop)
and range(start, stop)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I wasn't sure if it was that or a typo. Does the first constructor add a sign bit when signed
is true, and otherwise create a number whose range spans from 0 to some value below 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although the exact semantics wrt. the sign bit depends on which Q notation we settle on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dominant notation in the industry, at least in the audio ASIC world, is to include the sign bit in the number of integer bits. For example, -1 to 1 is represented as a Q1.23. That would be my vote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samimia-swks would you happen to have some references we can cite to show that this is the dominant Q notation in the audio ASIC world? It will be useful evidence that may be worth appending to the RFC document itself.
text/0041-fixed-point.md
Outdated
## Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
This RFC proposes a library addition `amaranth.lib.fixedpoint` with the following contents: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bikeshed: lib.fixed
, lib.fixnum
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would lean toward lib.fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib.fixed
seems good, looking through the usage and discussion; fixed.Shape
, fixed.Value
are pretty natural to write.
text/0041-fixed-point.md
Outdated
|
||
- `FixedPoint(f_width, /, *, signed)`: Create a `FixedPoint` with `f_width` fractional bits. | ||
- `FixedPoint(i_width, f_width, /, *, signed)`: Create a `FixedPoint` with `i_width` integer bits and `f_width` fractional bits. | ||
- `FixedPoint.cast(shape)`: Cast `shape` to a `FixedPoint` instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the semantics of this operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just updated it to .cast(shape, f_width=0)
locally. The idea is to determine i_width
automatically, so FixedPoint.cast(unsigned(8))
would result in UQ(8)
and FixedPoint.cast(unsigned(8), f_width = 4)
would result in UQ(4, 4)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why UQ(4,4)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FixedPoint.cast()
is a building block for FixedPointValue.cast()
. The idea is to be able to say «here's a value with n fractional bits, please turn it into an appropriately sized FixedPointValue
». An unsigned(8)
with four fractional bits would have four integer bits left and thus cast to UQ(4, 4)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API seems confusing and difficult to use. Should we expose it at all?
text/0041-fixed-point.md
Outdated
- `FixedPoint(i_width, f_width, /, *, signed)`: Create a `FixedPoint` with `i_width` integer bits and `f_width` fractional bits. | ||
- `FixedPoint.cast(shape)`: Cast `shape` to a `FixedPoint` instance. | ||
- `.i_width`, `.f_width`, `.signed`: Width and signedness properties. | ||
- `.const(value)`: Create a fixed point constant from an `int` or `float`, rounded to the closest representable value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decimal
support as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And probably Fraction
as well.
text/0041-fixed-point.md
Outdated
- `FixedPoint(f_width, /, *, signed)`: Create a `FixedPoint` with `f_width` fractional bits. | ||
- `FixedPoint(i_width, f_width, /, *, signed)`: Create a `FixedPoint` with `i_width` integer bits and `f_width` fractional bits. | ||
- `FixedPoint.cast(shape)`: Cast `shape` to a `FixedPoint` instance. | ||
- `.i_width`, `.f_width`, `.signed`: Width and signedness properties. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the i_width
and f_width
names are difficult enough to read that it's of more importance than bikeshedding to come up with something more readable.
.int_bits
, .frac_bits
?
cursed option: int, frac = x.width
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on int_bits
, frac_bits
...
text/0041-fixed-point.md
Outdated
- `.as_shape()`: Return the underlying `Shape`. | ||
- `.__call__(target)`: Create a `FixedPointValue` over `target`. | ||
|
||
`Q(*args)` is an alias for `FixedPoint(*args, signed=True)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand Q-notation is an established one, however since Amaranth defaults to unsigned, I feel that having SQ
and UQ
(without having just Q
) would serve Amaranth better.
text/0041-fixed-point.md
Outdated
- `.as_value()`: Return the underlying value. | ||
- `.eq(value)`: Assign `value`. | ||
- If `value` is a `FixedPointValue`, the precision will be extended or rounded as required. | ||
- If `value` is an `int` or `float`, the value will be rounded to the closest representable value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you assign 1024 to a Q8.8 value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signal(Q(8, 8)).eq(1024)
would effectively be Signal(signed(17)).eq(1024 << 8)
. (Or signed(16)
depending on Q notation.)
Speaking of overflow, I figure fixed point overflow should behave like regular integer overflow. Saturating math feels orthogonal to this RFC and could be added through a later RFC that handles both integer and fixed point values.
I think for making constants of a specific shape it's reasonable to have implicit float conversion. As the
An adjacent question is «what will |
Are there good fixed point libraries for Python we could use here? |
|
||
- `Decimal` and/or `Fraction` support? | ||
- This could make sense to have, but both can represent values that's not representable as binary fixed point. | ||
On the other hand, a Python `float` can perfectly represent any fixed point value up to a total width of 53 bits and any `float` value is perfectly representable as fixed point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is a compelling argument for support of floats. I am convinced that it is a good idea to have floats supported in at least some way. But we need to be very careful around overflows.
- Simply truncating is free, rounds towards negative infinity. | ||
- IEEE 754 defaults to round to nearest, ties to even, which is more expensive to implement. | ||
- Should we make it user selectable? | ||
- We still need a default mode used when a higher precision number is passed to `.eq()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could leave it unspecified and implementation-defined until we can get a numerics expert to chime in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most DSP applications, simple truncating is done (bit picking, which is equivalent to a floor()) because it's free. I would vote for that being the default behavior at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truncating can also be a big foot gun in some DSP applications that is hard to track down, you can end up with a DC spur that grows over time or over your pipeline with no obvious explanation. A middle ground is round towards zero or symmetrically towards positive/negative infinity (this is nearly free on the output of Xilinx DSP48E1/2 and I believe cheap/free on ECP5 and friends DSPs). The way I personally would handle it would be to make the rounding mode part of fixed.Shape
type making it fixed.Shape(i_width, f_width, /, *, signed, rounding_mode)
. Truncate is still a reasonable default for most applications though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, making rounding_mode
part of the signature is a pretty big action, what happens if you add two numbers with differing rounding modes for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear the semantics I'm imagining here are if you have a
and b
and b
has more fractional bits than a
then when you assign a.eq(b)
the rounding mode that is used is that of a
so that downstream DSP modules can enforce a rounding mode on their inputs.
The scenarios where I can imagine what you propose causing issues are (I'm guessing there are more outside the bounds of my imagination)
- User adds
a
andb
and then.round
s them - A user adds
a
andb
and utilizes the result as a port on their module (I believe this is allowed but still learning the language)
Solutions that solve 1. But not two in no particular order:
- Don't actually resolve the round until a signal is assigned to something with a concrete value (this seems like a bad idea and probably doesn't solve the problem)
- Make the rounding mode undefined and require an explicit rounding mode to
.round
, but this doesn't solve the problem of getting a top level port with an undefined rounding mode
Solutions that I believe would solve both:
- Choose the rounding mode of the left term
- Choose the less footgunny rounding mode (TOEVEN,TOODD>SYMZERO,SYMINF>TRUNC) with ties going to either a defined order or the left term
- Ban arithmetic operations between numbers with different rounding modes and require an explicit cast of one of the arguments with something like
.with_mode(other.rounding_mode)
- Make this undefined behavior and pick one of the above for now
My naive preference inclination would be 3, it seems like a fairly rare scenario and if you end up in it it's probably worth making the user think about what they want the result to be, but I'm not a language designer so take that with a grain of salt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further thoughts on 3 as an option:
Operations with a constant fixed point value should probably inherit the rounding mode of the non-constant value without requiring an explicit cast because I think the alternative is annoying to deal with (requiring a shape to be passed any time a constant is declared).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was playing around with this a while back and I no longer think making it a part of the signature is a good idea as there is no platform independent way of providing rounding that infers well in the context of common DSP use cases. My understanding is that that would mean lowering would require adding a new primitive to the AST which doesn't feel like a good strategy. I think a better approach would be to leave rounding and several other common operations that require platform dependent lowering to a subsequent RFC. Currently my biggest stumbling block was that getting reasonable performance required manually instantiating DSPs which meant the design wasn't simulatible in pysim.
text/0041-fixed-point.md
Outdated
|
||
- `fixed.Shape(f_width, /, *, signed)`: Create a `fixed.Shape` with zero integer bits and `f_width` fractional bits. | ||
- `fixed.Shape(i_width, f_width, /, *, signed)`: Create a `fixed.Shape` with `i_width` integer bits and `f_width` fractional bits. | ||
- The sign bit is not included in `i_width` or `f_width`, so a `fixed.Shape(7, 8, signed=True)` will be 16 bits wide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that fixed.Shape(7, 0, signed=True)
has the same width as signed(8)
, which seems counterintuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just managed to hit this myself even though I've read through this a few times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about this since I wrote the last draft and concluded that this is probably the wrong decision, so I'm intending to change it when I find time to revise the RFC again.
- If `value` is a `float` and `shape` is not specified, the smallest shape that gives a perfect representation will be selected. | ||
If `shape` is specified, `value` will be rounded to the closest representable value first. | ||
- `.as_integer_ratio()`: Return the value represented as an integer ratio `tuple`. | ||
- `.as_float()`: Return the value represented as a `float`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if it's not representable? At least we need to have c.as_float(exact=True)
which gives an error in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this. I would even consider having exact=True
being the default.
If `shape` is specified, `value` will be rounded to the closest representable value first. | ||
- `.as_integer_ratio()`: Return the value represented as an integer ratio `tuple`. | ||
- `.as_float()`: Return the value represented as a `float`. | ||
- Operators are extended to return a `fixed.Const` if all operands are constant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inconsistent with how Const()
works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not addressed yet but will need to be before the next revision.
I have nothing concrete here, just minddumping stuff I wrote down in text files locally. I assume that You can get as much precision as you'd like from dividing fixed point numbers by left-shifting a numerator (increasing its
On the other hand, we can left-shift the integer part to get a non-zero result from this divide. :
I arbitrarily chose to shift by 7 decimal places (replace with binary for this RFC), but it's not obvious to me that any given left shift will be universally good for all use cases to prevent divides from returning 0 for non-zero numerators. Additionally, you may only want the extra precision during the divide; e.g. I could truncate the Q(6,7) in the second example back down to Q(0,7), saving 6 bits, and still have a perfectly valid result to pipe to the rest of a design. Whether/how much to truncate depends on the dynamic range of fixed point values you expect your design to see. Will fixed point divide be rare enough that divide should be completely ignored? Maybe there should be a divide function that allows the user to tweak:
I do not have a concrete use case in mind right now (I could probably shoehorn division into my Celsius to Fahrenheit conversion experiments). Just something I mulled over today. |
You could always multiply by reciprocal for an application like unit conversion. |
Multiply-by-reciprocal only works if you can calculate it ahead of time- i.e. one of the multiply inputs is constant. Otherwise you'd have to to find the reciprocal while your design runs before doing the multiply, which... requires a divide. The thing you're trying to avoid. |
This is exactly the case for Celsius to Fahrenheit conversion, no? |
Indeed, I will need to find a different way to shoehorn a division into my Celsius to Fahrenheit experiments :). Divide by non-constant factor for fixed point is probably uncommon. But I hesitate to say it's "so uncommon it's not worth supporting in some way at all, even if a |
The question of whether Independent of that, there's already nothing that prevents you from making a divider component with two fixedpoint inputs and a fixedpoint output. |
Do FPGAs have divider IPs? I thought they only had multiplier IP. I assumed amaranth provides floor division/mod for simulation purposes, and not something you'd want to synthesize. |
Yosys will actually synthesize a divider for you if you do
|
Even if dividers synthesize, I don't think In fixed point, there's no remainder, so if " Even if |
If a perfect To avoid scope creep, I'm inclined to leave inferred division out of this RFC. We could instead do a separate RFC later for that. |
That's fine, mind putting your comment re: ratio objects under Future Possibilities/making a mention that division is out of scope? |
text/0041-fixed-point.md
Outdated
|
||
- `fixed.Shape(f_width, /, *, signed)`: Create a `fixed.Shape` with zero integer bits and `f_width` fractional bits. | ||
- `fixed.Shape(i_width, f_width, /, *, signed)`: Create a `fixed.Shape` with `i_width` integer bits and `f_width` fractional bits. | ||
- The sign bit is not included in `i_width` or `f_width`, so a `fixed.Shape(7, 8, signed=True)` will be 16 bits wide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is having i_width
positive and f_width
negative or the converse in scope for this RFC? If it is then I think .round
should take i_width
as an optional argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would that mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two scenarios where I think you could reasonably end up in this position with the current RFC
- You start off with say UQ0.16 [0,1) and right shift by 2 the natural representation is still 16 bits but with the range [0, 0.25) and I believe the representation is going to end up being UQ-2.18 in this notation?
- This is less reasonable but say you throw a 4096 (1<<12) point FFT at values with the format SQ7.10, if you are naive about the scaling and keep the bit width to 18 bits, you end up with SQ19.-2 (ie a step of 4 between every point). This is equivalent in terms of the bits on the wire to starting with SQ0.17 and ending with SQ12.5 but it feels more silly and you can end up there naturally so it should probably be supported
I'm currently working on tracking down some DSP bugs in a project at work that nobody every simulated, in my fixed point emulation code the number format I chose was basically n_bits, exponent which handles these cases more naturally but is probably overall less intelligible. I'd imagine this looking like fixed.Shape(shape, exponent)
in amaranth (IE fixed.Shape(signed(18), 2)
for SQ19.-2) but that would be a pretty large change I'm not sure it would be a net positive anyways (also don't want to bikeshed here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I've played with what the implementation does here, this might be a bit long; The implementation does allow the creation of the above types of representations (IE Q-4.20, Q20.-4) and appears to operate on them correctly, however it for the most part won't generate them on its own. Left shift and right shift preserve the number of bits until either i_width
or f_width
would go negative at which point it pads things out:
Right shift works as expected:
In [2]: fixed.SQ(8, 8), fixed.SQ(-4, 20), Signal(fixed.SQ(8, 8)) >> 8, Signal(fixed.SQ(8, 8)) >> 12
Out[2]:
(fixed.Shape(8, 8, signed=True),
fixed.Shape(-4, 20, signed=True),
(fixedpoint SQ0.16 (sig $signal)),
(fixedpoint SQ0.20 (sig $signal)))
Left shift seems to get converted to unsigned which I think is not the correct behavior:
In [4]: Signal(fixed.SQ(8, 8)) << 8, Signal(fixed.SQ(8, 8)) << 9, Signal(fixed.SQ(8, 8)) << 10, Signal(fixed.SQ(8, 8)) << 12
Out[4]:
((fixedpoint SQ16.0 (sig $signal)),
(fixedpoint UQ18.0 (cat (const 1'd0) (sig $signal))),
(fixedpoint UQ19.0 (cat (const 2'd0) (sig $signal))),
(fixedpoint UQ21.0 (cat (const 4'd0) (sig $signal))))
I personally think the rounding semantics are a bit difficult to reason about as they stand:
In [5]: (Signal(fixed.SQ(8, 8)) << 12).round(-4), (Signal(fixed.SQ(8, 8)) >> 12).round(16)
Out[5]:
((fixedpoint UQ21.-4 (+ (slice (cat (const 4'd0) (sig $signal)) 4:21) (slice (cat (const 4'd0) (sig $signal)) 3:4))),
(fixedpoint SQ0.16 (+ (slice (sig $signal) 4:17) (slice (sig $signal) 3:4))))
Rounding with negative fractional bits produces the sensible result (wrt signed to unsigned bug), but rounding off the fractional bits in a right shift looses precision when it does not necessarily need too. I think this is a decent argument for either always preserving the number of bits in a (constant) shift, or including an i_width
argument to round. Personally I would advocate for always preserving the number of bits because I think that that is easier to reason about and the argument to round can just be width
not i_width
or f_width
.
Addition between a number with no fractional bits and one with no integer bits also produces a number 2 bits wider than I believe is necessary:
In [7]: Signal(fixed.SQ(-4, 20)) + Signal(fixed.SQ(20, -4))
Out[7]: (fixedpoint SQ22.20 (+ (sig $signal) (cat (const 24'd0) (sig $signal))))
- The sign bit is not included in `i_width` or `f_width`, so a `fixed.Shape(7, 8, signed=True)` will be 16 bits wide. | ||
- `fixed.Shape.cast(shape, f_width=0)`: Cast `shape` to a `fixed.Shape` instance. | ||
- `.i_width`, `.f_width`, `.signed`: Width and signedness properties. | ||
- `.const(value)`: Create a `fixed.Const` from `value`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would a .max()
and .min()
method or property make sense here to get a const with the max/min representable value make sense here or is that outside the scope of the Shape
API given that the base signed and unsigned types don't have that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could definitely have additional functionality on specific shapes, e.g. enum shapes are pretty magical, so are layouts. .min()
/.max()
seem completely reasonable to me.
- If `other` is an `int`, it'll be cast to a `fixed.Const` first. | ||
- If `other` is a `float`: TBD | ||
- The result will be a new `fixed.Value` with enough precision to hold any resulting value without rounding or overflowing. | ||
- `.__lshift__(other)`, `.__rshift__(other)`: Bit shift operators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the implementation at least the semantics of these seem to be different from underlying amaranth values. If the shift amount is an integer these act like .shift_left()
and .shift_right()
. Those operators should probably be added here and the semantics of __lshift__
and __rshift__
adjusted to match those on unsigned()
and signed()
.
Note: I may be wrong here, still learning the language
- The result will be a new `fixed.Value` with enough precision to hold any resulting value without rounding or overflowing. | ||
- `.__lshift__(other)`, `.__rshift__(other)`: Bit shift operators. | ||
- `.__neg__()`, `.__pos__()`, `.__abs__()`: Unary arithmetic operators. | ||
- `.__lt__(other)`, `.__le__(other)`, `.__eq__(other)`, `.__ne__(other)`, `.__gt__(other)`, `.__ge__(other)`: Comparison operators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the comparison operators the rounding behavior seems under defined when the types don't have the same precision, the options seem to be:
- Round to the precision of the left argument
(what the draft implementation does)EDIT: Draft implementation does not have an opinion on this - Round to the precision of the least precise argument (easiest for the hardware)
- Round to the most precise argument (excluding constants maybe?)
- Ban comparison between values (but maybe not const and value?) with different precision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- seems bad, I'm going to implement 3 for my FFT testing and see if it ends up being a problem
- Should we make it user selectable? | ||
- We still need a default mode used when a higher precision number is passed to `.eq()`. | ||
|
||
- Are there any other operations that would be good to have? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.as_value()
returns a value with a signdedness that doesn't depend on the signdedness of the fixedpoint shape, should there be a .lower()
or .numerator()
method that returns .as_value()
cast to the appropriate signdedness?
Is this still being worked on? I'm familiar with the Python FxpMath module and so like the idea of denoting Qs.m or SQs.m for significand and mantissa signed, UQs.m for unsigned. The syntax @ld-cd proposed looks good to me. I'm not so into the It looks like there's still question about what to do if the other value is a The same can be said for passing any constant as a fixed-point value: I'd rather require the constant to be presented as In the worst case, you can always add implicit conversion later without it being a breaking change. Use case: I frequently use lookup tables generated by I'm using an Abed-Siferd predictor to produce an accurate To do this, I need to add dissimilar fixed-point values with proper sign extension: adding a negative I'm looking forward to built-in fixed-point math, the lack of which is a deficiency in VHDL and Verilog that I would very much like to escape. |
We're definitely still very interested in it! The RFC will need several people driving it forward; usually two is enough (I am usually one of them and the author is another) but this one will likely need three. At the moment I'm seriously ill and will probably continue to be mostly bedridden until January (chronic pain that became worse recently). I think it's definitely worth it to work out a bunch of kinks before then, and I'll try to participate however I can. |
@schnommus Could you list your modifications? |
@whitequark I have tried to keep things in line with this draft and @zyp's original PR to reduce the risk of churn, so there is not so much that deviates. In general I like current direction of the RFC. Some small things I changed:
That's all I touched. On some other bikeshedding points from above:
|
In case this is useful, I have developed a fixed-point library for systemverilog which we have open-sourced here: https://github.com/SkyworksSolutionsInc/fplib The conventions used in this library are wide-spread at least in the audio ASIC industry. |
@schnommus These all look reasonable to me although I am not a DSP expert so I may well be missing something. From what @zyp says it might be meaningful for you to pick up the RFC yourself, what do you think? |
@whitequark sure, I'm happy to take this. As a first action I would try to condense all comments above into the existing markdown from @zyp . Ideally we keep the changes in this PR to retain the comment history. I can't commit to a hard deadline due to work commitments, but my personal goal would be to get a new RFC revision out in the coming week. |
I propose that instead of specifying the integer and fractional parts individually, we specify the total width and the shift amount. E.g. a |
Check my math on the fixedShape ones. I went over them a few times and got a few wrong and, honestly, I spend several seconds thinking about it but I'm not rigorously confirming they're right, it's effort.
Making this table, three things are obvious:
It seems like the SQ/UQ format is far more readable and also far more writable. Going both directions with the |
I'm a fan of knowing what size you're working with. In programming languages, we don't get an extra bit for a signed 64-bit int versus an unsigned—we actually have things like int32 vs uint32 or whatever, depending on languages—I don't see why we need to hide the details here. It seems even worse to hide those details here since we're specifying hardware.
Throw an exception. Shifting by a non-integer is absurd and attempting to handle this would just enable programmers being clever by using non-obvious design details of the language. You can always set up a signal consisting of some set of bits in the fixed-width signal. |
Actually I think I got the table wrong on the second line in both. The signed version needs to be shifted left one more than the unsigned version. |
The
As in the existing draft RFC, I think there is room for both forms:
As long as we don't have to jump through hoops, I think we should try to expose the industry standard Q notation. (exactly which kind of Q notation is still TBD)
Agree, I think @zyp's suggestion above is a neat solution to this.
Agree, to me it is obvious this operation should not be permitted. |
Based on all the helpful feedback above, I have an updated draft RFC -
@whitequark - would you prefer we try to stick to this PR (to preserve comment history), or would you prefer I open a new PR and we continue discussion there? I'm fine with either, although for the former I'd need the permissions. FYI @zyp |
b345e8a
to
cb3d938
Compare
This is looking good!
One note on the rounding topic:
Simple rounding is also known as a mid-tread quantizer. If the desired output format is I.Q (total length: I+Q) we add 0.5 / 2^(Q) and truncate. The cost is a real adder and a clipper if you wanna be safe. Not trivial if you didn’t care about the DC offset error that simple truncation introduces.
But you could also do a mid rise quantizer. You truncate to I.Q, but to get rid of the negative DC offset you add that 0.5 / 2^(Q) after truncation. This is technically an adder too but in HW you can just append an LSB of 1 which is free!
But now you end up with I+Q+1 bits. If these bits were going across a bus you would transport just the I.Q but the receiver has to remember to pad the LSB.
(Note that in integer land, mid-tread is like rounding to nearest integer and mid-rise is like rounding to the nearest midpoint between integers, so you cannot represent a hard zero)
With all that said, if you want to resize a number to I+Q bits without adding a DC offset AND you don’t want the cost of an adder, you can simply truncate to I.(Q-1) bits and then pad an LSB of 1!
So I would advocate for 3 options with truncate being default:
1. truncate (floors to -inf, free, adds dc offset)
2. mid-tread (rounds, costs an adder, can represent zero)
3. mid-rise (rounds, free, cannot represent zero)
|
- If `value` is an `int` or `float`, it'll be cast to a `fixed.Const` first. | ||
- If `value` is a `fixed.Value`, the precision will be extended or truncated as required. | ||
- `.reshape(f_bits)`: Return a new `fixed.Value` with `f_bits` fractional bits, truncating or extending precision as required. | ||
- `.reshape(shape)`: Return a new `fixed.Value` with shape `shape`, truncating or extending precision as required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 2 signatures of .reshape()
here are my error. They should be collapsed into one .reshape()
, although I am also considering dropping the second form if I can't find a killer use-case for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case I think this method could use further bikeshedding. I would be just as open to calling this method truncate(f_bits=0)
or trunc(f_bits=0)
, although that loses the notion that it can both increase and decrease precision, which was part of the reason of not calling things round()
in the first place.
Thank you @samimia-swks for your input on the rounding details. This is all useful context, although at this point I am inclined to postpone rounding strategies for a future RFC. In the latest version of the RFC, I have de-scoped rounding completely.
Even without an opinionated |
(I haven't been able to track the changes in depth but the overall approach sounds good. We have previously de-scoped controversial or poorly understood aspects of RFC and it worked quite fine.) |
Really happy to see progress on this RFC. I appreciate all the work that has gone into it. I'm particularly interested in supporting user-selectable rounding modes beyond simple truncation. As noted by @vk2seb, truncation can introduce DC biases in the signal that are problematic for some DSP applications. However, I'm okay with that being left out of this particular RFC. Besides that, I like the current proposal as it stands. |
Rendered