Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different results with evalpoly #28

Open
JeffreySarnoff opened this issue Nov 24, 2021 · 7 comments
Open

different results with evalpoly #28

JeffreySarnoff opened this issue Nov 24, 2021 · 7 comments

Comments

@JeffreySarnoff
Copy link
Contributor

julia> a=1.5;aa=StaticFloat64(a); b=1/3; bb=StaticFloat64(b); a-aa, b-bb
(0.0, 0.0)

julia> x=0.875;xx=StaticFloat64(x); x-xx
0.0

julia> evalpoly(x, (a,b))
1.7916666666666667

julia> evalpoly(xx, (aa,bb))
1.7916666666666665
@JeffreySarnoff JeffreySarnoff changed the title unequal results with evalpoly different results with evalpoly Nov 24, 2021
@JeffreySarnoff
Copy link
Contributor Author

It seems that the issue involves muladd not being autoreplaced with fma on systems where that replacement happens with Float64s when (some of) the inputs are Static.

@Tokazama
Copy link
Collaborator

Do we just need to define muladd?

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Nov 25, 2021

Yes, assuming we do not care if an emulated fma is used where there is no hardware support for fma. I think that is fine when all args are Static; when one (e.g. the x used in evalpoly) is not static we may or may not care that it could use the emulated fma. Many more systems running Julia have hardware supported fma than those that do not.

@JeffreySarnoff
Copy link
Contributor Author

Here should be a quick test for a hardware supported fma (only run with v1.7RC3 on a system where fma is supported in hardware)

a = prevfloat(1.0); b = nextfloat(1.0); c = -1.0;
const FastFMA = fma(a,b,c) === muladd(a,b,c)

so we could use

if fma(prevfloat(1.0), nextfloat(1.0), -1.0) === muladd(prevfloat(1.0), nextfloat(1.0), -1.0)
    Base.muladd(a::StaticFloat64, b::StaticFloat64, c::StaticFloat64) = fma(a, b, c)
    Base.muladd(a::Float64, b::StaticFloat64, c::StaticFloat64) = fma(a, b, c)
    Base.muladd(a::StaticFloat64, b::Float64, c::StaticFloat64) = fma(a, b, c)
    Base.muladd(a::StaticFloat64, b::StaticFloat64, c::Float64) = fma(a, b, c)
    # others if needed
end

@chriselrod
Copy link
Collaborator

chriselrod commented Nov 25, 2021

Here should be a quick test for a hardware supported fma

This won't work in general, because the compiler has a lot of freedom in doing what it wants, meaning this will often return the wrong answer when actually placed in a module.
Base.FMA_NATIVE is false on every computer I've tried it on when building from source, even though I can copy/paste the definition into the REPL and get true (and the value really is true).

Also, see
JuliaLang/julia#43085

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Nov 25, 2021

That PR
would give us an easy way forward, as I understand it.

meanwhile
This rewrite (and shift in logic) seems to work when used within another module. Does this get us any nearer?

module FastFMA

export HasFastFNA

a = prevfloat(1.0); b = nextfloat(1.0); c = -1.0;

function sequential_muladd(a,b,c)
    @noinline mul(a,b) = a * b
    mul(a,b) + c
end

const HasFastFMA = sequential_muladd(a,b,c) !== muladd(a,b,c)

end  # FastFMA

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Nov 26, 2021

🖋️ This issue does not require fma testing to be resolved.
In Base/float.jl (around line 410) there are the functions
muladd(x::T, y::T, z::T) for each T in (Float64, Float32, Float16).

Base.muladd is defined using the Core.Intrinisics function muladd_float. The Core.Intrinsics float functions are:
[:abs_float, :add_float, :copysign_float, :div_float, :eq_float, :fma_float, :le_float, :lt_float, :mul_float, :muladd_float, :ne_float, :neg_float, :rem_float, :sub_float].

These intrinsics (those that are binary or trinary functions,
so this does not apply to abs_float and neg_float ) require their arguments to be assigned values of a common concrete type.
Our Static<T> types are UnionAll types until they carry a specific value as their parameter's content. Once realized, our Static<T>{x} types are distinct for each distinct value.

zero = StaticFloat64(0)
one  = StaticFloat64(1)
typeof(zero) != typeof(one)
# typeof(zero).name === typeof(one).name
typeof(zero) <: StaticFloat64
typeof(one)  <: StaticFloat64

For evalpoly to work with StaticFloat64 as it does with Float64,

const SF = StaticFloat64
muladd(x::T1, y::T2, z::T3) where {T1<:SF, T2<:SF, T3<:SF}
# must be intercepted  to let the args x, y, z 
#   become of one shared type
muladd(x::T1, y::T2, z::T3) where {T1<:SF, T2<:SF, T3<:SF} =
  muladd(dynamic(x), dynamic(y), dynamic(z))

now muladd_float will be used as we prefer it (no difference when calculating with Float64s or with StaticFloat64).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants