-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PRP modular division bug #18
Comments
The problem is solved by a small rewrite of Lines 6893 to 6907 in 93b703d
The problem is that fquo may have a tiny round-off error so that instead of 1.00000000000000000000, it equals 0.99999999999999988898, and so in turn the conversion itmp64 equals 0 rather than 1. Ernst is aware of round-off-errors, so he makes two passes, but the problem is in the else limb of the second pass, because fquo still evaluates to just shy of an integer (0.99999999999999988898) and gets rounded down to zero again, which causes the modular division to fail since fquo is one short and returns a value of 0x0. One possible workaround (but I’m sure someone can think of something superior) is to just nudge fquo slightly higher in the else limb, since (uint64)fquo will convert it to an integer anyway:
fquo = rem64*fqinv + 0.00000000000001; // CXC: Give fquo a tiny push if we are in the 2nd pass, to ensure a 0.9999999999999 value actually gets to 1.0 |
Ouch. That kind of tricks with floating-point values is dangerous. |
It’s a nasty hack, which is why I hope someone can think of a more elegant method of dealing with the float to integer conversion. Edited to add: I nearly forgot. For some reason Mlucas really didn’t like trying the 31 base test of M44497 at 4K (so I substituted 3K which completed the test successfully):
|
I've had enough of implementing FP operations in our Arm simulator... I now hate FP :-) |
Enjoy your holiday, away from devilish FP implementations! I figure there’s no rush to solve this, given that the line 698 assertion in |
Thanks for debugging this and finding the source of the problem!
The typical solution to this problem in C is to add
That is the same error I got in #19. It looks like it affects multiple FFT lengths. |
On second thought, I think it is possible to make the fix much more targeted by making the “push” conditional on } else {
fquo = rem64*fqinv;
if(fquo >= 0.99999999999999 && fquo < 1.0) fquo += 0.00000000000001; // CXC: Give fquo a tiny push if we are in the 2nd pass, to ensure a 0.9999999999999 value actually gets to 1.0; see https://github.com/primesearch/Mlucas/issues/18
itmp64 += (uint64)fquo;
rem64 = rem64 - q*(uint64)fquo;
} |
Hmm, unless we are certain the bad value is going to be greater than |
I think we can be confident that the quotient will be evaluated to a fair number of significant digits. The main reason we are looking at this bit of code is because of a rounding error in b2 * b–2, as for some values of b we do not get 1.0, but very close to it; so I would be quite happy with ten nines (0.9999999999) as the lower bound. Contrarily I would not be happy with 0.5 since we know that this value should not be the result of a multiplication by a modular inverse where we expect to get divisibility by an integer, and if that is happening we should be getting an error. Mlucas seems happy using up to about 224–1 = 16777215 as a PRP base:
The rounding error for this base was evidently not problematic, since my debug printout showed
|
Thanks for the additional information. As you may know, this is a very common problem when using floating point numbers. The rounding error for the calculation is only going to get bigger as the base increases, especially as one gets near the limits for double precision floating point. Anyway, I will let @ldesnogu make the final decision on how we should fix this when they get back from their holiday. If you are certain that the value is going to be "very close" to 1.0, you could try adding |
Sorry for my inability to explain the problem and my concern. Generally the point of finding a modular inverse is that you don’t have to do division (which is slow) but instead a modular multiplication by the mod inverse should give the same result as a division by the number it inverts – the algorithm for obtaining a modular inverse is non-trivial but it’s still better than doing division. |
Thanks for the detailed explanation. My understanding is that all of the many algorithms implemented by Mlucas exclusively involve integers and ideally should be implemented entirely with the integer datatypes. However, modern CPUs and other hardware) are much faster when using the floating point datatypes. The SIMD instructions needed for good performance on CPUs can only be fully utilized when using the floating point datatypes. That is why all GIMPS programs, including Mlucas, are forced to use floating point datatypes, which requires constantly converting back and forth between floating point and integer numbers. However, as you know, in general, floating point datatypes cannot perform integer calculations precisely, so this causes inevitable roundoff errors (ROEs). For example, the double precision ( I understand what you are saying that the result of this calculation should be very close to an integer. However, it is multiplying a potentially very large number by a very small number, which could easily result in a huge ROE, as the floating point datatypes are not designed for working with numbers of different magnitudes. By using the floating point datatypes, we have very little control over the ROEs, so we just have to hope that they are less than 0.5 (or 0.4375) so they can be reliably detected. Currently any value between Regardless of how we decide to fix this, feel free to add a warning or assert statement here for the case that the result is not close to an integer, if you believe it is needed. |
After a little bit of thought I added a warning using |
As discovered in issue #15, a PRP test of a probable Mersenne prime was incorrectly tested, probably as a result of legacy code that was not correctly adapted from the case of Lucas–Lehmer or Pépin tests.
To PRP test a Mersenne prime efficiently with modular squaring, Mlucas evaluates b2p ≡ b2 (mod 2p – 1), and then as the final step undertakes a modular division by b2 to obtain the usual form of Fermat’s little theorem, where b2p–2 ≡ 1 (mod 2p – 1) indicates the Mersenne number is a probable prime to the base b. Any result other than 1 indicates the Mersenne number is composite.
The typical base for this PRP test is 3; of the other small primes, 2 is unsuitable as a base because any Mersenne number with an odd prime exponent is a 2–PRP pseudoprime, while bases 5 and 7 do not present problems. An assert at line 698 of Mlucas.c disables using PRP bases other than 3 (which was put in place probably owing to this issue).
So for base 3, Mlucas performs a modular division by 9, taking the 32p residue from the a[ ] array and storing the result of the division in arrtmp[ ]. (Pull request #17 fixes the primality checking loop that examines the arrtmp[ ] for any non-zero values other than the final 0x1 result.)
Bases 5 and 7 require modular division by 25 and 49 respectively as the final step of the test, and in testing the 5–PRP results were correct (0x1) while the 7–PRP results were not (0x31 ÷ 72 ≠ 0x0, the result returned).
Therefore there exists a substantial bug in the modular division code for untypical PRP bases such as 7 (and possibly others).
The text was updated successfully, but these errors were encountered: