Combined Sieve - a new program to find prime gaps.

A fast prime gap searching suite.

$ make
$ sqlite3 test.db < schema.sql
$ FN=809_1841070_11244000_1000_s15000_l200M.txt
$ ./combined_sieve --save --search-db test.db -u $FN
$ ./gap_stats --save --search-db test.db -u $FN --min-merit 20
$ ./gap_test.py --search-db test.db -u $FN
...
23142  30.1485  11244911 * 809#/1841070 -5788 to +17354	RECORD!
...

see [EXTENDED.md](for labourious notes and details)

Tools

See detailed instructions on each combined_sieve, gap_stats, gap_test, and more in EXTENDED.md tools.

Flow

combined_sieve
- Takes a set of parameters ([m_start, m_start + m_inc), P#, d, sieve-length, max-prime)
  - Use combined_sieve --mstart 1 --minc 100 --max-prime 1 -d 4 -p <P> for some insight on choose of d
- Performs combined sieve algorithm
- Produces <PARAMS>.txt, list of all numbers near m * P#/d with no small factors.
  - each row is m: -count_low, +count_high | -18 -50 -80 ... | +6 +10 +12 ...
- Writes initial range entry (with time_sieve) into range table.
gap_stats
- Calculates some statistics over each range in the interval
  - Expected gap in both directions
  - Probability of being a record gap, of being a missing gap, of being gap > --min-merit
- Store statistics in m_stats table
- Compute prob(gap) over all m and save in range_stats table
gap_test_simple / gap_test.py --unknown-filename <PARAM>.txt
- Runs PRP tests on most likely m's from <PARAM>.txt
- Stores results (as they are calculated in result & m_stats
  - Allows for saving of progress and graceful restart
- gap_test.py can produce graphs of distribution (via --plots)

Database

Often I run combined_sieve and gap_stats for a BUNCH of ranges not I never end up testing.

misc/show_ranges.sh updates and shows ranges from prime-gap-search.db.

misc/drop_range.sh deletes ranges when none of them have ever been tested

misc/record_check.py searches through prime-gap-search.db for records (over gaps.db)

misc/finaliz.py -p <P> -d <D> dumps m_stats to <UNKNOWN>.csv and <UNKNOWN>.stats.txt files. It also DELETES DATA from prime-gap-search.db use it when you are done searching that P#/d

misc/run.sh <UNKNOWN_FILENAME> runs combined_sieve, gap_stats, and prints the gap_test.py command

misc/status.py to check if a filename has been processed 100% (and can be deleted)

Benchmarks

See EXTENDED.md benchmark and slightly out of date BENCHMARKS.md

Theory

Theory and justification for some calculations in present in THEORY.md

Setup

#$ sudo apt install libgmp10 libgmp-dev
#$ sudo apt install mercurial build-essential automake autoconf bison make libtool texinfo m4
# Building gmp from source is required until 6.3
$ hg clone https://gmplib.org/repo/gmp/
$ cd gmp
$ ./.bootstrap
$ mkdir build
$ cd build
$ ../configure
$ make
$ make check
$ make install

$ sudo apt install sqlite3

$ sudo apt install libmpfr-dev libmpc-dev libomp-dev libsqlite3-dev libbenchmark-dev

# Make sure this is >= python3.7 and not python2
$ python -m pip install --user gmpy2==2.1.0b5 primegapverify

# For misc/double_check.py
$ sudo apt install gmp-ecm
$ sudo apt install sympy

$ git clone https://github.com/sethtroisi/prime-gap.git
$ cd prime-gap

$ mkdir -p logs/ unknowns/
$ sqlite3 prime-gap-search.db < schema.sql

# There are a handful of constants (MODULE_SEARCH_SECS, PRIME_RANGE_SEC, ...)
# in gap_common.cpp `combined_sieve_method2_time_estimate` that can be set for
# more accurate `combined_sieve` timing.

prime-gap-list

$ git clone --depth 10 https://github.com/primegap-list-project/prime-gap-list.git
$ sqlite3 gaps.db < prime-gap-list/allgaps.sql

Notes

gap_test_gpu is experimental using https://github.com/NVlabs/CGBN
- It's much faster for <2,048 bits.
Clang (CC=clang++-12) might speed up combined_sieve slightly.
There is some support for using OpenPFGW PrimeWiki SourceForge
- Unpack somewhere and link binary as pfgw64
- may also need to chmod a+x pfgw64
- modify is_prime in gap_utils.py
Multiple layers of verification of combined_sieve
- Can compare --method1 output with --method2
- Can add use make clean combined_sieve VALIDATE_FACTORS=1 or make clean combined_sieve VALIDATE_LARGE=1
- misc/double_check.py double checks using ecm, to check that small factor aren't found for numbers in unknown-file.txt
  - python misc/double_check.py --unknown-filename <unknown_file.txt> -c 10
- skipped PRP/s is checked in THEORY.md
Compression
- method1 doesn't support compression at this time so it also makes verifying combined_sieve slightly harder.
  - misc/convert_rle.py can convert existing files to RLE if you want to double check / shrink old files.
- Run Length Encoding
  - can be forced with --rle` saves ~60% of space
- --bitcompress
  - Saves ~95% of space, but makes code very hard
  - Stores 7 bits per byte (to avoid ascii control characters) of is_composite array
  - Reduces size of is_composite by removing any number coprime to K# or d

Quick test of all functions

This is somewhat automated with ./misc/tests.sh which should print a green Success!

quick test commands and output

$ PARAMS="-p 907 -d 2190 --mstart 1 --minc 200 --max-prime 100 --sieve-length 11000"
$ make combined_sieve gap_stats gap_test_simple
$ time ./combined_sieve --method1 -qqq --save-unknowns $PARAMS
$ time ./combined_sieve           -qqq --save-unknowns $PARAMS
$ md5sum unknowns/907_2190_1_200_s11000_l100M.{txt,m1.txt}
15a5cbff7301262caf047028c05f0525  unknowns/907_2190_1_200_s11000_l100M.txt
15a5cbff7301262caf047028c05f0525  unknowns/907_2190_1_200_s11000_l100M.m1.txt

$ ./gap_stats --unknown-filename 907_2190_1_200_s11000_l100M.txt
# Verify RECORD @ 1,7,13,101,137
# Verify "avg missing prob : 0.0000000"
# Verify "RECORD : top  50% (    26) sum(prob) = 1.50e-05 (avg: 5.77e-07)"
# Verify "RECORD : top 100% (    53) sum(prob) = 2.15e-05 (avg: 4.06e-07)"

# Optionally validate combined_sieve results with ecm
$ python misc/double_check.py --seed=123 --unknown-filename unknowns/907_2190_1_200_s11000_l100M.txt -c 5 --B1 1000
1 * 907# / 2190 	unknowns 218 + 212 = 430
			+1423  had trivial factor of 2 skipping
	+6570 	should have small factor
		ecm: 2099 (1*907#/2190+6570)/2099
			+1373  had trivial factor of 2 skipping
			-1439  had trivial factor of 2 skipping
			+7789  had trivial factor of 2 skipping
			-2306  had trivial factor of 3 skipping
			+4377  had trivial factor of 2 skipping
	-4222 	should have small factor
		ecm: 3325199 (1*907#/2190+-4222)/3325199
		factor 3325199: 1319 2521
...
17 * 907# / 2190 	unknowns 216 + 208 = 424
	+2722 	shouldn't have small factor
		ecm: 17*907#/2190+2722
...
4150 trivially composite, 86 unknowns, 179 known composites
ecm found 178 composites: known 171 composite, 7 were unknown


$ ./gap_test_simple --unknown-filename 907_2190_1_200_s11000_l100M.txt -q --min-merit 8
Testing m * 907#/2190, m = 1 + [0, 200)
Min Gap ~= 7734 (for merit > 9.0)

Reading unknowns from '907_2190_1_200_s11000_l100M.txt'

53 tests M_start(1) + mi(0 to 198)

7750  9.0185  1 * 907#/2190 -450 to +7300
	m=1  218 <- unknowns -> 212 	 450 <- gap -> 7300
	    tests     1          (15.29/sec)  0 seconds elapsed
	    unknowns  430        (avg: 430.00), 98.05% composite  50.70% <- % -> 49.30%
	    prp tests 144        (avg: 144.00) (2201.5 tests/sec)
7462  8.6548  17 * 907#/2190 -6982 to +480
6938  8.0460  19 * 907#/2190 -6722 to +216
	m=37  214 <- unknowns -> 234 	1296 <- gap -> 250
	    tests     10         (34.35/sec)  0 seconds elapsed
	    unknowns  4262       (avg: 426.20), 98.06% composite  48.97% <- % -> 51.03%
	    prp tests 643        (avg: 64.30) (2208.7 tests/sec)
8520  9.8689  53 * 907#/2190 -4084 to +4436
	m=113  207 <- unknowns -> 215 	  40 <- gap -> 972
	    tests     30         (41.87/sec)  1 seconds elapsed
	    unknowns  12676      (avg: 422.53), 98.08% composite  49.31% <- % -> 50.69%
	    prp tests 1550       (avg: 51.67) (2163.0 tests/sec)
8382  9.6979  143 * 907#/2190 -1800 to +6582
7678  8.8820  163 * 907#/2190 -432 to +7246
	m=199  216 <- unknowns -> 205 	 192 <- gap -> 30
	    tests     53         (43.88/sec)  1 seconds elapsed
	    unknowns  22377      (avg: 422.21), 98.08% composite  49.97% <- % -> 50.03%
	    prp tests 2590       (avg: 48.87) (2144.5 tests/sec)


$ python gap_test.py --unknown-filename 907_2190_1_200_s11000_l100M.txt --min-merit 8
...
7750  9.0185  1 * 907#/2190 -7300 to +450
	  1  218 <- unknowns ->  212	7300 <- gap ->  450
7462  8.6548  17 * 907#/2190 -480 to +6982
6938  8.0460  19 * 907#/2190 -216 to +6722
	 37  214 <- unknowns ->  234	 250 <- gap -> 1296
	    tests     10         (35.7/sec)  0 seconds elapsed
	    unknowns  4262       (avg: 426.20), 98.06% composite  48.97% <- % -> 51.03%
	    prp tests 643        (avg: 64.30) (2294.854 tests/sec)
	    best merit this interval: 9.018 (at m=1)
8520  9.8689  53 * 907#/2190 -4436 to +4084
	113  207 <- unknowns ->  215	 972 <- gap ->   40
	    tests     30         (43.1/sec)  1 seconds elapsed
	    unknowns  12676      (avg: 422.53), 98.08% composite  49.31% <- % -> 50.69%
	    prp tests 1550       (avg: 51.67) (2226.529 tests/sec)
	    best merit this interval: 9.869 (at m=53)
8382  9.6979  143 * 907#/2190 -6582 to +1800
7678  8.8820  163 * 907#/2190 -7246 to +432
	199  216 <- unknowns ->  205	  30 <- gap ->  192
	    tests     53         (45/sec)  1 seconds elapsed
	    unknowns  22377      (avg: 422.21), 98.08% composite  49.97% <- % -> 50.03%
	    prp tests 2590       (avg: 48.87) (2199.432 tests/sec)
	    best merit this interval: 9.698 (at m=143)

Name	Name	Last commit message	Last commit date
Latest commit sethtroisi Fix min GMP version logic Feb 3, 2025 b64c160 · Feb 3, 2025 History 633 Commits
.idea	.idea	combined_sieve: QoL tweaks	Feb 2, 2021
cmake	cmake	Project: working on cmake	Jan 19, 2021
data	data	gap_test_plotting: Fix plotting	Oct 28, 2020
misc	misc	Update some constants	Aug 9, 2024
.gitignore	.gitignore	project: Break out EXTENDED.md from README.md	Aug 25, 2023
BENCHMARKS.md	BENCHMARKS.md	project: Added some misc notes	Aug 25, 2023
CMakeLists.txt	CMakeLists.txt	Project: working on cmake	Jan 19, 2021
EXTENDED.md	EXTENDED.md	project: Break out EXTENDED.md from README.md	Aug 25, 2023
LICENSE	LICENSE	include LICENSE	Feb 5, 2020
Makefile	Makefile	gap_test_gpu: Auto-select BITS part 1	Jul 16, 2021
README.md	README.md	project: Break out EXTENDED.md from README.md	Aug 25, 2023
THEORY.md	THEORY.md	project: Added some misc notes	Aug 25, 2023
combined_sieve.cpp	combined_sieve.cpp	Add explicit (but undocumented uncompressed option)	Oct 1, 2021
gap_common.cpp	gap_common.cpp	Fix min GMP version logic	Feb 3, 2025
gap_common.h	gap_common.h	gap_test: mskip	Aug 3, 2021
gap_stats.cpp	gap_stats.cpp	Update some constants	Aug 9, 2024
gap_test.py	gap_test.py	README: updates	Aug 3, 2021
gap_test_common.cpp	gap_test_common.cpp	QoL improvements	Aug 5, 2021
gap_test_common.h	gap_test_common.h	gap_test_gpu: log prev_prime_only count	Aug 2, 2021
gap_test_gpu.cu	gap_test_gpu.cu	QoL improvements	Aug 5, 2021
gap_test_plotting.py	gap_test_plotting.py	project: Added some misc notes	Aug 25, 2023
gap_test_simple.cpp	gap_test_simple.cpp	Fix min GMP version logic	Feb 3, 2025
gap_test_stats.py	gap_test_stats.py	python: unify parse unknown and compressed line	Aug 2, 2021
gap_utils.py	gap_utils.py	python: unify parse unknown and compressed line	Aug 2, 2021
gap_utils_primes.py	gap_utils_primes.py	Allow newer GMP versions	Aug 9, 2024
miller_rabin.h	miller_rabin.h	gap_test_gpu: fix busy wait on cuda	Jul 28, 2021
modulo_search.cpp	modulo_search.cpp	project: Added some misc notes	Aug 25, 2023
modulo_search.h	modulo_search.h	Project: Handle m > uint32	May 18, 2021
schema.sql	schema.sql	misc: various TODO, comments, and self notes	Jul 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Combined Sieve - a new program to find prime gaps.

Table of Contents

Overview

Tools

Flow

Database

Benchmarks

Theory

Setup

Notes

Quick test of all functions

About

Languages

License

sethtroisi/prime-gap

Folders and files

Latest commit

History

Repository files navigation

Combined Sieve - a new program to find prime gaps.

Table of Contents

Overview

Tools

Flow

Database

Benchmarks

Theory

Setup

Notes

Quick test of all functions

About

Resources

License

Stars

Watchers

Forks

Languages