forked from rygorous/ryg_rans
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
84 lines (68 loc) · 2.82 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
This is a public-domain implementation of a byte-aligned rANS encoder.
rans_byte.h has all the good stuff and comments on how to use it.
rans64.h is a version for 64-bit architectures (faster and more accurate).
rans_word_sse41.h is a world-aligned SIMD decoder (I'll write more about this
later).
See my blog http://fgiesen.wordpress.com/ for some notes on the design.
I intend to write a blog post about the design soon. Until then, sneak preview!
Results on my machine (Sandy Bridge i7-2600K) with rans_byte in 64-bit mode:
----
rANS encode:
12896496 clocks, 16.8 clocks/symbol (192.8MiB/s)
12486912 clocks, 16.2 clocks/symbol (199.2MiB/s)
12511975 clocks, 16.3 clocks/symbol (198.8MiB/s)
12660765 clocks, 16.5 clocks/symbol (196.4MiB/s)
12550285 clocks, 16.3 clocks/symbol (198.2MiB/s)
rANS: 435113 bytes
17023550 clocks, 22.1 clocks/symbol (146.1MiB/s)
18081509 clocks, 23.5 clocks/symbol (137.5MiB/s)
16901632 clocks, 22.0 clocks/symbol (147.1MiB/s)
17166188 clocks, 22.3 clocks/symbol (144.9MiB/s)
17235859 clocks, 22.4 clocks/symbol (144.3MiB/s)
decode ok!
interleaved rANS encode:
9618004 clocks, 12.5 clocks/symbol (258.6MiB/s)
9488277 clocks, 12.3 clocks/symbol (262.1MiB/s)
9460194 clocks, 12.3 clocks/symbol (262.9MiB/s)
9582025 clocks, 12.5 clocks/symbol (259.5MiB/s)
9332017 clocks, 12.1 clocks/symbol (266.5MiB/s)
interleaved rANS: 435117 bytes
10687601 clocks, 13.9 clocks/symbol (232.7MB/s)
10637918 clocks, 13.8 clocks/symbol (233.8MB/s)
10909652 clocks, 14.2 clocks/symbol (227.9MB/s)
10947637 clocks, 14.2 clocks/symbol (227.2MB/s)
10529464 clocks, 13.7 clocks/symbol (236.2MB/s)
decode ok!
----
And here's rans64 in 64-bit mode:
----
rANS encode:
10256075 clocks, 13.3 clocks/symbol (242.3MiB/s)
10620132 clocks, 13.8 clocks/symbol (234.1MiB/s)
10043080 clocks, 13.1 clocks/symbol (247.6MiB/s)
9878205 clocks, 12.8 clocks/symbol (251.8MiB/s)
10122645 clocks, 13.2 clocks/symbol (245.7MiB/s)
rANS: 435116 bytes
14244155 clocks, 18.5 clocks/symbol (174.6MiB/s)
15072524 clocks, 19.6 clocks/symbol (165.0MiB/s)
14787604 clocks, 19.2 clocks/symbol (168.2MiB/s)
14736556 clocks, 19.2 clocks/symbol (168.8MiB/s)
14686129 clocks, 19.1 clocks/symbol (169.3MiB/s)
decode ok!
interleaved rANS encode:
7691159 clocks, 10.0 clocks/symbol (323.3MiB/s)
7182692 clocks, 9.3 clocks/symbol (346.2MiB/s)
7060804 clocks, 9.2 clocks/symbol (352.2MiB/s)
6949201 clocks, 9.0 clocks/symbol (357.9MiB/s)
6876415 clocks, 8.9 clocks/symbol (361.6MiB/s)
interleaved rANS: 435120 bytes
8133574 clocks, 10.6 clocks/symbol (305.7MB/s)
8631618 clocks, 11.2 clocks/symbol (288.1MB/s)
8643790 clocks, 11.2 clocks/symbol (287.7MB/s)
8449364 clocks, 11.0 clocks/symbol (294.3MB/s)
8331444 clocks, 10.8 clocks/symbol (298.5MB/s)
decode ok!
----
Note that this is running "book1" which is a relatively short test, and
the measurement setup is not great.
-Fabian "ryg" Giesen, Feb 2014.