YCSB Benchmark

简介

YCSB 是一个开源规范和程序套件, 用于评估计算机程序的检索和维护能力。它经常被用来比较 NoSQL 数据库管理系统的相对性能。本测试使用修改版的 YCSB 分别向官方原版 Redis、Pika 和 Redis on TerarkDB 导入 38,508,221 条 wikipedia 文章数据，并测试在不同内存下三者的随机读性能。

YCSB 有对不同数据分布的支持，例如 uniform，即均匀分布，其测试结果主要体现的是随机访问的性能。本文中所有的测试均使用 uniform 分布。

测试程序使用 terark/YCSB 1.0.1，我们在原版 YCSB 的基础上添加了读取文本文件作为数据源的功能。

测试的数据库有：Redis ，Pika，Pika on TerarkDB

测试平台

CPU	Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz x2 （共 16 核 32 线程）
内存	DDR4 16G @ 1866 MHz x 12 （共 192 G）
SSD	INTEL SSDSC2BP48 0420 IOPS 49000（共 480 G）
操作系统	CentOS 7

测试中使用的官方原版 Redis 版本为 redis-4.0.9

下文 G, GB 指 2³⁰，而非 10⁹。

数据导入

YCSB 原版只能导入自动生成的数据，这样的数据无法体现压缩算法的优劣，所以我们修改了 YCSB 源码，以支持导入指定文件中的数据。

数据导入后，数据库尺寸大小比较如下：

数据库尺寸		压缩率	数据条数	数据大小
Redis	65 G	63.1% 或 1.58倍	38,508,221	103 G
Pika	61 G	59.2% 或 1.69倍
Pika on TerarkDB	28 G	27.2% 或 3.68倍

导入数据所使用的 YCSB 命令如下：

bin/ycsb load redis -s -P wikipedia_load.conf -threads 32

随机读取所使用的 YCSB 命令如下：

bin/ycsb run redis -s -P wikipedia_read.conf -threads 32

测试结果

我们进行了随机读测试，测试分别在 192G、32G、24G、8G 的内存限制下运行。不同的内存限制使用内存挤占工具实现，内存挤占工具挤占一定数量的内存（不可换出）确保数据库所能使用的内存为以上指定值。

内存	数据库	OPS
192G	Pika on TerarkDB	94,251
	Pika	68,638
	Redis	50,132
32G	Pika on TerarkDB	72,768
	Pika	37,445
	Redis	19,790
24G	Pika on TerarkDB	47,664
	Pika	30,673
	Redis	13,982
8G	Pika on TerarkDB	483
	Pika	9,148
	Redis

配置文件

wikipedia_load.conf

recordcount=38508221

workload=com.yahoo.ycsb.workloads.FileWorkload

redis.host=127.0.0.1
redis.port=9221
table=data

datafile=/path/to/wikipedia.txt
fieldnames=cur_id,cur_namespace,cur_title,cur_text,cur_comment,cur_user,cur_user_text,cur_timestamp,cur_restrictions,cur_counter,cur_is_redirect,cur_minor_edit,cur_random,cur_touched,inverse_timestamp
delimiter=\t
usecustomkey=true
keyfield=0,1,2
fieldnum=15

wikipedia_read.conf

operationcount=38508221
workload=com.yahoo.ycsb.workloads.FileWorkload

readallfields=true
sourcefromself=false

readproportion=1
writeproportion=0

redis.host=127.0.0.1
redis.port=9221
table=data
requestdistribution=uniform

keyfile=/disk2/data/wikipedia.flat.no.tab.key.shuf
datafile=/disk2/data/wikipedia.flat.no.tab

fieldnames=cur_id,cur_namespace,cur_title,cur_text,cur_comment,cur_user,cur_user_text,cur_timestamp,cur_restrictions,cur_counter,cur_is_redirect,cur_minor_edit,cur_random,cur_touched,inverse_timestamp
delimiter=\t
usecustomkey=true
keyfield=0,1,2
fieldnum=15

writeinread=false
writerate=0
batchread=1

配置文件说明

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YCSB Benchmark

简介

测试平台

数据导入

测试结果

配置文件

Clone this wiki locally