Multiple KeyTuples support in Anna #46

authwork · 2020-05-16T10:30:41Z

@vsreekanti
Many thanks for your help.
I found KeyRequest can bring multiple KeyTuples in once transmission, so I try to upgrade it:

string vget_async(const vector<Key>& keys, int size) {
        # to simplify the process, or to jump the check of pending_get_response_map_
	//if (pending_get_response_map_.find(keys[0])
	//			== pending_get_response_map_.end()) {
		KeyRequest request;
		request.set_request_id(get_request_id());
		request.set_response_address(ut_.response_connect_address());
		request.set_type(RequestType::GET);
		for (int i = 0; i < size; i++) {
			KeyTuple* tp = request.add_tuples();
			tp->set_key(keys[i]);
		}
		try_request(request);
		return request.request_id();
	//}
	//return "";
}

When I set size = 1, it works normal; when it is larger than 1.
The request_id cannot match

# size = 1
time: 164087
throughput: 6.09433e-06
time: 1348
throughput: 0.00074184
time: 318
throughput: 0.00314465
staleness of one key: 12
10.1.2.1:0_9=?10.1.2.1:0_9
number of keys: 1
10.1.2.1:0_11=?10.1.2.1:0_11


# size = 2
time: 103791
throughput: 1.92695e-05
time: 56
throughput: 0.0357143
time: 310
throughput: 0.00645161
staleness of one key: 0
10.1.2.1:0_11=?10.1.2.1:0_7
number of keys: 1
=?10.1.2.1:0_11
staleness: 15
[libprotobuf FATAL /usr/local/include/google/protobuf/repeated_field.h:1522] CHECK failed: (index) < (current_size_):
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: (index) < (current_size_):
Aborted

Update:
I found the cause of this bug. It may not be related to the Multiple KeyTuples, but the receive function of batched put.
It is likes:

void receive(KvsClientInterface *client, int number) {
	vector<KeyResponse> responses = client->receive_async();
	while (responses.size() < number) {
		responses = client->receive_async();
		number = number - responses.size();
		if (number == 0)
			break;
	}
}

Update II:
The Python client said:

# PUT only supports one key operations, we only ever have to look at
# the first KeyTuple returned.

But I see user_request_handler goes through all tuples and handle all of them, then why it said PUT only supports one key operation?

 for (const auto &tuple : request.tuples()) {
    // first check if the thread is responsible for the key
    Key key = tuple.key();
    string payload = tuple.payload();
    ...
    else if (request_type == RequestType::PUT) {
           ...
    }
    ...
}

I have extended PUT operation like GET, but only the first key-value pair in the passed key vector is executed

x = client->vput_async(keys, values, count, LatticeType);
receive(client)
x = client->vget_async(keys, count);

only keys[0]'s value is returned.

The text was updated successfully, but these errors were encountered:

vsreekanti · 2020-05-18T16:18:53Z

The Python client only currently supports a single PUT not for a fundamental reason but because we just haven't implemented putting multiple keys in parallel. Thanks for catching the receive bug. Please make a PR with that change.

With regards to how the sends and receives might work, keep in mind that Anna uses a DHT under the hood. So when you call a PUT with two keys, say k1, and k2, those two keys might be on different machines (i.e., k1 goes to node1 and k2 goes to node2). Those nodes will send separate responses, so you will have to look at not just the request ID but also the key to which each node is responding to make sure you have the correct request/response mapping. That's also why you will only see 1 KeyTuple in the response -- node1 doesn't know you also sent a request for k2 to node2, so it only sends one response. Hope this clears things up!

authwork · 2020-05-19T02:47:50Z

Many thanks for your explaination.
I want to be sure:

Only the Python client of Anna could support getting multiple keys in parallel, this is implemented at the client side by sending separate responses to different machines.
If some keys are located on the same machine, it can use one request with multiple KeyTuples because the user_request_handler could handle all KeyTuples in each request. If it can do so, I configure the replica.memory=4 and replica.local=1 on 4 machines. Does it mean each machine will be necessary to have one replica of each key?

Currently, in my tests, I found that doing many requests is costly:

100 32-Bytes PUT requests
1 3200-Bytes PUT request
The first way caused 10 times longer latency than the second way.
I guess it may be caused by the number of the (PUT(key)-receive(client)) loop.

============================================
Some ideal case likes (sharding + replica):

The client divided a set of keys into multiple groups based on their locations.
The client send one request to one machine, carrying a group of keys.
Collect them on the client and get the values.

I am also looking for batched way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple KeyTuples support in Anna #46

Multiple KeyTuples support in Anna #46

authwork commented May 16, 2020 •

edited

Loading

vsreekanti commented May 18, 2020

authwork commented May 19, 2020 •

edited

Loading

Multiple KeyTuples support in Anna #46

Multiple KeyTuples support in Anna #46

Comments

authwork commented May 16, 2020 • edited Loading

vsreekanti commented May 18, 2020

authwork commented May 19, 2020 • edited Loading

authwork commented May 16, 2020 •

edited

Loading

authwork commented May 19, 2020 •

edited

Loading