iterator works slow #12

gabrieldodan · 2014-08-28T10:27:52Z

Hi, I have made some performance tests and I noticed that the iterator works very slow. I think is because of callback for each iteration. On each iteration it call the callback function, that's expensive. I think a Sync version for next() function would be very useful, e.g. iterator.nextSync() that should return the current key/value pair. What you think ?

Besides that, the library is really fast !

The text was updated successfully, but these errors were encountered:

stephenmathieson · 2014-08-28T12:04:50Z

hmm yeah a sync version may be nice. I haven't hit performance issues with #next(), but we do jump from JS to c++ a lot, so I believe it may be slow.

the next addition I make to this lib should be node 0.11 support, but I may find time for sync iterarors. if you opened a pr, I'd happily merge, btw ;)

gabrieldodan · 2014-08-28T12:32:13Z

Thanks :) Most probably is slow because of callbacks. I have also compared the performance of Async get/set vs Sync get/set. Async versions are much slower.

Also, a new suggestion. Would be very useful to have batch operations, for example to read multiple keys in one shot , a functionality somehow similar to Redis MULTI. e.g. getMulti(['k1', 'k2', 'k3'], cb) where cb is cb(err, values) In this way you can avoid so much jumps from JS to C++, you call only one callback ;)

stephenmathieson · 2014-08-28T12:49:59Z

yeah, the async stuff has to:

cross from JS to c++
be added to the libuv pool
wait its turn for execution
actually run
callback js

whereas the sync versions:

cross from JS to c++
run

quite a few less things to do ;)

a mutli-get makes sense too. it lands out of scope of the iterator, unless we'd do something like:

var iterator = db.iterator()
iterator.next(5 /* get 5 the next keys */, function (err, values) {
  if (err) handleError(err)
  // values = [ ... ]
})

also, if you run $ make benchmarks in the root of this repo, you'll be able to see iterator performance vs LevelDB. i think you'll be impressed ;)

edit:

                      Sophist#iterator
              70 op/s » iterate 1000 keys

                      LevelDOWN#iterator
              46 op/s » iterate 1000 keys

gabrieldodan · 2014-08-28T13:38:15Z

I am already impressed by this library :) seems it is perfect for my needs. It is really fast except Async iterators, according to sophia benchmarks http://sphia.org/benchmarks.html reading data using iterators should by much much faster than using get(key)

iterator.next(5, cb) would be useful also but not as critical as Sync iterator. I have made many tests with thousands of databases and hundreds of keys per database, it reads data very fast, I am curious how fast it reads using a Sync iterator. Any idea how would take you to implement the Sync iterator ? If you are very busy I could take a look at C++ source maybe I can implement myself Sync iterator.

stephenmathieson · 2014-08-28T17:41:18Z

a sync #next() shouldn't be hard. the async method is here.

for sync, maybe something like (this has not been tested!):

NAN_METHOD(Iterator::NextSync) {
  NanScope();
  Iterator *self = ObjectWrap::Unwrap<Iterator>(args.This());
  sophia::IteratorResult *result = self->it->Next();
  if (result) {
    v8::Local<v8::Object> obj = NanNew<v8::Object>();
    obj->Set(NanNew("key"), NanNew(result->key));
    obj->Set(NanNew("value"), NanNew(result->value));
    NanReturnValue(obj);
  } else {
    NanReturnUndefined();
  }
}

See #12.

stephenmathieson · 2014-09-02T16:39:25Z

@gabrieldodan I've been working on a new release (1.0.0). I've added Iterator#nextSync() as well as Iterator#endSync(). If you've got time, checkout the wip-node-11 branch ;)

gabrieldodan · 2014-09-16T15:39:56Z

Thanks, will do some tests tomorrow :) Also if you could implement muti-read that will be great! The idea is to read batch of data in only one call, to avoid too many jumps from JS to C++. An iterator like this:

iterate({start: 'k1', end:'k100', reverse: false}, function (err, values) {
});

iterate({start: 'k1', limit:20, reverse: false}, function (err, values) { // read only 20 records starting from k1
});

It doesn't make sense to return object by object, do the iteration on C++ and give me only the results! Will be also useful an option to return only the keys or only the values, or both. Thanks.

stephenmathieson · 2014-09-17T13:06:24Z

hmm.. iterating like that over a large repository would cause significant performance issues when dealing with any sort of concurrency. assuming we'd only push one task into uv's queue (read everything), everything else would be blocked (http stuff, other iterators, etc.).

now, if we queued a job for every read in uv's pool, we'd avoid blocking the main thread, but the iterator would be slow (again). all we'd do is reduce the number of times we go from js <-> c++.

also: opened separate thread: #13.

stephenmathieson · 2014-09-18T17:58:14Z

version 1.0.0 now has iterator#nextSync() implemented.

Will address iterator#next(N, fn) later on.

stephenmathieson pushed a commit that referenced this issue Sep 2, 2014

iterator.cc: Add #NextSync and #EndSync

0fcc37d

See #12.

stephenmathieson mentioned this issue Sep 17, 2014

feature: read multiple values from iterator#next() #13

Open

stephenmathieson closed this as completed Sep 18, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iterator works slow #12

iterator works slow #12

gabrieldodan commented Aug 28, 2014

stephenmathieson commented Aug 28, 2014

gabrieldodan commented Aug 28, 2014

stephenmathieson commented Aug 28, 2014

gabrieldodan commented Aug 28, 2014

stephenmathieson commented Aug 28, 2014

stephenmathieson commented Sep 2, 2014

gabrieldodan commented Sep 16, 2014

stephenmathieson commented Sep 17, 2014

stephenmathieson commented Sep 18, 2014

iterator works slow #12

iterator works slow #12

Comments

gabrieldodan commented Aug 28, 2014

stephenmathieson commented Aug 28, 2014

gabrieldodan commented Aug 28, 2014

stephenmathieson commented Aug 28, 2014

gabrieldodan commented Aug 28, 2014

stephenmathieson commented Aug 28, 2014

stephenmathieson commented Sep 2, 2014

gabrieldodan commented Sep 16, 2014

stephenmathieson commented Sep 17, 2014

stephenmathieson commented Sep 18, 2014