Efficiency measurements (how to gather?) #23

lintool · 2019-04-05T17:38:56Z

hi @andrewtrotman can you think about how you'd like the jig to report efficiency metrics? I see a few options:

jig could record it - but would be coarse grained
the image itself could record it - but relay the metrics back to the jig in some standard format

Both have their advantages and disadvantages... thoughts?

amallia · 2019-04-05T19:00:23Z

In terms of efficiency, in my opinion, it is fundamental to measure the size of the index.
Being able to store additional data can definitely improve query processing speed and quality, which, in turn, corresponds to higher main memory usage utilization.

Should we have specific hard limits (memory, space, CPUs..) on the Docker instance running? For example, by forcing the container to run on a single CPU we ensure that ad-hoc retrieval runs on a single core too.

Moreover, is efficiency only related to query processing? Is efficiency of indexing relevant at all?

andrewtrotman · 2019-04-12T04:51:21Z

Efficiency essentially breaks down into efficiency of space and efficiency of time.

In the case of the indexer I think we can just use the output of the UNIX time command to tell us how long it took to build the index. If the indexer also reports time it would be interesting to see how the two compare. We can use the UNIX ls command to see how large the index is, but the indexer will need to tell us where to look.

For the search I think the 250 topics we have is way too small for measuring search time. The brief test I ran suggested that some of those topics will take near-enough to 0 time. So I think we should use the 10,000 topics from the TREC Million Query Track (or 20,000 if we use both years). I'd like to compare what the search engine claims against what the UNIX time command claims. Sure, UNIX time will include start-up, shut-down, and index-load time, but that is why we also need to look at what the search engine claims.

So we need, I think, a "spec":

Nothing really for indexing (is there?), just agreement on a single line of output that states where the index can be found so that we can start the container and "ls" to get the index size. We can easily change the jig to call UNIX time command.

For search, we need to agree on when we start the timer, and when it ends, and what we are measuring (throughput or latency). We can turn throughput into latency by setting the thread count to 1. So lets measure throughput. I think we start the timer the last possible moment before the first query and stop the timer at the first possible moment after we complete the last query. As we all have the same I/O demands when it comes to producing the TREC run file, we could agree to include or exclude that time - thoughts please.

frrncl · 2019-04-12T06:24:42Z

Hi,

What about indexing time in the case of ML stuff? Should we break it down into training, validation, ...? Also, do we need some break down on the idea of index size in this case?

Nicola

andrewtrotman · 2019-04-15T22:28:33Z

Agreed - we need to measure the efficiency of the ML stuff. I'm hoping there's a change to do the ML stuff before indexing because I want to learn the best solution then bake it into my index.

albpurpura · 2019-04-19T08:48:50Z

NVSM performs indexing before training and validation. I think indexing could be a separate step from training and test, also for NeuIR models. Training, validation and test are performed on different subsets of topics specified by the user (without cross validation).
To summarize, the steps we consider are:

indexing
training and validation (with early stopping)
test
What do you think of this sequence of steps? Can we adopt this also for other NeuIR models?

cmacdonald · 2019-04-24T20:15:05Z

The (nuclear) alternative would be for efficiency to be measured using the jig by sending queries on stdin (one by one).

In any case, I agree that we should record the number of cores & threads involved in both retrieval and indexing, so we get like-for-like comparisons

lintool assigned andrewtrotman Apr 5, 2019

lintool mentioned this issue Apr 10, 2019

How is search time measured? #35

Closed

vkitchen mentioned this issue May 21, 2019

Implement timing metrics #70

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficiency measurements (how to gather?) #23

Efficiency measurements (how to gather?) #23

lintool commented Apr 5, 2019

amallia commented Apr 5, 2019 •

edited

Loading

andrewtrotman commented Apr 12, 2019

frrncl commented Apr 12, 2019

andrewtrotman commented Apr 15, 2019

albpurpura commented Apr 19, 2019

cmacdonald commented Apr 24, 2019

Efficiency measurements (how to gather?) #23

Efficiency measurements (how to gather?) #23

Comments

lintool commented Apr 5, 2019

amallia commented Apr 5, 2019 • edited Loading

andrewtrotman commented Apr 12, 2019

frrncl commented Apr 12, 2019

andrewtrotman commented Apr 15, 2019

albpurpura commented Apr 19, 2019

cmacdonald commented Apr 24, 2019

amallia commented Apr 5, 2019 •

edited

Loading