build an assert for a linear algorithm

play with the thing, and print first results

O(x) algorithms : do a simple Ordinary Least Squares

disable GC for the moment

deal with warmup

find a way to know if my new algo is better than the previous one

have an arena, it generates validating samples, runs the 2 algos, then compares the distance on validation inputs

compute the rmse

split the classes to files

What parameters do I already have ?

distribution of input sizes

number of runs

GC or not

warmup or not

dump csvs to know how the error changes with the algos

detect whether warmup has a consequence

It Does. Particularly on small number of runs.

Even in the same process, we can spot a difference

Once we know if it is important, let’s just use it all the time

It might be depending on the algorithm ! the number of time some parts are looped over during the run.

[OK] On our linear search, around 60 warmup rounds are required to reduce variability

play with different counts of warmup rounds

O(1) : identify the flat slope

not sure a transformation is possible for a regression, otherwise, just do an average

predict is just returning the average

print and compare the results

try with a constant algorithm to test

need to rule out very small coefs, could find a quadratic with a small coef !

12 times on 55, found linear better than constant, only finds constant 80% of times

to compare RMSE corretly, they shouls be normalized, or using the same data points

reuse the same data for training and validation

If we had an idea of the max size we would like to use it for, we could check that at that size, coeff is still irrelevant

algo	slope	y intercept	at 1000
random 5	E-10	E-6	+10%
linear search	E-7	E-7	*1000

Compute the error we can expect from our data sample, and use it to determine what is 0 or not

Don’t know how to do it !

One thing is that negative slopes are zero !

See how it goes with quadratic

Could we use special assertion be_constant(within: constraints)

It’s a case of overfitting in fact !

We can give benefit of the doubt to the simpler model.

The more complex model needs to give an RMSE half the one of the simpler model to be selected

This works well on practice with Random5 and LinearSearch

O(x2) : pre-treat with an sqrt before OLS, compare RMSE of both linear and quadratic models to see which one is best

write a quadratic algorithm

add QuadraticComplexityModel

A = Qn2 + C

reg : sqrt(A) = Ln + D

reg : A = L2n2 + 2LDn + D2

select the best and see if it works

Needs to be twice as better as the next one !

have all rmse in order of complexity (constant, linear, quadratic, etc)

go through all starting from the first. Keep the min rmse

if a new rmse is found smaller than half the min, we’ll pick this one

Sometimes, if finds O(n) instead of O(n²). Maybe that’s not so much of an issue if the assertion checks that it’s at worst O(n²)

Add +7 pomodoro to the task title

write the basic rspec integration

define the API we want

class Algorithm; def generate_args(n) …; def run(args) …; end;

expect(Algorithm).to be_linear()

expect(Algorithm).to be_constant()

expect(Algorithm).to be_quadratic()

expect(Algorithm).to be_logarithmic()

expect(Algorithm).to be_in(N*LN(N))

expect(Algorithm, warmups: 30, rounds: 20, sizes: […]).to be_xxx() or expect(Algorithm).to be_xxx(warmups: 30, rounds: 20, sizes: […])

determine the condition

compare with constant

simple to do ! not sure it works :

algorithm	expect constant	expect linear
constant	regress both, find constant is better or linear with very small slope	will find a very small slope
linear	should fail	should pass, but might be quadratic !

checking constant is checking that is not worse than constant eg : linear or log

Files

DayBook.org

Latest commit

History

DayBook.org

File metadata and controls

build an assert for a linear algorithm

play with the thing, and print first results

O(x) algorithms : do a simple Ordinary Least Squares

disable GC for the moment

deal with warmup

find a way to know if my new algo is better than the previous one

have an arena, it generates validating samples, runs the 2 algos, then compares the distance on validation inputs

compute the rmse

split the classes to files

What parameters do I already have ?

distribution of input sizes

number of runs

GC or not

warmup or not

dump csvs to know how the error changes with the algos

detect whether warmup has a consequence

It Does. Particularly on small number of runs.

Even in the same process, we can spot a difference

Once we know if it is important, let’s just use it all the time

It might be depending on the algorithm ! the number of time some parts are looped over during the run.

[OK] On our linear search, around 60 warmup rounds are required to reduce variability

play with different counts of warmup rounds

O(1) : identify the flat slope

not sure a transformation is possible for a regression, otherwise, just do an average

predict is just returning the average

print and compare the results

try with a constant algorithm to test

need to rule out very small coefs, could find a quadratic with a small coef !

12 times on 55, found linear better than constant, only finds constant 80% of times

to compare RMSE corretly, they shouls be normalized, or using the same data points

reuse the same data for training and validation

If we had an idea of the max size we would like to use it for, we could check that at that size, coeff is still irrelevant

Compute the error we can expect from our data sample, and use it to determine what is 0 or not

Don’t know how to do it !

One thing is that negative slopes are zero !

See how it goes with quadratic

Could we use special assertion be_constant(within: constraints)

It’s a case of overfitting in fact !

We can give benefit of the doubt to the simpler model.

The more complex model needs to give an RMSE half the one of the simpler model to be selected

This works well on practice with Random5 and LinearSearch

O(x2) : pre-treat with an sqrt before OLS, compare RMSE of both linear and quadratic models to see which one is best

write a quadratic algorithm

add QuadraticComplexityModel

A = Qn2 + C

reg : sqrt(A) = Ln + D

reg : A = L2n2 + 2LDn + D2

select the best and see if it works

Needs to be twice as better as the next one !

have all rmse in order of complexity (constant, linear, quadratic, etc)

go through all starting from the first. Keep the min rmse

if a new rmse is found smaller than half the min, we’ll pick this one

Sometimes, if finds O(n) instead of O(n²). Maybe that’s not so much of an issue if the assertion checks that it’s at worst O(n²)

Add +7 pomodoro to the task title

write the basic rspec integration

define the API we want

class Algorithm; def generate_args(n) …; def run(args) …; end;

expect(Algorithm).to be_linear()

expect(Algorithm).to be_constant()

expect(Algorithm).to be_quadratic()

expect(Algorithm).to be_logarithmic()

expect(Algorithm).to be_in(N*LN(N))

expect(Algorithm, warmups: 30, rounds: 20, sizes: […]).to be_xxx() or expect(Algorithm).to be_xxx(warmups: 30, rounds: 20, sizes: […])

determine the condition

compare with constant

simple to do ! not sure it works :

checking constant is checking that is not worse than constant eg : linear or log

they can all be coded with transformations to the timings

constant : remove the size, keep the timing

linear : identity

logarithmic : some kind of exponential

quadratic : some kind of squar root

We just need to assert that it’s at worst O(something).

Get an internet access