Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easy days: revisited #3661

Merged
merged 3 commits into from
Jan 8, 2025
Merged

Easy days: revisited #3661

merged 3 commits into from
Jan 8, 2025

Conversation

jakeprobst
Copy link
Contributor

#3643 was something of a temporary fix. However, having a feature be effectively disabled and it still affecting scheduling to that degree hinted at a more fundamental problem.

I found that the existing system has two problems: it biases cards to the furthest day, and removes many reasonable days from the list of possible days to schedule to.

format for scheduler examples:

interval: cards_due_on_day, load_balancer_percent (raw_lb_weight) => after_easy_days_percent (raw_ed_weight)

Take the (fixed by the above PR, but kept for illustration of the issue) case where all days were set to normal:

good -> good graduating interval

   11: 92 0.16 (0.000010740678) => 0.00 (0)
   12: 84 0.18 (0.00001181028) => 0.00 (0)
   13: 76 0.20 (0.000013317708) => 0.04 (0.00001598121)
   14: 73 0.20 (0.000013403748) => 0.16 (0.000056295703)
   15: 61 0.27 (0.00001791633) => 0.80 (0.0002902445)

good -> easy graduating interval

   21: 56 0.14 (0.000015184647) => 0.00 (0)
   22: 55 0.14 (0.000015026295) => 0.00 (0)
   23: 55 0.13 (0.000014372978) => 0.00 (0)
   24: 48 0.16 (0.000018084493) => 0.32 (0.000064587504)
   25: 52 0.13 (0.0000147929) => 0.00 (0)
   26: 48 0.15 (0.000016693379) => 0.30 (0.000059619237)
   27: 47 0.15 (0.000016766426) => 0.38 (0.000076646545)

In the first case, the latest day is heavily biased. And in the second, days that would be entirely reasonable to schedule to are zeroed out.


The case with a day marked as reduced fares similarly.

good -> good graduating interval

   11: 92 0.16 (0.000010740678) => 0.00 (0)
   12: 84 0.18 (0.00001181028) => 0.04 (0.000020996064)
   13: 76 0.20 (0.000013317708) => 0.22 (0.0001302176)
   14: 73 0.20 (0.000013403748) => 0.00 (0) # reduced day
   15: 61 0.27 (0.00001791633) => 0.75 (0.00044392687)

good -> easy graduating interval

   21: 56 0.14 (0.000015184647) => 0.00 (0) # reduced day
   22: 55 0.14 (0.000015026295) => 0.02 (0.000008091056)
   23: 55 0.13 (0.000014372978) => 0.02 (0.000007739271)
   24: 48 0.16 (0.000018084493) => 0.29 (0.00013632922)
   25: 52 0.13 (0.0000147929) => 0.11 (0.000052344083)
   26: 48 0.15 (0.000016693379) => 0.27 (0.00012584236)
   27: 47 0.15 (0.000016766426) => 0.30 (0.00014315946)

When there is at least one day set as reduced it fares a bit better, but it still overpowers the existing fuzz by putting a very heavy bias on days with less cards (the existing load balancer already does this (though in a gentler manner) so fuzzing ends up being overbiased by this particular data point). For a feature that is about scheduling fewer cards on certain days it sure does a lot more than that.


In this PR I have a different approach. The easy days modifers are either 0.0 or 1.0. This effectively turns that day on or off as a possible day to schedule to.

Naturally, minimum is always a 0.0 and normal is always a 1.0. The issue is how to toggle between 0.0 and 1.0 for reduced days. Here I use a simple method:
If the amount of cards due on a reduced day is below half the mean of all other days in the fuzz range, it is 1.0. Otherwise it is 0.0.

(note: in practice, the minimum modifier will be a small value such as 0.0001 to make it a rare occurance and remove the need to add a variety of special cases in the implementation)
(also note: half the mean is flexible, I can see it make more sense in practice for it to be 0.4 instead)

Lets look at the same cases above, but with this new approach.

With all days as normal:

good -> good graduating interval

   11: 92 0.16 (0.000010740678) => 0.16 (0.000010740678)
   12: 84 0.18 (0.00001181028) => 0.18 (0.00001181028)
   13: 76 0.20 (0.000013317708) => 0.20 (0.000013317708)
   14: 73 0.20 (0.000013403748) => 0.20 (0.000013403748)
   15: 61 0.27 (0.00001791633) => 0.27 (0.00001791633)

good -> easy graduating interval

   21: 56 0.14 (0.000015184647) => 0.14 (0.000015184647)
   22: 55 0.14 (0.000015026295) => 0.14 (0.000015026295)
   23: 55 0.13 (0.000014372978) => 0.13 (0.000014372978)
   24: 48 0.16 (0.000018084493) => 0.16 (0.000018084493)
   25: 52 0.13 (0.0000147929) => 0.13 (0.0000147929)
   26: 48 0.15 (0.000016693379) => 0.15 (0.000016693379)
   27: 47 0.15 (0.000016766426) => 0.15 (0.000016766426)

nothing changes, as it should be.


And with a single day reduced:

good -> good graduating interval

   11: 92 0.16 (0.000010740678) => 0.20 (0.000010740678)
   12: 84 0.18 (0.00001181028) => 0.22 (0.00001181028)
   13: 76 0.20 (0.000013317708) => 0.25 (0.000013317708)
   14: 73 0.20 (0.000013403748) => 0.00 (0.0000000013403748) # reduced day
   15: 61 0.27 (0.00001791633) => 0.33 (0.00001791633)

good -> easy graduating interval

   21: 56 0.14 (0.000015184647) => 0.00 (0.0000000015184647) # reduced day
   22: 55 0.14 (0.000015026295) => 0.16 (0.000015026295)
   23: 55 0.13 (0.000014372978) => 0.15 (0.000014372978)
   24: 48 0.16 (0.000018084493) => 0.19 (0.000018084493)
   25: 52 0.13 (0.0000147929) => 0.15 (0.0000147929)
   26: 48 0.15 (0.000016693379) => 0.17 (0.000016693379)
   27: 47 0.15 (0.000016766426) => 0.18 (0.000016766426)

The first case favored the day with fewer cards scheduled to it, but not exessively so. The second day has a fairly even distribution which is expected as the amount of cards due is fairly even.


Other cases/contrived examples!

all days are minimum: the easy day factor is constant for all of them so fuzzing occurs as normal
all days are reduced: days under the threshold will be 1.0 with higher load days being 0.0, if they are all close it will have the minimum load factor for all of them so normal balancing occurs.

a case where two days are reduced:

normal  1: 60 cards
reduced 2: 35 cards
normal  3: 40 cards
reduced 4: 20 cards
normal  5: 30 cards

in this case, the threshold of reduced 2 is 18.75 and reduced 4 is 20.625. reduced 2 is above the threshold but reduced 4 is below, so the easy days modifier is is: [1.0, 0.0, 1.0, 1.0, 1.0]

all reduced days:

reduced 1: 45 cards
reduced 2: 40 cards
reduced 3: 35 cards
day 1 => (120-45)/2*0.5 = 18.75 (45: over)
day 2 => (120-40)/2*0.5 = 20 (40: over)
day 3 => (120-35)/2*0.5 = 21.25 (35: over)

resulting modifier: [0.0, 0.0, 0.0]

But since the 0.0 is actually 0.0001, the fuzzing can proceed as normal .

If we add a normal day with 100 cards, we can see it will only schedule to the normal day.

day 1 => (220-45)/3*0.5 = 29.16  (45: over)
day 2 => (220-40)/3*0.5 = 30 (40: over)
day 3 => (220-35)/3*0.5 = 30.83 (35: over)
day 4 => the normal one

resulting modifier: [0.0, 0.0, 0.0, 1.0]

In this particular case, day4 would need 126 cards scheduled before a reduced day would even have the option of being scheduled to.



I would also like to convert the underlying configuration from being a series of floats to a series of enumerations of (Normal/Reduced/Minimal), but I am unsure it is worth the effort.

@L-M-Sherlock
Copy link
Contributor

L-M-Sherlock commented Dec 24, 2024

Could you write a simulation for current easy days design? Like this: https://github.com/open-spaced-repetition/easy-days-simulator/blob/main/notebook.ipynb

Edit: nvm, I made one:

image

Looks good to me. I will implement it in the helper add-on.

Edit:

I find a problem when I set Mon to normal and the rest to reduced:

image

The PR's design will reduce too much reviews for reduced days.

Here is the previous design's result:

image

Edit:

When I only set one weekday to reduced, the workload seems more reasonable:

image

Compared with the previous design:

image

In summary, the current design in this pull request will cause the ease of reduced days to fluctuate in relation to the number of reduced days configured.

@jakeprobst
Copy link
Contributor Author

newapproach
by multiplying reduced day counts by 2 (dividing by EASY_DAY_REDUCED_MODIFIER actually) when doing average loads, it cleans it up.

-    total_review_count = sum(review_cnts)
+    total_review_count = sum([rc if easy_days_percentages[d.dayofweek % 7] == 1.0 else rc*2 for (d, rc) in zip(possible_dates, review_cnts)])
-    other_days_total = total_review_count - review_count
+    other_days_total = total_review_count - review_count*2

However, this sort of pathological case does occur and I'm not sure what exactly to do about it (note the cliff at ~50)
newapproach_pathological

this case happens when only monday is normal. (first graph when only wednesday is normal)

but in actual actuality, should there be a limit on the number of days one can set as reduced? perhaps a requirement that 3-4 days need to be normal? Is there a reasonable case to allow users to do just have 6 reduced/minimal days in the first place?

@L-M-Sherlock
Copy link
Contributor

L-M-Sherlock commented Dec 25, 2024

Let's say there are six reduced days and one normal day, and the workload of one reduced day is 0.5.

The total workload of a week is 0.5 * 6 + 1 = 4.

If today is reduced day, the total workload of the rest days is 4 - 0.5 = 3.5. The avg is 3.5 / 6 = 0.58.

So today's workload (0.5) is related to the avg (0.58).

Here is the code:

    easy_days_modifier = []
    total_review_count = sum(review_cnts)
    EASY_DAYS_NORMAL_LOAD = 1.0
    EASY_DAYS_REDUCED_MODIFIER = 0.5
    EASY_DAYS_MINIMUM_LOAD = 0.0001
    date_percentages = [max(EASY_DAYS_MINIMUM_LOAD, easy_days_percentages[date.dayofweek % 7]) for date in possible_dates]
    for date, review_count in zip(possible_dates, review_cnts):
        if easy_days_percentages[date.dayofweek % 7] == 1:
            easy_days_modifier.append(EASY_DAYS_NORMAL_LOAD)
        elif easy_days_percentages[date.dayofweek % 7] == 0.5:
            other_days_count_total = total_review_count - review_count
            other_days_percentage_total = sum(date_percentages) - easy_days_percentages[date.dayofweek % 7]
            if review_count / EASY_DAYS_REDUCED_MODIFIER > other_days_count_total / other_days_percentage_total:
                easy_days_modifier.append(EASY_DAYS_MINIMUM_LOAD)
            else:
                easy_days_modifier.append(EASY_DAYS_NORMAL_LOAD)
        else:
            easy_days_modifier.append(EASY_DAYS_MINIMUM_LOAD)

It works well:

        1,  # Monday
        0.5,  # Tuesday
        0.5,  # Wednesday
        0.5,  # Thursday
        0.5,  # Friday
        0.5,  # Saturday
        0.5,  # Sunday

image


        1,  # Monday
        0,  # Tuesday
        0,  # Wednesday
        0,  # Thursday
        0,  # Friday
        0,  # Saturday
        0.5,  # Sunday

image


        1,  # Monday
        0,  # Tuesday
        0.5,  # Wednesday
        0,  # Thursday
        0.5,  # Friday
        0,  # Saturday
        0,  # Sunday

image

        1,  # Monday
        1,  # Tuesday
        0.5,  # Wednesday
        1,  # Thursday
        1,  # Friday
        1,  # Saturday
        1,  # Sunday

image

Copy link
Contributor

@L-M-Sherlock L-M-Sherlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

It was probably not worth the time I took to change this ^_^;
@dae
Copy link
Member

dae commented Jan 8, 2025

Thank you both!

@dae dae merged commit aaf8b4d into ankitects:main Jan 8, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants