Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Сalculation speed up for the Gregorian calendar #5849

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

149 changes: 51 additions & 98 deletions components/calendar/src/iso.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ use crate::calendar_arithmetic::{ArithmeticDate, CalendarArithmetic};
use crate::error::DateError;
use crate::{types, Calendar, Date, DateDuration, DateDurationUnit, DateTime, RangeError, Time};
use calendrical_calculations::helpers::I32CastError;
use calendrical_calculations::iso::{day_of_year, is_leap_year, iso_from_year_day};
use calendrical_calculations::rata_die::RataDie;
use tinystr::tinystr;

Expand Down Expand Up @@ -61,12 +62,37 @@ pub struct IsoDateInner(pub(crate) ArithmeticDate<Iso>);
impl CalendarArithmetic for Iso {
type YearInfo = ();

fn month_days(year: i32, month: u8, _data: ()) -> u8 {
fn month_days(year: i32, month: u8, _: ()) -> u8 {
// Binary representation of `30` is `0b__11110`
// Month in 1..=12 represented as `0b__00001`..=`0b__01100`
// So:
// A. For any x in 0..31: `30 | x` = `30 + is_odd(x)`
// | so `30 | (month ^ (month >> 3))` = `30 + is_odd(month ^ (month >> 3))`
// B. `month >> 3` is:
// | * `0` for months in 1..=7,
// | * `1` for months in 8..=12,
// C. From [B] => `is_odd(month ^ (month >> 3))` is
// | * `is_odd(month)` for months in 1..=7,
// | * `!is_odd(month)` for months in 8..=12,
//
// days: | 31 | 28 | 31 | 30 | 31 | 30 | 31 | 31 | 30 | 31 | 30 | 31 |
// month: | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
// B: | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 |
// C: | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 1 |
// A: | 31 |=30=| 31 | 30 | 31 | 30 | 31 | 31 | 30 | 31 | 30 | 31 |
//
//
// Avg. speed is ~the same as full matching because of
// computation time for `30 | (month ^ (month >> 3))`,
// but there will be less jump and therefore it can be
// helpful for branch predictor.
// Also it use less memory space (fewer generated code).
match month {
4 | 6 | 9 | 11 => 30,
2 if Self::is_leap_year(year, ()) => 29,
2 => 28,
1 | 3 | 5 | 7 | 8 | 10 | 12 => 31,
2 => 28 | (is_leap_year(year) as u8),
1..=12 => 30 | (month ^ (month >> 3)),
// Should we return `0` on incorrect month?
// Or we can return any number?
// If any => delete next line & change `1..=12` on prev line to `_`
_ => 0,
}
}
Expand All @@ -76,19 +102,15 @@ impl CalendarArithmetic for Iso {
}

fn is_leap_year(year: i32, _data: ()) -> bool {
calendrical_calculations::iso::is_leap_year(year)
is_leap_year(year)
}

fn last_month_day_in_year(_year: i32, _data: ()) -> (u8, u8) {
(12, 31)
}

fn days_in_provided_year(year: i32, _data: ()) -> u16 {
if Self::is_leap_year(year, ()) {
366
} else {
365
}
fn days_in_provided_year(year: i32, _: ()) -> u16 {
Self::days_in_year_direct(year)
}
}

Expand Down Expand Up @@ -132,51 +154,9 @@ impl Calendar for Iso {
}

fn day_of_week(&self, date: &Self::DateInner) -> types::IsoWeekday {
// For the purposes of the calculation here, Monday is 0, Sunday is 6
// ISO has Monday=1, Sunday=7, which we transform in the last step

// The days of the week are the same every 400 years
// so we normalize to the nearest multiple of 400
let years_since_400 = date.0.year.rem_euclid(400);
debug_assert!(years_since_400 >= 0); // rem_euclid returns positive numbers
let years_since_400 = years_since_400 as u32;
let leap_years_since_400 = years_since_400 / 4 - years_since_400 / 100;
// The number of days to the current year
// Can never cause an overflow because years_since_400 has a maximum value of 399.
let days_to_current_year = 365 * years_since_400 + leap_years_since_400;
// The weekday offset from January 1 this year and January 1 2000
let year_offset = days_to_current_year % 7;

// Corresponding months from
// https://en.wikipedia.org/wiki/Determination_of_the_day_of_the_week#Corresponding_months
let month_offset = if Self::is_leap_year(date.0.year, ()) {
match date.0.month {
10 => 0,
5 => 1,
2 | 8 => 2,
3 | 11 => 3,
6 => 4,
9 | 12 => 5,
1 | 4 | 7 => 6,
_ => unreachable!(),
}
} else {
match date.0.month {
1 | 10 => 0,
5 => 1,
8 => 2,
2 | 3 | 11 => 3,
6 => 4,
9 | 12 => 5,
4 | 7 => 6,
_ => unreachable!(),
}
};
let january_1_2000 = 5; // Saturday
let day_offset = (january_1_2000 + year_offset + month_offset + date.0.day as u32) % 7;

// We calculated in a zero-indexed fashion, but ISO specifies one-indexed
types::IsoWeekday::from((day_offset + 1) as usize)
let day_of_week =
calendrical_calculations::iso::day_of_week(date.0.year, date.0.month, date.0.day);
types::IsoWeekday::from(day_of_week as usize)
}

fn offset_date(&self, date: &mut Self::DateInner, offset: DateDuration<Self>) {
Expand All @@ -200,7 +180,7 @@ impl Calendar for Iso {
}

fn is_in_leap_year(&self, date: &Self::DateInner) -> bool {
Self::is_leap_year(date.0.year, ())
is_leap_year(date.0.year)
}

/// The calendar-specific month represented by `date`
Expand Down Expand Up @@ -291,22 +271,17 @@ impl Iso {
Self
}

/// Count the number of days in a given month/year combo
fn days_in_month(year: i32, month: u8) -> u8 {
match month {
4 | 6 | 9 | 11 => 30,
2 if Self::is_leap_year(year, ()) => 29,
2 => 28,
_ => 31,
}
}
// /// Count the number of days in a given month/year combo
// const fn days_in_month(year: i32, month: u8) -> u8 {
// // see comment to `<impl CalendarArithmetic for Iso>::month_days`
// match month {
// 2 => 28 | (is_leap_year(year) as u8),
// _ => 30 | (month ^ (month >> 3)),
// }
// }

pub(crate) fn days_in_year_direct(year: i32) -> u16 {
if Self::is_leap_year(year, ()) {
366
} else {
365
}
pub(crate) const fn days_in_year_direct(year: i32) -> u16 {
365 + (is_leap_year(year) as u16)
}

// Fixed is day count representation of calendars starting from Jan 1st of year 1.
Expand All @@ -316,19 +291,8 @@ impl Iso {
}

pub(crate) fn iso_from_year_day(year: i32, year_day: u16) -> Date<Iso> {
let mut month = 1;
let mut day = year_day as i32;
while month <= 12 {
let month_days = Self::days_in_month(year, month) as i32;
if day <= month_days {
break;
} else {
debug_assert!(month < 12); // don't try going to month 13
day -= month_days;
month += 1;
}
}
let day = day as u8; // day <= month_days < u8::MAX
let (month, day) = iso_from_year_day(year, year_day);
debug_assert!(month < 13);

#[allow(clippy::unwrap_used)] // month in 1..=12, day <= month_days
Date::try_new_iso(year, month, day).unwrap()
Expand All @@ -348,18 +312,7 @@ impl Iso {
}

pub(crate) fn day_of_year(date: IsoDateInner) -> u16 {
// Cumulatively how much are dates in each month
// offset from "30 days in each month" (in non leap years)
let month_offset = [0, 1, -1, 0, 0, 1, 1, 2, 3, 3, 4, 4];
#[allow(clippy::indexing_slicing)] // date.0.month in 1..=12
let mut offset = month_offset[date.0.month as usize - 1];
if Self::is_leap_year(date.0.year, ()) && date.0.month > 2 {
// Months after February in a leap year are offset by one less
offset += 1;
}
let prev_month_days = (30 * (date.0.month as i32 - 1) + offset) as u16;

prev_month_days + date.0.day as u16
day_of_year(date.0.year, date.0.month, date.0.day)
}

/// Wrap the year in the appropriate era code
Expand Down
9 changes: 4 additions & 5 deletions components/calendar/src/julian.rs
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,11 @@ pub struct JulianDateInner(pub(crate) ArithmeticDate<Julian>);
impl CalendarArithmetic for Julian {
type YearInfo = ();

fn month_days(year: i32, month: u8, _data: ()) -> u8 {
fn month_days(year: i32, month: u8, data: ()) -> u8 {
// See `Iso::month_days`
match month {
4 | 6 | 9 | 11 => 30,
2 if Self::is_leap_year(year, ()) => 29,
2 => 28,
1 | 3 | 5 | 7 | 8 | 10 | 12 => 31,
2 => 28 | (Self::is_leap_year(year, data) as u8),
1..=12 => 30 | (month ^ (month >> 3)),
_ => 0,
}
}
Expand Down
19 changes: 15 additions & 4 deletions utils/calendrical_calculations/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ rust-version.workspace = true

# This is a special exception: The algorithms in this crate are based on "Calendrical Calculations" by Reingold and Dershowitz
# which has its lisp code published at https://github.com/EdReingold/calendar-code2/
license = "Apache-2.0"
license = "Apache-2.0" # TODO: Need to add `MIT`/`GNU`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the license was picked by a lawyer. Why would GNU be needed?

Copy link
Author

@Nikita-str Nikita-str Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a realization of the algorithm(from the author of the article) in C/C++ and in the repo no apache 2.0 license

Here is a comment with mentioning it in the PR

So I don't sure is it necessary to add any of them or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. We'd have to talk to the lawyer again for this.

Typically algorithms themselves aren't copyrightable, however we would indeed need to check with our lawyer to pull in this code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the policy is, we can redistribute MIT (but not GPL) code under the Apache-2.0 license, and all third-party code should retain its copyright comments inline in the code, similar to the Reingold code.

Copy link
Member

@sffc sffc Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first link goes to a file with the following comment at the top

// SPDX-License-Identifier: GPL-3.0-or-later

According to the terms of the GPL license, which are fairly strict, an Apache-licensed crate such as calendrical_calculations would not be able to redistribute that code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A workaround to this type of issue would be for any GPL-licensed code to live in its own crate, and then either icu_calendar or calendrical_calculations has an optional Cargo feature to consume it. Clients who would like the speedup and are okay consuming GPL-licensed code would need to manually enable the Cargo feature.

I do not know whether such a GPL crate could live in this repository or whether it would need its own repository.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather avoid introducing GPL licensed code in our dep tree at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc @Manishearth
As I can understand you can implement algos from an article without such license restriction.
And if you will check the code by link and the code from PR it's will be pretty clear* that code was inspired by the article and only then it was matched with author's code for reference to authority. So maybe I can just remove link to the author implementation and leave only links to the article?

[*]: because of naming(unnamed const and very short names that say almost nothing (except for y/m/d ones, yeah they can say something, but nothing about how and why)) in the repo of the article's author. And in the PR even some consts was changed because of we have larger valid dates' interval -- in the author's code they are just magic numbers again; and how will you change such consts without understanding for what and why? And of course in the PR code there is plenty comments about why and how.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand that generally algorithms are not copyrightable, but either way, we will have to get approval from our lawyer for doing this, and they may choose to be more cautious about this. We were already quite cautious about the Reingold&Dershowitz algoritms.


[package.metadata.workspaces]
independent = true
Expand All @@ -37,9 +37,20 @@ displaydoc = { workspace = true }
log = { workspace = true, optional = true }

[features]
bench = []
logging = ["dep:log"]
std = []

[package.metadata.cargo-all-features]
# Bench feature gets tested separately and is only relevant for CI
denylist = ["bench"]
# [package.metadata.cargo-all-features]
# # Bench feature gets tested separately and is only relevant for CI
# denylist = ["bench"]

[target.'cfg(not(target_arch = "wasm32"))'.dev-dependencies]
criterion = { workspace = true }

[[bench]]
name = "iso"
harness = false

# [profile.bench]
# lto = false
Loading