Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiplication in number values #1036

Open
J-Vernay opened this issue Aug 11, 2024 · 3 comments
Open

Allow multiplication in number values #1036

J-Vernay opened this issue Aug 11, 2024 · 3 comments

Comments

@J-Vernay
Copy link

Hello,

I suggest a common solution for both issues #514 (Add a duration/timedelta type) and #912 (Add nicer syntax for file sizes): allow multiplications where integers are expected, output an error if the computation would exceed the 64-bit range.

cache-time-s = 10 * 3600             # 10 hours
cache-size   = 1024 * 1024 * 1024    # 1 GiB

This design would solve these points:

  • Whether K, M, G are based on 1000 or 1024 (explicitly stated by the multiplication)
  • A software reading TOML documents what unit it expects (eg. by using a key suffix: -ms, -days, etc), and the user can use multiplication to express the value in a clearer way
  • Allowing only integer multiplication keeps parsers simple (no priority, no parenthesis, no precision loss)

One open question is whether floats should be allowed in multiplications? What type to use for computation when both integers and floats are present? What about precision regarding float computations? Simpler choice would be to say that only integers can be multiplied.

Have a nice day,
Julien Vernay

@arp242
Copy link
Contributor

arp242 commented Aug 11, 2024

The main reason I'd like unit suffixes is to clarify what unit something is. In your examples you still need to clarify that cache-time-s is in seconds with an awkward key name, and cache-size could conceivably be in bytes, K, or M, so that comment is more or less mandatory.

So you still need either a somewhat awkward key name (cache-time-s) and/or a comment to clarify. Especially for time durations, because nanoseconds, microseconds, milliseconds, and seconds are all fairly common, sometimes even in the same file.

The biggest issue with suffixes is one of compatibility. I haven't looked at it in depth since so maybe that's resolvable, but if not maybe multiplication is the best we can do. Although I'm not entirely convinced that this:

cache-size = 1024 * 1024 * 1024  # 1G

is strictly better than:

cache-size = 1_073_741_824  # 1G

Neither reads particularly naturally, and in both cases you need that "# 1G" comment, although I suppose you could eliminate that by doing:

cache-size-bytes = 1024 * 1024 * 1024

But meh; it's an ugly key size and I still need to mentally do "bytes → kb → mb → gb" in my head.

So I don't think I'd be in favour of this, and I'd prefer to think of a way to solve the compatibility issue.

@eksortso
Copy link
Contributor

Thanks for the suggestion, @J-Vernay.

I'm a little concerned about what arbitrary integer multiplication may lead to. It suggests that arbitrary arithmetic of any type will be viable later. We need to avoid that. Unless we can point to several more use cases where different multipliers are involved. And even then, it wouldn't address @arp242's concerns regarding compatibility of units of measure, which is a much more complicated matter.

@tintin10q
Copy link

I think that this is a good suggestion and summary but I want to argue against having expressions in TOML.
While it sounds like a small change, having multiplication means that you basically add expressions to the language.
Expressions should not be added to a configuration language that is supposed to be simple.
Having expressions requires an expression evaluator which is just a lot of complexity that a configuration language does not need.

When you parse the file the programmer expects to get the result of the expression. Not some expression object. I think we can all agree on that. This means it falls to the parser to evaluate the expressions. You might say a simple multiplication does not hurt, but it does. It means you have to parse the expression and actually execute it recursively until you can't anymore. People will naturally want to expand the expression language once * is added. First with other infix math functions like + and ^ or ** or % and () and then with things like sin and cos and constants like pi or tau. Before you know it every toml parser needs to have a full math expression evaluator inside of it.

I believe expressions turn a configuration language into a dsl and toml should be a configuration language and not a dsl.

Keep it simple!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants