Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support #[conf(serde(flatten))] #15

Open
cbeck88 opened this issue Oct 14, 2024 · 0 comments
Open

Support #[conf(serde(flatten))] #15

cbeck88 opened this issue Oct 14, 2024 · 0 comments

Comments

@cbeck88
Copy link
Owner

cbeck88 commented Oct 14, 2024

Flattening is important because:

  • You may want to replace a single config value, which is a primitive in your conf struct, with a sub struct in order to evolve your program to meet new needs. This is helpful if a part that was pretty simple before turns into a new component, which now needs extra config options and may need more in the future.
  • If you actually inlined all the fields into the existing struct, then there is a lot of plumbing to pass the additional fields all the way through the program.
  • If you make a struct, but then use the flatten option, then you don't create any new ongoing plumbing work, and you also don't break compatibility with existing working config.
  • If you are using the #[conf(serde)] stuff, then you will need some form of #[conf(serde(flatten))] to exist to actually avoid breaking compatibility.

However, serde(flatten) is a bit fraught -- we don't want to do exactly what serde(derive) did because it has a lot of known problems. In particular, it is not compatible with #[serde(deny_unknown_fields)]. But that is our default behavior, because if you don't have that it's way too easy to misconfigure things. We actually have the reverse -- you opt out of this with #[conf(serde(allow_unknown_fields))].

Instead what I want to do is create a new trait for things that can be conf(serde(flatten)). The idea is to use a pub-sub model when these substructures communicate with their parent:

  • Flattened structures announce in advance what keys they are interested in reading. ("subscribe")
  • The parent structure looks at its own fields, and which of them are flattened, and collects all the subscriptions. (there would be some debug assert mechanism to assert that no subscriptions overlap)
  • The parent structure receives a serde::de::MapAccess from the deserializer, which we think of as an iterator over key-value pairs where the values have heterogenous types.
  • Each time it gets a key-value pair, it "publishes" it to whichever field, if any, expressed interest in it. If no field expressed interest, that's an "unexpected value" error, unless #[conf(serde(allow_unknown_fields))] was used.
  • "Publishing" the key value pair means that we push a (&str, conf::NextValueProvider) to whoever subscribed to it. (We also have to pass the ConfSerdeContext, but let's ignore that detail for the moment.)
  • Any flattenable type has to provide a special state machine called a StructEntryVisitor. This is something that can receive a (&str, conf::NextValueProvider) pair, make progress in initializing itself, and return Result<(), Error>. Additionally it has to provide a finalize(self) function which returns Result<Self::Value, Error>. This is called when there are no more key value pairs. Intuitively, the state of this thing is the stuff that goes on the stack during a call to serde::de::Visitor::visit_map when deserializing it normally with serde, but it "yields" to the caller after processing each key value pair.
  • All our implementations of DeserializeSeed::deserialize on #[conf(serde)] structures would be modified to go via their StructEntryVisitor to avoid code duplication.
  • Intuitively, if a flattened struct itself contains flattened structs, this works out because when it is subscribing to keys, it would naturally subscribe to the keys that its substructures announce that they are interested in subscribing to. And when key-value pairs are published to it, it can match on the keys and decide which primitive fields or substructures to pass them on to.

I believe that this model will sidestep all of the issues in serde_derive caused by "internal buffering" of "content", since key-value pairs will naturally propagate right to their final home, without having to be stored in any temporary representation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant