-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to efficiently return full yaml file with updated values? #110
Comments
Hi @mleveck! So, if I understand correctly, you were hoping to do something like: (-> "example.yaml"
slurp
yaml/parse-string
(assoc-in [:clusters 0 :name] "myname2")) But this fails because If you really want to use (require '[clojure.walk :as walk])
(defn unlazify [x]
(walk/postwalk #(if (= clojure.lang.LazySeq (type %))
(vec %)
%)
x))
(-> "example.yaml"
slurp
yaml/parse-string
unlazify
(assoc-in [:clusters 0 :name] "myname2"))
;; => {:apiVersion "v1",
;; :clusters [{:cluster nil, :server "http://myserver", :name "myname2"}]} But this might be naive, not sure. Perhaps others will chime in. Another good place to ask this kind of question is on Clojurians Slack in the #beginners channel. |
Thanks @lread . Very helpful. Yes, that is exactly what I was wanting to do, and pointing out Clojure.walk is very helpful. After I posted this question, I implemented something like it(and likely less efficient) by hand to get past the issue. I just expected that someone might say "Oh, we don't use assoc-in to do this. We do it another way with this library". If there isn't a better solution, it seems like having the initial parse optionally decode as a vec could be an ok feature. Would the project be open to a PR that adds a option to decode as a vec rather than a seq (with default behavior remaining as is)? |
If the project is open to a PR like that ^ I think there would need to be a discussion of what the behavior would be if a user set it to true and |
Glad it helped! Thank you for your kind offer, but I don't think we are interested in such a PR at this time. |
This has come up so often that I think it's an appropriate addition to this library. |
I propose just an option Inconsistency with The relevant bit of code would be here? Another idea would be to let the user pass a function that gets wrapped around the Most YAML you typically parse for CI etc. is not recursive. Having the simple option to support the 99% use case makes sense IMO. |
Huh, that's the benefit of having more than one maintainer! As for naming, I'm a bit on the naive side on how recursive YAML can fail. Also I can recap our options, as I see them, if that's helpful. |
I've seen the postwalk solution come up often enough (e.g. in the babashka channel, or in an issue here). Some people just expect a vector. You can find an old issue here: #29
|
Thanks @borkdude, the reverted #18 is very informative. namingOn naming: yes, agreed. I think "array" might be a SnakeYAML implementation detail for a YAML sequence. How about understanding recursive YAMLI don't yet understand valid use cases for circular YAML, ex: recursive:
name: "A node"
child: &ref_node
name: "Child node"
child: *ref_node I'm going to pester Ingy to learn more and will follow up. |
Or... |
Or |
(it'd be good to check names of existing options and go for some consistency, I haven't checked yet) |
Yeah seems to fit well. Existing boolean options, for example, do not use question marks. |
circular structures in clj-yamlI've had an initial peek at existing behaviour, all without any vectorization. allow-recursive-keys - what does it do?I did not realize the limited scenario If we steal a sample from our tests: (def recursive-yaml "&A
- *A: *A") We see parsing fails as expected if we do not specify (def r1 (yaml/parse-string recursive-yaml))
;; => Execution error (YAMLException) at org.yaml.snakeyaml.constructor.SafeConstructor/processDuplicateKeys (SafeConstructor.java:109).
;; Recursive key for mapping is detected but it is not configured to be allowed. And it passes parsing as expected with (def r2 (yaml/parse-string recursive-yaml :allow-recursive-keys true))
;; => #'user/r2 But if we try to evaluate the result, we get a stack overflow error: r2
;; ....{({({({Error printing return value (StackOverflowError) at clojure.core/deref (core.clj:2337).
;; null other circular YAMLWe can attempt to parse other arbitrary recursive YAML outside the scope of the This YAML fails immediately with a stack overflow: (def circular-yaml "
recursive:
name: A node
child: &ref_node
name: Child node
child: *ref_node")
(def c1 (yaml/parse-string circular-yaml))
;; => Execution error (StackOverflowError) at clj-yaml.core/eval9075$fn$iter$fn (core.clj:230).
;; null We can grab the first element from this next YAML, but it blows the stack on further attempts: (def partial-circular-yaml "
- one
- two
- three
- four
- &x { x: *x }
- billy")
(def p1 (yaml/parse-string partial-circular-yaml))
(-> p1 first)
;; => "one"
(-> p1 second)
;; => Execution error (StackOverflowError) at clj-yaml.core/eval9075$fn$iter$fn (core.clj:230).
;; null What about SnakeYAML?Can SnakeYAML load the above examples without issue? I'll go to Java to explore: package org.example;
import org.yaml.snakeyaml.LoaderOptions;
import org.yaml.snakeyaml.Yaml;
public class Main {
public static void yamlTest(String yamlIn, LoaderOptions opts) {
System.out.println("\nyaml:\n" + yamlIn);
try {
Yaml yaml = new Yaml(opts);
Object loaded = yaml.load(yamlIn);
System.out.println("dump:\n" + yaml.dump(loaded));
} catch (Throwable e) {
System.out.println("ex: " + e.getMessage());
};
}
public static void yamlTest(String yamlIn) {
yamlTest(yamlIn, new LoaderOptions());
}
public static void main(String[] args) {
// this first one should fail without allow recursive keys
yamlTest("""
&A
- *A: *A""");
// the rest should pass
LoaderOptions options = new LoaderOptions();
options.setAllowRecursiveKeys(true);
yamlTest("""
&A
- *A: *A""",
options);
yamlTest("""
recursive:
name: A node
child: &ref_node
name: Child node
child: *ref_node
""");
yamlTest("""
- one
- two
- three
- four
- &x { x: *x }
- billy
""");
}
} Outputs:
So, yes, it seems to. Use case for circular YAMLMy usage of YAML is limited to reading config files.
I'm guessing that most users of clj-yaml will not bump into circular YAML, but I do not know. Initial thoughtsIf my experiments above are sound, clj-yaml doesn't handle recursive YAML. Since clj-yaml doesn't currently handle recursive YAML well, it seems that adding Watcha think? |
I think for the circular case it's best to go back to the test that was added to support this.
Yes, agreed. |
The // https://en.wikipedia.org/wiki/Billion_laughs_attack
private boolean allowRecursiveKeys = false; Here's the relevant SnakeYAML commit. |
Notice that the clj-yaml test does not evaluate the result. |
Do you mean "realize"? What happens if you only take the first n elements, does that work? Just out of curiosity, not that I really need this feature myself. Perhaps the option got added to be able to parse recursive YAML, while not using the recursive field and be able to read the rest of the YAML. Who knows. |
But anyway, adding the option still seems good to me. Or are you suggesting making a breaking change and swapping defaults? |
In the unit test YAML, no: user=> (require '[clj-yaml.core :as yaml])
nil
user=> (def recursive-yaml "&A
- *A: *A")
#'user/recursive-yaml
user=> (first (yaml/parse-string recursive-yaml :allow-recursive-keys true))
#ordered/map ([(#ordered/map ........([(Error printing return value (StackOverflowError) at java.util.LinkedHashMap/entrySet (LinkedHashMap.java:674).
null But see
Yeah, me neither. Do you know of anybody who has wanted to parse circular YAML with cl-yaml?
I guess that I know there is a bunch of text in my exploration... but I'll restate that SnakeYAML will still load circular YAML when I don't know why the SnakeYAML
I did not propose this but thought about it too. But we might accidentally break existing code in ways we did not anticipate if we change the default. So, I was thinking of updating the user guide to always use But really, still on the fence on this. |
I'm leaning to this as well. |
I think I am also in a situation where I would need vectors instead of seqs. I also don't think I need ordered collections since I don't think I need ordering, but not a big issue IMO. |
The ordering is important for round-tripping. Agree on vectors being more useful format. You can use the clojure.walk solution above until this is implemented. |
If there is a better forum for this question please let me know. Also I'm relatively new to Clojure, so apologies in advance.
Minimal example yaml file (the actual files I'm dealing with are larger):
when parsed gives back:
I'd like to get a copy of that map with the cluster name changed from my "myname" to "myname2". But...
The type of clusters is a lazy seq
This seems to prevent using assoc-in or update-in, as a lazy seq isn't an an associative data structure. I can obviously navigate into to data structure threading it through a series of keyword and calls like
first
. But then I end up returning just the inner most map and not the full file.I feel like what I want to do can't be uncommon, so I must be missing something. Can you give me some guidance?
The text was updated successfully, but these errors were encountered: