Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paths can have FileSystem that have different separators than the OS-specific one #111

Open
onetom opened this issue Sep 19, 2023 · 7 comments

Comments

@onetom
Copy link

onetom commented Sep 19, 2023

I see a couple of places in the code, where OS support is hardwired for babashka.fs/file-separator.

It could be made more flexible, if we would use a filesystem specific separator,
since it's possible to use an S3 filesystem (via https://github.com/awslabs/aws-java-nio-spi-for-s3 for example) under Windows, which uses slash as separator, while local file paths would still use backslash.

java.nio.file.FileSystem - just like java.io.FileSystem - does have a getSeparator method, which we could use to be file-system agnostic.
The java.nio.file.Path#getFileSystem method allows obtaining the separator on a per-Path basis.

It's also possible to test this more dynamic implementation, using the JimFS library (https://github.com/google/jimfs), which can simulate both unix and win filesystems on any OS. Here is a nice tutorial on this topic:
https://www.baeldung.com/jimfs-file-system-mocking

As an aside, babashka.fs/path-separator is a separate question.
That probably makes sense to keep it as-is, since the OS won't change under a JVM process, while its running.

@borkdude
Copy link
Contributor

I see a couple of places in the code, where OS support is hardwired for babashka.fs/file-separator.

Can you be more specific, or just provide a PR to fix it?

@borkdude
Copy link
Contributor

borkdude commented Oct 10, 2023

I've read the issue again and now understand it better, but I'd like to have very specific examples / test cases of what currently fails with the "hard-coded" approach, i.e. a repro. We can deprecate babashka.fs/file-separator in favor of a new function and use that instead to make the test(s) pass.

@borkdude borkdude changed the title Windows vs UNIX Paths can have FileSystem that have different separators than the OS-specific one Oct 10, 2023
@onetom
Copy link
Author

onetom commented Oct 10, 2023

i can prepare a repo for you in the next few hours, to demonstrate my specific use-cases with both S3 filesystem provider and Jimfs in-memory provider.

i gave a lot of thoughts how would I address this problem, but I'm not sure.

we made a tiny wrapper in our project to explore different approaches, but none of them were very satisfactory, that's why I haven't followed up on the issue yet.

@borkdude
Copy link
Contributor

ok sorry for sounding impatient. the jimfs thing sounds like it could work in the unit tests, s3 is a bit more complicated.
just a zip filesystem could also work since the file separator is always "/" there, even on Windows

@onetom
Copy link
Author

onetom commented Oct 10, 2023

Made a repro repo:
https://github.com/onetom/babashka-fs-issue-111

Here is one symptom of not taking into account of the file system of a Path:

  (-> s3-path (fs/path "test.txt"))
  ; Execution error (UnsupportedOperationException)
  ;   at software.amazon.nio.spi.s3.S3Path/toFile (S3Path.java:729).
  ; S3 Objects cannot be represented in the local (default) file system

For further context see the repl.s3 namespace.

I will commit some Jimfs examples too tomorrow.

@onetom
Copy link
Author

onetom commented Oct 11, 2023

Pushed some Jimfs examples too.

Regarding the implementation, here are some observations:

  1. The path & as-path functions are calling io/file, involves the legacy java.io.File concept, which doesn't support pluggable file systems, if i understand it correctly, so that could be the reason for some of the jimfs:// & s3:// errors I've experienced before.
  2. java.nio.file.Paths class is deprecated in favor of Path/of.
  3. appending to an existing Path can be done by Path#resolve & Path#resolveSibling, so we can avoid involving io/file there too.

In light of these, here are some proposed changes.

Use Path/of:

(defn- as-path
  ^Path [path]
  (cond
    (instance? Path path) path
    (instance? URI path) (Path/of path)
    (instance? File path) (.toPath path)
    :else (Path/of path (make-array String 0))))

this hasn't broken any existing tests.

Use Path#resolve:

(defn path
  "Coerces f into a Path. Multiple-arg versions treat the first argument as
  parent and subsequent args as children relative to the parent."
  (^Path [f]
   (as-path f))
  (^Path [parent child]
   (.resolve (as-path parent) (as-path child)))
  (^Path [parent child & more]
   (reduce path (path parent child) more)))

this broke 5 tests, so I'll look into those later.

@borkdude
Copy link
Contributor

Sounds good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants