Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite-clj does not preserve CRLF whitespaces #187

Open
ikappaki opened this issue Sep 18, 2022 · 15 comments
Open

rewrite-clj does not preserve CRLF whitespaces #187

ikappaki opened this issue Sep 18, 2022 · 15 comments
Labels

Comments

@ikappaki
Copy link

ikappaki commented Sep 18, 2022

Version
1.1.45

Platform
Operating System: any
Clojure version: any
JDK vendor and version: any

Symptom
crewrite-clj does not preserve CRLF whitespace when converted from zipper back to string,

Reproduction
On the REPL:

(require '[rewrite-clj.zip :as z])
;; => nil
(-> (z/of-string "(ns nl-test\r\n  (:require [rewrite-clj.zip :as z]))")
      z/root-string)
;; => "(ns nl-test\n  (:require [rewrite-clj.zip :as z]))"

The \r\n line endings have been converted to \n.

Actual behavior
The zipper converts CRLF line endings to LF.

Expected behavior
As per documentation whitespaces should be preserved:

Rewrite-clj is a library that can read, update and write Clojure, ClojureScript and EDN source code while preserving whitespace and comments.

Diagnosis
The CRLF to LF conversion was an intentional workaround as per f78856e:

Handle Windows line endings

While we await https://clojure.atlassian.net/browse/TRDR-65, I've
introduced a work-around.

See rewrite-clj.reader.cljc comments for details.

Fixes https://github.com/clj-commons/rewrite-clj/issues/93

Action
I'm not familiar with the codebase, but I might try to have a look at it at some point.

@lread
Copy link
Collaborator

lread commented Sep 18, 2022

Thanks for raising such a clear and easy-to-digest issue @ikappaki, very much appreciated!

This is currently as designed, but I think rewrite-clj docs should describe this behaviour clearly if this is not already the case.

We use clojure tools.reader and I think it wants to normalize \r\n to \n but does not always handle this well. I raised an issue about this.

@lread lread added the docs label Sep 18, 2022
@ikappaki
Copy link
Author

ikappaki commented Sep 18, 2022

Thanks for raising such a clear and easy-to-digest issue @ikappaki, very much appreciated!

This is currently as designed, but I think rewrite-clj docs should describe this behaviour clearly if this is not already the case.

We use clojure tools.reader and I think it wants to normalize \r\n to \n but does not always handle this well. I raised an issue about this.

Hi @lread, and thank you for the prompt reply! May I please argue that there shouldn't be any exceptions in the design doc that suggest that all newlines seq should be treated as LF. I think the goal that wtitespaces should be preserved is quite powerful, since it guarantees compatibility across all architecture, which should be part of the design.

This adversely affects window's users. Its native line ending format is CRLF. If a zipper pipeline is constructed on such a file, it is likely to convert the CRLF to LF even if there are no other changes, and the file stored on disk with incompatible line endings. Every rewrite-clj consumer has to implement workarounds for just that case to maintain compatibility on windows. Instead it is much wiser IMHO to maintain whitespaces in this library.

To exaggerate a bit, imagine the trouble it would have caused on *nix if rewrite-clj was converting any newlines to CRLF instead.

IMHO the tools reader issue is an incidental complexity that should be either prioritised upstream (the issue looks stale?) or a workaround should be devised in rewrite-clj instead.

@borkdude
Copy link
Collaborator

The JIRA issue doesn't yet have a patch. Maybe it would help if someone wrote that patch and brought the issue to the attention of the core team either on Slack, ask.clojure, etc.

@borkdude
Copy link
Collaborator

I haven't looked inside rewrite-clj where this could be patched temporarily, but some looking into this might also help the process of contemplating if a workaround in rewrite-clj is feasible and desired.

@lread
Copy link
Collaborator

lread commented Sep 18, 2022

My understanding: I think the clojure tools.reader does not intend to preserve \r\n line endings. I think it intends to normalize them to \n. This is not Windows-friendly, but a reality.

Before my tools.reader work-around in rewrite-clj, files with mixed \r\n and \n were causing issues for rewrite-clj and raised as an issue by the author of zprint.

At this point not normalizing to \n might be considered a breaking change for some rewrite-clj users and would likely have to be optional.

For now I'm willing to document the behaviour (if I haven't already done so).

@lread
Copy link
Collaborator

lread commented Sep 18, 2022

So @ikappaki, I don't disagree with your sentiment at all here, but Clojure is not entirely Windows-friendly. We, unfortunately, inherit that Windows unfriendly-ness.

@lread
Copy link
Collaborator

lread commented Sep 18, 2022

Related: #93

@ikappaki
Copy link
Author

ikappaki commented Sep 18, 2022

So @ikappaki, I don't disagree with your sentiment at all here, but Clojure is not entirely Windows-friendly. We, unfortunately, inherit that Windows unfriendly-ness.

Sure @lread we are both I think on the same page here and we are just having a friendly discussion from our personal perspectives :)

I have been using Clojure primarily on MS-Windows for a long time now, and I have not noticed anything at the language level that I would consider brought me to a disadvantage compared to working on other operating systems. The tooling perhaps is an exception where the user has to go via PowerShell and calling conventions and arguments passing do not match between unix and windows, but I think this area is not relevant to this case.

I believe there is a large user base who are primarily developing on windows, that should not be discounted as second class citizens, since this hinders Clojure's wider adaptation.

Thanks :)

@lread
Copy link
Collaborator

lread commented Sep 18, 2022

I'm not in disagreement, and very much appreciate this issue and discussion.

My guess: Clojure Windows support is not first-class because there are fewer Clojure Windows users, and there are fewer Clojure Windows users because Windows support is not first-class.

I'll document this rewrite-clj behaviour but leave this issue open for a bit to potentially capture thoughts/ideas from others.

@borkdude
Copy link
Collaborator

Just for my understanding: rewrite-clj has a temporary workaround (normalizing \r\n to \n) for an issue in tools.reader, but tools.reader maintainers haven't responded in the issue, but word has it that they will never fix this?

Why does the workaround in rewrite-clj normalize and does not preserve the Windows newline?

@lread
Copy link
Collaborator

lread commented Sep 18, 2022

The current intent of clojure tools.reader is to normalize newlines to \n.
Rewrite-clj worked around a bug in that intent.

I was willing to work around a bug, but was not interested in changing the intended behaviour.
I felt that changing the intended behaviour would be riskier.

I do not know if the Clojure core team will address the issue I raised, or if they will change the newline normalization behaviour. There was interest on Slack when I raised the issue, but have not heard back since.

lread added a commit to lread/rewrite-clj that referenced this issue Sep 18, 2022
Thanks @ikappaki for raising clj-commons#187

I'll leave the issue open for a bit to gather more feedback but at least
we've documented the behaviour for now.
@lread lread added the hammock requires some thought label Sep 18, 2022
@lread
Copy link
Collaborator

lread commented Sep 18, 2022

Added a question to Ask Clojure.

lread added a commit that referenced this issue Sep 18, 2022
Thanks @ikappaki for raising #187

I'll leave the issue open for a bit to gather more feedback but at least
we've documented the behaviour for now.

I've included a link to a question I raised on Ask Clojure.
@borkdude
Copy link
Collaborator

borkdude commented Feb 9, 2023

Note:

user=> (read-string "\"foo\r\nbar\"")
"foo\r\nbar"
user=> (require '[clojure.tools.reader :as rdr])
nil
user=> (rdr/read-string "\"foo\r\nbar\"")
"foo\r\nbar"

I.e. the clojure reader(s) do not mess with newlines within literal strings, but rewrite-clj currently does.

But I don't see why rewrite-clj would not preserve newlines as they are written: it's a rewriting tool with preservation for whitespace after all.

@lread
Copy link
Collaborator

lread commented Feb 9, 2023

Yeah, it took me a while, but I finally agree.

The fact that the Clojure readers normalize newlines is an internal detail of the readers.

One detail that we might want to remember is that:

(read-string "\"foo
bar\"")

Will always (appropriately) return "foo\nbar" regardless of CR or CRLF between foo and bar.

@lread lread added bug and removed hammock requires some thought docs labels Feb 9, 2023
@lread
Copy link
Collaborator

lread commented Feb 14, 2023

I'm starting to think about this one a bit again.

New Newlines

When inserting a new newline, we have to decide the variant of that newline (CR or CRLF).

Ideally we'd just use whatever variant the source is already using but:

  1. the source might have mixed newline variants (here we could just use the first variant seen?)
  2. there might be no newlines in the source
  3. the user might be working on a subset of the source without newlines
  4. the user might be creating source from scratch

It would be nice if the newline variant selection could be automatic for new newlines, but I'm not sure if it can be.

Mixed Newlines

If a source has mixed newlines we'd preserve that oddity.

I remember mixed newlines causing the author of zprint some grief but that might have been due do issues with rewrite-clj and not anything else.
I'd keep this in mind.

@lread lread added this to rewrite-clj Jul 3, 2024
@lread lread moved this to Low Priority in rewrite-clj Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Low Priority
Development

No branches or pull requests

3 participants