Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Location is not always populated in parse exceptions #294

Open
lread opened this issue Jun 30, 2024 · 3 comments
Open

Location is not always populated in parse exceptions #294

lread opened this issue Jun 30, 2024 · 3 comments

Comments

@lread
Copy link
Collaborator

lread commented Jun 30, 2024

Version
1.1.47

Platform
All

Symptom
The :row and :col map entries are not always populated in parsing exceptions data.

Exhibit 1
Case where :row and :col are absent for exception data:

(require '[rewrite-clj.parser :as p])

(try
  (p/parse-string-all "goodsym badsym/ oksym")
  (catch Throwable e
    {:msg (ex-message e)
     :data (ex-data e)}))
;; => {:msg "Invalid symbol: badsym/.",
;;     :data {:type :reader-exception, :ex-kind :reader-error}}

Exhibit 2
Case where position is available in exception data, but notice :line instead of :row.

(try
  (p/parse-string-all ":goodkw :badkw: :okkw")
  (catch Throwable e
    {:msg (ex-message e)
     :data (ex-data e)}))
;; => {:msg "[line 1, col 16] Invalid keyword: badkw:.",
;;     :data
;;     {:type :reader-exception, :ex-kind :reader-error, :file nil, :line 1, :col 16}}

Expected behavior
Here's a case that works as expected:

(try                   
  (p/parse-string-all "#:{:a 1}")
  (catch Throwable e
    {:msg (ex-message e)
     :data (ex-data e)}))
;; => {:msg "namespaced map expects a namespace [at line 1, column 3]",
;;     :data {:msg "namespaced map expects a namespace", :row 1, :col 3}}

Diagnosis
For default token handling, we delegate to clojure.tools.reader's read-string in these cases, row/col info is not available.
In the case of the keyword, we internalized and used clojure.tools.reader read-keyword and it throws using clojure.tools.reader throw mechanisms.

Action
I'll take a peek.

@lread lread added this to rewrite-clj Jul 3, 2024
@lread lread moved this to In Progress in rewrite-clj Jul 3, 2024
@lread
Copy link
Collaborator Author

lread commented Jul 4, 2024

It's probably worth studying, absent of rewrite-clj, what positional data tools.reader v1.4.2 throws.

(require '[clojure.tools.reader.edn :as edn]
         '[clojure.tools.reader.reader-types :as rt])


(defn read-all [s]
  (let [rdr (-> s
                rt/string-push-back-reader
                rt/indexing-push-back-reader)]
    (loop [acc []]
      (let [n (edn/read {:eof nil} rdr)]
        (if n
          (recur (conj acc n))
          acc)))))

1 col after the problematic token:

          ;123456789
(read-all "boo bah/ bang")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 9] Invalid symbol: bah/.

But when at eof, col at the end of the problematic token:

          ;1234
(read-all "bah/")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 4] Invalid symbol: bah/.

Unclosed vector at eof, col at end of last item:

          ;1234
(read-all "[boo")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 4] Unexpected EOF while reading item 1 of vector, starting at line 1 and column 1.

...but not consistently, here we are 1 col after eof:

          ;1234567
(read-all "[boo  ")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 7] Unexpected EOF while reading item 1 of vector, starting at line 1 and column 1.

1 char after invalid keyword:

          ;12345678901
(read-all ":boo :bah: :bang")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 11] Invalid keyword: :bah:.

1 char after eof for unterminated string at eof:

          ;1 234
(read-all "\"ab")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 4] Unexpected EOF reading string starting ""ab.

At end col of invalid number:

          ;123
(read-all "42Z")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 3] Invalid number: 42Z.

          ;1234
(read-all "42Z2")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 4] Invalid number: 42Z2.

Namespaced map errors might try to be a little more targeted, but still seems off by 1 col:

          ;1234567
(read-all "#:foo[fah]")
;; => Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
;;    [line 1, col 7] Namespaced map with namespace foo does not specify a map.

Note that the above does not imply what types of exceptions rewrite-clj does/will throw, it is just a glance at what tools.reader does today.

So tools.reader points you at the end of the problematic token/code and sometimes 1 col after the problematic token/code and sometimes 1 col after eof. Which is probably "good enough" for it.

Since rewrite-clj is carefully adding positions at parse time, I think I'll explore the possibility of doing better.
I'm thinking it might be better to point at the start of the problematic token/code. But I'll also explore including both start and end positions. I'll keep in mind that this would technically be a breaking change for folks who rely on locations in exception data.

One thing to note is that I am talking about parsing only here.
Tools.reader is also employed when sexpr is called on a string node.
I don't have plans to provide positional data on throws for sexpr.

@borkdude
Copy link
Collaborator

borkdude commented Jul 4, 2024

Note that work like this has also been done in the rewrite-clj + tools reader fork in clj-kondo

@lread
Copy link
Collaborator Author

lread commented Jul 4, 2024

Oh handy to know, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants