Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A way to let every peg rule to produce its own object... #32

Closed
khchen opened this issue Apr 11, 2021 · 4 comments
Closed

A way to let every peg rule to produce its own object... #32

khchen opened this issue Apr 11, 2021 · 4 comments

Comments

@khchen
Copy link

khchen commented Apr 11, 2021

Here are some codes to demonstrate my problem.

import npeg, strutils

const
  testData = "10:10;12:00;22:40" # last item has no ';'

proc test1() =
  var list: seq[string]
  let peg = peg "start":
    start <- +statement
    statement <- >time * ';':
      list.add $1
    time <- Digit[1..2] * ':' * Digit[1..2]

  discard peg.match(testData)
  echo list

Output: @["10:10", "12:00"]

The output is what we want. However, we like object instead of string. So...

type
  Time = object
    hour: int
    min: int

proc test2() =
  var list: seq[Time]
  let peg = peg "start":
    start <- +statement
    statement <- time * ';'
    time <- >Digit[1..2] * ':' * >Digit[1..2]:
      let (hour, min) = (parseInt($1), parseInt($2))
      if hour in 0..23 and min in 0..59:
        list.add Time(hour: hour, min: min)

  discard peg.match(testData)
  echo list

Output: @[(hour: 10, min: 10), (hour: 12, min: 0), (hour: 22, min: 40)]
We get seq[Object] as output, however, code block capture are always executed even when the parser state is rolled back afterwards. The result is wrong.

proc test3() =
  var list: seq[Time]
  let peg = peg "start":
    start <- +statement
    statement <- time * ';':
      for i in countup(1, capture.len-1, step=2):
        let (hour, min) = (capture[i].s.parseInt, capture[i+1].s.parseInt)
        if hour in 0..23 and min in 0..59:
          list.add Time(hour: hour, min: min)

    time <- >Digit[1..2] * ':' * >Digit[1..2]

  discard peg.match(testData)
  echo list 

Output: @[(hour: 10, min: 10), (hour: 12, min: 0)]

Finally, we get what we want. However I think this code is bad because we produce Time object outside of time rule.
If statment rule is statement <- (time | date | something) * ';' , the code will be really ugly.

The way I can resolve the problem for now is:

import marshal

proc test4() =
  let peg = peg "start":
    start <- +statement:
      var list: seq[Time]
      for i in 1..<capture.len:
        list.add to[Time](capture[i].s)
      push($$list)

    statement <- time * ';'

    time <- >Digit[1..2] * ':' * >Digit[1..2]:
      let (hour, min) = (parseInt($1), parseInt($2))
      if hour in 0..23 and min in 0..59:
        push($$Time(hour: hour, min: min))

  var list = to[seq[Time]](peg.match(testData).captures[0])
  echo list

Output: @[(hour: 10, min: 10), (hour: 12, min: 0)]

Ok, it works fine, and the code is clear, each rule produce the object of itself. But it will be very slow due to serialization/deserialization, and marshal cannot works at compile-time (https://github.com/treeform/jsony can).

In the end, is there a better/smarter way to do this?
Sorry for my bad English.

@zevv
Copy link
Owner

zevv commented Apr 11, 2021

I'm afraid I have no better or smarter way to do this, this is a limitation of the way the code block captures now work; there are some ideas floating around to change this behavior, but no concrete solutions yet. You can take a peek at the other open issues #14 and #24, as they basically revolve about the same issue as yours.

@zevv
Copy link
Owner

zevv commented Apr 11, 2021

What about:

import npeg, strutils
               
type         
  Time = object
    hour: int
    min: int
                                                       
const
  testData = "10:10;12:00;22:40" # last item has no ';'
                       
proc test2() =          
  var list: seq[Time]  
  let peg = peg "start":     
    start <- +statement                       
    statement <- time * ';':                        
      let (hour, min) = (parseInt($1), parseInt($2))
      if hour in 0..23 and min in 0..59:
        list.add Time(hour: hour, min: min)
      
    time <- >Digit[1..2] * ':' * >Digit[1..2] 
                             
  discard peg.match(testData)
  echo list

test2()

@zevv
Copy link
Owner

zevv commented Apr 11, 2021

Oh, right, I now see this is your test3() case, I should first properly read and only then write.

@khchen
Copy link
Author

khchen commented May 1, 2021

I use your library to rewrite my autolayout parser. The problem was resolved by a simple serialization/deserialization mechanism. Thank you very much.

@zevv zevv closed this as completed Jun 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants