More powerful querying mechanism #190
Replies: 6 comments 4 replies
-
Reminds me of C#'s LINQ. I love the text representation that klog have, but writing SQL parser on top seems like a lot of work. Not sure if makes sense, but the simplest lazy idea would be to export the data into sqlite (seems the format is well-enough defined to get into simple table) and then allow doing this SQL query on top of that. Not super efficient, but also data size is trivial. |
Beta Was this translation helpful? Give feedback.
-
I’m currently experimenting with a different approach, which also seems promising. The idea would be to add flags such as So in order to query for “all records from March or April 2022”, you could do:
Or, “all records from March 2022 that contain either the tag
The way it works is that you specify a pair of filter expressions (operands), and then combine them via a logical operator such as What I like about this is that it doesn’t break out of the CLI realm, and it would be completely backwards-compatible with the current filter flag mechanism (provided that Once I have something working, I’ll push a branch for trying it out. |
Beta Was this translation helpful? Give feedback.
-
This looks relevant for sqlite-based queries https://github.com/samonzeweb/godb |
Beta Was this translation helpful? Give feedback.
-
I think being able to run SQL queries on klog files would be simply brilliant. And your explanation of what you are trying to achieve kind of screams for this mechanism. You seem to be against the idea of having people writing SQL queries, but I think that's exactly what one wants when one needs a highly custom report. I think the mechanism should simply spawn an sqlite console or read a query from STDIN. You could also hide that logic for some already prepared queries, but please don't hide the access to the thing entirely. |
Beta Was this translation helpful? Give feedback.
-
A quick update: a few months ago, I started an experimental branch to tinker around with the idea of flag-based querying (the RPN idea), but unfortunately that turned out to be more complex than anticipated, so I had to abandon it. At this point, I don’t think it makes sense for klog to incorporate an “external” querying mechanism such as SQL, as it’s out of scope for the project. A potential starting point for anyone interested in creating something: the |
Beta Was this translation helpful? Give feedback.
-
FYI: As a SQL refresher + It seems to have generated me pretty much the same report as a -- Quick start:
-- klog json default.klg > default.klg.json
-- usql "duckdb:///tmp/duck.db" <default.klg.sql
LOAD
json;
CREATE
OR REPLACE TABLE records AS
SELECT row_number() OVER (PARTITION BY (date)) as duplicate, *
FROM (SELECT 'default.klg' AS filename, unnest(records, max_depth:=2) FROM read_json_auto('default.klg.json'))
ORDER BY date;
-- fill-in all empty days with neutral zero-entries so the days are not missing from report
UPDATE records
SET entries = [{
'type': 'duration'
, 'summary': ''
, 'tags': []
, 'total': ''
, 'total_mins': 0
, 'start': ''
, 'start_mins': 0
, 'end': ''
, 'end_mins': 0
}]
WHERE entries == [];
CREATE
OR REPLACE TABLE entries AS
SELECT *, (list_concat(record_tags, tags)) AS all_tags
FROM (SELECT r.filename AS filename,
r.date AS "date",
r.duplicate AS duplicate,
r.tags AS record_tags,
unnest(entries, max_depth := 2)
FROM records r);
CREATE
OR REPLACE TABLE daily_reports AS
SELECT
month, day, (SUM (total_mins)) AS total, (SUM (total_mins) FILTER (NOT is_oncall AND NOT is_off)):: INTEGER AS standard, bool_or(is_oncall_standby) AS is_oncall_standby, (SUM (total_mins) FILTER (is_oncall AND NOT is_off)):: INTEGER AS oncall_standard, (SUM (total_mins) FILTER (is_oncall AND is_off)):: INTEGER AS oncall_high_rate
FROM (
SELECT strftime(date, '%Y-%m') AS "month", strftime(date, '%Y-%m-%d') AS "day", (total_mins) AS total_mins, (list_contains(all_tags, '#off') AND NOT list_contains(all_tags, '#nooff')) AS is_off, (list_contains(tags, '#oncall')) AS is_oncall, (list_contains(record_tags, '#oncall')) AS is_oncall_standby
FROM entries
)
GROUP BY month, day
ORDER BY day
;
SELECT day, format('{:02d}h{:02d}m', (total // 60):: INT, (total % 60):: INT) as total, format('{:02d}h{:02d}m', (standard // 60):: INT, (standard % 60):: INT) as standard, is_oncall_standby, format('{:02d}h{:02d}m', (oncall_standard // 60):: INT, (oncall_standard % 60):: INT) as oncall_standard, format('{:02d}h{:02d}m', (oncall_high_rate // 60):: INT, (oncall_high_rate % 60):: INT) as oncall_high_rate
FROM daily_reports
WHERE month == '2024-05'
ORDER BY day DESC; |
Beta Was this translation helpful? Give feedback.
-
This originally came up in #187 by @rkj.
The CLI filter flags are currently all “inclusive” and (if multiple ones appear) chained together via
AND
, e.g.klog total --period 2021-12 --tag work myfile.klg
.I had already been thinking how querying could be made more powerful – think: “give me all records from May 2021 and June 2021, that either contain the tag
work
or the tagoffice
, unless the should-total is0h
.”I think, though, that the CLI flags are rather limited in that regard. I’m not sure it’s worth to try extending them that way, since this will inevitably lead to a dead end.
I feel that it might make more sense to provide an additional querying mechanism, that is more powerful.
Like:
(This is just to get the idea across! Implementing a custom SQL dialect is a bit over-the-top.)
If anyone has thoughts or ideas, feel free to share them.
Requirements
AND
operator, it should supportOR
andNOT
AND
) should work in the same simple way as it does right nowBeta Was this translation helpful? Give feedback.
All reactions