Skip to content

Releases: EdJoPaTo/website-stalker

v0.18.0

01 Feb 10:08
Compare
Choose a tag to compare

html_prettify attribute improvements

Before this release changes like this occurred regularly:

-<a class="external link">
+<a class="link external">

-<a style="color: white; display: none">
+<a style="display:none;color:white">

This release sorts classes and formats style. This reduces the amount of diffs when the host only changes something like the order.
It also fits into the concept of 'pretty' HTML which this editor attempts.

eeb020f 4a40826

support URL queries

Some websites are server generated based on the queries used. Different queries for the same domain/path are now possible.

05c4dc8

Minor changes

housekeeping, dependency updates, …

v0.17.0

25 Nov 07:27
Compare
Choose a tag to compare

HTML parsing improvements

html_markdownify, html_prettify and html_textify received bugfixes and improvements to parsing.
HTML parts aren't escaped anymore 2989f56 and prettify ensures indentation of text contents 952bde3.
html_markdownify now uses the html2md crate which implements more features and less strange edge cases 1362a4e.

RSS pubDate

It is now attempted to read the datetime attribute from elements to determine the pubDate of the RSS item.
The goal of the datetime element is to provide a machine-readable format. As parsing the date time from various human formats is hard this is probably the simplest way of adding a useful pubDate when possible while not over-complicating things.

5b6d971

Minor Changes

  • feat: improve 5sec between domain logic 722bf5f
  • perf: dont recreate regular expressions 836ad78
  • docs(readme): fix typos and backticks in README (#47) 84d8b45

v0.16.0

11 Nov 10:24
Compare
Choose a tag to compare

Automatically assume file extensions

previously you configured the wanted extension via the config file. This is now automatically assumed based on the Content-Type HTTP Header and the used editors.

 - url: "https://edjopato.de"
-  extension: md
   editors:
     - html_markdownify

a6ba06f 01afa7f

Notifications

Its now possible to send notifications on changes via pling.
Notification targets (E-Mail, Slack, Telegram, …) are entirely configured via environment variables as they mainly contain secrets. Check the pling documentation about which environment variables can be set.
The sent notification can be changed via the new config key notification_template.
When using GitHub Actions you can check out their Environment variable documentation and the example repo config which configures Telegram notifications into this Telegram channel.

1b1977a 14d3837 24b6cd6 8ad53c1

Improvements to website-stalker check

Check shows more details like configured notifications. This will not show details to prevent leakage of secrets and only the amount of configured notification targets.

Its also possible to print or rewrite the current config as yaml.
This is helpful when migrating older configs or checking if certain environment variables are correctly read.

3caf39e d9bfff9 3b7c82d

Minor Changes

  • feat(config): allow loading via environment variable 94e4bd4
  • fix: dont prefix sites in message with M or A 6959978

v0.15.0

21 Oct 14:25
Compare
Choose a tag to compare

Multiple URLs with same options

You can now specify an URL array to be used for an entry in the config. This way multiple urls will use the same specified options.
This is especially for stalking multiple nearly the same webpages.

To provide an example:

sites:
  - url: "https://edjopato.de/"
    extension: html
  - url: "https://edjopato.de/post/"
    extension: html

Can now also be specified like this:

sites:
  - url:
      - "https://edjopato.de/"
      - "https://edjopato.de/post/"
    extension: html

0921233

Minor Changes

  • fix(rss): error when no items are selected 408f0b0
  • fix: use actual url for editors f0136b1

v0.14.0

06 Oct 11:48
Compare
Choose a tag to compare
  • feat: add accept_invalid_certs site option c4a25e2
  • feat(git): git message head according to changes 2a233d1
  • fix(git): dont show same change twice in git commit message 71f3da5
  • build(http): enable socks5 proxy support 3ba2ddb
  • build(http): enable deflate body decompression 01b0237

v0.13.0

23 Sep 16:29
Compare
Choose a tag to compare

Split css_select into css_select and css_remove

This results in simpler configs for removing via css selector:

 editors:
-   - css_select:
-       selector: img
-       remove: true
+   - css_remove: img

This is a breaking change and also simplifies the internal logic.

fafc19d

img in html_markdownify

Images are now added to the markdown output.

Images will require absolute paths when markdown is being rendered as html so html_url_canonicalize is helpful here.

If you do not want the images (like it was before this release) add the editor css_remove: img to your config.

ec96f24

Minor Changes

  • fix(git): work in repo without commits yet 7436dff

v0.12.1

01 Sep 13:33
Compare
Choose a tag to compare
  • fix(rss): find link when the item itself is the link a464868
  • build(container): use Github Action cached base image eddd8c4
  • ci(container): reduce image size c0d2367

v0.12.0

09 Aug 13:00
Compare
Choose a tag to compare

Editors

Two new editors json_prettify and html_url_canonicalize. 73814fb e51baf0

IPv6 vs legacy IPv4

The log output now shows which kind of address was used. e034a70

v0.11.0

29 Jul 08:31
Compare
Choose a tag to compare

Simplify Git Logic

The git part was heavily updated. When running with --commit the command now aborts when not in a git repo or the repo is unclean.
If the repo is unclean (without --commit) no more git add is used which simplifies testing out the ideal config before commiting it.

With these changes also now all the git logic is handled via libgit2. The git binary is not anymore a required dependency. ❇️

  • feat(run)!: prevent --commit in a not clean repo 73800f0
  • feat!: prevent --commit when not in a git repo 664837c
  • fix(run): only git add when --commit 25fa0d8
  • fix(git): dont integrate git diff and git status da23989
  • feat(run): dont cleanup or reset b75d9f8
  • refactor(run): simplify git finishup logic 8efda45

Warn on redirected URLs

Some urls are redirected first before the content is returned. This results in additional traffic and roundtrips. As this is done every time the website-stalker is running this adds up over time. In order to reduce traffic the target of the redirects should be specified directly.
There is now a warning which shows which URL leads where and suggests using the target instead.

  • feat: warn on redirected URLs to reduce traffic 4c9136c

Init command

You can now init a directory with a git repo (git init) and a config (website-stalker example-config > website-stalker.yaml) in one neat command:
website-stalker init

  • feat(init): provide init folder/repo/config command 9842d9a

Case insensitive site filter

The site filter is now case insensitve. When you had to use website-stalker run EdJoPaTo for running on https://EdJoPaTo.de you can now do so with website-stalker run edjopato

  • feat(cli)!: site filter is now case insensitive 85af5f6

Config format is now fixed

Before you could use other formats for the config like website-stalker.toml. In order to simplify the config logic the config now has to be a yaml file.

  • refactor(config)!: simplify 4d5e390

Minor Changes

  • fix(check): dont panic, just exit code != 0 3689922
  • fix: dont print empty lines 9b1eb2e

v0.10.0

20 Jul 17:40
Compare
Choose a tag to compare

html_markdownify

A new editor html_markdownify can create markdown from html input. See more details about this new editor in the README. e1798ee

html_textify

Creates now up to one empty line between filled lines db894e9 32fa6d1

Rename editors to be more like functions

Editors should now be more clear in what they are doing when they are applied. This is a breaking change and you have to adapt your configs in order to work with this release. 82cefbc

  • html_text → html_textify
  • css_selector → css_select
  • regex_replacer → regex_replace