-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Earthly cache watcher utility (#269)
* feat: initial commit * feat: initial file watcher * refactor: minor * feat: config * feat: interval * feat: handle size exceeding * feat: trigger events * refactor: check sizes * refactor: print * feat: journal * feat: conf * fix: logging * feat: pyproject * docs: add proper readme * fix: growth indexes * fix: logging * feat: default options * chore: format * chore: lintfix * ci: fix earthfile * fix: logging * fix: app.service * feat: config argument * refactor: service file * fix: service script * fix: params * fix: to simple servicei instead of forking * fix: watch dir location * feat: layer watching * chore: type annotation * chore: lint * chore: log layer name * fix: type * fix: first checks * chore: lintfix * fix: is file check * feat: init log * feat: log init * fix: log large layer * fix: layer name * fix: parameters and notes * refactor: minor code * feat: print number formatter * refactor: minor number formatter * feat: safe delete * feat: handle file accessing * feat: overall compacting * fix: default config path in systemd service * fix: default.conf * docs: systemd installation * feat: handle move * fix: growth index iteration * feat: trigger once * chore: warning to error * fix: empty set * fix: has triggered layer * fix: layer discard * feat: loguru * fix: markdownlint * chore: sort import * chore: lintfix * chore: rufflint fix * fix: log level info --------- Co-authored-by: Oleksandr Prokhorenko <[email protected]>
- Loading branch information
Showing
8 changed files
with
694 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
VERSION 0.8 | ||
|
||
IMPORT github.com/input-output-hk/catalyst-ci/earthly/python:v3.1.7 AS python-ci | ||
|
||
check: | ||
FROM python-ci+python-base | ||
|
||
COPY . . | ||
|
||
DO python-ci+CHECK |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
<!-- cspell: words loguru inotify journalctl --> | ||
|
||
# Earthly Cache Watcher | ||
|
||
Logs an error when cache layers reach their maximum size limit. | ||
|
||
## Functionality | ||
|
||
* Watch files changes in a specified directory. | ||
* Trigger events when either an individual file or | ||
a watched directory grows beyond certain criteria. | ||
* Main triggering criteria: single file size exceeds, watched directory size exceeds, | ||
watched directory growth in size within an interval exceeds. | ||
|
||
## Configuration Parameters | ||
|
||
There are several options of configurable parameters: | ||
|
||
* `watch_dir` - A directory to watch recursively. (default: `.`) | ||
* `large_layer_size` - A parameter to determine and detect an individual file | ||
if reaches the criteria of a large-sized file. (default: `1073741824` bytes) | ||
* `max_cache_size` - A parameter to determine `watch_dir` | ||
if reaches the criteria. (default: `536870912000` bytes) | ||
* `time_window` - The duration of time interval to detect growth | ||
in size of `watch_dir`. (default: `10` secs) | ||
* `max_time_window_growth_size` - A criteria to determine within an interval to detect | ||
if `watch_dir` exceeds the size criteria. (default: `53687091200`) | ||
* `log_file_accessing_err` - Logs errors occurring during file access. (default: `True`) | ||
|
||
Typically, these configuration will be read from the specified file. | ||
|
||
## System Setup | ||
|
||
If the system has many files to watch, you should consider to config this parameter | ||
with `sysctl` to raise the maximum numbers of files to watch: | ||
|
||
```bash | ||
sudo sysctl fs.inotify.max_user_watches=25000000 | ||
echo 'fs.inotify.max_user_watches=25000000' | sudo tee -a /etc/sysctl.conf | ||
``` | ||
|
||
Feel free to change the number of the parameter to fit your requirement. | ||
|
||
## Systemd Unit Installation | ||
|
||
Run the following commands to install the program as a unit in systemd service: | ||
|
||
```bash | ||
systemctl daemon-reload | ||
systemctl enable /path/to/your/watchdog.service | ||
systemctl start watchdog | ||
``` | ||
|
||
To view the status and logs, use these two commands: | ||
|
||
```bash | ||
systemctl status watchdog | ||
``` | ||
|
||
Or | ||
|
||
```bash | ||
journalctl -xeu watchdog.service | ||
``` | ||
|
||
## Logging Example | ||
|
||
Logging example using `loguru`: | ||
|
||
```json | ||
{ | ||
"text": "read config from '/root/catalyst-ci/utilities/earthly-cache-watcher/default.conf'\n", | ||
"record": { | ||
"elapsed": { | ||
"repr": "0:00:00.007240", | ||
"seconds": 0.00724 | ||
}, | ||
"exception": null, | ||
"extra": {}, | ||
"file": { | ||
"name": "main.py", | ||
"path": "/root/catalyst-ci/utilities/earthly-cache-watcher/main.py" | ||
}, | ||
"function": "main", | ||
"level": { | ||
"icon": "ℹ️", | ||
"name": "INFO", | ||
"no": 20 | ||
}, | ||
"line": 298, | ||
"message": "read config from '/root/catalyst-ci/utilities/earthly-cache-watcher/default.conf'", | ||
"module": "main", | ||
"name": "__main__", | ||
"process": { | ||
"id": 59917, | ||
"name": "MainProcess" | ||
}, | ||
"thread": { | ||
"id": 8615431168, | ||
"name": "MainThread" | ||
}, | ||
"time": { | ||
"repr": "2024-07-04 19:22:31.458044+07:00", | ||
"timestamp": 1720095751.458044 | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Notes: The logging result is prettified, the actual result is a single-lined message. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# cspell: words runc overlayfs | ||
|
||
watch_dir = /var/lib/docker/volumes/earthly-satellite_earthly-tmp/_data/buildkit/runc-overlayfs/snapshots/snapshots | ||
large_layer_size = 1073741824 # 1GB | ||
max_cache_size = 536870912000 # 500GB | ||
time_window = 10 # 10 secs | ||
max_time_window_growth_size = 53687091200 # 50GB | ||
log_file_accessing_err = True |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
import os | ||
|
||
|
||
def get_subdirectory_name(working_dir_path: str, path: str): | ||
""" | ||
Extracts the direct subdirectory name from the given path within | ||
the specified working directory. | ||
Parameters: | ||
working_dir_path (str): The absolute path of the current working directory. | ||
path (str): The absolute path from which to extract the direct subdir name. | ||
Returns: | ||
str | None: The name of the direct subdirectory if the given path is within | ||
the working directory; otherwise, None. | ||
Example: | ||
>>> working_dir = "/home/user/projects" | ||
>>> given_path = "/home/user/projects/subdir1/file.txt" | ||
>>> get_subdirectory_name(working_dir, given_path) | ||
'subdir1' | ||
>>> given_path_invalid = "/home/user/projects1/subdir1/file.txt" | ||
>>> get_subdirectory_name(working_dir, given_path_invalid) | ||
None | ||
""" | ||
working_dir_path = os.path.abspath(working_dir_path) | ||
path = os.path.abspath(path) | ||
|
||
if ( | ||
os.path.commonpath([working_dir_path]) | ||
!= os.path.commonpath([working_dir_path, path]) | ||
): | ||
return None | ||
|
||
relative_path = os.path.relpath(path, working_dir_path) | ||
parts = relative_path.split(os.sep) | ||
|
||
if parts: | ||
return parts[0] | ||
return None | ||
|
||
def add_or_init(obj: dict[str, int], key: str, value: int): | ||
obj.setdefault(key, 0) | ||
obj[key] += value |
Oops, something went wrong.