Compare commits

...

117 commits
0.2.2 ... main

Author SHA1 Message Date
Luc Didry
9389e3a005
🏷 — Bump version (0.9.0) 2025-02-18 17:05:55 +01:00
Luc Didry
159a6e2427
🔀 Merge remote-tracking branch 'origin/develop' 2025-02-18 17:05:25 +01:00
Luc Didry
211ac32028
🐛 — Fix worker timeout for old results cleaning in recurring tasks (fix #84)
💥 Old results are now removed by their age, not based on their number.

💥 Warning: `max_results` setting has been replaced by `max_results_age`, which is a duration.
Use `argos server generate-config > /etc/argos/config.yaml-dist` to generate
a new example configuration file.
2025-02-18 17:04:26 +01:00
Luc Didry
32f2518294
🏷 — Bump version (0.8.2) 2025-02-18 14:58:35 +01:00
Luc Didry
38cc06e972
🐛 — Fix recurring tasks with gunicorn 2025-02-18 14:57:49 +01:00
Luc Didry
4b78919937
🏷 — Bump version (0.8.1) 2025-02-18 14:22:41 +01:00
Luc Didry
d8f30ebccd
🐛 — Fix todo enum in jobs table 2025-02-18 14:22:12 +01:00
Luc Didry
09674f73ef
🏷 — Bump version (0.8.0) 2025-02-18 13:50:36 +01:00
Luc Didry
c63093bb2f
🔀 Merge remote-tracking branch 'origin/develop' 2025-02-18 13:48:47 +01:00
Luc Didry
657624ed35
📝 — Add enum doc for developers 2025-02-18 13:47:27 +01:00
Luc Didry
471c1eae91
📜 — Add breaking changes to CHANGELOG 2025-02-18 13:43:05 +01:00
Luc Didry
c3708af32a
🐛 — Better httpx.RequestError handling (fix #83) 2025-02-18 13:36:40 +01:00
Luc Didry
23fea9fffa
🐛 — Automatically reconnect to LDAP if unreachable (fix #81) 2025-02-18 11:28:05 +01:00
Luc Didry
a48c7b74e6
— Reload configuration asynchronously (fix #79) 2025-02-17 17:26:56 +01:00
Luc Didry
8d82f7f9d6
— No need cron tasks for agents watching (fix #76) 2025-02-17 15:35:13 +01:00
Luc Didry
fd0c68cd4c
— Add missing dependency for fastapi-utils 2025-02-17 11:03:01 +01:00
Luc Didry
c98cd9c017
— No need cron tasks for DB cleaning anymore (fix #74 and #75) 2025-02-17 10:46:01 +01:00
Luc Didry
73e7a8f414
📝 — Document how to add data to requests (fix #77) 2025-02-12 16:25:10 +01:00
Luc Didry
db54dd2cdd
— Allow to customize agent User-Agent header (fix #78) 2025-02-12 16:10:09 +01:00
Luc Didry
1b484da27a
🏷 — Bump version (0.7.4) 2025-02-12 15:33:36 +01:00
Luc Didry
07f87a0f7d
🔀 Merge remote-tracking branch 'origin/develop' 2025-02-12 15:33:04 +01:00
Luc Didry
60f3079140
🩹 — Add missing enum removal 2025-02-12 15:32:24 +01:00
Luc Didry
ca709dca62
🔀 Merge remote-tracking branch 'dryusdan/dev/fix-method-enum' into develop 2025-02-12 15:02:19 +01:00
Dryusdan
0f099b9df4 Fix method enum in tasks table 2025-01-29 11:37:09 +01:00
Luc Didry
5abdd8414d
🏷 — Bump version (0.7.3) 2025-01-26 07:59:10 +01:00
Luc Didry
06868cdd74
🔀 Merge remote-tracking branch 'origin/develop' 2025-01-26 07:58:34 +01:00
Luc Didry
2b82f7c8f2
🐛 — Fix bug in retry_before_notification logic when success 2025-01-26 07:54:31 +01:00
Luc Didry
797a60a85c
🏷 — Bump version (0.7.2) 2025-01-24 14:07:50 +01:00
Luc Didry
4c4d3b69b2
🔀 Merge remote-tracking branch 'origin/develop' 2025-01-24 14:07:16 +01:00
Luc Didry
c922894567
🐛 — Fix bug in retry_before_notification logic 2025-01-24 14:06:36 +01:00
Luc Didry
8652539086
🏷 — Bump version (0.7.1) 2025-01-15 16:18:16 +01:00
Luc Didry
4f3dfd994b
🔀 Merge remote-tracking branch 'origin/develop' 2025-01-15 16:16:14 +01:00
Luc Didry
28ec85fed3
📝 — Improve release documentation 2025-01-15 16:15:16 +01:00
Luc Didry
586660c02a
🩹 — Check before adding/removing ip_version_enum 2025-01-15 09:14:55 +01:00
Luc Didry
64f8241e74
🩹 — Avoid warning from MySQL only alembic instructions 2025-01-14 17:06:59 +01:00
Luc Didry
3d209fed22
🏷 — Bump version (0.7.0) 2025-01-14 16:41:09 +01:00
Luc Didry
acd90133bd
🔀 Merge remote-tracking branch 'origin/develop' 2025-01-14 16:39:53 +01:00
Luc Didry
be90aa095a
🐛 — Fix strange and buggy behavior 2025-01-14 16:38:43 +01:00
Luc Didry
06f8310505
🗃 — Use bigint type for results id column in PostgreSQL (fix #73) 2025-01-14 16:38:43 +01:00
Luc Didry
fe89d62e88
🐛🗃 — Fix enum migration on PostgreSQL 2025-01-14 16:38:43 +01:00
Luc Didry
1e7672abca
🚸 — Add a long expiration date on auto-refresh cookies 2025-01-14 16:38:43 +01:00
Luc Didry
2ef999fa63
— Allow to specify form data and headers for checks (fix #70) 2025-01-14 16:38:43 +01:00
Luc Didry
9c8be94c20
🐛 — Fix bug when changing IP version not removing tasks (fix #72) 2025-01-14 16:38:38 +01:00
Luc Didry
311d86d130
— Ability to delay notification after X failures (fix #71) 2024-12-09 14:08:55 +01:00
Luc Didry
e0edb50e12
— Mutualize check requests (fix #68) 2024-12-04 15:04:06 +01:00
Luc Didry
ea23ea7c1f
— IPv4/IPv6 choice for checks, and choice for a dual-stack check (fix #69) 2024-12-02 15:24:54 +01:00
Luc Didry
a1600cb08e
🏷 — Bump version (0.6.1) 2024-11-28 16:59:58 +01:00
Luc Didry
0da1f4986e
🔀 Merge remote-tracking branch 'origin/develop' 2024-11-28 16:59:10 +01:00
Luc Didry
1853b4fead
💚 — Fix tests in CI 2024-11-28 16:51:28 +01:00
Luc Didry
bb4db3ca84
🐛 - Fix domain status selector’s bug on page refresh 2024-11-28 16:16:53 +01:00
Luc Didry
7d21d8d271
🐛 - Fix database migrations without default values 2024-11-28 16:13:30 +01:00
Luc Didry
868e91b866
🔨 — Update hatch 2024-11-28 15:51:32 +01:00
Luc Didry
ffd24173e5
🏷 — Bump version (0.6.0) 2024-11-28 15:42:39 +01:00
Luc Didry
594fbd6881
🔀 Merge remote-tracking branch 'origin/develop' 2024-11-28 15:41:37 +01:00
Luc Didry
04e33a8d24
🛂 — Allow to use a LDAP server for authentication (fix #64) 2024-11-28 15:37:07 +01:00
Luc Didry
da221b856b
🛂 — Allow partial or total anonymous access to web interface (fix #63) 2024-11-28 11:48:08 +01:00
Luc Didry
841f8638de
📝 — Fix doc headings 2024-11-28 11:39:09 +01:00
Luc Didry
5b999184d0
— Add a setting to set a reschedule delay if check failed (fix #67)
BREAKING CHANGE: `mo` is no longer accepted for declaring a duration in month in the configuration
You need to use `M`, `month` or `months`

Bonus:  - Allow to choose a frequency smaller than a minute
2024-11-27 16:26:56 +01:00
Luc Didry
0563cf185a
— Add "Remember me" checkbox on login (#65) 2024-11-27 11:00:40 +01:00
Luc Didry
91a9b27106
💄 — Filter form on domains list (fix #66) 2024-11-27 09:55:24 +01:00
Luc Didry
4117f9f628
♻ — Refactor some agent code 2024-11-26 16:52:20 +01:00
Luc Didry
8ac2519398
— The HTTP method used by checks is now configurable 2024-11-26 15:59:19 +01:00
Luc Didry
d3766a79c6
— Retry check right after a httpx.ReadError 2024-11-26 14:32:35 +01:00
Luc Didry
759fa05417
📝 — Avoid scrolling on a documented command 2024-11-25 17:02:09 +01:00
Luc Didry
a31c12e037
— Fix not-OK domains display if javascript is disabled 2024-11-14 09:41:59 +01:00
Luc Didry
04bbe21a66
💄 — Show only not-OK domains by default in domains list, to reduce the load on browser 2024-11-14 08:54:19 +01:00
Luc Didry
fdc219ba5c
🩹 — Fix CHANGELOG typo 2024-11-14 08:40:53 +01:00
Luc Didry
d3b5a754dd
🏷 — Bump version (0.5.0) 2024-09-26 11:57:34 +02:00
Luc Didry
37bd7b0d8d
🔀 Merge branch 'develop' 2024-09-26 11:44:56 +02:00
Luc Didry
0058e05f15
💄 — Better display of results’ error details 2024-09-26 11:43:34 +02:00
Luc Didry
0ed60508e9
🩹 — Severity of ssl-certificate-expiration’s errors is now UNKNOWN (#60) 2024-09-26 11:38:50 +02:00
Luc Didry
db4f045adf
👷 — Remove Unreleased section from CHANGELOG when publishing documentation 2024-09-26 10:33:14 +02:00
Luc Didry
100171356b
— Add new check type: http-to-https (fix #61) 2024-09-26 10:11:24 +02:00
Luc Didry
175f605e35
🔨 — Don’t use the same port for doc server than for dev argos server 2024-09-26 09:21:52 +02:00
Luc Didry
7c822b10c0
🔨 — Add a small web server to browse documentation when developing 2024-09-26 05:09:39 +02:00
Luc Didry
3dd1b3f36f
📝💄 — Add opengraph tags to documentation site (fix #62) 2024-09-26 05:09:34 +02:00
Luc Didry
89f4590fb7
💄 — Correctly show results on small screens 2024-09-24 10:58:58 +02:00
Luc Didry
839429f460
🏷 — Bump version (0.4.1) 2024-09-18 11:53:48 +02:00
Luc Didry
3b83e4f3e3
🔀 Merge branch 'develop' 2024-09-18 11:52:22 +02:00
Luc Didry
c62bf82e0d
🐛 — Fix mail and gotify alerting 2024-09-18 11:50:07 +02:00
Luc Didry
2c5420cc9d
💄 — Use a custom User-Agent header 2024-09-05 16:18:14 +02:00
Luc Didry
269e551502
🏷 — Bump version (0.4.0) 2024-09-04 17:26:20 +02:00
Luc Didry
3a3c5852d0
🔀 Merge branch 'develop' 2024-09-04 17:23:14 +02:00
Luc Didry
6c3c44f5be
— Add Apprise as notification way (fix #50) 2024-09-04 17:21:04 +02:00
Luc Didry
7998333fc1
⬆ — Update httpx min version 2024-09-04 15:03:40 +02:00
Luc Didry
3917eb2498
— Add nagios command to use as a Nagios probe 2024-09-04 14:55:30 +02:00
Luc Didry
8072a485a1
— Add command to test gotify configuration 2024-09-04 14:24:17 +02:00
Luc Didry
255fa77ac3
💄 — Improve email and gotify notifications 2024-09-04 14:19:12 +02:00
Luc Didry
4b78d9ddda
🏷 — Bump version (0.3.1) 2024-09-02 14:53:14 +02:00
Luc Didry
7c485a4ad9
🔀 Merge remote-tracking branch 'origin/develop' 2024-09-02 14:50:50 +02:00
Luc Didry
5f43f252b4
— Add new check types: body-like, headers-like and json-like (Fix #58) 2024-09-02 14:50:33 +02:00
Luc Didry
575fe2ad22
🏷 — Bump version (0.3.0) 2024-09-02 14:36:22 +02:00
Luc Didry
dec6c72238
🔀 Merge remote-tracking branch 'origin/develop' 2024-09-02 14:34:18 +02:00
Luc Didry
261f843b46
💄 — Change order of columns on domain’s checks page 2024-09-02 14:22:07 +02:00
Luc Didry
9b40c5a675
📝 — Document test-mail command 2024-08-29 17:19:06 +02:00
Luc Didry
67162f6ce4
🔀 Merge branch 'fix-57' into develop 2024-08-29 17:15:54 +02:00
Luc Didry
1c6abce9b9
— Add new check types: json-contains, json-has and json-is (fix #57) 2024-08-29 17:11:37 +02:00
Luc Didry
353d12240f
📌 — Fix httpx max version to avoid a test bug
Stacktrace of the test bug:
```
ImportError while loading conftest '/home/luc/tmp/framasoft/argos/tests/conftest.py'.
tests/conftest.py:6: in <module>
    from fastapi.testclient import TestClient
venv/lib/python3.12/site-packages/fastapi/testclient.py:1: in <module>
    from starlette.testclient import TestClient as TestClient  # noqa
venv/lib/python3.12/site-packages/starlette/testclient.py:362: in <module>
    class TestClient(httpx.Client):
venv/lib/python3.12/site-packages/starlette/testclient.py:444: in TestClient
    url: httpx._types.URLTypes,
E   AttributeError: module 'httpx._types' has no attribute 'URLTypes'
```
2024-08-27 14:23:59 +02:00
Luc Didry
d2468eff6e 🔀 Merge branch 'fix-59' into 'develop'
 — Allow to run Argos in a subfolder (i.e. not on /). Fix #59

See merge request framasoft/framaspace/argos!68
2024-08-27 12:05:05 +00:00
Luc Didry
95c49c5924
— Allow to run Argos in a subfolder (i.e. not on /). Fix #59 2024-08-27 13:02:01 +02:00
Luc Didry
bc3bc52ed0
🪵 — Update CHANGELOG 2024-08-27 11:50:00 +02:00
Luc Didry
282f5147a5
🔀 Merge branch 'css-changes' into develop 2024-08-27 11:49:24 +02:00
9dc0ffc5ef
Styling: enhance the mobile view
- Add some spacing that was previously removed on all pages
- Include the JavaScript only if not on the login view
- Change the menu to not use buttons, and use rtl so the menu is viewable on small screens.
2024-08-27 11:47:31 +02:00
Luc Didry
eb65470935
🪵 — Update CHANGELOG 2024-08-27 11:17:26 +02:00
Luc Didry
aac7ca4ec5
🚨 — Make ruff and pylint happy 2024-08-27 11:11:31 +02:00
Luc Didry
a25cfea8c0
🔀 Merge remote-tracking branch 'origin/test-email' into develop 2024-08-27 11:08:08 +02:00
c419133eec
Add a command to test email configuration. 2024-08-14 01:20:40 +02:00
Luc Didry
7eede341e4 🔀 Merge branch 'fix-56' into 'develop'
 — Add new check types: headers-contain and headers-have (fix #56)

See merge request framasoft/framaspace/argos!65
2024-07-18 16:15:15 +00:00
Luc Didry
7e5502f7a4
— Add new check types: headers-contain and headers-have (fix #56) 2024-07-18 18:01:03 +02:00
Luc Didry
b904f4c35d 🔀 Merge branch 'fix-55' into 'develop'
🩹 — Close menu after rescheduling non-ok checks (fix #55)

See merge request framasoft/framaspace/argos!64
2024-07-18 14:44:08 +00:00
Luc Didry
ef1eb6ed6e
🩹 — Close menu after rescheduling non-ok checks (fix #55) 2024-07-13 16:27:27 +02:00
Luc Didry
77dbc8bb3a 🔀 Merge branch 'add-check/status-in' into 'develop'
 — Add new check type: status-in

See merge request framasoft/framaspace/argos!63
2024-07-04 13:50:15 +00:00
Luc Didry
fde061da19
— Add new check type: status-in
Similar to status-is except that the HTTP status can be one of a list, instead of just one fixed value.

Usecase: a Sympa server with CAS authentication set. Without a sympa_session cookie, you get a 302 status,
with it, you have a 200 status.
2024-07-04 14:32:46 +02:00
Luc Didry
9078a1384b 🔀 Merge branch 'add-mypy-test' into 'develop'
 — Add mypy test

See merge request framasoft/framaspace/argos!61
2024-07-04 12:09:42 +00:00
Luc Didry
5bd4d9909a
— Add mypy test 2024-07-04 13:33:54 +02:00
Luc Didry
3b49594bef
👷 — Add link to PyPI page in GitLab releases 2024-07-04 09:20:51 +02:00
Luc Didry
9102d5f974
🩹 — Fix release documentation 2024-07-04 09:09:36 +02:00
73 changed files with 3175 additions and 711 deletions

View file

@ -1,3 +1,4 @@
---
image: python:3.11
stages:
@ -18,6 +19,9 @@ default:
install:
stage: install
before_script:
- apt-get update
- apt-get install -y build-essential libldap-dev libsasl2-dev
script:
- make venv
- make develop
@ -37,6 +41,12 @@ djlint:
script:
- make djlint
mypy:
<<: *pull_cache
stage: test
script:
- make mypy
pylint:
<<: *pull_cache
stage: test
@ -61,13 +71,16 @@ release_job:
release: # See https://docs.gitlab.com/ee/ci/yaml/#release for available properties
tag_name: '$CI_COMMIT_TAG'
description: './release.md'
assets:
links:
- name: 'PyPI page'
url: 'https://pypi.org/project/argos-monitoring/$CI_COMMIT_TAG/'
pages:
<<: *pull_cache
stage: deploy
script:
- pwd
- ls
- sed -e "/Unreleased/,+1d" -i CHANGELOG.md
- make docs
- echo "https://framasoft.frama.io/framaspace/argos/* https://argos-monitoring.framasoft.org/:splat 301" > public/_redirects
artifacts:

View file

@ -2,6 +2,163 @@
## [Unreleased]
## 0.9.0
Date: 2025-02-18
- 🐛 — Fix worker timeout for old results cleaning in recurring tasks (#84)
💥 Old results are now removed by their age, not based on their number.
💥 Warning: `max_results` setting has been replaced by `max_results_age`, which is a duration.
Use `argos server generate-config > /etc/argos/config.yaml-dist` to generate
a new example configuration file.
## 0.8.2
Date: 2025-02-18
- 🐛 — Fix recurring tasks with gunicorn
## 0.8.1
Date: 2025-02-18
- 🐛 — Fix todo enum in jobs table
## 0.8.0
Date: 2025-02-18
- ✨ — Allow to customize agent User-Agent header (#78)
- 📝 — Document how to add data to requests (#77)
- ✨ — No need cron tasks for DB cleaning anymore (#74 and #75)
- ✨ — No need cron tasks for agents watching (#76)
- ✨ — Reload configuration asynchronously (#79)
- 🐛 — Automatically reconnect to LDAP if unreachable (#81)
- 🐛 — Better httpx.RequestError handling (#83)
💥 Warning: there is new settings to add to your configuration file.
Use `argos server generate-config > /etc/argos/config.yaml-dist` to generate
a new example configuration file.
💥 You dont need cron tasks anymore!
Remove your old cron tasks as they will now do nothing but generating errors.
NB: You may want to add `--enqueue` to `reload-config` command in your systemd file.
## 0.7.4
Date: 2025-02-12
- 🐛 — Fix method enum in tasks table (thx to Dryusdan)
## 0.7.3
Date: 2025-01-26
- 🐛 — Fix bug in retry_before_notification logic when success
## 0.7.2
Date: 2025-01-24
- 🐛 — Fix bug in retry_before_notification logic
## 0.7.1
Date: 2025-01-15
- 🩹 — Avoid warning from MySQL only alembic instructions
- 🩹 — Check before adding/removing ip_version_enum
- 📝 — Improve release documentation
## 0.7.0
Date: 2025-01-14
- ✨ — IPv4/IPv6 choice for checks, and choice for a dual-stack check (#69)
- ⚡ — Mutualize check requests (#68)
- ✨ — Ability to delay notification after X failures (#71)
- 🐛 — Fix bug when changing IP version not removing tasks (#72)
- ✨ — Allow to specify form data and headers for checks (#70)
- 🚸 — Add a long expiration date on auto-refresh cookies
- 🗃️ — Use bigint type for results id column in PostgreSQL (#73)
## 0.6.1
Date: 2024-11-28
- 🐛 - Fix database migrations without default values
- 🐛 - Fix domain status selectors bug on page refresh
## 0.6.0
Date: 2024-11-28
- 💄 — Show only not-OK domains by default in domains list, to reduce the load on browser
- ♿️ — Fix not-OK domains display if javascript is disabled
- ✨ — Retry check right after a httpx.ReadError
- ✨ — The HTTP method used by checks is now configurable
- ♻️ — Refactor some agent code
- 💄 — Filter form on domains list (#66)
- ✨ — Add "Remember me" checkbox on login (#65)
- ✨ — Add a setting to set a reschedule delay if check failed (#67)
BREAKING CHANGE: `mo` is no longer accepted for declaring a duration in month in the configuration
You need to use `M`, `month` or `months`
- ✨ - Allow to choose a frequency smaller than a minute
- ✨🛂 — Allow partial or total anonymous access to web interface (#63)
- ✨🛂 — Allow to use a LDAP server for authentication (#64)
## 0.5.0
Date: 2024-09-26
- 💄 — Correctly show results on small screens
- 📝💄 — Add opengraph tags to documentation site (#62)
- 🔨 — Add a small web server to browse documentation when developing
- ✨ — Add new check type: http-to-https (#61)
- 👷 — Remove Unreleased section from CHANGELOG when publishing documentation
- 🩹 — Severity of ssl-certificate-expirations errors is now UNKNOWN (#60)
- 💄 — Better display of results error details
## 0.4.1
Date: 2024-09-18
- 💄 — Use a custom User-Agent header
- 🐛 — Fix mail and gotify alerting
## 0.4.0
Date: 2024-09-04
- 💄 — Improve email and gotify notifications
- ✨ — Add command to test gotify configuration
- ✨ — Add nagios command to use as a Nagios probe
- ✨ — Add Apprise as notification way (#50)
## 0.3.1
Date: 2024-09-02
- ✨ — Add new check types: body-like, headers-like and json-like (#58)
## 0.3.0
Date: 2024-09-02
- 🩹 — Fix release documentation
- ✅ — Add mypy test
- ✨ — Add new check type: status-in
- 🩹 — Close menu after rescheduling non-ok checks (#55)
- ✨ — Add new check types: headers-contain and headers-have (#56)
- ✨ — Add command to test email configuration (!66)
- 💄 — Enhance the mobile view (!67)
- ✨ — Allow to run Argos in a subfolder (i.e. not on /) (#59)
- ✨ — Add new check types: json-contains, json-has and json-is (#57)
## 0.2.2
Date: 2024-07-04
@ -20,7 +177,7 @@ Date: 2024-06-24
- 💄📯 — Improve notifications and result(s) pages
- 🔊 — Add level of log before the log message
— 🔊 — Add a warning messages in the logs if there is no tasks in database. (fix #41)
- 🔊 — Add a warning message in the logs if there is no tasks in database. (fix #41)
- ✨ — Add command to generate example configuration (fix #38)
- 📝 — Improve documentation
- ✨ — Add command to warn if its been long since last viewing an agent (fix #49)

View file

@ -5,17 +5,19 @@ ORANGE=\033[0;33m
BLUE=\033[0;34m
NC=\033[0m # No Color
.PHONY: test lint djlint pylint ruff
.PHONY: test lint djlint pylint ruff mypy
venv: ## Create the venv
python3 -m venv venv
develop: venv ## Install the dev dependencies
venv/bin/pip install -e ".[dev,docs]"
venv/bin/pip install -e ".[dev,docs,ldap]"
docs: cog ## Build the docs
venv/bin/sphinx-build docs public
if [ ! -e "public/mermaid.min.js" ]; then curl -sL $$(grep mermaid.min.js public/search.html | cut -f 2 -d '"') --output public/mermaid.min.js; fi
sed -e 's@https://unpkg.com/mermaid[^"]*"@mermaid.min.js"@' -i public/search.html public/genindex.html
sed -e 's@https://unpkg.com/mermaid[^"]*"@../mermaid.min.js"@' -i public/developer/models.html public/developer/overview.html
docs-webserver: docs
python3 -m http.server -d public -b 127.0.0.1 8001
cog: ## Run cog, to integrate the CLI options to the docs.
venv/bin/cog -r docs/*.md
test: venv ## Run the tests
@ -25,10 +27,12 @@ ruff: venv
ruff-format: venv
venv/bin/ruff format .
djlint: venv ## Format the templates
venv/bin/djlint --ignore=H030,H031,H006 --profile jinja --lint argos/server/templates/*html
venv/bin/djlint --ignore=H006 --profile jinja --lint argos/server/templates/*html
pylint: venv ## Runs pylint on the code
venv/bin/pylint argos
lint: djlint pylint ruff
mypy: venv
venv/bin/mypy argos tests
lint: djlint pylint mypy ruff
help:
@python3 -c "$$PRINT_HELP_PYSCRIPT" < $(MAKEFILE_LIST)

View file

@ -1 +1 @@
VERSION = "0.2.2"
VERSION = "0.9.0"

View file

@ -6,11 +6,14 @@ import asyncio
import json
import logging
import socket
from hashlib import md5
from time import sleep
from typing import List
import httpx
from tenacity import retry, wait_random # type: ignore
from argos import VERSION
from argos.checks import get_registered_check
from argos.logging import logger
from argos.schemas import AgentResult, SerializableException, Task
@ -31,46 +34,139 @@ def log_failure(retry_state):
)
class ArgosAgent:
class ArgosAgent: # pylint: disable-msg=too-many-instance-attributes
"""The Argos agent is responsible for running the checks and reporting the results."""
def __init__(self, server: str, auth: str, max_tasks: int, wait_time: int):
def __init__( # pylint: disable-msg=too-many-positional-arguments
self, server: str, auth: str, max_tasks: int, wait_time: int, user_agent: str
):
self.server = server
self.max_tasks = max_tasks
self.wait_time = wait_time
self.auth = auth
self._http_client = None
if user_agent == "":
self.ua = user_agent
else:
self.ua = f" - {user_agent}"
self._http_client: httpx.AsyncClient | None = None
self._http_client_v4: httpx.AsyncClient | None = None
self._http_client_v6: httpx.AsyncClient | None = None
self._res_cache: dict[str, httpx.Response] = {}
self.agent_id = socket.gethostname()
@retry(after=log_failure, wait=wait_random(min=1, max=2))
async def run(self):
headers = {
auth_header = {
"Authorization": f"Bearer {self.auth}",
"User-Agent": f"Argos Panoptes agent {VERSION}{self.ua}",
}
self._http_client = httpx.AsyncClient(headers=headers)
self._http_client = httpx.AsyncClient(headers=auth_header)
ua_header = {
"User-Agent": f"Argos Panoptes {VERSION} "
f"(about: https://argos-monitoring.framasoft.org/){self.ua}",
}
self._http_client_v4 = httpx.AsyncClient(
headers=ua_header,
transport=httpx.AsyncHTTPTransport(local_address="0.0.0.0"),
)
self._http_client_v6 = httpx.AsyncClient(
headers=ua_header, transport=httpx.AsyncHTTPTransport(local_address="::")
)
logger.info("Running agent against %s", self.server)
async with self._http_client:
while "forever":
retry_now = await self._get_and_complete_tasks()
if not retry_now:
logger.error("Waiting %i seconds before next retry", self.wait_time)
logger.info("Waiting %i seconds before next retry", self.wait_time)
await asyncio.sleep(self.wait_time)
async def _complete_task(self, task: dict) -> dict:
async def _do_request(self, group: str, details: dict):
logger.debug("_do_request for group %s", group)
headers = {}
if details["request_data"] is not None:
request_data = json.loads(details["request_data"])
if request_data["headers"] is not None:
headers = request_data["headers"]
if details["ip_version"] == "4":
http_client = self._http_client_v4
else:
http_client = self._http_client_v6
try:
task = Task(**task)
if details["request_data"] is None or request_data["data"] is None:
response = await http_client.request( # type: ignore[union-attr]
method=details["method"],
url=details["url"],
headers=headers,
timeout=60,
)
elif request_data["json"]:
response = await http_client.request( # type: ignore[union-attr]
method=details["method"],
url=details["url"],
headers=headers,
json=request_data["data"],
timeout=60,
)
else:
response = await http_client.request( # type: ignore[union-attr]
method=details["method"],
url=details["url"],
headers=headers,
data=request_data["data"],
timeout=60,
)
except httpx.ReadError:
sleep(1)
logger.warning("httpx.ReadError for group %s, re-emit request", group)
if details["request_data"] is None or request_data["data"] is None:
response = await http_client.request( # type: ignore[union-attr]
method=details["method"], url=details["url"], timeout=60
)
elif request_data["json"]:
response = await http_client.request( # type: ignore[union-attr]
method=details["method"],
url=details["url"],
json=request_data["data"],
timeout=60,
)
else:
response = await http_client.request( # type: ignore[union-attr]
method=details["method"],
url=details["url"],
data=request_data["data"],
timeout=60,
)
except httpx.RequestError as err:
logger.warning("httpx.RequestError for group %s", group)
response = err
self._res_cache[group] = response
async def _complete_task(self, _task: dict) -> AgentResult:
try:
task = Task(**_task)
check_class = get_registered_check(task.check)
check = check_class(self._http_client, task)
result = await check.run()
check = check_class(task)
response = self._res_cache[task.task_group]
if isinstance(response, httpx.Response):
result = await check.run(response)
status = result.status
context = result.context
else:
status = "failure"
context = SerializableException.from_exception(response)
except Exception as err: # pylint: disable=broad-except
status = "error"
context = SerializableException.from_exception(err)
msg = f"An exception occured when running {task}. {err.__class__.__name__} : {err}"
msg = f"An exception occured when running {_task}. {err.__class__.__name__} : {err}"
logger.error(msg)
return AgentResult(task_id=task.id, status=status, context=context)
async def _get_and_complete_tasks(self):
@ -81,12 +177,45 @@ class ArgosAgent:
)
if response.status_code == httpx.codes.OK:
# XXX Maybe we want to group the tests by URL ? (to issue one request per URL)
data = response.json()
logger.info("Received %i tasks from the server", len(data))
req_groups = {}
_tasks = []
for _task in data:
task = Task(**_task)
url = task.url
group = task.task_group
if task.check == "http-to-https":
data = task.request_data
if data is None:
data = ""
url = str(httpx.URL(task.url).copy_with(scheme="http"))
group = (
f"{task.method}-{task.ip_version}-{url}-"
f"{md5(data.encode()).hexdigest()}"
)
_task["task_group"] = group
req_groups[group] = {
"url": url,
"ip_version": task.ip_version,
"method": task.method,
"request_data": task.request_data,
}
_tasks.append(_task)
requests = []
for group, details in req_groups.items():
requests.append(self._do_request(group, details))
if requests:
await asyncio.gather(*requests)
tasks = []
for task in data:
for task in _tasks:
tasks.append(self._complete_task(task))
if tasks:
@ -94,7 +223,7 @@ class ArgosAgent:
await self._post_results(results)
return True
logger.error("Got no tasks from the server.")
logger.info("Got no tasks from the server.")
return False
logger.error("Failed to fetch tasks: %s", response.read())
@ -102,12 +231,19 @@ class ArgosAgent:
async def _post_results(self, results: List[AgentResult]):
data = [r.model_dump() for r in results]
if self._http_client is not None:
response = await self._http_client.post(
f"{self.server}/api/results", params={"agent_id": self.agent_id}, json=data
f"{self.server}/api/results",
params={"agent_id": self.agent_id},
json=data,
)
if response.status_code == httpx.codes.CREATED:
logger.error("Successfully posted results %s", json.dumps(response.json()))
logger.info(
"Successfully posted results %s", json.dumps(response.json())
)
else:
logger.error("Failed to post results: %s", response.read())
return response
logger.error("self._http_client is None")

View file

@ -1,9 +1,8 @@
"""Various base classes for checks"""
from dataclasses import dataclass
from typing import Type, Union
from typing import Type
import httpx
from pydantic import BaseModel
from argos.schemas.models import Task
@ -71,7 +70,7 @@ class InvalidResponse(Exception):
class BaseCheck:
config: str
expected_cls: Union[None, Type[BaseExpectedValue]] = None
expected_cls: None | Type[BaseExpectedValue] = None
_registry = [] # type: ignore[var-annotated]
@ -92,8 +91,7 @@ class BaseCheck:
raise CheckNotFound(name)
return check
def __init__(self, http_client: httpx.AsyncClient, task: Task):
self.http_client = http_client
def __init__(self, task: Task):
self.task = task
@property

View file

@ -1,7 +1,12 @@
"""Define the available checks"""
import json
import re
from datetime import datetime
from httpx import Response
from jsonpointer import resolve_pointer, JsonPointerException
from argos.checks.base import (
BaseCheck,
ExpectedIntValue,
@ -17,13 +22,7 @@ class HTTPStatus(BaseCheck):
config = "status-is"
expected_cls = ExpectedIntValue
async def run(self) -> dict:
# XXX Get the method from the task
task = self.task
response = await self.http_client.request(
method="get", url=task.url, timeout=60
)
async def run(self, response: Response) -> dict:
return self.response(
status=response.status_code == self.expected,
expected=self.expected,
@ -31,29 +30,240 @@ class HTTPStatus(BaseCheck):
)
class HTTPStatusIn(BaseCheck):
"""Checks that the HTTP status code is in the list of expected values."""
config = "status-in"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
return self.response(
status=response.status_code in json.loads(self.expected),
expected=self.expected,
retrieved=response.status_code,
)
class HTTPToHTTPS(BaseCheck):
"""Checks that the HTTP to HTTPS redirection status code is the expected one."""
config = "http-to-https"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
expected_dict = json.loads(self.expected)
expected = range(300, 400)
if "range" in expected_dict:
expected = range(expected_dict["range"][0], expected_dict["range"][1])
if "value" in expected_dict:
expected = range(expected_dict["value"], expected_dict["value"] + 1)
if "list" in expected_dict:
expected = expected_dict["list"]
return self.response(
status=response.status_code in expected,
expected=self.expected,
retrieved=response.status_code,
)
class HTTPHeadersContain(BaseCheck):
"""Checks that response headers contains the expected headers
(without checking their values)"""
config = "headers-contain"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
status = True
for header in json.loads(self.expected):
if header not in response.headers:
status = False
break
return self.response(
status=status,
expected=self.expected,
retrieved=json.dumps(list(dict(response.headers).keys())),
)
class HTTPHeadersHave(BaseCheck):
"""Checks that response headers contains the expected headers and values"""
config = "headers-have"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
status = True
for header, value in json.loads(self.expected).items():
if header not in response.headers:
status = False
break
if response.headers[header] != value:
status = False
break
return self.response(
status=status,
expected=self.expected,
retrieved=json.dumps(dict(response.headers)),
)
class HTTPHeadersLike(BaseCheck):
"""Checks that response headers contains the expected headers and that the values
matches the provided regexes"""
config = "headers-like"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
status = True
for header, value in json.loads(self.expected).items():
if header not in response.headers:
status = False
break
if not re.search(rf"{value}", response.headers[header]):
status = False
break
return self.response(
status=status,
expected=self.expected,
retrieved=json.dumps(dict(response.headers)),
)
class HTTPBodyContains(BaseCheck):
"""Checks that the HTTP body contains the expected string."""
config = "body-contains"
expected_cls = ExpectedStringValue
async def run(self) -> dict:
response = await self.http_client.request(
method="get", url=self.task.url, timeout=60
)
async def run(self, response: Response) -> dict:
return self.response(status=self.expected in response.text)
class HTTPBodyLike(BaseCheck):
"""Checks that the HTTP body matches the provided regex."""
config = "body-like"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
if re.search(rf"{self.expected}", response.text):
return self.response(status=True)
return self.response(status=False)
class HTTPJsonContains(BaseCheck):
"""Checks that JSON response contains the expected structure
(without checking the value)"""
config = "json-contains"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
obj = response.json()
status = True
for pointer in json.loads(self.expected):
try:
resolve_pointer(obj, pointer)
except JsonPointerException:
status = False
break
return self.response(
status=status,
expected=self.expected,
retrieved=json.dumps(obj),
)
class HTTPJsonHas(BaseCheck):
"""Checks that JSON response contains the expected structure and values"""
config = "json-has"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
obj = response.json()
status = True
for pointer, exp_value in json.loads(self.expected).items():
try:
value = resolve_pointer(obj, pointer)
if value != exp_value:
status = False
break
except JsonPointerException:
status = False
break
return self.response(
status=status,
expected=self.expected,
retrieved=json.dumps(obj),
)
class HTTPJsonLike(BaseCheck):
"""Checks that JSON response contains the expected structure and that the values
matches the provided regexes"""
config = "json-like"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
obj = response.json()
status = True
for pointer, exp_value in json.loads(self.expected).items():
try:
value = resolve_pointer(obj, pointer)
if not re.search(rf"{exp_value:}", value):
status = False
break
except JsonPointerException:
status = False
break
return self.response(
status=status,
expected=self.expected,
retrieved=json.dumps(obj),
)
class HTTPJsonIs(BaseCheck):
"""Checks that JSON response is the exact expected JSON object"""
config = "json-is"
expected_cls = ExpectedStringValue
async def run(self, response: Response) -> dict:
obj = response.json()
status = response.json() == json.loads(self.expected)
return self.response(
status=status,
expected=self.expected,
retrieved=json.dumps(obj),
)
class SSLCertificateExpiration(BaseCheck):
"""Checks that the SSL certificate will not expire soon."""
config = "ssl-certificate-expiration"
expected_cls = ExpectedStringValue
async def run(self):
async def run(self, response: Response) -> dict:
"""Returns the number of days in which the certificate will expire."""
response = await self.http_client.get(self.task.url, timeout=60)
network_stream = response.extensions["network_stream"]
ssl_obj = network_stream.get_extra_info("ssl_object")
cert = ssl_obj.getpeercert()
@ -65,6 +275,8 @@ class SSLCertificateExpiration(BaseCheck):
@classmethod
async def finalize(cls, config, result, **context):
if result.status == Status.ERROR:
return result.status, Severity.UNKNOWN
if result.status != Status.ON_CHECK:
return result.status, Severity.WARNING

View file

@ -10,8 +10,7 @@ import uvicorn
from alembic import command
from alembic.config import Config
from argos import logging
from argos import VERSION
from argos import VERSION, logging
from argos.agent import ArgosAgent
@ -93,7 +92,12 @@ def version():
default="INFO",
type=click.Choice(logging.LOG_LEVELS, case_sensitive=False),
)
def agent(server_url, auth, max_tasks, wait_time, log_level):
@click.option(
"--user-agent",
default="",
help="A custom string to append to the User-Agent header",
)
def agent(server_url, auth, max_tasks, wait_time, log_level, user_agent): # pylint: disable-msg=too-many-positional-arguments
"""Get and run tasks for the provided server. Will wait for new tasks.
Usage: argos agent https://argos.example.org "auth-token-here"
@ -109,7 +113,7 @@ def agent(server_url, auth, max_tasks, wait_time, log_level):
from argos.logging import logger
logger.setLevel(log_level)
agent_ = ArgosAgent(server_url, auth, max_tasks, wait_time)
agent_ = ArgosAgent(server_url, auth, max_tasks, wait_time, user_agent)
asyncio.run(agent_.run())
@ -136,101 +140,6 @@ def start(host, port, config, reload):
uvicorn.run("argos.server:app", host=host, port=port, reload=reload)
def validate_max_lock_seconds(ctx, param, value):
if value <= 60:
raise click.BadParameter("Should be strictly higher than 60")
return value
def validate_max_results(ctx, param, value):
if value <= 0:
raise click.BadParameter("Should be a positive integer")
return value
@server.command()
@click.option(
"--max-results",
default=100,
help="Number of results per task to keep",
callback=validate_max_results,
)
@click.option(
"--max-lock-seconds",
default=100,
help=(
"The number of seconds after which a lock is "
"considered stale, must be higher than 60 "
"(the checks have a timeout value of 60 seconds)"
),
callback=validate_max_lock_seconds,
)
@click.option(
"--config",
default="argos-config.yaml",
help="Path of the configuration file. "
"If ARGOS_YAML_FILE environment variable is set, its value will be used instead. "
"Default value: argos-config.yaml and /etc/argos/config.yaml as fallback.",
envvar="ARGOS_YAML_FILE",
callback=validate_config_access,
)
@coroutine
async def cleandb(max_results, max_lock_seconds, config):
"""Clean the database (to run routinely)
\b
- Removes old results from the database.
- Removes locks from tasks that have been locked for too long.
"""
# Its mandatory to do it before the imports
os.environ["ARGOS_YAML_FILE"] = config
# The imports are made here otherwise the agent will need server configuration files.
from argos.server import queries
db = await get_db()
removed = await queries.remove_old_results(db, max_results)
updated = await queries.release_old_locks(db, max_lock_seconds)
click.echo(f"{removed} results removed")
click.echo(f"{updated} locks released")
@server.command()
@click.option(
"--time-without-agent",
default=5,
help="Time without seeing an agent after which a warning will be issued, in minutes. "
"Default is 5 minutes.",
callback=validate_max_results,
)
@click.option(
"--config",
default="argos-config.yaml",
help="Path of the configuration file. "
"If ARGOS_YAML_FILE environment variable is set, its value will be used instead.",
envvar="ARGOS_YAML_FILE",
callback=validate_config_access,
)
@coroutine
async def watch_agents(time_without_agent, config):
"""Watch agents (to run routinely)
Issues a warning if no agent has been seen by the server for a given time.
"""
# Its mandatory to do it before the imports
os.environ["ARGOS_YAML_FILE"] = config
# The imports are made here otherwise the agent will need server configuration files.
from argos.server import queries
db = await get_db()
agents = await queries.get_recent_agents_count(db, time_without_agent)
if agents == 0:
click.echo(f"No agent has been seen in the last {time_without_agent} minutes.")
sysexit(1)
@server.command(short_help="Load or reload tasks configuration")
@click.option(
"--config",
@ -241,23 +150,40 @@ async def watch_agents(time_without_agent, config):
envvar="ARGOS_YAML_FILE",
callback=validate_config_access,
)
@click.option(
"--enqueue/--no-enqueue",
default=False,
help="Let Argos main recurring tasks handle configurations loading. "
"It may delay the application of the new configuration up to 2 minutes. "
"Default is --no-enqueue",
)
@coroutine
async def reload_config(config):
async def reload_config(config, enqueue):
"""Read tasks configuration and add/delete tasks in database if needed"""
# Its mandatory to do it before the imports
os.environ["ARGOS_YAML_FILE"] = config
# The imports are made here otherwise the agent will need server configuration files.
from argos.server import queries
from argos.server.main import read_config
from argos.server.settings import read_config
_config = read_config(config)
db = await get_db()
config_changed = await queries.has_config_changed(db, _config)
if not config_changed:
click.echo("Config has not change")
else:
if enqueue:
msg = await queries.update_from_config_later(db, config_file=config)
click.echo(msg)
else:
changed = await queries.update_from_config(db, _config)
click.echo(f"{changed['added']} tasks added")
click.echo(f"{changed['vanished']} tasks deleted")
click.echo(f"{changed['added']} task(s) added")
click.echo(f"{changed['vanished']} task(s) deleted")
@server.command()
@ -305,9 +231,10 @@ async def add(config, name, password):
os.environ["ARGOS_YAML_FILE"] = config
# The imports are made here otherwise the agent will need server configuration files.
from argos.server import queries
from passlib.context import CryptContext
from argos.server import queries
db = await get_db()
_user = await queries.get_user(db, name)
if _user is not None:
@ -339,9 +266,10 @@ async def change_password(config, name, password):
os.environ["ARGOS_YAML_FILE"] = config
# The imports are made here otherwise the agent will need server configuration files.
from argos.server import queries
from passlib.context import CryptContext
from argos.server import queries
db = await get_db()
_user = await queries.get_user(db, name)
if _user is None:
@ -374,9 +302,10 @@ async def verify_password(config, name, password):
os.environ["ARGOS_YAML_FILE"] = config
# The imports are made here otherwise the agent will need server configuration files.
from argos.server import queries
from passlib.context import CryptContext
from argos.server import queries
db = await get_db()
_user = await queries.get_user(db, name)
if _user is None:
@ -548,5 +477,256 @@ async def generate_config():
print(f.read())
@server.command()
@click.option(
"--config",
default="argos-config.yaml",
help="Path of the configuration file. "
"If ARGOS_YAML_FILE environment variable is set, its value will be used instead.",
envvar="ARGOS_YAML_FILE",
callback=validate_config_access,
)
@click.option("--domain", help="Domain for the notification", default="example.org")
@click.option("--severity", help="Severity", default="CRITICAL")
@coroutine
async def test_mail(config, domain, severity):
"""Send a test email"""
os.environ["ARGOS_YAML_FILE"] = config
from datetime import datetime
from argos.logging import set_log_level
from argos.server.alerting import notify_by_mail
from argos.server.models import Result, Task
from argos.server.settings import read_config
conf = read_config(config)
if not conf.general.mail:
click.echo("Mail is not configured, cannot test", err=True)
sysexit(1)
else:
now = datetime.now()
task = Task(
url=f"https://{domain}",
domain=domain,
check="body-contains",
expected="foo",
frequency=1,
ip_version=4,
selected_by="test",
selected_at=now,
)
result = Result(
submitted_at=now,
status="success",
context={"foo": "bar"},
task=task,
agent_id="test",
severity="ok",
)
class _FalseRequest:
def url_for(*args, **kwargs):
return "/url"
set_log_level("debug")
notify_by_mail(
result,
task,
severity=severity,
old_severity="OLD SEVERITY",
config=conf.general.mail,
request=_FalseRequest(),
)
@server.command()
@click.option(
"--config",
default="argos-config.yaml",
help="Path of the configuration file. "
"If ARGOS_YAML_FILE environment variable is set, its value will be used instead.",
envvar="ARGOS_YAML_FILE",
callback=validate_config_access,
)
@click.option("--domain", help="Domain for the notification", default="example.org")
@click.option("--severity", help="Severity", default="CRITICAL")
@coroutine
async def test_gotify(config, domain, severity):
"""Send a test gotify notification"""
os.environ["ARGOS_YAML_FILE"] = config
from datetime import datetime
from argos.logging import set_log_level
from argos.server.alerting import notify_with_gotify
from argos.server.models import Result, Task
from argos.server.settings import read_config
conf = read_config(config)
if not conf.general.gotify:
click.echo("Gotify notifications are not configured, cannot test", err=True)
sysexit(1)
else:
now = datetime.now()
task = Task(
url=f"https://{domain}",
domain=domain,
check="body-contains",
expected="foo",
frequency=1,
ip_version=4,
selected_by="test",
selected_at=now,
)
result = Result(
submitted_at=now,
status="success",
context={"foo": "bar"},
task=task,
agent_id="test",
severity="ok",
)
class _FalseRequest:
def url_for(*args, **kwargs):
return "/url"
set_log_level("debug")
notify_with_gotify(
result,
task,
severity=severity,
old_severity="OLD SEVERITY",
config=conf.general.gotify,
request=_FalseRequest(),
)
@server.command()
@click.option(
"--config",
default="argos-config.yaml",
help="Path of the configuration file. "
"If ARGOS_YAML_FILE environment variable is set, its value will be used instead.",
envvar="ARGOS_YAML_FILE",
callback=validate_config_access,
)
@click.option("--domain", help="Domain for the notification", default="example.org")
@click.option("--severity", help="Severity", default="CRITICAL")
@click.option(
"--apprise-group", help="Apprise group for the notification", required=True
)
@coroutine
async def test_apprise(config, domain, severity, apprise_group):
"""Send a test apprise notification"""
os.environ["ARGOS_YAML_FILE"] = config
from datetime import datetime
from argos.logging import set_log_level
from argos.server.alerting import notify_with_apprise
from argos.server.models import Result, Task
from argos.server.settings import read_config
conf = read_config(config)
if not conf.general.apprise:
click.echo("Apprise notifications are not configured, cannot test", err=True)
sysexit(1)
else:
now = datetime.now()
task = Task(
url=f"https://{domain}",
domain=domain,
check="body-contains",
expected="foo",
frequency=1,
ip_version=4,
selected_by="test",
selected_at=now,
)
result = Result(
submitted_at=now,
status="success",
context={"foo": "bar"},
task=task,
agent_id="test",
severity="ok",
)
class _FalseRequest:
def url_for(*args, **kwargs):
return "/url"
set_log_level("debug")
notify_with_apprise(
result,
task,
severity=severity,
old_severity="OLD SEVERITY",
group=conf.general.apprise[apprise_group],
request=_FalseRequest(),
)
@server.command(short_help="Nagios compatible severities report")
@click.option(
"--config",
default="argos-config.yaml",
help="Path of the configuration file. "
"If ARGOS_YAML_FILE environment variable is set, its value will be used instead.",
envvar="ARGOS_YAML_FILE",
callback=validate_config_access,
)
@coroutine
async def nagios(config):
"""Output a report of current severities suitable for Nagios
with a Nagios compatible exit code"""
os.environ["ARGOS_YAML_FILE"] = config
# The imports are made here otherwise the agent will need server configuration files.
from argos.server import queries
exit_nb = 0
db = await get_db()
severities = await queries.get_severity_counts(db)
if severities["warning"] != 0:
exit_nb = 1
if severities["critical"] != 0:
exit_nb = 2
if severities["unknown"] != 0:
exit_nb = 2
stats = (
f"ok={severities['ok']}; warning={severities['warning']}; "
f"critical={severities['critical']}; unknown={severities['unknown']};"
)
if exit_nb == 0:
print("OK — All sites are ok|{stats}")
elif exit_nb == 1:
print(f"WARNING — {severities['warning']} sites are in warning state|{stats}")
elif severities["critical"] == 0:
print(f"UNKNOWN — {severities['unknown']} sites are in unknown state|{stats}")
elif severities["unknown"] == 0:
print(
f"CRITICAL — {severities['critical']} sites are in critical state|{stats}"
)
else:
print(
f"CRITICAL/UNKNOWN — {severities['critical']} sites are in critical state "
f"and {severities['unknown']} sites are in unknown state|{stats}"
)
sysexit(exit_nb)
if __name__ == "__main__":
cli()

View file

@ -1,25 +1,98 @@
---
general:
# Except for frequency and recheck_delay settings, changes in general
# section of the configuration will need a restart of argos server.
db:
# The database URL, as defined in SQLAlchemy docs : https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls
# The database URL, as defined in SQLAlchemy docs :
# https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls
# Example for SQLite: "sqlite:////tmp/argos.db"
url: "postgresql://argos:argos@localhost/argos"
# You configure the size of the database pool of connection, and the max overflow (until when new connections are accepted ?)
# See https://docs.sqlalchemy.org/en/20/core/pooling.html#sqlalchemy.pool.QueuePool.params.pool_size for details
# You configure the size of the database pool of connection, and
# the max overflow (until when new connections are accepted ?)
# For details, see
# https://docs.sqlalchemy.org/en/20/core/pooling.html#sqlalchemy.pool.QueuePool.params.pool_size
pool_size: 10
max_overflow: 20
# Can be "production", "dev", "test".
# If not present, default value is "production"
env: "production"
# to get a good string for cookie_secret, run:
# To get a good string for cookie_secret, run:
# openssl rand -hex 32
cookie_secret: "foo_bar_baz"
# Session duration
# Use m for minutes, h for hours, d for days
# w for weeks, M for months, y for years
# See https://github.com/timwedde/durations_nlp#scales-reference for details
# If not present, default value is "7d"
session_duration: "7d"
# Session opened with "Remember me" checked
# If not present, the "Remember me" feature is not available
# remember_me_duration: "1M"
# Unauthenticated access
# If can grant an unauthenticated access to the dashboard or to all pages
# To do so, choose either "dashboard", or "all"
# If not present, all pages needs authentication
# unauthenticated_access: "all"
# LDAP authentication
# Instead of relying on Argos users, use a LDAP server to authenticate users.
# If not present, Argos native user system is used.
# ldap:
# # Server URI
# uri: "ldaps://ldap.example.org"
# # Search base DN
# user_tree: "ou=users,dc=example,dc=org"
# # Search bind DN
# bind_dn: "uid=ldap_user,ou=users,dc=example,dc=org"
# # Search bind password
# bind_pwd: "secr3t"
# # User attribute (uid, mail, sAMAccountName, etc.)
# user_attr: "uid"
# # User filter (to exclude some users, etc.)
# user_filter: "(!(uid=ldap_user))"
# Default delay for checks.
# Can be superseeded in domain configuration.
# For ex., to run checks every minute:
frequency: "1m"
# For ex., to run checks every 5 minutes:
frequency: "5m"
# Default re-check delay if a check has failed.
# Can be superseeded in domain configuration.
# If not present, failed checked wont be re-checked (they will be
# run again like if they succeded
# For ex., to re-try a check one minute after a failure:
# recheck_delay: "1m"
# Default setting for notifications delay.
# Say you want to be warned right after a failure on a check: set it to 0
# Say you want a second failure on the check before being warned,
# to avoid network hiccups: set it to 1
# Can be superseeded in domain configuration
# If not present, default is 0
# retry_before_notification: 0
# Defaults settings for IPv4/IPv6
# Can be superseeded in domain configuration.
# By default, Argos will check both IPv4 and IPv6 addresses of a domain
# (i.e. by default, both `ipv4` and `ipv6` are set to true).
# To disable the IPv4 check of domains:
# ipv4: false
# To disable the IPv6 check of domains:
# ipv6: false
# Argos root path
# If not present, default value is ""
# Set it to /foo if you want to use argos at /foo/ instead of /
# on your web server
# root_path: "/foo"
# Which way do you want to be warned when a check goes to that severity?
# "local" emits a message in the server log
# Youll need to configure mail and gotify below to be able to use them here.
# Youll need to configure mail, gotify or apprise below to be able to use
# them here.
# Use "apprise:john", "apprise:team" (with the quotes!) to use apprise
# notification groups.
alerts:
ok:
- local
@ -29,6 +102,10 @@ general:
- local
unknown:
- local
# This alert is triggered when no Argos agent has been seen in a while
# See recurring_tasks.time_without_agent below
no_agent:
- local
# Mail configuration is quite straight-forward
# mail:
# mailfrom: no-reply@example.org
@ -49,6 +126,17 @@ general:
# tokens:
# - foo
# - bar
# See https://github.com/caronc/apprise#productivity-based-notifications
# for apprises URLs syntax.
# You need to surround the URLs with quotes like in the examples below.
# Use "apprise:john", "apprise:team" (with the quotes!) in "alerts" settings.
# apprise:
# john:
# - "mastodon://access_key@hostname/@user"
# - "matrixs://token@hostname:port/?webhook=matrix"
# team:
# - "mmosts://user@hostname/authkey"
# - "nctalks://user:pass@host/RoomId1/RoomId2/RoomIdN"
service:
secrets:
@ -61,6 +149,22 @@ ssl:
- "1d": critical
- "5d": warning
# Argos will execute some tasks in the background for you
# every 2 minutes and needs some configuration for that
recurring_tasks:
# Maximum age of results
# Use m for minutes, h for hours, d for days
# w for weeks, M for months, y for years
# See https://github.com/timwedde/durations_nlp#scales-reference for details
max_results_age: "1d"
# Max number of seconds a task can be locked
# Minimum value is 61, default is 100
max_lock_seconds: 100
# Max number of minutes without seing an agent
# before sending an alert
# Minimum value is 1, default is 5
time_without_agent: 5
# It's also possible to define the checks in another file
# with the include syntax:
#
@ -68,17 +172,100 @@ ssl:
#
websites:
- domain: "https://mypads.example.org"
# Wait for a second failure before sending notification
retry_before_notification: 1
paths:
- path: "/mypads/"
# Specify the method of the HTTP request
# Valid values are "GET", "HEAD", "POST", "OPTIONS",
# "CONNECT", "TRACE", "PUT", "PATCH" and "DELETE"
# default is "GET" if omitted
method: "GET"
checks:
# Check that the returned HTTP status is 200
- status-is: 200
# Check that the response contains this string
- body-contains: '<div id= "mypads"></div>'
# Check that the response matches this regex
- body-like: MyPads .* accounts
# Check that the SSL certificate is no older than ssl.thresholds
- ssl-certificate-expiration: "on-check"
# Check that the response contains this headers
# The comparison is case insensitive
- headers-contain:
- "content-encoding"
- "content-type"
# Check that there is a HTTP to HTTPS redirection with 3xx status code
- http-to-https: true
# Check that there is a HTTP to HTTPS redirection with 301 status code
- http-to-https: 301
# Check that there is a HTTP to HTTPS redirection with a status code
# in the provided range (stop value excluded)
- http-to-https:
start: 301
stop: 308
# Check that there is a HTTP to HTTPS redirection with a status code
# in the provided list
- http-to-https:
- 301
- 302
- 307
- path: "/admin/"
methode: "POST"
# Send form data in the request
request_data:
data:
login: "admin"
password: "my-password"
# To send data as JSON (optional, default is false):
is_json: true
# To send additional headers
headers:
Authorization: "Bearer foo-bar-baz"
checks:
- status-is: 401
# Check that the return HTTP status is one of those
# Similar to status-is, verify that you dont mistyped it!
- status-in:
- 401
- 301
# Check that the response contains this headers and values
# Its VERY important to respect the 4 spaces indentation here!
# The name of the headers is case insensitive
- headers-have:
content-encoding: "gzip"
content-type: "text/html"
# Checks that response headers contains the expected headers and
# that the values matches the provided regexes
# You have to double the escape character \
- headers-like:
content-encoding: "gzip|utf"
content-type: "text/(html|css)"
- path: "/my-stats.json"
checks:
# Check that JSON response contains the expected structure
- json-contains:
- /foo/bar/0
- /foo/bar/1
- /timestamp
# Check that JSON response contains the expected structure and values
# Its VERY important to respect the 4 spaces indentation here!
- json-has:
/maintenance: false
/productname: "Nextcloud"
# Check that JSON response contains the expected structure and
# that the values matches the provided regexes
# You have to double the escape character \
- json-like:
/productname: ".*cloud"
/versionstring: "29\\..*"
# Check that JSON response is the exact expected JSON object
# The order of the items in the object does not matter.
- json-is: '{"foo": "bar", "baz": 42}'
- domain: "https://munin.example.org"
frequency: "20m"
recheck_delay: "5m"
# Lets say its an IPv6 only web site
ipv4: false
paths:
- path: "/"
checks:

View file

@ -14,9 +14,10 @@ logger = logging.getLogger(__name__)
# XXX Does not work ?
def set_log_level(log_level):
def set_log_level(log_level: str, quiet: bool = False):
level = getattr(logging, log_level.upper(), None)
if not isinstance(level, int):
raise ValueError(f"Invalid log level: {log_level}")
logger.setLevel(level=level)
if not quiet:
logger.info("Log level set to %s", log_level)

View file

@ -2,8 +2,12 @@
For database models, see argos.server.models.
"""
from typing import Dict, List, Literal, Optional, Tuple
import json
from typing import Any, Dict, List, Literal, Tuple
from durations_nlp import Duration
from pydantic import (
BaseModel,
ConfigDict,
@ -14,15 +18,16 @@ from pydantic import (
PositiveInt,
field_validator,
)
from pydantic.functional_validators import BeforeValidator
from pydantic.functional_validators import AfterValidator, BeforeValidator
from pydantic.networks import UrlConstraints
from pydantic_core import Url
from typing_extensions import Annotated
from argos.schemas.utils import string_to_duration
from argos.schemas.utils import Method
Severity = Literal["warning", "error", "critical", "unknown"]
Environment = Literal["dev", "test", "production"]
Unauthenticated = Literal["dashboard", "all"]
SQLiteDsn = Annotated[
Url,
UrlConstraints(
@ -34,7 +39,7 @@ SQLiteDsn = Annotated[
def parse_threshold(value):
"""Parse duration threshold for SSL certificate validity"""
for duration_str, severity in value.items():
days = string_to_duration(duration_str, "days")
days = Duration(duration_str).to_days()
# Return here because it's one-item dicts.
return (days, severity)
@ -43,6 +48,33 @@ class SSL(BaseModel):
thresholds: List[Annotated[Tuple[int, Severity], BeforeValidator(parse_threshold)]]
class RecurringTasks(BaseModel):
max_results_age: float
max_lock_seconds: int
time_without_agent: int
@field_validator("max_results_age", mode="before")
def parse_max_results_age(cls, value):
"""Convert the configured maximum results age to seconds"""
return Duration(value).to_seconds()
@field_validator("max_lock_seconds", mode="before")
def parse_max_lock_seconds(cls, value):
"""Ensure that max_lock_seconds is higher or equal to agents requests timeout (60)"""
if value > 60:
return value
return 100
@field_validator("time_without_agent", mode="before")
def parse_time_without_agent(cls, value):
"""Ensure that time_without_agent is at least one minute"""
if value >= 1:
return value
return 5
class WebsiteCheck(BaseModel):
key: str
value: str | List[str] | Dict[str, str]
@ -76,13 +108,49 @@ def parse_checks(value):
if name not in available_names:
msg = f"Check should be one of f{available_names}. ({name} given)"
raise ValueError(msg)
if name == "http-to-https":
if isinstance(expected, int) and expected in range(300, 400):
expected = json.dumps({"value": expected})
elif isinstance(expected, list):
expected = json.dumps({"list": expected})
elif (
isinstance(expected, dict)
and "start" in expected
and "stop" in expected
):
expected = json.dumps({"range": [expected["start"], expected["stop"]]})
else:
expected = json.dumps({"range": [300, 400]})
else:
if isinstance(expected, int):
expected = str(expected)
if isinstance(expected, list):
expected = json.dumps(expected)
if isinstance(expected, dict):
expected = json.dumps(expected)
return (name, expected)
def parse_request_data(value):
"""Turn form or JSON data into JSON string"""
return json.dumps(
{"data": value.data, "json": value.is_json, "headers": value.headers}
)
class RequestData(BaseModel):
data: Any = None
is_json: bool = False
headers: Dict[str, str] | None = None
class WebsitePath(BaseModel):
path: str
method: Method = "GET"
request_data: Annotated[
RequestData, AfterValidator(parse_request_data)
] | None = None
checks: List[
Annotated[
Tuple[str, str],
@ -93,14 +161,26 @@ class WebsitePath(BaseModel):
class Website(BaseModel):
domain: HttpUrl
frequency: Optional[int] = None
ipv4: bool | None = None
ipv6: bool | None = None
frequency: float | None = None
recheck_delay: float | None = None
retry_before_notification: int | None = None
paths: List[WebsitePath]
@field_validator("frequency", mode="before")
def parse_frequency(cls, value):
"""Convert the configured frequency to minutes"""
if value:
return string_to_duration(value, "minutes")
return Duration(value).to_minutes()
return None
@field_validator("recheck_delay", mode="before")
def parse_recheck_delay(cls, value):
"""Convert the configured recheck delay to minutes"""
if value:
return Duration(value).to_minutes()
return None
@ -126,7 +206,7 @@ class Mail(BaseModel):
port: PositiveInt = 25
ssl: StrictBool = False
starttls: StrictBool = False
auth: Optional[MailAuth] = None
auth: MailAuth | None = None
addresses: List[EmailStr]
@ -137,6 +217,7 @@ class Alert(BaseModel):
warning: List[str]
critical: List[str]
unknown: List[str]
no_agent: List[str]
class GotifyUrl(BaseModel):
@ -150,25 +231,66 @@ class DbSettings(BaseModel):
max_overflow: int = 20
class LdapSettings(BaseModel):
uri: str
user_tree: str
bind_dn: str | None = None
bind_pwd: str | None = None
user_attr: str
user_filter: str | None = None
class General(BaseModel):
"""Frequency for the checks and alerts"""
cookie_secret: str
frequency: int
db: DbSettings
env: Environment = "production"
cookie_secret: str
session_duration: int = 10080 # 7 days
remember_me_duration: int | None = None
unauthenticated_access: Unauthenticated | None = None
ldap: LdapSettings | None = None
frequency: float
recheck_delay: float | None = None
retry_before_notification: int = 0
ipv4: bool = True
ipv6: bool = True
root_path: str = ""
alerts: Alert
mail: Optional[Mail] = None
gotify: Optional[List[GotifyUrl]] = None
mail: Mail | None = None
gotify: List[GotifyUrl] | None = None
apprise: Dict[str, List[str]] | None = None
@field_validator("session_duration", mode="before")
def parse_session_duration(cls, value):
"""Convert the configured session duration to minutes"""
return Duration(value).to_minutes()
@field_validator("remember_me_duration", mode="before")
def parse_remember_me_duration(cls, value):
"""Convert the configured session duration with remember me feature to minutes"""
if value:
return int(Duration(value).to_minutes())
return None
@field_validator("frequency", mode="before")
def parse_frequency(cls, value):
"""Convert the configured frequency to minutes"""
return string_to_duration(value, "minutes")
return Duration(value).to_minutes()
@field_validator("recheck_delay", mode="before")
def parse_recheck_delay(cls, value):
"""Convert the configured recheck delay to minutes"""
if value:
return Duration(value).to_minutes()
return None
class Config(BaseModel):
general: General
service: Service
ssl: SSL
recurring_tasks: RecurringTasks
websites: List[Website]

View file

@ -8,17 +8,39 @@ from typing import Literal
from pydantic import BaseModel, ConfigDict
from argos.schemas.utils import IPVersion, Method, Todo
# XXX Refactor using SQLModel to avoid duplication of model data
class Job(BaseModel):
"""Tasks needing to be executed in recurring tasks processing.
Its quite like a job queue."""
id: int
todo: Todo
args: str
current: bool
added_at: datetime
def __str__(self):
return f"Job ({self.id}): {self.todo}"
class Task(BaseModel):
"""A task corresponds to a check to execute"""
id: int
url: str
domain: str
ip_version: IPVersion
check: str
method: Method
request_data: str | None
expected: str
task_group: str
retry_before_notification: int
contiguous_failures: int
selected_at: datetime | None
selected_by: str | None
@ -28,7 +50,8 @@ class Task(BaseModel):
task_id = self.id
url = self.url
check = self.check
return f"Task ({task_id}): {url} - {check}"
ip_version = self.ip_version
return f"Task ({task_id}): {url} (IPv{ip_version}) - {check}"
class SerializableException(BaseModel):

View file

@ -1,42 +1,10 @@
from typing import Literal, Union
from typing import Literal
def string_to_duration(
value: str, target: Literal["days", "hours", "minutes"]
) -> Union[int, float]:
"""Convert a string to a number of hours, days or minutes"""
num = int("".join(filter(str.isdigit, value)))
IPVersion = Literal["4", "6"]
# It's not possible to convert from a smaller unit to a greater one:
# - hours and minutes cannot be converted to days
# - minutes cannot be converted to hours
if (target == "days" and ("h" in value or "m" in value.replace("mo", ""))) or (
target == "hours" and "m" in value.replace("mo", "")
):
msg = (
"Durations cannot be converted from a smaller to a greater unit. "
f"(trying to convert '{value}' to {target})"
)
raise ValueError(msg, value)
Method = Literal[
"GET", "HEAD", "POST", "OPTIONS", "CONNECT", "TRACE", "PUT", "PATCH", "DELETE"
]
# Consider we're converting to minutes, do the eventual multiplication at the end.
if "h" in value:
num = num * 60
elif "d" in value:
num = num * 60 * 24
elif "w" in value:
num = num * 60 * 24 * 7
elif "mo" in value:
num = num * 60 * 24 * 30 # considers 30d in a month
elif "y" in value:
num = num * 60 * 24 * 365 # considers 365d in a year
elif "m" not in value:
raise ValueError("Invalid duration value", value)
if target == "hours":
return num / 60
if target == "days":
return num / 60 / 24
# target == "minutes"
return num
Todo = Literal["RELOAD_CONFIG"]

View file

@ -1,20 +1,165 @@
import ssl
import smtplib
from email.message import EmailMessage
from typing import List
from urllib.parse import urlparse
import apprise
import httpx
from argos.checks.base import Severity
from argos.logging import logger
from argos.schemas.config import Config, Mail, GotifyUrl
# XXX Implement mail alerts https://framagit.org/framasoft/framaspace/argos/-/issues/15
# XXX Implement gotify alerts https://framagit.org/framasoft/framaspace/argos/-/issues/16
from argos.server.models import Task
def handle_alert(config: Config, result, task, severity, old_severity, request):
def need_alert(
last_severity: str, last_severity_update, severity: str, status: str, task: Task
) -> bool:
## Create alert… or not!
send_notif = False
# Severity has changed, and no retry before notification
if last_severity != severity and task.retry_before_notification == 0:
send_notif = True
# Seems to be a first check: create a notification
elif last_severity != severity and last_severity_update is None:
send_notif = True
# As we created a notification, avoid resending it on a
# future failure
if status != "success":
task.contiguous_failures = task.retry_before_notification
# We need retry before notification, so the severity may not have changed
# since last check
elif task.retry_before_notification != 0:
# If we got a success, and we already have created a notification:
# create notification of success immediately
if (
status == "success"
and task.contiguous_failures >= task.retry_before_notification + 1
):
send_notif = True
task.contiguous_failures = 0
# The status is not a success
elif status != "success":
# This is a new failure
task.contiguous_failures += 1
# Severity has changed, but not to success, thats odd:
# create a notification
if (
last_severity not in ("ok", severity)
and last_severity_update is not None
):
send_notif = True
# As we created a notification, avoid resending it on a
# future failure
task.contiguous_failures = task.retry_before_notification
# Severity has not changed, but there has been enough failures
# to create a notification
elif task.contiguous_failures == task.retry_before_notification + 1:
send_notif = True
return send_notif
def get_icon_from_severity(severity: str) -> str:
icon = ""
if severity == Severity.OK:
icon = ""
elif severity == Severity.WARNING:
icon = "⚠️"
elif severity == Severity.UNKNOWN:
icon = ""
return icon
def send_mail(mail: EmailMessage, config: Mail):
"""Send message by mail"""
if config.ssl:
logger.debug("Mail notification: SSL")
context = ssl.create_default_context()
smtp = smtplib.SMTP_SSL(host=config.host, port=config.port, context=context)
else:
smtp = smtplib.SMTP(
host=config.host, # type: ignore
port=config.port,
)
if config.starttls:
logger.debug("Mail notification: STARTTLS")
context = ssl.create_default_context()
smtp.starttls(context=context)
if config.auth is not None:
logger.debug("Mail notification: authentification")
smtp.login(config.auth.login, config.auth.password)
for address in config.addresses:
logger.debug("Sending mail to %s", address)
logger.debug(mail.get_body())
smtp.send_message(mail, to_addrs=address)
def send_gotify_msg(config, payload):
"""Send message with gotify"""
headers = {"accept": "application/json", "content-type": "application/json"}
for url in config:
logger.debug("Sending gotify message(s) to %s", url.url)
for token in url.tokens:
try:
res = httpx.post(
f"{url.url}message",
params={"token": token},
headers=headers,
json=payload,
)
res.raise_for_status()
except httpx.RequestError as err:
logger.error(
"An error occurred while sending a message to %s with token %s",
err.request.url,
token,
)
def no_agent_alert(config: Config):
"""Alert"""
msg = "You should check whats going on with your Argos agents."
twa = config.recurring_tasks.time_without_agent
if twa > 1:
subject = f"No agent has been seen within the last {twa} minutes"
else:
subject = "No agent has been seen within the last minute"
if "local" in config.general.alerts.no_agent:
logger.error(subject)
if config.general.mail is not None and "mail" in config.general.alerts.no_agent:
mail = EmailMessage()
mail["Subject"] = f"[Argos] {subject}"
mail["From"] = config.general.mail.mailfrom
mail.set_content(msg)
send_mail(mail, config.general.mail)
if config.general.gotify is not None and "gotify" in config.general.alerts.no_agent:
priority = 9
payload = {"title": subject, "message": msg, "priority": priority}
send_gotify_msg(config.general.gotify, payload)
if config.general.apprise is not None:
for notif_way in config.general.alerts.no_agent:
if notif_way.startswith("apprise:"):
group = notif_way[8:]
apobj = apprise.Apprise()
for channel in config.general.apprise[group]:
apobj.add(channel)
apobj.notify(title=subject, body=msg)
def handle_alert(config: Config, result, task, severity, old_severity, request): # pylint: disable-msg=too-many-positional-arguments
"""Dispatch alert through configured alert channels"""
if "local" in getattr(config.general.alerts, severity):
@ -39,14 +184,54 @@ def handle_alert(config: Config, result, task, severity, old_severity, request):
result, task, severity, old_severity, config.general.gotify, request
)
if config.general.apprise is not None:
for notif_way in getattr(config.general.alerts, severity):
if notif_way.startswith("apprise:"):
group = notif_way[8:]
notify_with_apprise(
result,
task,
severity,
old_severity,
config.general.apprise[group],
request,
)
def notify_by_mail(
def notify_with_apprise( # pylint: disable-msg=too-many-positional-arguments
result, task, severity: str, old_severity: str, group: List[str], request
) -> None:
logger.debug("Will send apprise notification")
apobj = apprise.Apprise()
for channel in group:
apobj.add(channel)
icon = get_icon_from_severity(severity)
title = f"[Argos] {icon} {urlparse(task.url).netloc} (IPv{task.ip_version}): status {severity}"
msg = f"""\
URL: {task.url} (IPv{task.ip_version})
Check: {task.check}
Status: {severity}
Time: {result.submitted_at}
Previous status: {old_severity}
See result on {request.url_for('get_result_view', result_id=result.id)}
See results of task on {request.url_for('get_task_results_view', task_id=task.id)}#{result.id}
"""
apobj.notify(title=title, body=msg)
def notify_by_mail( # pylint: disable-msg=too-many-positional-arguments
result, task, severity: str, old_severity: str, config: Mail, request
) -> None:
logger.debug("Will send mail notification")
icon = get_icon_from_severity(severity)
msg = f"""\
URL: {task.url}
URL: {task.url} (IPv{task.ip_version})
Check: {task.check}
Status: {severity}
Time: {result.submitted_at}
@ -57,79 +242,52 @@ See result on {request.url_for('get_result_view', result_id=result.id)}
See results of task on {request.url_for('get_task_results_view', task_id=task.id)}#{result.id}
"""
mail = f"""\
Subject: [Argos] {urlparse(task.url).netloc}: status {severity}
{msg}"""
if config.ssl:
logger.debug("Mail notification: SSL")
context = ssl.create_default_context()
smtp = smtplib.SMTP_SSL(host=config.host, port=config.port, context=context)
else:
smtp = smtplib.SMTP(
host=config.host, # type: ignore
port=config.port,
)
if config.starttls:
logger.debug("Mail notification: STARTTLS")
context = ssl.create_default_context()
smtp.starttls(context=context)
if config.auth is not None:
logger.debug("Mail notification: authentification")
smtp.login(config.auth.login, config.auth.password)
for address in config.addresses:
logger.debug("Sending mail to %s", address)
logger.debug(msg)
smtp.sendmail(config.mailfrom, address, mail)
mail = EmailMessage()
mail[
"Subject"
] = f"[Argos] {icon} {urlparse(task.url).netloc} (IPv{task.ip_version}): status {severity}"
mail["From"] = config.mailfrom
mail.set_content(msg)
send_mail(mail, config)
def notify_with_gotify(
def notify_with_gotify( # pylint: disable-msg=too-many-positional-arguments
result, task, severity: str, old_severity: str, config: List[GotifyUrl], request
) -> None:
logger.debug("Will send gotify notification")
headers = {"accept": "application/json", "content-type": "application/json"}
icon = get_icon_from_severity(severity)
priority = 9
icon = ""
if severity == Severity.OK:
priority = 1
icon = ""
elif severity == Severity.WARNING:
priority = 5
icon = "⚠️"
elif severity == Severity.UNKNOWN:
priority = 5
subject = f"{icon} {urlparse(task.url).netloc}: status {severity}"
subject = (
f"{icon} {urlparse(task.url).netloc} (IPv{task.ip_version}): status {severity}"
)
msg = f"""\
URL: {task.url}
Check: {task.check}
Status: {severity}
Time: {result.submitted_at}
Previous status: {old_severity}
See result on {request.url_for('get_result_view', result_id=result.id)}
See results of task on {request.url_for('get_task_results_view', task_id=task.id)}#{result.id}
URL:    <{task.url}> (IPv{task.ip_version})\\
Check:  {task.check}\\
Status: {severity}\\
Time:   {result.submitted_at}\\
Previous status: {old_severity}\\
\\
See result on <{request.url_for('get_result_view', result_id=result.id)}>\\
\\
See results of task on <{request.url_for('get_task_results_view', task_id=task.id)}#{result.id}>
"""
extras = {
"client::display": {"contentType": "text/markdown"},
"client::notification": {
"click": {
"url": f"{request.url_for('get_result_view', result_id=result.id)}"
}
},
}
payload = {"title": subject, "message": msg, "priority": priority}
payload = {"title": subject, "message": msg, "priority": priority, "extras": extras}
for url in config:
logger.debug("Sending gotify message(s) to %s", url)
for token in url.tokens:
try:
res = httpx.post(
f"{url.url}message",
params={"token": token},
headers=headers,
json=payload,
)
res.raise_for_status()
except httpx.RequestError as err:
logger.error(
"An error occurred while sending a message to %s with token %s",
err.request.url,
token,
)
send_gotify_msg(config, payload)

View file

@ -1,40 +1,59 @@
import os
import sys
from contextlib import asynccontextmanager
from pathlib import Path
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
from fastapi_login import LoginManager
from pydantic import ValidationError
from fastapi_utils.tasks import repeat_every
from psutil import Process
from sqlalchemy import create_engine, event
from sqlalchemy.orm import sessionmaker
from argos.logging import logger
from argos.logging import logger, set_log_level
from argos.server import models, routes, queries
from argos.server.alerting import no_agent_alert
from argos.server.exceptions import NotAuthenticatedException, auth_exception_handler
from argos.server.settings import read_yaml_config
from argos.server.settings import read_config
def get_application() -> FastAPI:
"""Spawn Argos FastAPI server"""
appli = FastAPI(lifespan=lifespan)
config_file = os.environ["ARGOS_YAML_FILE"]
config = read_config(config_file)
root_path = config.general.root_path
if root_path != "":
logger.info("Root path for Argos: %s", root_path)
if root_path.endswith("/"):
root_path = root_path[:-1]
logger.info("Fixed root path for Argos: %s", root_path)
appli = FastAPI(lifespan=lifespan, root_path=root_path)
# Config is the argos config object (built from yaml)
appli.state.config = config
appli.add_exception_handler(NotAuthenticatedException, auth_exception_handler)
appli.state.manager = create_manager(config.general.cookie_secret)
if config.general.ldap is not None:
import ldap
appli.state.ldap = ldap.initialize(config.general.ldap.uri)
@appli.state.manager.user_loader()
async def query_user(user: str) -> None | models.User:
async def query_user(user: str) -> None | str | models.User:
"""
Get a user from the db
Get a user from the db or LDAP
:param user: name of the user
:return: None or the user object
"""
if appli.state.config.general.ldap is not None:
from argos.server.routes.dependencies import find_ldap_user
return await find_ldap_user(appli.state.config, appli.state.ldap, user)
return await queries.get_user(appli.state.db, user)
appli.include_router(routes.api, prefix="/api")
@ -51,17 +70,6 @@ async def connect_to_db(appli):
return appli.state.db
def read_config(yaml_file):
try:
config = read_yaml_config(yaml_file)
return config
except ValidationError as err:
logger.error("Errors where found while reading configuration:")
for error in err.errors():
logger.error("%s is %s", error["loc"], error["type"])
sys.exit(1)
def setup_database(appli):
config = appli.state.config
db_url = str(config.general.db.url)
@ -92,7 +100,7 @@ def setup_database(appli):
models.Base.metadata.create_all(bind=engine)
def create_manager(cookie_secret):
def create_manager(cookie_secret: str) -> LoginManager:
if cookie_secret == "foo_bar_baz":
logger.warning(
"You should change the cookie_secret secret in your configuration file."
@ -106,8 +114,47 @@ def create_manager(cookie_secret):
)
@repeat_every(seconds=120, logger=logger)
async def recurring_tasks() -> None:
"""Recurring DB cleanup and watch-agents tasks"""
# If we are using gunicorn
if not hasattr(app.state, "SessionLocal"):
parent_process = Process(os.getppid())
children = parent_process.children(recursive=True)
# Start the task only once, not for every worker
if children[0].pid == os.getpid():
# and we need to setup database engine
setup_database(app)
else:
return None
set_log_level("info", quiet=True)
logger.info("Start background recurring tasks")
with app.state.SessionLocal() as db:
config = app.state.config.recurring_tasks
agents = await queries.get_recent_agents_count(db, config.time_without_agent)
if agents == 0:
no_agent_alert(app.state.config)
logger.info("Agent presence checked")
removed = await queries.remove_old_results(db, config.max_results_age)
logger.info("%i result(s) removed", removed)
updated = await queries.release_old_locks(db, config.max_lock_seconds)
logger.info("%i lock(s) released", updated)
processed_jobs = await queries.process_jobs(db)
logger.info("%i job(s) processed", processed_jobs)
logger.info("Background recurring tasks ended")
return None
@asynccontextmanager
async def lifespan(appli):
async def lifespan(appli: FastAPI):
"""Server start and stop actions
Setup database connection then close it at shutdown.
@ -122,6 +169,7 @@ async def lifespan(appli):
"There is no tasks in the database. "
'Please launch the command "argos server reload-config"'
)
await recurring_tasks()
yield

View file

@ -0,0 +1,37 @@
"""Add recheck delay
Revision ID: 127d74c770bb
Revises: dcf73fa19fce
Create Date: 2024-11-27 16:04:58.138768
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "127d74c770bb"
down_revision: Union[str, None] = "dcf73fa19fce"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.add_column(sa.Column("recheck_delay", sa.Float(), nullable=True))
batch_op.add_column(
sa.Column(
"already_retried",
sa.Boolean(),
nullable=False,
server_default=sa.sql.false(),
)
)
def downgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.drop_column("already_retried")
batch_op.drop_column("recheck_delay")

View file

@ -0,0 +1,28 @@
"""Add request data to tasks
Revision ID: 31255a412d63
Revises: 80a29f64f91c
Create Date: 2024-12-09 16:40:20.926138
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "31255a412d63"
down_revision: Union[str, None] = "80a29f64f91c"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.add_column(sa.Column("request_data", sa.String(), nullable=True))
def downgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.drop_column("request_data")

View file

@ -0,0 +1,36 @@
"""Add job queue
Revision ID: 5f6cb30db996
Revises: bd4b4962696a
Create Date: 2025-02-17 16:56:36.673511
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "5f6cb30db996"
down_revision: Union[str, None] = "bd4b4962696a"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.create_table(
"jobs",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column("todo", sa.Enum("RELOAD_CONFIG", name="todo_enum"), nullable=False),
sa.Column("args", sa.String(), nullable=False),
sa.Column(
"current", sa.Boolean(), server_default=sa.sql.false(), nullable=False
),
sa.Column("added_at", sa.DateTime(), nullable=False),
sa.PrimaryKeyConstraint("id"),
)
def downgrade() -> None:
op.drop_table("jobs")

View file

@ -0,0 +1,34 @@
"""Add IP version to checks
Revision ID: 64f73a79b7d8
Revises: a1e98cf72a5c
Create Date: 2024-12-02 14:12:40.558033
"""
from typing import Sequence, Union
from alembic import op
from sqlalchemy.dialects.postgresql import ENUM
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "64f73a79b7d8"
down_revision: Union[str, None] = "a1e98cf72a5c"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
enum = ENUM("4", "6", name="ip_version_enum", create_type=False)
enum.create(op.get_bind(), checkfirst=True)
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.add_column(
sa.Column("ip_version", enum, server_default="4", nullable=False)
)
def downgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.drop_column("ip_version")
ENUM(name="ip_version_enum").drop(op.get_bind(), checkfirst=True)

View file

@ -0,0 +1,41 @@
"""Add retries before notification feature
Revision ID: 80a29f64f91c
Revises: 8b58ced14d6e
Create Date: 2024-12-04 17:03:35.104368
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "80a29f64f91c"
down_revision: Union[str, None] = "8b58ced14d6e"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"retry_before_notification",
sa.Integer(),
server_default="0",
nullable=False,
)
)
batch_op.add_column(
sa.Column(
"contiguous_failures", sa.Integer(), server_default="0", nullable=False
)
)
def downgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.drop_column("contiguous_failures")
batch_op.drop_column("retry_before_notification")

View file

@ -0,0 +1,35 @@
"""Add task index
Revision ID: 8b58ced14d6e
Revises: 64f73a79b7d8
Create Date: 2024-12-03 16:41:44.842213
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "8b58ced14d6e"
down_revision: Union[str, None] = "64f73a79b7d8"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.add_column(sa.Column("task_group", sa.String(), nullable=True))
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.execute(
"UPDATE tasks SET task_group = method || '-' || ip_version || '-' || url"
)
batch_op.alter_column("task_group", nullable=False)
batch_op.create_index("similar_tasks", ["task_group"], unique=False)
def downgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.drop_index("similar_tasks")
batch_op.drop_column("task_group")

View file

@ -0,0 +1,38 @@
"""Make frequency a float
Revision ID: a1e98cf72a5c
Revises: 127d74c770bb
Create Date: 2024-11-27 16:10:13.000705
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "a1e98cf72a5c"
down_revision: Union[str, None] = "127d74c770bb"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.alter_column(
"frequency",
existing_type=sa.INTEGER(),
type_=sa.Float(),
existing_nullable=False,
)
def downgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.alter_column(
"frequency",
existing_type=sa.Float(),
type_=sa.INTEGER(),
existing_nullable=False,
)

View file

@ -0,0 +1,42 @@
"""Use bigint for results id field
Revision ID: bd4b4962696a
Revises: 31255a412d63
Create Date: 2025-01-06 11:44:37.552965
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "bd4b4962696a"
down_revision: Union[str, None] = "31255a412d63"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
bind = op.get_bind()
if bind.engine.name != "sqlite":
with op.batch_alter_table("results", schema=None) as batch_op:
batch_op.alter_column(
"id",
existing_type=sa.INTEGER(),
type_=sa.BigInteger(),
existing_nullable=False,
)
def downgrade() -> None:
bind = op.get_bind()
if bind.engine.name != "sqlite":
with op.batch_alter_table("results", schema=None) as batch_op:
batch_op.alter_column(
"id",
existing_type=sa.BigInteger(),
type_=sa.INTEGER(),
existing_nullable=False,
)

View file

@ -0,0 +1,51 @@
"""Specify check method
Revision ID: dcf73fa19fce
Revises: c780864dc407
Create Date: 2024-11-26 14:40:27.510587
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = "dcf73fa19fce"
down_revision: Union[str, None] = "c780864dc407"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
enum = sa.Enum(
"GET",
"HEAD",
"POST",
"OPTIONS",
"CONNECT",
"TRACE",
"PUT",
"PATCH",
"DELETE",
name="method",
create_type=False,
)
enum.create(op.get_bind(), checkfirst=True)
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.add_column(
sa.Column(
"method",
enum,
nullable=False,
server_default="GET",
)
)
def downgrade() -> None:
with op.batch_alter_table("tasks", schema=None) as batch_op:
batch_op.drop_column("method")
sa.Enum(name="method").drop(op.get_bind(), checkfirst=True)

View file

@ -1,6 +1,7 @@
"""Database models"""
from datetime import datetime, timedelta
from hashlib import md5
from typing import List, Literal
from sqlalchemy import (
@ -9,15 +10,42 @@ from sqlalchemy import (
ForeignKey,
)
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
from sqlalchemy.schema import Index
from argos.checks import BaseCheck, get_registered_check
from argos.schemas import WebsiteCheck
from argos.schemas.utils import IPVersion, Method, Todo
def compute_task_group(context) -> str:
data = context.current_parameters["request_data"]
if data is None:
data = ""
return (
f"{context.current_parameters['method']}-"
f"{context.current_parameters['ip_version']}-"
f"{context.current_parameters['url']}-"
f"{md5(data.encode()).hexdigest()}"
)
class Base(DeclarativeBase):
type_annotation_map = {List[WebsiteCheck]: JSON, dict: JSON}
class Job(Base):
"""
Job queue emulation
"""
__tablename__ = "jobs"
id: Mapped[int] = mapped_column(primary_key=True)
todo: Mapped[Todo] = mapped_column(Enum("RELOAD_CONFIG", name="todo_enum"))
args: Mapped[str] = mapped_column()
current: Mapped[bool] = mapped_column(insert_default=False)
added_at: Mapped[datetime] = mapped_column()
class Task(Base):
"""
There is one task per check.
@ -32,15 +60,39 @@ class Task(Base):
# Info needed to run the task
url: Mapped[str] = mapped_column()
domain: Mapped[str] = mapped_column()
ip_version: Mapped[IPVersion] = mapped_column(
Enum("4", "6", name="ip_version_enum"),
)
check: Mapped[str] = mapped_column()
expected: Mapped[str] = mapped_column()
frequency: Mapped[int] = mapped_column()
frequency: Mapped[float] = mapped_column()
recheck_delay: Mapped[float] = mapped_column(nullable=True)
already_retried: Mapped[bool] = mapped_column(insert_default=False)
retry_before_notification: Mapped[int] = mapped_column(insert_default=0)
contiguous_failures: Mapped[int] = mapped_column(insert_default=0)
method: Mapped[Method] = mapped_column(
Enum(
"GET",
"HEAD",
"POST",
"OPTIONS",
"CONNECT",
"TRACE",
"PUT",
"PATCH",
"DELETE",
name="method",
),
insert_default="GET",
)
request_data: Mapped[str] = mapped_column(nullable=True)
# Orchestration-related
selected_by: Mapped[str] = mapped_column(nullable=True)
selected_at: Mapped[datetime] = mapped_column(nullable=True)
completed_at: Mapped[datetime] = mapped_column(nullable=True)
next_run: Mapped[datetime] = mapped_column(nullable=True)
task_group: Mapped[str] = mapped_column(insert_default=compute_task_group)
severity: Mapped[Literal["ok", "warning", "critical", "unknown"]] = mapped_column(
Enum("ok", "warning", "critical", "unknown", name="severity"),
@ -54,8 +106,8 @@ class Task(Base):
passive_deletes=True,
)
def __str__(self):
return f"DB Task {self.url} - {self.check} - {self.expected}"
def __str__(self) -> str:
return f"DB Task {self.url} (IPv{self.ip_version}) - {self.check} - {self.expected}"
def get_check(self) -> BaseCheck:
"""Returns a check instance for this specific task"""
@ -70,7 +122,16 @@ class Task(Base):
now = datetime.now()
self.completed_at = now
if (
self.recheck_delay is not None
and severity != "ok"
and not self.already_retried
):
self.next_run = now + timedelta(minutes=self.recheck_delay)
self.already_retried = True
else:
self.next_run = now + timedelta(minutes=self.frequency)
self.already_retried = False
@property
def last_result(self):
@ -87,6 +148,9 @@ class Task(Base):
return self.last_result.status
Index("similar_tasks", Task.task_group)
class Result(Base):
"""There are multiple results per task.

View file

@ -4,25 +4,27 @@ from hashlib import sha256
from typing import List
from urllib.parse import urljoin
from sqlalchemy import asc, desc, func
from sqlalchemy import asc, func, Select
from sqlalchemy.orm import Session
from argos import schemas
from argos.logging import logger
from argos.server.models import Result, Task, ConfigCache, User
from argos.server.models import ConfigCache, Job, Result, Task, User
from argos.server.settings import read_config
async def list_tasks(db: Session, agent_id: str, limit: int = 100):
"""List tasks and mark them as selected"""
tasks = (
db.query(Task)
subquery = (
db.query(func.distinct(Task.task_group))
.filter(
Task.selected_by == None, # noqa: E711
((Task.next_run <= datetime.now()) | (Task.next_run == None)), # noqa: E711
)
.limit(limit)
.all()
.subquery()
)
tasks = db.query(Task).filter(Task.task_group.in_(Select(subquery))).all()
now = datetime.now()
for task in tasks:
@ -51,7 +53,7 @@ async def list_users(db: Session):
return db.query(User).order_by(asc(User.username))
async def get_task(db: Session, task_id: int) -> Task:
async def get_task(db: Session, task_id: int) -> None | Task:
return db.get(Task, task_id)
@ -71,9 +73,9 @@ async def count_tasks(db: Session, selected: None | bool = None):
query = db.query(Task)
if selected is not None:
if selected:
query = query.filter(Task.selected_by is not None)
query = query.filter(Task.selected_by is not None) # type: ignore[arg-type]
else:
query = query.filter(Task.selected_by is None)
query = query.filter(Task.selected_by is None) # type: ignore[arg-type]
return query.count()
@ -82,13 +84,22 @@ async def count_results(db: Session):
return db.query(Result).count()
async def has_config_changed(db: Session, config: schemas.Config) -> bool:
async def has_config_changed(db: Session, config: schemas.Config) -> bool: # pylint: disable-msg=too-many-statements
"""Check if websites config has changed by using a hashsum and a config cache"""
websites_hash = sha256(str(config.websites).encode()).hexdigest()
conf_caches = db.query(ConfigCache).all()
same_config = True
keys = [
"websites_hash",
"general_frequency",
"general_recheck_delay",
"general_retry_before_notification",
"general_ipv4",
"general_ipv6",
]
if conf_caches:
for conf in conf_caches:
keys.remove(conf.name)
match conf.name:
case "websites_hash":
if conf.val != websites_hash:
@ -98,11 +109,74 @@ async def has_config_changed(db: Session, config: schemas.Config) -> bool:
case "general_frequency":
if conf.val != str(config.general.frequency):
same_config = False
conf.val = config.general.frequency
conf.val = str(config.general.frequency)
conf.updated_at = datetime.now()
case "general_recheck_delay":
if conf.val != str(config.general.recheck_delay):
same_config = False
conf.val = str(config.general.recheck_delay)
conf.updated_at = datetime.now()
case "general_retry_before_notification":
if conf.val != str(config.general.retry_before_notification):
same_config = False
conf.val = str(config.general.retry_before_notification)
conf.updated_at = datetime.now()
case "general_ipv4":
if conf.val != str(config.general.ipv4):
same_config = False
conf.val = str(config.general.ipv4)
conf.updated_at = datetime.now()
case "general_ipv6":
if conf.val != str(config.general.ipv6):
same_config = False
conf.val = str(config.general.ipv6)
conf.updated_at = datetime.now()
for i in keys:
match i:
case "websites_hash":
c = ConfigCache(
name="websites_hash",
val=websites_hash,
updated_at=datetime.now(),
)
case "general_frequency":
c = ConfigCache(
name="general_frequency",
val=str(config.general.frequency),
updated_at=datetime.now(),
)
case "general_recheck_delay":
c = ConfigCache(
name="general_recheck_delay",
val=str(config.general.recheck_delay),
updated_at=datetime.now(),
)
case "general_retry_before_notification":
c = ConfigCache(
name="general_retry_before_notification",
val=str(config.general.retry_before_notification),
updated_at=datetime.now(),
)
case "general_ipv4":
c = ConfigCache(
name="general_ipv4",
val=str(config.general.ipv4),
updated_at=datetime.now(),
)
case "general_ipv6":
c = ConfigCache(
name="general_ipv6",
val=str(config.general.ipv6),
updated_at=datetime.now(),
)
db.add(c)
db.commit()
if keys:
return True
if same_config:
return False
@ -115,29 +189,103 @@ async def has_config_changed(db: Session, config: schemas.Config) -> bool:
val=str(config.general.frequency),
updated_at=datetime.now(),
)
gen_recheck = ConfigCache(
name="general_recheck_delay",
val=str(config.general.recheck_delay),
updated_at=datetime.now(),
)
gen_retry_before_notif = ConfigCache(
name="general_retry_before_notification",
val=str(config.general.retry_before_notification),
updated_at=datetime.now(),
)
gen_ipv4 = ConfigCache(
name="general_ipv4",
val=str(config.general.ipv4),
updated_at=datetime.now(),
)
gen_ipv6 = ConfigCache(
name="general_ipv6",
val=str(config.general.ipv6),
updated_at=datetime.now(),
)
db.add(web_hash)
db.add(gen_freq)
db.add(gen_recheck)
db.add(gen_retry_before_notif)
db.add(gen_ipv4)
db.add(gen_ipv6)
db.commit()
return True
async def update_from_config(db: Session, config: schemas.Config):
"""Update tasks from config file"""
config_changed = await has_config_changed(db, config)
if not config_changed:
return {"added": 0, "vanished": 0}
async def update_from_config_later(db: Session, config_file):
"""Ask Argos to reload configuration in a recurring task"""
jobs = (
db.query(Job)
.filter(
Job.todo == "RELOAD_CONFIG",
Job.args == config_file,
Job.current == False,
)
.all()
)
if jobs:
return "There is already a config reloading job in the job queue, for the same file"
job = Job(todo="RELOAD_CONFIG", args=config_file, added_at=datetime.now())
db.add(job)
db.commit()
return "Config reloading has been added in the job queue"
async def process_jobs(db: Session) -> int:
"""Process job queue"""
jobs = db.query(Job).filter(Job.current == False).all()
if jobs:
for job in jobs:
job.current = True
db.commit()
if job.todo == "RELOAD_CONFIG":
logger.info("Processing job %i: %s %s", job.id, job.todo, job.args)
_config = read_config(job.args)
changed = await update_from_config(db, _config)
logger.info("%i task(s) added", changed["added"])
logger.info("%i task(s) deleted", changed["vanished"])
db.delete(job)
db.commit()
return len(jobs)
return 0
async def update_from_config(db: Session, config: schemas.Config): # pylint: disable-msg=too-many-branches
"""Update tasks from config file"""
max_task_id = (
db.query(func.max(Task.id).label("max_id")).all() # pylint: disable-msg=not-callable
)[0].max_id
tasks = []
unique_properties = []
seen_tasks: List[int] = []
for website in config.websites:
for website in config.websites: # pylint: disable-msg=too-many-nested-blocks
domain = str(website.domain)
frequency = website.frequency or config.general.frequency
recheck_delay = website.recheck_delay or config.general.recheck_delay
retry_before_notification = (
website.retry_before_notification
if website.retry_before_notification is not None
else config.general.retry_before_notification
)
ipv4 = website.ipv4 if website.ipv4 is not None else config.general.ipv4
ipv6 = website.ipv6 if website.ipv6 is not None else config.general.ipv6
if ipv4 is False and ipv6 is False:
logger.warning("IPv4 AND IPv6 are disabled on website %s!", domain)
continue
for ip_version in ["4", "6"]:
for p in website.paths:
url = urljoin(domain, str(p.path))
for check_key, expected in p.checks:
@ -146,36 +294,74 @@ async def update_from_config(db: Session, config: schemas.Config):
db.query(Task)
.filter(
Task.url == url,
Task.method == p.method,
Task.request_data == p.request_data,
Task.check == check_key,
Task.expected == expected,
Task.ip_version == ip_version,
)
.all()
)
if (ip_version == "4" and ipv4 is False) or (
ip_version == "6" and ipv6 is False
):
continue
if existing_tasks:
existing_task = existing_tasks[0]
seen_tasks.append(existing_task.id)
if frequency != existing_task.frequency:
existing_task.frequency = frequency
if recheck_delay != existing_task.recheck_delay:
existing_task.recheck_delay = recheck_delay # type: ignore[assignment]
if (
retry_before_notification
!= existing_task.retry_before_notification
):
existing_task.retry_before_notification = (
retry_before_notification
)
logger.debug(
"Skipping db task creation for url=%s, "
"check_key=%s, expected=%s, frequency=%s.",
"method=%s, check_key=%s, expected=%s, "
"frequency=%s, recheck_delay=%s, "
"retry_before_notification=%s, ip_version=%s.",
url,
p.method,
check_key,
expected,
frequency,
recheck_delay,
retry_before_notification,
ip_version,
)
else:
properties = (url, check_key, expected)
properties = (
url,
p.method,
check_key,
expected,
ip_version,
p.request_data,
)
if properties not in unique_properties:
unique_properties.append(properties)
task = Task(
domain=domain,
url=url,
ip_version=ip_version,
method=p.method,
request_data=p.request_data,
check=check_key,
expected=expected,
frequency=frequency,
recheck_delay=recheck_delay,
retry_before_notification=retry_before_notification,
already_retried=False,
)
logger.debug("Adding a new task in the db: %s", task)
tasks.append(task)
@ -192,7 +378,8 @@ async def update_from_config(db: Session, config: schemas.Config):
)
db.commit()
logger.info(
"%i tasks has been removed since not in config file anymore", vanished_tasks
"%i task(s) has been removed since not in config file anymore",
vanished_tasks,
)
return {"added": len(tasks), "vanished": vanished_tasks}
@ -208,7 +395,7 @@ async def get_severity_counts(db: Session) -> dict:
# Execute the query and fetch the results
task_counts_by_severity = query.all()
counts_dict = dict(task_counts_by_severity)
counts_dict = dict(task_counts_by_severity) # type: ignore[var-annotated,arg-type]
for key in ("ok", "warning", "critical", "unknown"):
counts_dict.setdefault(key, 0)
return counts_dict
@ -222,26 +409,11 @@ async def reschedule_all(db: Session):
db.commit()
async def remove_old_results(db: Session, max_results: int):
tasks = db.query(Task).all()
deleted = 0
for task in tasks:
# Get the id of the oldest result to keep
subquery = (
db.query(Result.id)
.filter(Result.task_id == task.id)
.order_by(desc(Result.id))
.limit(max_results)
.subquery()
)
min_id = db.query(func.min(subquery.c.id)).scalar() # pylint: disable-msg=not-callable
# Delete all the results older than min_id
if min_id:
deleted += (
db.query(Result)
.where(Result.id < min_id, Result.task_id == task.id)
.delete()
async def remove_old_results(db: Session, max_results_age: float):
"""Remove old results, base on age"""
max_acceptable_time = datetime.now() - timedelta(seconds=max_results_age)
deleted = (
db.query(Result).filter(Result.submitted_at < max_acceptable_time).delete()
)
db.commit()

View file

@ -1,5 +1,5 @@
"""Web interface for machines"""
from typing import List, Union
from typing import List
from fastapi import APIRouter, BackgroundTasks, Depends, Request
from sqlalchemy.orm import Session
@ -7,7 +7,7 @@ from sqlalchemy.orm import Session
from argos.logging import logger
from argos.schemas import AgentResult, Config, Task
from argos.server import queries
from argos.server.alerting import handle_alert
from argos.server.alerting import handle_alert, need_alert
from argos.server.routes.dependencies import get_config, get_db, verify_token
route = APIRouter()
@ -18,22 +18,25 @@ async def read_tasks(
request: Request,
db: Session = Depends(get_db),
limit: int = 10,
agent_id: Union[None, str] = None,
agent_id: None | str = None,
):
"""Return a list of tasks to execute"""
agent_id = agent_id or request.client.host
host = ""
if request.client is not None:
host = request.client.host
agent_id = agent_id or host
tasks = await queries.list_tasks(db, agent_id=agent_id, limit=limit)
return tasks
@route.post("/results", status_code=201, dependencies=[Depends(verify_token)])
async def create_results(
async def create_results( # pylint: disable-msg=too-many-positional-arguments
request: Request,
results: List[AgentResult],
background_tasks: BackgroundTasks,
db: Session = Depends(get_db),
config: Config = Depends(get_config),
agent_id: Union[None, str] = None,
agent_id: None | str = None,
):
"""Get the results from the agents and store them locally.
@ -42,7 +45,10 @@ async def create_results(
- If it's an error, determine its severity ;
- Trigger the reporting calls
"""
agent_id = agent_id or request.client.host
host = ""
if request.client is not None:
host = request.client.host
agent_id = agent_id or host
db_results = []
for agent_result in results:
# XXX Maybe offload this to a queue.
@ -52,16 +58,26 @@ async def create_results(
logger.error("Unable to find task %i", agent_result.task_id)
else:
last_severity = task.severity
last_severity_update = task.last_severity_update
result = await queries.create_result(db, agent_result, agent_id)
check = task.get_check()
status, severity = await check.finalize(config, result, **result.context)
result.set_status(status, severity)
task.set_times_severity_and_deselect(severity, result.submitted_at)
# Dont create an alert if the severity has not changed
if last_severity != severity:
send_notif = need_alert(
last_severity, last_severity_update, severity, status, task
)
if send_notif:
background_tasks.add_task(
handle_alert, config, result, task, severity, last_severity, request
handle_alert,
config,
result,
task,
severity,
last_severity,
request,
)
db_results.append(result)

View file

@ -1,5 +1,8 @@
from fastapi import Depends, HTTPException, Request
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
from fastapi_login import LoginManager
from argos.logging import logger
auth_scheme = HTTPBearer()
@ -16,7 +19,10 @@ def get_config(request: Request):
return request.app.state.config
async def get_manager(request: Request):
async def get_manager(request: Request) -> LoginManager:
if request.app.state.config.general.unauthenticated_access is not None:
return await request.app.state.manager.optional(request)
return await request.app.state.manager(request)
@ -27,3 +33,35 @@ async def verify_token(
if token.credentials not in request.app.state.config.service.secrets:
raise HTTPException(status_code=401, detail="Unauthorized")
return token
async def find_ldap_user(config, ldapobj, user: str) -> str | None:
"""Do a LDAP search for user and return its dn"""
import ldap
import ldap.filter as ldap_filter
from ldapurl import LDAP_SCOPE_SUBTREE
try:
ldapobj.simple_bind_s(config.general.ldap.bind_dn, config.general.ldap.bind_pwd)
except ldap.LDAPError as err: # pylint: disable-msg=no-member
logger.error("LDAP error: %s", err)
return None
result = ldapobj.search_s(
config.general.ldap.user_tree,
LDAP_SCOPE_SUBTREE,
filterstr=ldap_filter.filter_format(
f"(&(%s=%s){config.general.ldap.user_filter})",
[
config.general.ldap.user_attr,
user,
],
),
attrlist=[config.general.ldap.user_attr],
)
# If there is a result, there should, logically, be only one entry
if len(result) > 0:
return result[0][0]
return None

View file

@ -14,8 +14,10 @@ from passlib.context import CryptContext
from sqlalchemy import func
from sqlalchemy.orm import Session
from argos.checks.base import Status
from argos.schemas import Config
from argos.server import queries
from argos.server.exceptions import NotAuthenticatedException
from argos.server.models import Result, Task, User
from argos.server.routes.dependencies import get_config, get_db, get_manager
@ -27,7 +29,17 @@ SEVERITY_LEVELS = {"ok": 1, "warning": 2, "critical": 3, "unknown": 4}
@route.get("/login")
async def login_view(request: Request, msg: str | None = None):
async def login_view(
request: Request,
msg: str | None = None,
config: Config = Depends(get_config),
):
if config.general.unauthenticated_access == "all":
return RedirectResponse(
request.url_for("get_severity_counts_view"),
status_code=status.HTTP_303_SEE_OTHER,
)
token = request.cookies.get("access-token")
if token is not None and token != "":
manager = request.app.state.manager
@ -43,7 +55,14 @@ async def login_view(request: Request, msg: str | None = None):
else:
msg = None
return templates.TemplateResponse("login.html", {"request": request, "msg": msg})
return templates.TemplateResponse(
"login.html",
{
"request": request,
"msg": msg,
"remember": config.general.remember_me_duration,
},
)
@route.post("/login")
@ -51,13 +70,44 @@ async def post_login(
request: Request,
db: Session = Depends(get_db),
data: OAuth2PasswordRequestForm = Depends(),
rememberme: Annotated[str | None, Form()] = None,
config: Config = Depends(get_config),
):
if config.general.unauthenticated_access == "all":
return RedirectResponse(
request.url_for("get_severity_counts_view"),
status_code=status.HTTP_303_SEE_OTHER,
)
username = data.username
user = await queries.get_user(db, username)
invalid_credentials = templates.TemplateResponse(
"login.html",
{"request": request, "msg": "Sorry, invalid username or bad password."},
)
if config.general.ldap is not None:
from ldap import INVALID_CREDENTIALS # pylint: disable-msg=no-name-in-module
from argos.server.routes.dependencies import find_ldap_user
invalid_credentials = templates.TemplateResponse(
"login.html",
{
"request": request,
"msg": "Sorry, invalid username or bad password. "
"Or the LDAP server is unreachable (see logs to verify).",
},
)
ldap_dn = await find_ldap_user(config, request.app.state.ldap, username)
if ldap_dn is None:
return invalid_credentials
try:
request.app.state.ldap.simple_bind_s(ldap_dn, data.password)
except INVALID_CREDENTIALS:
return invalid_credentials
else:
user = await queries.get_user(db, username)
if user is None:
return invalid_credentials
@ -69,19 +119,37 @@ async def post_login(
db.commit()
manager = request.app.state.manager
token = manager.create_access_token(
data={"sub": username}, expires=timedelta(days=7)
)
session_duration = config.general.session_duration
if config.general.remember_me_duration is not None and rememberme == "on":
session_duration = config.general.remember_me_duration
delta = timedelta(minutes=session_duration)
token = manager.create_access_token(data={"sub": username}, expires=delta)
response = RedirectResponse(
request.url_for("get_severity_counts_view"),
status_code=status.HTTP_303_SEE_OTHER,
)
manager.set_cookie(response, token)
response.set_cookie(
key=manager.cookie_name,
value=token,
httponly=True,
samesite="strict",
expires=int(delta.total_seconds()),
)
return response
@route.get("/logout")
async def logout_view(request: Request, user: User | None = Depends(get_manager)):
async def logout_view(
request: Request,
config: Config = Depends(get_config),
user: User | None = Depends(get_manager),
):
if config.general.unauthenticated_access == "all":
return RedirectResponse(
request.url_for("get_severity_counts_view"),
status_code=status.HTTP_303_SEE_OTHER,
)
response = RedirectResponse(
request.url_for("login_view").include_query_params(msg="logout"),
status_code=status.HTTP_303_SEE_OTHER,
@ -111,6 +179,7 @@ async def get_severity_counts_view(
"agents": agents,
"auto_refresh_enabled": auto_refresh_enabled,
"auto_refresh_seconds": auto_refresh_seconds,
"user": user,
},
)
@ -119,13 +188,18 @@ async def get_severity_counts_view(
async def get_domains_view(
request: Request,
user: User | None = Depends(get_manager),
config: Config = Depends(get_config),
db: Session = Depends(get_db),
):
"""Show all tasks and their current state"""
if config.general.unauthenticated_access == "dashboard":
if user is None:
raise NotAuthenticatedException
tasks = db.query(Task).all()
domains_severities = defaultdict(list)
domains_last_checks = defaultdict(list)
domains_last_checks = defaultdict(list) # type: ignore[var-annotated]
for task in tasks:
domain = urlparse(task.url).netloc
@ -162,6 +236,7 @@ async def get_domains_view(
"last_checks": domains_last_checks,
"total_task_count": len(tasks),
"agents": agents,
"user": user,
},
)
@ -171,12 +246,23 @@ async def get_domain_tasks_view(
request: Request,
domain: str,
user: User | None = Depends(get_manager),
config: Config = Depends(get_config),
db: Session = Depends(get_db),
):
"""Show all tasks attached to a domain"""
if config.general.unauthenticated_access == "dashboard":
if user is None:
raise NotAuthenticatedException
tasks = db.query(Task).filter(Task.domain.contains(f"//{domain}")).all()
return templates.TemplateResponse(
"domain.html", {"request": request, "domain": domain, "tasks": tasks}
"domain.html",
{
"request": request,
"domain": domain,
"tasks": tasks,
"user": user,
},
)
@ -185,12 +271,23 @@ async def get_result_view(
request: Request,
result_id: int,
user: User | None = Depends(get_manager),
config: Config = Depends(get_config),
db: Session = Depends(get_db),
):
"""Show the details of a result"""
if config.general.unauthenticated_access == "dashboard":
if user is None:
raise NotAuthenticatedException
result = db.query(Result).get(result_id)
return templates.TemplateResponse(
"result.html", {"request": request, "result": result}
"result.html",
{
"request": request,
"result": result,
"error": Status.ERROR,
"user": user,
},
)
@ -203,6 +300,10 @@ async def get_task_results_view(
config: Config = Depends(get_config),
):
"""Show history of a tasks results"""
if config.general.unauthenticated_access == "dashboard":
if user is None:
raise NotAuthenticatedException
results = (
db.query(Result)
.filter(Result.task_id == task_id)
@ -210,6 +311,8 @@ async def get_task_results_view(
.all()
)
task = db.query(Task).get(task_id)
description = ""
if task is not None:
description = task.get_check().get_description(config)
return templates.TemplateResponse(
"results.html",
@ -218,6 +321,8 @@ async def get_task_results_view(
"results": results,
"task": task,
"description": description,
"error": Status.ERROR,
"user": user,
},
)
@ -226,9 +331,14 @@ async def get_task_results_view(
async def get_agents_view(
request: Request,
user: User | None = Depends(get_manager),
config: Config = Depends(get_config),
db: Session = Depends(get_db),
):
"""Show argos agents and the last time the server saw them"""
if config.general.unauthenticated_access == "dashboard":
if user is None:
raise NotAuthenticatedException
last_seen = (
db.query(Result.agent_id, func.max(Result.submitted_at).label("submitted_at"))
.group_by(Result.agent_id)
@ -236,7 +346,12 @@ async def get_agents_view(
)
return templates.TemplateResponse(
"agents.html", {"request": request, "last_seen": last_seen}
"agents.html",
{
"request": request,
"last_seen": last_seen,
"user": user,
},
)
@ -251,8 +366,21 @@ async def set_refresh_cookies_view(
request.url_for("get_severity_counts_view"),
status_code=status.HTTP_303_SEE_OTHER,
)
response.set_cookie(key="auto_refresh_enabled", value=auto_refresh_enabled)
# Cookies age in Chrome cant be more than 400 days
# https://developer.chrome.com/blog/cookie-max-age-expires
delta = int(timedelta(days=400).total_seconds())
response.set_cookie(
key="auto_refresh_seconds", value=max(5, int(auto_refresh_seconds))
key="auto_refresh_enabled",
value=str(auto_refresh_enabled),
httponly=True,
samesite="strict",
expires=delta,
)
response.set_cookie(
key="auto_refresh_seconds",
value=str(max(5, int(auto_refresh_seconds))),
httponly=True,
samesite="strict",
expires=delta,
)
return response

View file

@ -1,18 +1,32 @@
"""Pydantic schemas for server"""
import sys
from pathlib import Path
import yaml
from yamlinclude import YamlIncludeConstructor
from pydantic import ValidationError
from argos.logging import logger
from argos.schemas.config import Config
def read_yaml_config(filename):
def read_config(yaml_file):
try:
config = read_yaml_config(yaml_file)
return config
except ValidationError as err:
logger.error("Errors where found while reading configuration:")
for error in err.errors():
logger.error("%s is %s", error["loc"], error["type"])
sys.exit(1)
def read_yaml_config(filename: str) -> Config:
parsed = _load_yaml(filename)
return Config(**parsed)
def _load_yaml(filename):
def _load_yaml(filename: str):
base_dir = Path(filename).resolve().parent
YamlIncludeConstructor.add_to_loader_class(
loader_class=yaml.FullLoader, base_dir=str(base_dir)

View file

@ -1,13 +1,26 @@
@import url("pico.min.css");
.display-small {
display: none;
text-align: center;
}
@media (max-width: 767px) {
.display-large {
display: none !important;
}
.display-small {
display: block;
}
.display-small article {
display: inline-block;
width: 24%;
}
}
code {
white-space: pre-wrap;
}
body > header,
body > main {
padding: 0 !important;
}
#title {
margin-bottom: 0;
}
@ -53,3 +66,7 @@ label[for="select-status"] {
#refresh-delay {
max-width: 120px;
}
/* Remove chevron on menu */
#nav-menu summary::after {
background-image: none !important;
}

View file

@ -3,6 +3,10 @@
<head>
<meta charset="utf-8">
<title>Argos</title>
<meta name="description"
content="Argos monitoring">
<meta name="keywords"
content="argos, monitoring">
<link rel="shortcut icon"
href="{{ url_for('static', path='/logo.png') }}">
<meta name="viewport"
@ -12,14 +16,14 @@
{% if auto_refresh_enabled %}
<meta http-equiv="refresh"
content="{{ auto_refresh_seconds }}">
{% endif %}
{%- endif %}
<link rel="stylesheet"
href="{{ url_for('static', path='/styles.css') }}">
</head>
<body>
<header class="container">
<nav>
<a href="{{ url_for('get_severity_counts_view') }}">
<a href="{{ url_for("get_severity_counts_view") }}">
<ul>
<li>
<img src="{{ url_for('static', path='/logo-64.png') }}"
@ -34,9 +38,10 @@
</a>
{% if request.url.remove_query_params('msg') != url_for('login_view') %}
<ul>
<details class="dropdown">
<li>
<details id="nav-menu" class="dropdown">
<summary autofocus>Menu</summary>
<ul>
<ul dir="rtl">
<li>
<a href="{{ url_for('get_severity_counts_view') }}"
class="outline {{ 'contrast' if request.url == url_for('get_severity_counts_view') }}"
@ -58,6 +63,8 @@
Agents
</a>
</li>
{% set unauthenticated_access = request.app.state.config.general.unauthenticated_access %}
{% if (user is defined and user is not none) or unauthenticated_access == "all" %}
<li>
<a href="#"
id="reschedule-all"
@ -67,15 +74,27 @@
Reschedule non-ok checks
</a>
</li>
{% endif %}
{% if user is defined and user is not none %}
<li>
<a href="{{ url_for('logout_view') }}"
class="outline {{ 'contrast' if request.url == url_for('get_agents_view') }}"
class="outline }}"
role="button">
Logout
</a>
</li>
{% elif unauthenticated_access != "all" %}
<li>
<a href="{{ url_for('login_view') }}"
class="outline }}"
role="button">
Login
</a>
</li>
{% endif %}
</ul>
</details>
</li>
</ul>
{% endif %}
</nav>
@ -97,10 +116,11 @@
(<a href="https://framagit.org/framasoft/framaspace/argos">sources</a>)
<br>
API documentation:
<a href="{{ url_for('get_severity_counts_view') }}docs">Swagger</a>
<a href="{{ url_for("get_severity_counts_view") }}docs">Swagger</a>
or
<a href="{{ url_for('get_severity_counts_view') }}redoc">Redoc</a>
<a href="{{ url_for("get_severity_counts_view") }}redoc">Redoc</a>
</footer>
{% if request.url.remove_query_params('msg') != url_for('login_view') %}
<script>
async function rescheduleAll() {
const response = await fetch('{{ url_for("reschedule_all") }}', {method: 'POST'});
@ -115,7 +135,9 @@
document.getElementById('reschedule-all').addEventListener('click', event => {
event.preventDefault();
rescheduleAll();
document.getElementById('nav-menu').open = false;
});
</script>
{% endif %}
</body>
</html>

View file

@ -1,24 +1,23 @@
{% extends "base.html" %}
{% block title %}<h2>{{ domain }}</h2>{% endblock title %}
{% block content %}
<div id="domains" class="frame">
<table id="domains-list" role="grid">
<div id="domains" class="overflow-auto">
<table id="domains-list" role="grid" class="striped">
<thead>
<tr>
<th>URL</th>
<th>Check</th>
<th>Expected</th>
<th>Current status</th>
<th></th>
<th scope="col">URL</th>
<th scope="col">Check</th>
<th scope="col">Current status</th>
<th scope="col">Expected</th>
<th scope="col"></th>
</tr>
</thead>
<tbody id="domains-body">
{% for task in tasks %}
<tr>
<td>{{ task.url }}</td>
<tr scope="row">
<td>{{ task.url }} (IPv{{ task.ip_version }})</td>
<td>{{ task.check }}</td>
<td>{{ task.expected }}</td>
<td class="status highlight">
{% if task.status %}
<a data-tooltip="Completed at {{ task.completed_at }}"
@ -37,6 +36,7 @@
Waiting to be checked
{% endif %}
</td>
<td>{{ task.expected }}</td>
<td><a href="{{ url_for('get_task_results_view', task_id=task.id) }}">view all</a></td>
</tr>
{% endfor %}

View file

@ -12,15 +12,25 @@
</a>
</li>
</ul>
<ul>
{# djlint:off H021 #}
<ul id="js-only" style="display: none; ">{# djlint:on #}
<li>
<input id="domain-search"
type="search"
spellcheck="false"
placeholder="Filter domains list"
aria-label="Filter domains list"
/>
</li>
<li>
<label for="select-status">Show domains with status:</label>
<select id="select-status">
<option value="all">All</option>
<option value="not-ok" selected>Not OK</option>
<option value="ok">✅ OK</option>
<option value="warning">⚠️ Warning</option>
<option value="critical">❌ Critical</option>
<option value="unknown">❔ Unknown</option>
<option value="all">All</option>
</select>
</li>
</ul>
@ -36,7 +46,8 @@
<tbody id="domains-body">
{% for (domain, status) in domains %}
<tr data-status={{ status }}>
<tr data-status="{{ status }}"
data-domain="{{ domain }}">
<td>
<a href="{{ url_for('get_domain_tasks_view', domain=domain) }}">
{{ domain }}
@ -60,20 +71,47 @@
</table>
</div>
<script>
document.getElementById('select-status').addEventListener('change', (e) => {
if (e.currentTarget.value === 'all') {
function filterDomains() {
let status = document.getElementById('select-status');
let filter = document.getElementById('domain-search').value;
console.log(filter)
if (status.value === 'all') {
document.querySelectorAll('[data-status]').forEach((item) => {
if (filter && item.dataset.domain.indexOf(filter) == -1) {
item.style.display = 'none';
} else {
item.style.display = null;
}
})
} else if (status.value === 'not-ok') {
document.querySelectorAll('[data-status]').forEach((item) => {
if (item.dataset.status !== 'ok') {
if (filter && item.dataset.domain.indexOf(filter) == -1) {
item.style.display = 'none';
} else {
item.style.display = null;
}
} else {
item.style.display = 'none';
}
})
} else {
document.querySelectorAll('[data-status]').forEach((item) => {
if (item.dataset.status === e.currentTarget.value) {
if (item.dataset.status === status.value) {
if (filter && item.dataset.domain.indexOf(filter) == -1) {
item.style.display = 'none';
} else {
item.style.display = null;
}
} else {
item.style.display = 'none';
}
})
}
});
}
document.getElementById('select-status').addEventListener('change', filterDomains);
document.getElementById('domain-search').addEventListener('input', filterDomains);
filterDomains()
document.getElementById('js-only').style.display = null;
</script>
{% endblock content %}

View file

@ -1,11 +1,13 @@
{% extends "base.html" %}
{% block title %}<h2>Dashboard</h2>{% endblock title %}
{% block title %}
<h2>Dashboard</h2>
{% endblock title %}
{% block content %}
<div id="domains" class="frame">
<nav>
<ul>
<li>
<a href="{{ url_for('get_agents_view') }}">
<a href="{{ url_for("get_agents_view") }}">
{{ agents | length }} agent{{ 's' if agents | length > 1 }}
</a>
</li>
@ -21,46 +23,77 @@
</li>
<li>
<label class="inline-label">
Every <input id="refresh-delay"
Every
<input id="refresh-delay"
class="initial-width"
name="auto_refresh_seconds"
type="number"
form="refresh-form"
min="5"
value="{{ auto_refresh_seconds }}"> seconds
value="{{ auto_refresh_seconds }}">
seconds
</label>
</li>
<li>
<form id="refresh-form"
method="post"
action="{{ url_for('set_refresh_cookies_view') }}">
action="{{ url_for("set_refresh_cookies_view") }}">
<input type="Submit">
</form>
</li>
</ul>
</nav>
<div class="container">
<div class="grid grid-index">
<div class="display-small">
<article title="Unknown">
<br>
{{ counts_dict['unknown'] }}
</article>
<article title="OK">
<br>
{{ counts_dict['ok'] }}
</article>
<article title="Warning">
⚠️
<br>
{{ counts_dict['warning'] }}
</article>
<article title="Critical">
<br>
{{ counts_dict['critical'] }}
</article>
</div>
<div class="grid grid-index display-large">
<article>
<header title="Unknown"></header>
<header title="Unknown">
</header>
{{ counts_dict['unknown'] }}
</article>
<article>
<header title="OK"></header>
<header title="OK">
</header>
{{ counts_dict['ok'] }}
</article>
<article>
<header title="Warning">⚠️</header>
<header title="Warning">
⚠️
</header>
{{ counts_dict['warning'] }}
</article>
<article>
<header title="Critical"></header>
<header title="Critical">
</header>
{{ counts_dict['critical'] }}
</article>
</div>
<p class="text-center">
<a href="{{ url_for('get_domains_view') }}"
<a href="{{ url_for("get_domains_view") }}"
class="outline"
role="button">
Domains

View file

@ -16,6 +16,14 @@
name="password"
type="password"
form="login">
{% if remember is not none %}
<label>
<input type="checkbox"
name="rememberme"
form="login">
Remember me
</label>
{% endif %}
<form id="login"
method="post"
action="{{ url_for('post_login') }}">

View file

@ -3,7 +3,11 @@
{% block content %}
<dl>
<dt>Task</dt>
<dd>{{ result.task }}</dd>
<dd>
<a href="{{ url_for('get_task_results_view', task_id=result.task.id) }}">
{{ result.task }}
</a>
</dd>
<dt>Submitted at</dt>
<dd>{{ result.submitted_at }}</dd>
<dt>Status</dt>
@ -11,6 +15,26 @@
<dt>Severity</dt>
<dd>{{ result.severity }}</dd>
<dt>Context</dt>
<dd>{{ result.context }}</dd>
<dd>
{% if result.status != error %}
{{ result.context }}
{% else %}
<dl>
{% if result.context['error_message'] %}
<dt>Error message</dt>
<dd>{{ result.context['error_message'] }}</dd>
{% endif %}
<dt>Error type</dt>
<dd>{{ result.context['error_type'] }}</dd>
<dt>Error details</dt>
<dd>
<details>
<summary>{{ result.context['error_details'] | truncate(120, False, '…') }} (click to expand)</summary>
<pre><code>{{ result.context['error_details'] | replace('\n', '<br>') | safe }}</code></pre>
</details>
</dd>
</dl>
{% endif %}
</dd>
</dl>
{% endblock content %}

View file

@ -14,10 +14,34 @@
<tbody>
{% for result in results %}
<tr id="{{ result.id }}">
<td>{{ result.submitted_at }}</td>
<td>
<a href="{{ url_for('get_result_view', result_id=result.id) }}" title="See details of result {{ result.id }}">
{{ result.submitted_at }}
</a>
</td>
<td>{{ result.status }}</td>
<td>{{ result.severity }}</td>
<td>{{ result.context }}</td>
<td>
{% if result.status != error %}
{{ result.context }}
{% else %}
<dl>
{% if result.context["error_message"] %}
<dt>Error message</dt>
<dd>{{ result.context["error_message"] }}</dd>
{% endif %}
<dt>Error type</dt>
<dd>{{ result.context["error_type"] }}</dd>
<dt>Error details</dt>
<dd>
<details>
<summary>{{ result.context["error_details"] | truncate(120, False, "…") }} (click to expand)</summary>
<pre><code>{{ result.context["error_details"] | replace("\n", "<br>") | safe }}</code></pre>
</details>
</dd>
</dl>
{% endif %}
</td>
</tr>
{% endfor %}
</tbody>

View file

@ -0,0 +1,4 @@
location /foo/ {
include proxy_params;
proxy_pass http://127.0.0.1:8000/;
}

4
docs/_static/fix-nav.css vendored Normal file
View file

@ -0,0 +1,4 @@
.sy-head-brand img + strong {
display: inline;
margin-left: 1em;
}

1
docs/_static/logo.png vendored Symbolic link
View file

@ -0,0 +1 @@
../../argos/server/static/logo.png

View file

@ -1,3 +1,6 @@
---
description: Argos exposes a website and an API. This is how to use the API.
---
# The HTTP API
Argos exposes a website and an API. The website is available at "/" and the API at "/api".

View file

@ -1,2 +1,5 @@
---
description: Last changes in Argos.
---
```{include} ../CHANGELOG.md
```

View file

@ -1,6 +1,9 @@
---
description: Here are the checks that Argos proposes, with a description of what they do and how to configure them.
---
# Checks
At its core, argos runs checks and return the results to the service. Here are the implemented checks, with a description of what they do and how to configure them.
At its core, Argos runs checks and return the results to the service. Here are the implemented checks, with a description of what they do and how to configure them.
## Simple checks
@ -8,8 +11,18 @@ These checks are the most basic ones. They simply check that the response from t
| Check | Description | Configuration |
| --- | --- | --- |
| `status-is` | Check that the returned status code matches what you expect. | `status-is: "200"` |
| `body-contains` | Check that the returned body contains a given string. | `body-contains: "Hello world"` |
| `status-is` | Check that the returned status code matches what you expect. | <pre><code>status-is: \"200\"</code></pre> |
| `status-in` | Check that the returned status code is in the list of codes you expect. | <pre><code>status-in:<br> - 200<br> - 302</code></pre> |
| `body-contains` | Check that the returned body contains a given string. | <pre><code>body-contains: "Hello world"</code></pre> |
| `body-like` | Check that the returned body matches a given regex. | <pre><code>body-like: "Hel+o w.*"</code></pre> |
| `headers-contain` | Check that the response contains the expected headers. | <pre><code>headers-contain:<br> - "content-encoding"<br> - "content-type"</code></pre> |
| `headers-have` | Check that the response contains the expected headers with the expected value. | <pre><code>headers-have:<br> content-encoding: "gzip"<br> content-type: "text/html"</code></pre> |
| `headers-like` | Check that response headers contains the expected headers and that the values matches the provided regexes. | <pre><code>headers-like:<br> content-encoding: "gzip\|utf"<br> content-type: "text/(html\|css)"</code></pre> |
| `json-contains` | Check that JSON response contains the expected structure. | <pre><code>json-contains:<br> - /foo/bar/0<br> - /timestamp</code></pre> |
| `json-has` | Check that JSON response contains the expected structure and values. | <pre><code>json-has:<br> /maintenance: false<br> /productname: "Nextcloud"</code></pre> |
| `json-like` | Check that JSON response contains the expected structure and that the values matches the provided regexes. | <pre><code>json-like:<br> /productname: ".\*cloud"<br> /versionstring: "29\\\\..\*"</code></pre> |
| `json-is` | Check that JSON response is the exact expected JSON object. | <pre><code>json-is: '{"foo": "bar", "baz": 42}'</code></pre> |
| `http-to-https` | Check that the HTTP version of the domain redirects to HTTPS. Multiple choices of configuration. | <pre><code>http-to-https: true<br>http-to-https: 301<br>http-to-https:<br> start: 301<br> stop: 308<br>http-to-https:<br> - 301<br> - 302<br> - 307</code></pre> |
```{code-block} yaml
---
@ -21,6 +34,94 @@ caption: argos-config.yaml
checks:
- status-is: 200
- body-contains: "Hello world"
- body-like: "Hel+o w.*"
- headers-contain:
- "content-encoding"
- "content-type"
# Check that there is a HTTP to HTTPS redirection with 3xx status code
- http-to-https: true
# Check that there is a HTTP to HTTPS redirection with 301 status code
- http-to-https: 301
# Check that there is a HTTP to HTTPS redirection with a status code
# in the provided range (stop value excluded)
- http-to-https:
start: 301
stop: 308
# Check that there is a HTTP to HTTPS redirection with a status code
# in the provided list
- http-to-https:
- 301
- 302
- 307
- path: "/foobar"
checks:
- status-in:
- 200
- 302
# Its VERY important to respect the 4 spaces indentation here!
- headers-have:
content-encoding: "gzip"
content-type: "text/html"
# Its VERY important to respect the 4 spaces indentation here!
# You have to double the escape character \
- headers-like:
content-encoding: "gzip|utf"
content-type: "text/(html|css)"
- json-contains:
- /foo/bar/0
- /timestamp
# Its VERY important to respect the 4 spaces indentation here!
- json-has:
/maintenance: false
/productname: "Nextcloud"
# Its VERY important to respect the 4 spaces indentation here!
# You have to double the escape character \
- json-like:
/productname: ".*cloud"
/versionstring: "29\\..*"
- json-is: '{"foo": "bar", "baz": 42}'
```
## Add data to requests
If you want to specify query parameters, just put them in the path:
```{code-block} yaml
websites:
- domain: "https://contact.example.org"
paths:
- path: "/index.php?action=show_messages"
method: "GET"
```
If you want, for example, to test a form and send some data to it:
```{code-block} yaml
websites:
- domain: "https://contact.example.org"
paths:
- path: "/"
method: "POST"
request_data:
# These are the data sent to the server: title and msg
data:
title: "Hello my friend"
msg: "How are you today?"
# To send data as JSON (optional, default is false):
is_json: true
```
If you need to send some headers in the request:
```{code-block} yaml
websites:
- domain: "https://contact.example.org"
paths:
- path: "/api/mail"
method: "PUT"
request_data:
headers:
Authorization: "Bearer foo-bar-baz"
```
## SSL certificate expiration

View file

@ -1,3 +1,6 @@
---
description: How to use Argos from the command line.
---
# Command-line interface
<!-- [[[cog
@ -57,7 +60,9 @@ Options:
--max-tasks INTEGER Number of concurrent tasks this agent can run
--wait-time INTEGER Waiting time between two polls on the server
(seconds)
--log-level [DEBUG|INFO|WARNING|ERROR|CRITICAL]
--log-level [debug|info|warning|error|critical]
--user-agent TEXT A custom string to append to the User-Agent
header
--help Show this message and exit.
```
@ -79,14 +84,16 @@ Options:
--help Show this message and exit.
Commands:
cleandb Clean the database (to run routinely)
generate-config Output a self-documented example config file.
generate-token Generate a token for agents
migrate Run database migrations
nagios Nagios compatible severities report
reload-config Load or reload tasks configuration
start Starts the server (use only for testing or development!)
test-apprise Send a test apprise notification
test-gotify Send a test gotify notification
test-mail Send a test email
user User management
watch-agents Watch agents (to run routinely)
```
<!--[[[end]]]
@ -143,65 +150,6 @@ Options:
-->
### Server cleandb
<!--
.. [[[cog
help(["server", "cleandb", "--help"])
.. ]]] -->
```man
Usage: argos server cleandb [OPTIONS]
Clean the database (to run routinely)
- Removes old results from the database.
- Removes locks from tasks that have been locked for too long.
Options:
--max-results INTEGER Number of results per task to keep
--max-lock-seconds INTEGER The number of seconds after which a lock is
considered stale, must be higher than 60 (the
checks have a timeout value of 60 seconds)
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE
environment variable is set, its value will be
used instead. Default value: argos-config.yaml and
/etc/argos/config.yaml as fallback.
--help Show this message and exit.
```
<!--[[[end]]]
-->
### Server watch-agents
<!--
.. [[[cog
help(["server", "cleandb", "--help"])
.. ]]] -->
```man
Usage: argos server cleandb [OPTIONS]
Clean the database (to run routinely)
- Removes old results from the database.
- Removes locks from tasks that have been locked for too long.
Options:
--max-results INTEGER Number of results per task to keep
--max-lock-seconds INTEGER The number of seconds after which a lock is
considered stale, must be higher than 60 (the
checks have a timeout value of 60 seconds)
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE
environment variable is set, its value will be
used instead. Default value: argos-config.yaml and
/etc/argos/config.yaml as fallback.
--help Show this message and exit.
```
<!--[[[end]]]
-->
### Server reload-config
<!--
@ -215,9 +163,14 @@ Usage: argos server reload-config [OPTIONS]
Read tasks configuration and add/delete tasks in database if needed
Options:
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE environment
variable is set, its value will be used instead. Default value:
argos-config.yaml and /etc/argos/config.yaml as fallback.
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE
environment variable is set, its value will be used
instead. Default value: argos-config.yaml and
/etc/argos/config.yaml as fallback.
--enqueue / --no-enqueue Let Argos main recurring tasks handle
configurations loading. It may delay the
application of the new configuration up to 2
minutes. Default is --no-enqueue
--help Show this message and exit.
```
@ -269,9 +222,15 @@ Options:
### Server user management
To access Argos web interface, you need to create at least one user.
You can choose to protect Argos web interface with a user system, in which case youll need to create at least one user.
You can manage users only through CLI.
See [`unauthenticated_access` in the configuration file](configuration.md) to allow partial or total unauthenticated access to Argos.
See [`ldap` in the configuration file](configuration.md) to authenticate users against a LDAP server instead of Argos database.
You can manage Argos users only through CLI.
NB: you cant manage the LDAP users with Argos.
<!--
.. [[[cog
@ -465,3 +424,104 @@ Options:
<!--[[[end]]]
-->
### Use as a nagios probe
You can directly use Argos to get an output and an exit code usable with Nagios.
<!--
.. [[[cog
help(["server", "nagios", "--help"])
.. ]]] -->
```man
Usage: argos server nagios [OPTIONS]
Output a report of current severities suitable for Nagios with a Nagios
compatible exit code
Options:
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE environment
variable is set, its value will be used instead.
--help Show this message and exit.
```
<!--[[[end]]]
-->
### Test the email settings
You can verify that your mail settings are ok by sending a test email.
<!--
.. [[[cog
help(["server", "test-mail", "--help"])
.. ]]] -->
```man
Usage: argos server test-mail [OPTIONS]
Send a test email
Options:
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE
environment variable is set, its value will be used instead.
--domain TEXT Domain for the notification
--severity TEXT Severity
--help Show this message and exit.
```
<!--[[[end]]]
-->
### Test the Gotify settings
You can verify that your Gotify settings are ok by sending a test notification.
<!--
.. [[[cog
help(["server", "test-gotify", "--help"])
.. ]]] -->
```man
Usage: argos server test-gotify [OPTIONS]
Send a test gotify notification
Options:
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE
environment variable is set, its value will be used instead.
--domain TEXT Domain for the notification
--severity TEXT Severity
--help Show this message and exit.
```
<!--[[[end]]]
-->
### Test the Apprise settings
You can verify that your Apprise settings are ok by sending a test notification.
<!--
.. [[[cog
help(["server", "test-apprise", "--help"])
.. ]]] -->
```man
Usage: argos server test-apprise [OPTIONS]
Send a test apprise notification
Options:
--config TEXT Path of the configuration file. If ARGOS_YAML_FILE
environment variable is set, its value will be used
instead.
--domain TEXT Domain for the notification
--severity TEXT Severity
--apprise-group TEXT Apprise group for the notification [required]
--help Show this message and exit.
```
<!--[[[end]]]
-->

View file

@ -6,9 +6,11 @@
# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
# pylint: disable-msg=invalid-name,redefined-builtin
from os import environ
import argos
project = "Argos"
project = "Argos monitoring"
copyright = "2023, Alexis Métaireau, Framasoft"
author = "Alexis Métaireau, Framasoft"
release = argos.VERSION
@ -33,6 +35,15 @@ html_sidebars = {
# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
smartquotes = False
if "CI_JOB_ID" in environ:
html_baseurl = "https://argos-monitoring.framasoft.org"
html_theme = "shibuya"
html_static_path = ["_static"]
html_css_files = ["fonts.css"]
html_css_files = ["fonts.css", "fix-nav.css"]
html_logo = "_static/logo.png"
html_theme_options = {
"og_image_url": "https://argos-monitoring.framasoft.org/_static/logo.png"
}

View file

@ -1,7 +1,12 @@
---
description: How to configure Argos.
---
# Configuration
Argos uses a simple YAML configuration file to define the servers configuration, the websites to monitor and the checks to run on these websites.
See [here](checks.md) for more informations about the checks you can use.
Here is a simple self-documented configuration file, which you can get with [`argos server generate-config`](cli.md#server-generate-config):
```{literalinclude} ../conf/config-example.yaml

View file

@ -1,3 +1,6 @@
---
description: How to configure Nginx to use with Argos.
---
# Using Nginx as reverse proxy
Here is a example for Nginx configuration:
@ -6,3 +9,11 @@ Here is a example for Nginx configuration:
caption: /etc/nginx/sites-available/argos.example.org
---
```
If you want to use Argos under a subdirectory of your web server, youll need to set the `root_path` setting in Argoss [configuration](../configuration.md) and set Nginx like this:
```{literalinclude} ../../conf/nginx-subdirectory.conf
---
caption: Nginxs location for Argos in a subdirectory
---
```

View file

@ -1,3 +1,6 @@
---
description: Here are the systemd files that can be used to deploy the server and the agents.
---
# Using systemd
Here are the systemd files that can be used to deploy the server and the agents.

View file

@ -1,3 +1,6 @@
---
description: Many thanks to their developers!
---
# Main dependencies used by Argos
## Python packages
@ -11,7 +14,9 @@
- [Alembic](https://alembic.sqlalchemy.org) is used for DB migrations;
- [Tenacity](https://github.com/jd/tenacity) a small utility to retry a function in case an error occured;
- [Uvicorn](https://www.uvicorn.org/) is the tool used to run our server;
- [Gunicorn](https://gunicorn.org/) is the recommended WSGI HTTP server for production.
- [Gunicorn](https://gunicorn.org/) is the recommended WSGI HTTP server for production;
- [Apprise](https://github.com/caronc/apprise/wiki) allows Argos to send notifications through a lot of channels;
- [FastAPI Utilities](https://fastapiutils.github.io/fastapi-utils/) is in charge of recurring tasks.
## CSS framework

View file

@ -1,3 +1,6 @@
---
description: All you need to know to develop on Argos.
---
# Installing for development
To install all what you need to develop on Argos, do:

View file

@ -1,3 +1,6 @@
---
description: Argos is licensed under the terms of the GNU AFFERO GPLv3.
---
# License
Argos is licensed under the terms of the GNU AFFERO GPLv3.

View file

@ -1,3 +1,6 @@
---
description: How to use Alambic to add a database migratation to Argos.
---
# Adding a database migration
We are using [Alembic](https://alembic.sqlalchemy.org) to handle the database
@ -7,7 +10,12 @@ First, do your changes in the code, change the model, add new tables, etc. Once
you're done, you can create a new migration.
```bash
venv/bin/alembic -c argos/server/migrations/alembic.ini revision --autogenerate -m "migration reason"
venv/bin/alembic -c argos/server/migrations/alembic.ini revision \
--autogenerate -m "migration reason"
```
Edit the created file to remove comments and adapt it to make sure the migration is complete (Alembic is not powerful enough to cover all the corner cases).
In case you want to add an `Enum` type and use it in an existing table, please have a look at [`argos/server/migrations/versions/dcf73fa19fce_specify_check_method.py`](https://framagit.org/framasoft/framaspace/argos/-/blob/main/argos/server/migrations/versions/dcf73fa19fce_specify_check_method.py).
If you want to add an `Enum` type in a new table, you can do like in [`argos/server/migrations/versions/7d480e6f1112_initial_migrations.py`](https://framagit.org/framasoft/framaspace/argos/-/blob/main/argos/server/migrations/versions/7d480e6f1112_initial_migrations.py)

View file

@ -1,3 +1,6 @@
---
description: Whats in the database?
---
# The data model
```{mermaid}
@ -25,6 +28,19 @@ class Result{
- severity
- context
}
class ConfigCache {
- name
- val
- updated_at
}
class User {
- username
- password
- disabled
- created_at
- updated_at
- last_login_at
}
Result "*" o-- "1" Task : has many
```

View file

@ -1,3 +1,6 @@
---
description: Dont worry, creating a new check is quite easy.
---
# Implementing a new check
## Creating a new check class
@ -38,3 +41,7 @@ If that's your case, you can implement the `finalize` method, and return some ex
# You can use the extra_arg here to determine the severity
return Status.SUCCESS, Severity.OK
```
## Document the new check
Please, document the use of the new check in `docs/checks.md` and `argos/config-example.yaml`.

View file

@ -1,3 +1,6 @@
---
description: Adding a new notification way is quite simple.
---
# Add a notification way
Adding a new notification way is quite simple.

View file

@ -1,3 +1,6 @@
---
description: An agent and a server, thats all.
---
# Technical overview
Argos uses an agent and server architecture. The server is responsible for storing the configuration and the results of the checks. The agent is responsible for running the checks and sending the results to the server.

View file

@ -1,3 +1,6 @@
---
description: Once in a while, we release this package. Here is how.
---
# Releasing guide
Once in a while, we release this package. Here is how.
@ -23,18 +26,23 @@ git checkout main
# Ensure the tests run correctly
make test
# Check static typing
make mypy
# Bump the version, according to semantic versionning
hatch version minor # or `hatch version major`, or `hatch version fix`
# Modify the changelog
editor CHANGELOG.md
# Update the changelog
sed -e "s/## .Unreleased./&\n\n## $(hatch version)\n\nDate: $(date +%F)/" \
-i CHANGELOG.md
# Commit the change
git add argos/__init__.py CHANGELOG.md
git commit -m "🏷 — Bump version ($(hatch version))"
# Create a tag on the git repository and push it
git tag "$(hatch version)" && git push
git tag "$(hatch version)" -m "$(hatch version)" &&
git push --follow-tags
# Build the project
hatch build --clean
@ -45,9 +53,6 @@ hatch publish
Aditionnaly, ensure it works well in a new environment.
Then go to <https://framagit.org/framasoft/framaspace/argos/-/releases> to create a new release for the new tag.
Use CHANGELOG.md content for that.
## Bumping the version number
We follow semantic versionning conventions, and the version is specified in the `argos.__init__.py` file.
@ -86,7 +91,7 @@ If you're still experimenting, you can use the [Test PyPI](https://test.pypi.org
```bash
# Publishing on test PyPI
hatch build -r test
hatch publish -r test
# Installing from test PyPI
pip install --index-url https://test.pypi.org/simple/ argos-monitoring

View file

@ -1,3 +1,6 @@
---
description: Depending on your setup, you might need different tools to develop on argos.
---
# Requirements
Depending on your setup, you might need different tools to develop on argos. We try to list them here.

View file

@ -1,3 +1,6 @@
---
description: Launch tests! Make linting tools happy!
---
# Tests and linting
## Tests
@ -19,3 +22,8 @@ You can launch all of them with:
```bash
make lint
```
To let `ruff` format the code, run:
```bash
make ruff-format
```

View file

@ -1,3 +1,6 @@
---
description: Soooo much questions…
---
# FAQ
## How is it different than Nagios?

View file

@ -1,3 +1,6 @@
---
description: A monitoring and status board for websites. Test how your websites respond to external checks, get notified when something goes wrong.
---
# Argos monitoring
A monitoring and status board for websites.

View file

@ -1,3 +1,6 @@
---
description: Install Argos, with all the explanations you want.
---
# Installation
NB: if you want a quick-installation guide, we [got you covered](tl-dr.md).
@ -7,6 +10,14 @@ NB: if you want a quick-installation guide, we [got you covered](tl-dr.md).
- Python 3.11+
- PostgreSQL 13+ (for production)
### Optional dependencies
If you want to use LDAP authentication, you will need to install some packages (here for a Debian-based system):
```bash
apt-get install build-essential python3-dev libldap-dev libsasl2-dev
```
## Recommendation
Create a dedicated user for argos:
@ -42,6 +53,18 @@ For production, we recommend the use of [Gunicorn](https://gunicorn.org/), which
pip install "argos-monitoring[gunicorn]"
```
If you want to use LDAP authentication, youll need to install Argos this way:
```bash
pip install "argos-monitoring[ldap]"
```
And for an installation with Gunicorn and LDAP authentication:
```bash
pip install "argos-monitoring[gunicorn,ldap]"
```
## Install from sources
Once you got the source locally, create a virtualenv and install the dependencies:
@ -168,18 +191,6 @@ The only requirement is that the agent can reach the server through HTTP or HTTP
argos agent http://localhost:8000 "auth-token"
```
## Cleaning the database
You have to run cleaning task periodically. `argos server cleandb --help` will give you more information on how to do that.
Here is a crontab example, which will clean the db each hour:
```bash
# Run the cleaning tasks every hour (at minute 7)
# Keeps 10 results per task, and remove tasks locks older than 1 hour
7 * * * * argos server cleandb --max-results 10 --max-lock-seconds 3600
```
## Watch the agents
In order to be sure that agents are up and communicate with the server, you can periodically run the `argos server watch-agents` command.

View file

@ -1,3 +1,6 @@
---
description: Here are a few steps for you to install PostgreSQL on your system.
---
# Install and configure PostgreSQL
Here are a few steps for you to install PostgreSQL on your system:

View file

@ -1,3 +1,6 @@
---
description: You want to install Argos fast? Ok, here we go.
---
# TL;DR: fast installation instructions
You want to install Argos fast? Ok, here we go.
@ -87,13 +90,13 @@ User=argos
WorkingDirectory=/opt/argos/
EnvironmentFile=/etc/default/argos-server
ExecStartPre=/opt/argos/venv/bin/argos server migrate
ExecStartPre=/opt/argos/venv/bin/argos server reload-config
ExecStartPre=/opt/argos/venv/bin/argos server reload-config --enqueue
ExecStart=/opt/argos/venv/bin/gunicorn "argos.server.main:get_application()" \\
--workers \$ARGOS_SERVER_WORKERS \\
--worker-class uvicorn.workers.UvicornWorker \\
--bind \$ARGOS_SERVER_SOCKET \\
--forwarded-allow-ips \$ARGOS_SERVER_FORWARDED_ALLOW_IPS
ExecReload=/opt/argos/venv/bin/argos server reload-config
ExecReload=/opt/argos/venv/bin/argos server reload-config --enqueue
SyslogIdentifier=argos-server
[Install]
@ -150,8 +153,7 @@ If all works well, you have to put some cron tasks in `argos` crontab:
```bash
cat <<EOF | crontab -u argos -
*/10 * * * * /opt/argos/venv/bin/argos server cleandb --max-lock-seconds 120 --max-results 1200
*/10 * * * * /opt/argos/venv/bin/argos server watch-agents --time-without-agent 10
*/10 * * * * /opt/argos/venv/bin/argos server watch-agents --time-without-agent 10:
EOF
```

View file

@ -22,13 +22,18 @@ classifiers = [
dependencies = [
"alembic>=1.13.0,<1.14",
"apprise>=1.9.0,<2",
"bcrypt>=4.1.3,<5",
"click>=8.1,<9",
"durations-nlp>=1.0.1,<2",
"fastapi>=0.103,<0.104",
"fastapi-login>=1.10.0,<2",
"httpx>=0.25,<1",
"fastapi-utils>=0.8.0,<0.9",
"httpx>=0.27.2,<0.28.0",
"Jinja2>=3.0,<4",
"jsonpointer>=3.0,<4",
"passlib>=1.7.4,<2",
"psutil>=5.9.8,<6",
"psycopg2-binary>=2.9,<3",
"pydantic[email]>=2.4,<3",
"pydantic-settings>=2.0,<3",
@ -38,6 +43,7 @@ dependencies = [
"sqlalchemy[asyncio]>=2.0,<3",
"sqlalchemy-utils>=0.41,<1",
"tenacity>=8.2,<9",
"typing_inspect>=0.9.0,<1",
"uvicorn>=0.23,<1",
]
@ -45,16 +51,18 @@ dependencies = [
dev = [
"black==23.3.0",
"djlint>=1.34.0",
"hatch==1.13.0",
"ipdb>=0.13,<0.14",
"ipython>=8.16,<9",
"isort==5.11.5",
"pylint>=3.0.2",
"mypy>=1.10.0,<2",
"pylint>=3.2.5",
"pytest-asyncio>=0.21,<1",
"pytest>=6.2.5",
"respx>=0.20,<1",
"ruff==0.1.5,<1",
"sphinx-autobuild",
"hatch==1.9.4",
"types-PyYAML",
]
docs = [
"cogapp",
@ -67,6 +75,9 @@ docs = [
gunicorn = [
"gunicorn>=21.2,<22",
]
ldap = [
"python-ldap>=3.4.4,<4",
]
[project.urls]
homepage = "https://argos-monitoring.framasoft.org/"
@ -103,3 +114,6 @@ filterwarnings = [
"ignore:'crypt' is deprecated and slated for removal in Python 3.13:DeprecationWarning",
"ignore:The 'app' shortcut is now deprecated:DeprecationWarning",
]
[tool.mypy]
ignore_missing_imports = "True"

View file

@ -1,9 +1,21 @@
---
general:
# Except for frequency and recheck_delay settings, changes in general
# section of the configuration will need a restart of argos server.
db:
# The database URL, as defined in SQLAlchemy docs : https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls
# The database URL, as defined in SQLAlchemy docs:
# https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls
url: "sqlite:////tmp/test-argos.db"
# Can be "production", "dev", "test".
# If not present, default value is "production"
env: test
# To get a good string for cookie_secret, run:
# openssl rand -hex 32
cookie_secret: "foo-bar-baz"
# Default delay for checks.
# Can be superseeded in domain configuration.
# For ex., to run checks every 5 minutes:
frequency: "1m"
alerts:
ok:
@ -14,12 +26,37 @@ general:
- local
unknown:
- local
no_agent:
- local
service:
secrets:
# Secrets can be generated using `argos server generate-token`.
# You need at least one. Write them as a list, like:
# - secret_token
- "O4kt8Max9/k0EmHaEJ0CGGYbBNFmK8kOZNIoUk3Kjwc"
- "x1T1VZR51pxrv5pQUyzooMG4pMUvHNMhA5y/3cUsYVs="
ssl:
thresholds:
- "1d": critical
"5d": warning
- "5d": warning
# Argos will execute some tasks in the background for you
# every 2 minutes and needs some configuration for that
recurring_tasks:
# Maximum age of results
# Use m for minutes, h for hours, d for days
# w for weeks, M for months, y for years
# See https://github.com/timwedde/durations_nlp#scales-reference for details
max_results_age: "1d"
# Max number of seconds a task can be locked
# Minimum value is 61, default is 100
max_lock_seconds: 100
# Max number of seconds without seing an agent
# before sending an alert
# Minimum value is 61, default is 300
time_without_agent: 300
# It's also possible to define the checks in another file
# with the include syntax:
#
websites: !include websites.yaml

View file

@ -10,7 +10,7 @@ os.environ["ARGOS_YAML_FILE"] = "tests/config.yaml"
@pytest.fixture
def db() -> Session:
def db() -> Session: # type: ignore[misc]
from argos.server import models
app = _create_app()
@ -20,7 +20,7 @@ def db() -> Session:
@pytest.fixture
def app() -> FastAPI:
def app() -> FastAPI: # type: ignore[misc]
from argos.server import models
app = _create_app()

View file

@ -21,7 +21,7 @@ def test_tasks_retrieval_and_results(authorized_client, app):
assert response.status_code == 200
tasks = response.json()
assert len(tasks) == 2
assert len(tasks) == 4
results = []
for task in tasks:
@ -33,7 +33,7 @@ def test_tasks_retrieval_and_results(authorized_client, app):
response = client.post("/api/results", json=data)
assert response.status_code == 201
assert app.state.db.query(models.Result).count() == 2
assert app.state.db.query(models.Result).count() == 4
# The list of tasks should be empty now
response = client.get("/api/tasks")
@ -60,6 +60,8 @@ def ssl_task(db):
task = models.Task(
url="https://exemple.com/",
domain="https://exemple.com/",
ip_version="6",
method="GET",
check="ssl-certificate-expiration",
expected="on-check",
frequency=1,

View file

@ -35,7 +35,13 @@ def ssl_task(now):
id=1,
url="https://example.org",
domain="https://example.org",
ip_version="6",
method="GET",
request_data=None,
task_group="GET-6-https://example.org",
check="ssl-certificate-expiration",
retry_before_notification=0,
contiguous_failures=0,
expected="on-check",
selected_at=now,
selected_by="pytest",
@ -51,6 +57,9 @@ async def test_ssl_check_accepts_statuts(
return_value=httpx.Response(http_status, extensions=httpx_extensions_ssl),
)
async with httpx.AsyncClient() as client:
check = SSLCertificateExpiration(client, ssl_task)
check_response = await check.run()
check = SSLCertificateExpiration(ssl_task)
response = await client.request(
method=ssl_task.method, url=ssl_task.url, timeout=60
)
check_response = await check.run(response)
assert check_response.status == "on-check"

View file

@ -10,9 +10,9 @@ from argos.server.models import Result, Task, User
@pytest.mark.asyncio
async def test_remove_old_results(db, ten_tasks): # pylint: disable-msg=redefined-outer-name
for _task in ten_tasks:
for _ in range(5):
for iterator in range(5):
result = Result(
submitted_at=datetime.now(),
submitted_at=datetime.now() - timedelta(seconds=iterator * 2),
status="success",
context={"foo": "bar"},
task=_task,
@ -24,12 +24,12 @@ async def test_remove_old_results(db, ten_tasks): # pylint: disable-msg=redefi
# So we have 5 results per tasks
assert db.query(Result).count() == 50
# Keep only 2
deleted = await queries.remove_old_results(db, 2)
assert deleted == 30
assert db.query(Result).count() == 20
# Keep only those newer than 1 second ago
deleted = await queries.remove_old_results(db, 6)
assert deleted == 20
assert db.query(Result).count() == 30
for _task in ten_tasks:
assert db.query(Result).filter(Result.task == _task).count() == 2
assert db.query(Result).filter(Result.task == _task).count() == 3
@pytest.mark.asyncio
@ -70,7 +70,7 @@ async def test_update_from_config_with_duplicate_tasks(db, empty_config): # py
await queries.update_from_config(db, empty_config)
# Only one path has been saved in the database
assert db.query(Task).count() == 1
assert db.query(Task).count() == 2
# Calling again with the same data works, and will not result in more tasks being
# created.
@ -87,6 +87,7 @@ async def test_update_from_config_db_can_remove_duplicates_and_old_tasks(
same_task = Task(
url=task.url,
domain=task.domain,
ip_version="6",
check=task.check,
expected=task.expected,
frequency=task.frequency,
@ -108,7 +109,7 @@ async def test_update_from_config_db_can_remove_duplicates_and_old_tasks(
empty_config.websites = [website]
await queries.update_from_config(db, empty_config)
assert db.query(Task).count() == 2
assert db.query(Task).count() == 4
website = schemas.config.Website(
domain=task.domain,
@ -122,7 +123,7 @@ async def test_update_from_config_db_can_remove_duplicates_and_old_tasks(
empty_config.websites = [website]
await queries.update_from_config(db, empty_config)
assert db.query(Task).count() == 1
assert db.query(Task).count() == 2
@pytest.mark.asyncio
@ -136,7 +137,7 @@ async def test_update_from_config_db_updates_existing_tasks(db, empty_config, ta
empty_config.websites = [website]
await queries.update_from_config(db, empty_config)
assert db.query(Task).count() == 1
assert db.query(Task).count() == 2
@pytest.mark.asyncio
@ -212,6 +213,7 @@ def task(db):
_task = Task(
url="https://www.example.com",
domain="https://www.example.com",
ip_version="6",
check="body-contains",
expected="foo",
frequency=1,
@ -233,6 +235,7 @@ def empty_config():
warning=["", ""],
critical=["", ""],
unknown=["", ""],
no_agent=["", ""],
),
),
service=schemas.config.Service(
@ -241,6 +244,11 @@ def empty_config():
]
),
ssl=schemas.config.SSL(thresholds=[]),
recurring_tasks=schemas.config.RecurringTasks(
max_results_age="6s",
max_lock_seconds=120,
time_without_agent=300,
),
websites=[],
)
@ -271,6 +279,7 @@ def ten_locked_tasks(db):
_task = Task(
url="https://www.example.com",
domain="example.com",
ip_version="6",
check="body-contains",
expected="foo",
frequency=1,
@ -291,6 +300,7 @@ def ten_tasks(db):
_task = Task(
url="https://www.example.com",
domain="example.com",
ip_version="6",
check="body-contains",
expected="foo",
frequency=1,
@ -311,6 +321,7 @@ def ten_warning_tasks(db):
_task = Task(
url="https://www.example.com",
domain="example.com",
ip_version="6",
check="body-contains",
expected="foo",
frequency=1,
@ -331,6 +342,7 @@ def ten_critical_tasks(db):
_task = Task(
url="https://www.example.com",
domain="example.com",
ip_version="6",
check="body-contains",
expected="foo",
frequency=1,
@ -351,6 +363,7 @@ def ten_ok_tasks(db):
_task = Task(
url="https://www.example.com",
domain="example.com",
ip_version="6",
check="body-contains",
expected="foo",
frequency=1,

View file

@ -1,51 +0,0 @@
import pytest
from argos.schemas.utils import string_to_duration
def test_string_to_duration_days():
assert string_to_duration("1d", target="days") == 1
assert string_to_duration("1w", target="days") == 7
assert string_to_duration("3w", target="days") == 21
assert string_to_duration("3mo", target="days") == 90
assert string_to_duration("1y", target="days") == 365
with pytest.raises(ValueError):
string_to_duration("3h", target="days")
with pytest.raises(ValueError):
string_to_duration("1", target="days")
def test_string_to_duration_hours():
assert string_to_duration("1h", target="hours") == 1
assert string_to_duration("1d", target="hours") == 24
assert string_to_duration("1w", target="hours") == 7 * 24
assert string_to_duration("3w", target="hours") == 21 * 24
assert string_to_duration("3mo", target="hours") == 3 * 30 * 24
with pytest.raises(ValueError):
string_to_duration("1", target="hours")
def test_string_to_duration_minutes():
assert string_to_duration("1m", target="minutes") == 1
assert string_to_duration("1h", target="minutes") == 60
assert string_to_duration("1d", target="minutes") == 60 * 24
assert string_to_duration("3mo", target="minutes") == 60 * 24 * 30 * 3
with pytest.raises(ValueError):
string_to_duration("1", target="minutes")
def test_conversion_to_greater_units_throws():
# hours and minutes cannot be converted to days
with pytest.raises(ValueError):
string_to_duration("1h", target="days")
with pytest.raises(ValueError):
string_to_duration("1m", target="days")
# minutes cannot be converted to hours
with pytest.raises(ValueError):
string_to_duration("1m", target="hours")

View file

@ -1,3 +1,4 @@
---
- domain: "https://mypads.framapad.org"
paths:
- path: "/mypads/"