mirror of
https://framagit.org/framasoft/framaspace/argos.git
synced 2025-04-28 09:52:38 +02:00
Monitoring tool for Framaspace.
[Online documentation](https://argos-monitoring.framasoft.org/)
- Removed the `Definition` class and added the `Task` class. It contains all information needed to run the jobs on the workers. - Added the `Result` class. It stores the results returned by workers. - In queries.py, updated the `update_from_config` function. Now it checks for the existence of tasks with the same URL, check, and expected result before adding new ones. |
||
---|---|---|
argos | ||
config.yaml | ||
log_conf.yaml | ||
Pipfile | ||
Pipfile.lock | ||
README.md |
Argos
🚧 This is mainly a work in progress for now. It's not working, don't try to install it ! 🚧
Argos is an HTTP monitoring service. It's meant to be simple to configure and simple to use.
Features :
- Uses
.yaml
files for configuration ; - Read the configuration file and convert it to tasks ;
- Store tasks in a database ;
- Checks can be distributed on the network thanks to a job queue ;
- Multiple paths per websites can be tested ;
- Handles multiple alerting backends (email, sms, gotify) ;
- Exposes an HTTP API that can be consumed by other systems ;
- Exposes a simple read-only website.
Implemented checks :
- Returned status code matches what you expect ;
- Returned body matches what you expect ;
- SSL certificate expires in more than X days ;
Development notes
On service start.
- Read the job definitions file and populate the database.
- From the job definition, create a list of tasks to execute.
- From time to time (?) clean the db.
On configuration changes :
- Find and tombstone the JobDefinitions that are not useful anymore.
- Cascade delete the child tasks that are planned. Tombstone them as wel.
On worker demand :
- Find the tasks for which :
- last_check is not defined
- OR last_check + max_timedelta > datetime.now()
- AND selected_by not defined.
- Mark these tasks as selected by the current worker, on the current date.
From time to time:
- Check for stalled tasks (datetime.now() - selected_at) > MAX_WORKER_TIME. Remove the lock.
On the worker side
Hey, I'm XX, give me some work. OK, this is done, here are the results for Task: response.