mirror of
https://framagit.org/framasoft/framaspace/argos.git
synced 2025-04-28 18:02:41 +02:00
54 lines
No EOL
2.1 KiB
Markdown
54 lines
No EOL
2.1 KiB
Markdown
# Argos
|
||
|
||
Argos is an HTTP monitoring service. It allows you to define a list of websites to monitor, and a list of checks to run on these websites. It will then run these checks periodically, and alert you if something goes wrong.
|
||
|
||
Todo:
|
||
|
||
- [ ] Retrying: attempt 1413 ended with: <Future at 0x104f39390 state=finished raised RuntimeError> Cannot reopen a client instance, once it has been closed.
|
||
- [ ] Cleandb should keep max number of results by task
|
||
- [ ] Do not return empty list on / when no results from agents.
|
||
- [ ] Last seen agents
|
||
- [ ] donner un aperçu rapide de l’état de la supervision.
|
||
- [ ] Rename error in unexpected error
|
||
- [ ] Use background tasks for alerting
|
||
- [ ] Delete outdated tasks from config
|
||
- [ ] Implement alerting tasks
|
||
- [ ] Handles multiple alerting backends (email, sms, gotify)
|
||
- [ ] Un flag de configuration permet d’ajouter automatiquement un job de vérification de redirection 301 de la version HTTP vers HTTPS
|
||
- [ ] add an "unknown" severity for check errors
|
||
- [ ] Add a way to specify the severity of the alerts in the config
|
||
- [ ] Add a command to generate new authentication token
|
||
|
||
|
||
Implemented checks :
|
||
|
||
- [x] Returned status code matches what you expect ;
|
||
- [x] Returned body matches what you expect ;
|
||
- [x] SSL certificate expires in more than X days ;
|
||
|
||
|
||
## Development notes
|
||
|
||
### On service start.
|
||
|
||
1. Read the job definitions file and populate the database.
|
||
2. From the job definition, create a list of tasks to execute.
|
||
3. From time to time (?) clean the db.
|
||
|
||
### On configuration changes :
|
||
- Find and tombstone the JobDefinitions that are not useful anymore.
|
||
- Cascade delete the child tasks that are planned. Tombstone them as wel.
|
||
|
||
### On worker demand :
|
||
- Find the tasks for which :
|
||
- last_check is not defined
|
||
- OR last_check + max_timedelta > datetime.now()
|
||
- AND selected_by not defined.
|
||
- Mark these tasks as selected by the current worker, on the current date.
|
||
|
||
### From time to time (cleanup):
|
||
- Check for stalled tasks (datetime.now() - selected_at) > MAX_WORKER_TIME. Remove the lock.
|
||
|
||
### On the worker side
|
||
1. Hey, I'm XX, give me some work.
|
||
2. <Service answers> OK, this is done, here are the results for Task<id>: response. |