mirror of
https://framagit.org/framasoft/framaspace/argos.git
synced 2025-04-28 18:02:41 +02:00
124 lines
No EOL
3.7 KiB
Markdown
124 lines
No EOL
3.7 KiB
Markdown
# Argos
|
|
|
|
🚧 This is mainly a work in progress for now. It's not working, don't try to install it ! 🚧
|
|
|
|
Argos is an HTTP monitoring service. It's meant to be simple to configure and simple to use.
|
|
|
|
Features :
|
|
|
|
- [x] Uses `.yaml` files for configuration ;
|
|
- [x] Read the configuration file and convert it to tasks ;
|
|
- [x] Store tasks in a database ;
|
|
- [x] Multiple paths per websites can be tested ;
|
|
- [x] Handle jobs failures on the clients
|
|
- [x] Exposes an HTTP API that can be consumed by other systems ;
|
|
- [x] Checks can be distributed on the network thanks to a job queue ;
|
|
- [x] Change the naming and use service/agent.
|
|
- [x] Packaging (and `argos agent` / `argos service` commands)
|
|
- [x] Endpoints are protected by an authentication token
|
|
- [ ] Find a way to define when the task should be checked again (via config ? stored on the tasks themselves ?)
|
|
- [ ] Local task for database cleanup (to run periodically)
|
|
- [ ] Handles multiple alerting backends (email, sms, gotify) ;
|
|
- [ ] Exposes a simple read-only website.
|
|
- [ ] Add a way to specify the severity of the alerts in the config
|
|
- [ ] No need to return the expected and got values in case it worked in check-status and body-contains
|
|
|
|
Implemented checks :
|
|
|
|
- [x] Returned status code matches what you expect ;
|
|
- [x] Returned body matches what you expect ;
|
|
- [x] SSL certificate expires in more than X days ;
|
|
|
|
## How to run ?
|
|
|
|
We're using [pipenv](https://pipenv.pypa.io/) to manage the virtual environment and the dependencies.
|
|
You can install it with [pipx](https://pypa.github.io/pipx/):
|
|
|
|
```bash
|
|
pipx install pipenv
|
|
```
|
|
|
|
And then, checkout this repository and sync its pipenv
|
|
|
|
```bash
|
|
pipenv sync
|
|
```
|
|
|
|
Once all the dependencies are in place, here is how to run the server:
|
|
|
|
```bash
|
|
pipenv run argos server
|
|
```
|
|
|
|
The server will read a `config.yaml` file at startup, and will populate the tasks specified in it. See the configuration section below for more information on how to configure the checks you want to run.
|
|
|
|
And here is how to run the agent:
|
|
|
|
```bash
|
|
pipenv run argos agent --server http://localhost:8000 --auth "<auth-token>"
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Here is a simple configuration file:
|
|
|
|
```yaml
|
|
general:
|
|
frequency: 4h # Run checks every 4 hours.
|
|
alerts:
|
|
error:
|
|
- local
|
|
warning:
|
|
- local
|
|
alert:
|
|
- local
|
|
service:
|
|
port: 8888
|
|
# Can be generated using `openssl rand -base64 32`.
|
|
secrets:
|
|
- "O4kt8Max9/k0EmHaEJ0CGGYbBNFmK8kOZNIoUk3Kjwc"
|
|
- "x1T1VZR51pxrv5pQUyzooMG4pMUvHNMhA5y/3cUsYVs="
|
|
|
|
ssl:
|
|
thresholds:
|
|
critical: "1d"
|
|
warning: "10d"
|
|
|
|
websites:
|
|
- domain: "https://blog.notmyidea.org"
|
|
paths:
|
|
- path: "/"
|
|
checks:
|
|
- status-is: 200
|
|
- body-contains: "Alexis"
|
|
- ssl-certificate-expiration: "on-check"
|
|
- path: "/foo"
|
|
checks:
|
|
- status-is: 400
|
|
```
|
|
|
|
## Development notes
|
|
|
|
### On service start.
|
|
|
|
1. Read the job definitions file and populate the database.
|
|
2. From the job definition, create a list of tasks to execute.
|
|
3. From time to time (?) clean the db.
|
|
|
|
### On configuration changes :
|
|
- Find and tombstone the JobDefinitions that are not useful anymore.
|
|
- Cascade delete the child tasks that are planned. Tombstone them as wel.
|
|
|
|
### On worker demand :
|
|
- Find the tasks for which :
|
|
- last_check is not defined
|
|
- OR last_check + max_timedelta > datetime.now()
|
|
- AND selected_by not defined.
|
|
- Mark these tasks as selected by the current worker, on the current date.
|
|
|
|
### From time to time (cleanup):
|
|
- Check for stalled tasks (datetime.now() - selected_at) > MAX_WORKER_TIME. Remove the lock.
|
|
|
|
### On the worker side
|
|
1. Hey, I'm XX, give me some work.
|
|
2. <Service answers> OK, this is done, here are the results for Task<id>: response. |