README.md

# Engine API Server

<!-- TOC -->

1. [Engine API Server](#engine-api-server)
   1. [Overview](#overview)
   2. [Deployment Modes](#deployment-modes)
      1. [Single Deployment](#single-deployment)
      2. [Redundant Deployment](#redundant-deployment)
         1. [Managing Active Engine State](#managing-active-engine-state)
         2. [Playlog Synchronization for High Availability deployment scenarios](#playlog-synchronization-for-high-availability-deployment-scenarios)
            1. [Active Sync](#active-sync)
            2. [Passive Sync](#passive-sync)
   3. [Getting started](#getting-started)
      1. [Requirements](#requirements)
      2. [Installation](#installation)
         1. [Setting up the database](#setting-up-the-database)
      3. [Configuration](#configuration)
         1. [Engine 1 Node](#engine-1-node)
         2. [Engine 2 Node](#engine-2-node)
         3. [Synchronization Node](#synchronization-node)
   4. [Running the Server](#running-the-server)
      1. [Development](#development)
      2. [Production](#production)
      3. [Running with Systemd](#running-with-systemd)
      4. [Running with Supervisor](#running-with-supervisor)
      5. [Running with Docker](#running-with-docker)
   5. [Development](#development-1)
      1. [Using the API](#using-the-api)
      2. [Extending the API](#extending-the-api)
      3. [Creating a local image](#creating-a-local-image)
      4. [Publish new image](#publish-new-image)
   6. [Logging](#logging)
   7. [Read more](#read-more)

<!-- /TOC -->

## Overview

The Project serves the Engine API and handles state management of multiple [Engine](https://gitlab.servus.at/aura/engine) instances.

The Engine API stores and provides following information:

- **Playlogs**: A history of all audio titles being played by the Engine. This is used for example to generate detailed reports for regulartory purposes.
- **Track Service**: Same as playlogs, but stripped-down information. Used for implementing a track service feature or displaying the currently playing track information on the radio's website.
- **Active Source**: In redundant deployment scenarios the API stores and shares information on which engine instance is currently active. This could be extended to other audio sources.
- **Health Information**: In case of some critical issue affecting the functionality of AURA Engine, the history of health status records of the respective engine is stored.
- **Studio Clock**: Information on the current and next show to be used in a _Studio Clock_ application.

You can find details on the available API endpoints here: https://app.swaggerhub.com/apis/AURA-Engine/engine-api/1.0.0

## Deployment Modes

AURA Engine allows single and redundant deployments for high availability scenarios.

Engine can be deployed and run manually, using [Systemd](#running-with-systemd), [Supervisor](#running-with-supervisor) or [Docker](#running-with-docker).

### Single Deployment

This is the most simple case. In that scenario the Engine API is deployed on the same host as the [Engine](https://gitlab.servus.at/aura/engine) itself.

> In your live deployment you might not want to expose the API directly on the web. For security reasons it's highly recommended to guard it using something like NGINX,
> acting as a reverse proxy.

<img src="docs/engine-api_single.png" width="550" />

### Redundant Deployment

In this scenario there are two Engine instances involved. Here you will need to deploy one Engine API on the host of each Engine instance. Additionally you'll have to set up
a third, so-called _Synchronization Node_ of the Engine API. This sync instance of Engine API is in charge of synchronizing playlogs and managing the active engine state.

<img src="docs/engine-api_redundancy.png" />

#### Managing Active Engine State

In order to avoid duplicate playlog storage, the _Synchronization Node_ requires to know what the currently active Engine is. This can be achieved by some external _Status Monitor_
component which tracks the heartbeat of both engines. In case the Status Monitor identifies one Engine as dysfunctional, it sends a REST request to the _Sync Node_, informing it
about the second, functional Engine instance being activated.

The history of active Engine instances is stored in the database of the _Sync Node_. It is not only used for playlog syncing, but is also handy as an audit log.

> At the moment AURA doesn't provide its own _Status Monitor_ solution. You'll need to integrate your self knitted component which tracks the heartbeat of the engines and posts
> the active engine state to the _Sync Node_.

> In your live deployment you might not want to expose the API directly on the web. For security reasons it's highly recommended to guard it using something like NGINX,
> acting as a reverse proxy to shield your API.

#### Playlog Synchronization for High Availability deployment scenarios

Usually when some new audio source starts playing, AURA Engine logs it to its local Engine API instance via some REST call. Now, the _Local API server_ stores this information in its
local database. Next, it also performs a POST request to the _Synchronization API Server_. This _Sync Node_ checks if this request is coming from the currently active engine instance.
If yes, it stores this information in its playlog database. This keeps the playlogs of individual (currently active) Engine instances in sync with the _Engine API synchronization node_.
The _Engine API synchronization node_ always only stores the valid (i.e. actually played) playlog records.

##### Active Sync

This top-down synchronization process of posting any incoming playlogs at the _Engine Node_ also to the _Synchronization Node_ can be called **Active Sync**. This **Active Sync**
doesn't work in every scenario, as there might be the case, that the _Synchronization Node_ is not available e.g. due to network outage, maintenance etc. In this situation the playlog
obviously can not be synced. That means the local playlog at the _Engine Node_ is marked as "not synced".

##### Passive Sync

Such marked entries are focus of the secondary synchronization approach, the so called **Passive Sync**: Whenever the _Synchronization Node_ is up- and running again, some automated job
on this node is continuously checking for records on remote nodes marked as "unsynced". If there are such records found, this indicates that there has been an outage of the _Sync Node_.
Hence those "unsynced" records are pending to be synced. Now an automated job on the _Sync Node_ reads those records as batches from that Engine Node and stores them in its local database.
It also keeps track when the last sync has happened, avoiding to query unnecceary records on any remote nodes.

In order to avoid that this **Passive Sync** job might be causing high traffic on an engine instance, these batches are read with some configured delay time (see `sync_interval` and
`sync_step_sleep` in the _Sync Node_ configuration; all values are in seconds) and a configurable batch size (`sync_batch_size`; count of max unsynced playlogs which are read at once).

## Getting started

### Requirements

The most simple way to get started is by running Engine API using [Docker](https://www.docker.com/). See below for detailed descriptions on how to do that.

If you are not planning to go with Docker or just want to setup a local development environment, then you'll need:

- [Python 3.9+](https://www.python.org/)
- [`pip`](https://pip.pypa.io/en/stable/)
- [`git`](https://git-scm.com/)
- [PostgreSQL 13+](https://www.postgresql.org/)

For Production use you also need following:

- [Gunicorn](https://gunicorn.org/), or any other compatible WSGI server

### Installation

Create a virtual environment for your Python dependencies:

```bash
python3 -m venv python
```

To activate that environment, run

```bash
source python/bin/activate
```

Install the required dependencies

```bash
pip3 install -r requirements.txt
```

#### Setting up the database

We officially only support PostgreSQL. Setup your database using following scripts:

```bash
# Additional Python packages for PostgreSQL
pip3 install -r contrib/postgresql-requirements.txt
# Create database and user (change password in script)
sudo -u postgres psql -f contrib/postgresql-create-database.sql
```

You might want to change the password for the database user created by the relevant script.

### Configuration

Create a config file from the sample configuration file:

```bash
# Development
engine-api$ cp config/sample/sample-development.engine-api.ini config/engine-api.ini
# Production
engine-api$ cp config/sample/sample-production.engine-api.ini config/engine-api.ini
# Docker
engine-api$ cp config/sample/sample-docker.engine-api.ini config/docker/engine-api.ini
```

Now edit the configuration file. If you trust all the defaults you'll only need to change the database password.

For some deployment like production you may want to change the default port too.
In this case also set the correct IP and port in `gunicorn.conf.py` file.

> You might also need to 'open' the chosen port in your `iptables` (Default is 8008)

```shell
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 8008 -j ACCEPT
```

Then configure the type of federation. Depending on how you want to run your
Engine API node and where it is deployed, you'll needed to uncomment one of these federation sections:

> To avoid any malfunction it is important that any other node-type configuration is commented out.

#### Engine 1 Node

Use this section if you are running [AURA Engine](https://gitlab.servus.at/aura/engine) standalone or if this is the first API node in a redundant deployment.

Replace `api.sync.local` with the actual host name or IP of your sync node.

```ini
# NODE 1
host_id=1
sync_host="http://api.sync.local:8008"
```

#### Engine 2 Node

In case this is the second API node in a redundant deployment.

Replace `api.sync.local` with the actual host name or IP of your sync node.

```ini
# NODE 2
host_id=2
sync_host="http://api.sync.local:8008"
```

#### Synchronization Node

This is the synchronization instance in a redundant setup. This instance combines all valid information coming from Engine API 1 and 2.

Replace `engine1.local` and `engine2.local` with the actual details of your main nodes.

```ini
# NODE SYNC
host_id=0
main_host_1="http://engine1.local:8008"
main_host_2="http://engine2.local:8008"

# The Engine which is seen as "active" as long no other information is received from the status monitor
default_source=1
# How often the Engine 1 and 2 nodes should be checked for unsynced records (in seconds)
sync_interval=3600
# How many unsynced records should be retrieved at once (in seconds)
sync_batch_size=100
# How long to wait until the next batch is requested (in seconds)
sync_step_sleep=2
```

## Running the Server

### Development

To run the API in an local development server execute:

```bash
./run.sh dev
```

This command implicitly activates the virtual environment before starting the API.

For convenience running a plain `./run.sh` also starts the development server.

In development mode Engine uses the default [Flask](https://palletsprojects.com/p/flask/) web server.

**Please be careful not to use this type of server in your production environment!**

When you'll need to run all three nodes to do testing during development you can run:

```bash
./run.sh api-test-0 # Sync Node
./run.sh api-test-1 # Node 1
./run.sh api-test-2 # Node 2
```

Here the run script uses the configurations located in `./test/config`.

### Production

For production Engine API defaults to using the WSGI HTTP Server [`Gunicorn`](https://gunicorn.org/).

You might also want to pair Gunicorn with some proxy server, such as Nginx.

> Although there are many HTTP proxies available, we strongly advise that you use Nginx. If you choose another proxy
> server you need to make sure that it buffers slow clients when you use default Gunicorn workers. Without this buffering
> Gunicorn will be easily susceptible to denial-of-service attacks. You can use Hey to check if your proxy is behaving properly.
> — [**Gunicorn Docs**](http://docs.gunicorn.org/en/latest/deploy.html).

To run Gunicorn, you first need to create the Gunicorn configuration
by copying the sample `./config/sample/gunicorn/sample-production.gunicorn.conf.py`
to your `config` directory.

Then run this from the root directory:

```bash
./run.sh prod
```

If this is succeeding, you can now proceed to configure Engine API to run as a system daemon using [Systemd](#running-with-systemd) or
[Supervisor](#running-with-supervisor).

### Running with Systemd

The Systemd unit file configuration expects to be running under the user `engineuser`. To create such user type:

```bash
    sudo adduser engineuser
    sudo adduser engineuser sudo
```

Copy the systemd unit file in `./config/sample/systemd` to `/etc/systemd/system`. This configuration file is expecting you to have
Engine API installed under `/opt/aura/engine-api` and `engineuser` owning the files.

Let's start the service as root

```bash
systemctl start aura-engine-api
```

And check if it has started successfully

```bash
systemctl status aura-engine-api
```

If you experience issues and need more information, check the syslog while starting the service

```bash
tail -f /var/log/syslog
```

You can stop or restart the service with one of these

```bash
systemctl stop aura-engine-api
systemctl restart aura-engine-api
```

Note, any requirements from the [Installation](#installation) step need to be available for that user.

### Running with Supervisor

Alternatively to Systemd you can start Engine API using [Supervisor](http://supervisord.org/). In `./config/sample/supervisor/aura-engine-api.conf` you
can find an example Supervisor configuration file. Follow the initial steps of the Systemd setup.

### Running with Docker

Having the configuration files `engine-api.ini` and `gunicorn.conf.py` located in `./config/docker`,
you can run the API server in a Docker container this way

```bash
exec sudo docker run \
    --network="host" \
    --name engine-api \
    --rm -d \
    -u $UID:$GID \
    -v "$BASE_D":/srv \
    -v "$BASE_D/config/docker":/srv/config \
    --tmpfs /var/log/aura/ autoradio/engine-api
```

The project also contains a convenience script to get started with a one-liner

```bash
# Start up a container
./run.sh docker:api
```

## Development

This project is based on a [Flask](https://flask.palletsprojects.com/) server using an [_API First_](https://swagger.io/resources/articles/adopting-an-api-first-approach/) approach.
The API is specified using [Open API 3](https://swagger.io/specification/). The resulting API implementation is utilizing the popular [Connexion](https://github.com/zalando/connexion) library on top of Flask.

### Using the API

You can find details on the available API endpoints here: https://app.swaggerhub.com/apis/AURA-Engine/engine-api/1.0.0

Adding some entry to the playlog:

```bash
curl -d '{ "track_start": "2020-06-27 19:14:00", "track_artist": "Mazzie Star", "track_title": "Fade Into You", "log_source": 1 }' -H "Content-Type: application/json" -X POST http://localhost:8008/api/v1/playlog
```

This newly added entry can be queried using your browser in one of the following ways:

```bash
# Get the latest entry
http://localhost:8008/api/v1/trackservice/current
# Get a set of the most recent entries
http://localhost:8008/api/v1/trackservice/
# Filter some specific page (the three most recent entries)
http://localhost:8008/api/v1/trackservice?page=0&limit=3
```

All other API endpoints are listed in the interactive documentation.

```bash
http://localhost:8008/api/v1/ui/
```

Your OpenAPI definition lives here:

```bash
http://localhost:8008/api/v1/openapi.json
```

### Extending the API

The workflow for extending the API follows the **API First** approach. This means you have to edit the API at https://app.swaggerhub.com/apis/AURA-Engine/engine-api/,
using the SwaggerHub web editor. Then download the `python-flask` server stubs, and replace & merge the existing generated sources in `./src/aura_engine_api/rest`.

All model files can usually be overwritten. Only controller and test classes need to undergo a merge action.

In the future it might be favorable to use a local Codegen to generate the API artifacts.

### Creating a local image

If you are a developer and want to create a local image, run

```bash
# Build the image
./run.sh docker:build
```

### Publish new image

If you are developer and want to publish a new image to DockerHub, run

```bash
# Releasing the image to DockerHub
./run.sh docker:push
```

## Logging

The Engine API logs can be found under `./logs`.

## Read more

- [Engine Overview](https://gitlab.servus.at/aura/engine)
- [docs.aura.radio](https://docs.aura.radio)
- [aura.radio](https://aura.radio)