From 37f7922a679b36fbc278e84e503e2bdb59573e0e Mon Sep 17 00:00:00 2001
From: David Trattnig <david.trattnig@o94.at>
Date: Thu, 1 Oct 2020 12:12:25 +0200
Subject: [PATCH] Extended doc as per feedback in #10.

---
 README.md | 75 +++++++++++++++++++++++++++++++------------------------
 1 file changed, 43 insertions(+), 32 deletions(-)

diff --git a/README.md b/README.md
index 4d4a120..659d1b7 100644
--- a/README.md
+++ b/README.md
@@ -9,6 +9,8 @@
         - [Redundant Deployment](#redundant-deployment)
             - [Managing Active Engine State](#managing-active-engine-state)
             - [High Availability Playlog Synchronization](#high-availability-playlog-synchronization)
+                - [Active Sync](#active-sync)
+                - [Passive Sync](#passive-sync)
     - [Getting started](#getting-started)
         - [Requirements](#requirements)
         - [Installation](#installation)
@@ -38,7 +40,7 @@ The Project serves the Engine API and handles state management of multiple [Engi
 The Engine API stores and provides following information:
 
 - **Playlogs**: A history of all audio titles being played by the Engine. This is used for example to generate detailed reports for regulartory purposes.
-- **Track Service**: Same as track service, but stripped-down information. Used for implementing a track service feature on the radio's website.
+- **Track Service**: Same as playlogs, but stripped-down information. Used for implementing a track service feature or displaying the currently playing track information on the radio's website.
 - **Active Source**: In redundant deployment scenarios the API stores and shares information on which engine instance is currently active. This could be extended to other audio sources.
 - **Health Information**: In case of some critical issue affecting the functionality of AURA Engine, the history of health status records of the respective engine is stored.
 - **Studio Clock**: Information on the current and next show to be used in a *Studio Clock* application.
@@ -63,45 +65,58 @@ acting as a reverse proxy.
 ### Redundant Deployment
 
 In this scenario there are two Engine instances involved. Here you will need to deploy one Engine API on the host of each Engine instance. Additionally you'll have to set up
-a third, so-called "Syncronization Node" of the Engine API. This sync instance of Engine API is in charge of synchronizing playlogs and managing the active engine state.
+a third, so-called *Syncronization Node* of the Engine API. This sync instance of Engine API is in charge of synchronizing playlogs and managing the active engine state.
 
 <img src="docs/engine-api_redundancy.png" />
 
 #### Managing Active Engine State
 
-In order to avoid duplicate playlog storage, the Synchronization Node requires to know what the currently active Engine is. This can be achieved by some external *Status Monitor*
-component which tracks the heartbeat of both engines. In case the Status Monitor identifies one Engine as dysfunctional, it sends a REST request to the Sync Node, informing it
+In order to avoid duplicate playlog storage, the *Synchronization Node* requires to know what the currently active Engine is. This can be achieved by some external *Status Monitor*
+component which tracks the heartbeat of both engines. In case the Status Monitor identifies one Engine as dysfunctional, it sends a REST request to the *Sync Node*, informing it
 about the second, functional Engine instance being activated.
 
-The history of active Engine instances is stored in the database of the Sync Node. It is not only used for playlog syncing, but is also handy as an audit log.
+The history of active Engine instances is stored in the database of the *Sync Node*. It is not only used for playlog syncing, but is also handy as an audit log.
 
 > At the moment AURA doesn't provide its own *Status Monitor* solution. You'll need to integrate your self knitted component which tracks the heartbeat of the engines and posts
-the active engine state to the Sync Node.
+the active engine state to the *Sync Node*.
+
+> In your live deployment you might not want to expose the API directly on the web. For security reasons it's highly recommended to guard it using something like NGINX,
+acting as a reverse proxy to shield your API.
 
 #### High Availability Playlog Synchronization
 
-Usually when some new audio source starts playing, AURA Engine logs it to its local Engine API instance via some REST call. Now, the local API server stores this information in its
-local database. Next, it also performs a request to the Synchronization API Serve. The Sync Server checks if this request is coming from the currently active engine instance.
-If yes, it stores this information in the playlog database.
+Usually when some new audio source starts playing, AURA Engine logs it to its local Engine API instance via some REST call. Now, the *Local API server* stores this information in its
+local database. Next, it also performs a POST request to the *Synchronization API Server*. This *Sync Node* checks if this request is coming from the currently active engine instance.
+If yes, it stores this information in its playlog database. This keeps the playlogs of individual (currently active) Engine instances in sync with the *Engine API syncronization node*.
+The *Engine API syncronization node* always only stores the valid (i.e. actually played) playlog records.
 
-During the synchronization process between some Engine Node and the Synchronization Node, there might be the case, that the latter is not available e.g. due to network outage,
-maintenance etc. In this situation the playlog obviously can not be synced. That means the local playlog is marked as "not synced". Whenever the Sync Node is up- and running again,
-some automated job on the Sync Node is continuously checking for "unsynced" records on remote nodes. If there are such records pending to be synced, this job reads them as batches
-from that Engine Node. To avoid this sync causing high traffic on any engine instance, these batches are read with some configured delay time (see `sync_interval`, `sync_batch_size`,
-and `sync_step_sleep` in the Sync Node configuration; all values are in seconds).
+##### Active Sync
 
-> In your live deployment you might not want to expose the API directly on the web. For security reasons it's highly recommended to guard it using something like NGINX,
-acting as a reverse proxy.
+This top-down synchronization process of posting any incoming playlogs at the *Engine Node* also to the *Synchronization Node* can be called **Active Sync**. This **Active Sync**
+doesn't work in every scenario, as there might be the case, that the *Synchronization Node* is not available e.g. due to network outage, maintenance etc. In this situation the playlog
+obviously can not be synced. That means the local playlog at the *Engine Node* is marked as "not synced". 
+
+##### Passive Sync
+
+Such marked entries are focus of the secondary synchronization approach, the so called **Passive Sync**: Whenever the *Synchronization Node* is up- and running again, some automated job
+on this node is continuously checking for records on remote nodes marked as "unsynced". If there are such records found, this indicates that there has been an outage of the *Sync Node*.
+Hence those "unsynced" records are pending to be synced. Now an automated job on the *Sync Node* reads those records as batches from that Engine Node and stores them in its local database.
+It also keeps track when the last sync has happend, avoiding to query unnecceary records on any remote nodes.
+
+In order to avoid that this **Passive Sync** job might be causing high traffic on an engine instance, these batches are read with some configured delay time (see `sync_interval` and
+`sync_step_sleep` in the *Sync Node* configuration; all values are in seconds) and a configurable batch size (`sync_batch_size`; count of max unsynced playlogs which are read at once).
 
 ## Getting started
 
 ### Requirements
 
+The most simple way to get started is by running Engine API using [Docker](https://www.docker.com/). See below for detailed descriptions on how to do that.
+
 If you are not planning to go with Docker or just want to setup a local development environment, then you'll need:
 
-- Python 3.7+
-- MariaDB
-- Virtualenv
+- [Python 3.7+](https://www.python.org/)
+- [MariaDB](https://mariadb.org/)
+- [Virtualenv](https://virtualenv.pypa.io/en/latest/)
 
 ### Installation
 
@@ -253,23 +268,16 @@ The Systemd unit file configuration expects to be running under the user `engine
 Copy the systemd unit file in `./config/sample/systemd` to `/etc/systemd/system`. This configuration file is expecting you to have
 Engine API installed under `/opt/aura/engine-api` and `engineuser` owning the files.
 
-Next login to `engineuser` and give it permissions to the unit file
-
-```bash
-  su engineuser
-  sudo chmod 644 /etc/systemd/system/aura-engine-api.service
-```
-
-Let's start the service
+Let's start the service as root
 
 ```bash
-sudo systemctl start aura-engine-api
+systemctl start aura-engine-api
 ```
 
 And check if it has started successfully
 
 ```bash
-sudo systemctl status aura-engine-api
+systemctl status aura-engine-api
 ```
 
 If you experience issues and need more information, check the syslog while starting the service
@@ -281,8 +289,8 @@ tail -f /var/log/syslog
 You can stop or restart the service with one of these
 
 ```bash
-sudo systemctl stop aura-engine-api
-sudo systemctl restart aura-engine-api
+systemctl stop aura-engine-api
+systemctl restart aura-engine-api
 ```
 
 Note, any requirements from the [Installation](#installation) step need to be available for that user.
@@ -317,6 +325,9 @@ The project also contains a convenience script to get started with a one-liner
 
 ## Development
 
+This project is based on a [Flask](https://flask.palletsprojects.com/) server using an [*API First*](https://swagger.io/resources/articles/adopting-an-api-first-approach/) approach.
+The API is specified using [Open API 3](https://swagger.io/specification/). The resulting API implementation is utilizing the popular [Connexion](https://github.com/zalando/connexion) library on top of Flask.
+
 ### Using the API
 
 You can find details on the available API endpoints here: https://app.swaggerhub.com/apis/AURA-Engine/engine-api/1.0.0
@@ -353,7 +364,7 @@ http://localhost:8008/api/v1/openapi.json
 ### Extending the API
 
 The workflow for extending the API follows the **API First** approach. This means you have to edit the API at https://app.swaggerhub.com/apis/AURA-Engine/engine-api/,
-then download the `python-flask` server stubs, and replace & merge the existing generated sources in `./src/rest`.
+using the SwaggerHub web editor. Then download the `python-flask` server stubs, and replace & merge the existing generated sources in `./src/rest`.
 
 All model files can usually be overwritten. Only controller and test classes need to undergo a merge action.
 
-- 
GitLab