Overview

Hoss Summary

The Hoss is currently deployable as a collection of containers running via Docker Compose. The architecture is designed to allow independent deployment and scaling of components as needed, but future work is still required to support more complex deployments.

Depending on the configuration, the following services are deployed, each in its own container:

db: A PostgreSQL database
core: A service providing the primary REST API
auth: If enabled, a service providing a REST API for managing groups, PATs, token exchanges, and user accounts
dex: If auth is enabled, Dex (a federated OpenID Connect provider) is run to integrate with external auth providers
ldap: If enabled, a local LDAP server is run to manage user accounts internally
ldap-admin: If enabled, a small service provids an admin UI for the internal LDAP server
minio: If enabled, minIO is run in gateway mode to provide access to local storage via the S3 API
etcd-0: If minio is enabled, etcd is run to support required minIO features
rabbitmq: If minio is enabled, RabbitMQ is run to provide message queues to handle minIO bucket events
sync: A service to manage synchronizing data and indexing metadata for search
opensearch: Opensearch is run to provide metadata search
ui: A service to serve the Hoss web UI

Choosing The Right Configuration

The most complicated part about installing a Hoss server is likely understanding and determining a desired configuration. Because the system is so flexible, there are many choices related to not only how individual servers are deployed, but also how multiple servers can be linked together to enable hybrid cloud workflows.

Server Options

When deciding on an individual server’s configuration there are various options to be considered and parameters to be set. More details are available throughout the rest of the documentation, but at a high level you must consider:

The external hostname of the server
- This cannot easily be changed after a server has been deployed.
TLS configuration
- The Hoss can use Let’s Encrypt internally, run behind an additional proxy/load balancer that is doing TLS termination for you, or run unencrypted
Auth Configuration
- A Hoss server can run its own auth service or use an existing one (i.e. in a multi-server configuration)
- An internal LDAP provider can be used to work “out of the box”
- External LDAP or other authentication providers can be integrated
- If using the internal LDAP provider, you may want to enable Google’s reCaptcha service on the login page
Object Store Configuration
- Currently, you can choose between AWS S3, and externally hosted minIO server, or an internally hosted minIO server
Backup location
Custom UI colors and logos

Multi-Server Configurations

Often, multiple Hoss servers are run on different infrastructures (e.g. one server on-premise and one server in AWS) and linked together via syncing, auth, or both. These architectural decisions can enable useful hybrid cloud workflows, for example:

Easy sharing of data generated and managed on-premise with external collaborators
Off-site data collection and transfer back on-premise
Portable analytics to leverage both on-premise and cloud compute resources
Data “delivery” to external users

The first consideration in a multi-server configuration is how auth will be configured. In addition to deciding on what authentication provider will be used, you must also choose between:

(Recommended) One server runs an auth service. Additional servers use the “centralized” auth service.
- Less steps to configure, and easier to use and manage
- Auth service must be accessible by all other servers (e.g. runs in the cloud, not on-premise behind a firewall)
- Users can use a single PAT with any linked server and have only one set of credentials to remember
Each server runs its own auth service
- More complex to configure and manage
- Depending on the Auth provider configured for each server, different credentials may be needed.
- You must be careful to make sure usernames match between servers or there could be issues when synchronizing data, groups, and permissions.

The second consideration is how syncing will be configured. Typically you should run a single sync service that can reach all servers. For example, if you have one server on-premise and one server in the cloud, you’ll likely want to run the sync service on-premise. This sync service can then be responsible for moving data between object stores around as needed.

Installation Process

Installing a server requires several manual steps and configuration. The system is quite flexible and can support various use cases and deployment architectures. Given your decisions on how to configure both individual servers and if you will be linking multiple servers, the process at a high level is:

Prepare required infrastructure
1. Create any required cloud resources (i.e. S3 buckets, SQS queues, IAM roles & policies, EC2 instance)
2. Create any required on-premise resources (e.g. NFS shares, a VM or server)
Prepare the server
1. Install Docker and Docker Compose
2. Install additional make and git dependencies
3. Configure host user accounts and storage mounts as needed
Install, configure, and start the Hoss server software

The following installation documents outline the installation process for common configurations. Details on all the available configuration parameters and scenarios is captured in the “Configuration” section.