Overview
Etlworks can be deployed to symmetrical, horizontally scalable cluster.
Deployment steps
1. Application EC2s
1.1. Create two to N EC2 instances from m5
family with Amazon Linux 2 or Ubuntu 20.04 OS.
1.2. Security group should allow port 8080
access from HaProxy EC2 instance (Step 5) and your incoming traffic load balancer.
2. EFS shared storage
2.1. Create EFS that will be shared between Application EC2s.
2.2. Security group should allow TCP
NFS
port 2049
access from Application EC2s security group (Step 1.2).
2.3. Mount EFS on all Application EC2s (Step 1) at /opt/app-data
. {efs_url}
has to be replaced with actual EFS URL:
sudo yum install -y amazon-efs-utils
sudo mount -t efs -o tls {efs_url}:/ /opt/app-data
3. RDS PostgreSQL
3.1. Create RDS PostgreSQL (db.m6i.large
should be sufficient enough for most installations).
3.2. Enable password authentication (note password for later application config).
3.3. Set "Initial database name" to integrator
.
3.4. Set network to allow port 5432
access from Application EC2s security group (Step 1.2).
4. ElastiCache Redis
4.1. Create ElastiCache Redis (cache.t4g.small
should be sufficient enough for most installations).
4.2. Set network to allow port 6379
access from Application EC2s security group (Step 1.2).
5. HaProxy EC2
5.1. Create EC2 instance t2.small
with Amazon Linux 2 OS.
5.2. Security group should allow port 80
access from Application EC2s security group (Step 1.2).
5.3. Install HaProxy:
sudo yum install haproxy -y
5.4. Configure HaProxy by replacing /etc/haproxy/haproxy.cfg
with:
global log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 10000 user haproxy group haproxy daemon stats socket /var/lib/haproxy/stats defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 10 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 5m timeout server 0 timeout http-keep-alive 10s timeout check 10s maxconn 10000
frontend http-in
bind *:80
default_backend servers
backend servers balance roundrobin option httpchk GET /etl/rest/v1/health http-check expect status 200 server tomcat1 {node-1-ip}:8080 check server tomcatN {node-N-ip}:8080 check
{node-1-ip}
, {node-2-ip}
with actual Application EC2 nodes IPs (Step 1). Add extra server
lines if more than 2 nodes. The default recommended setting for timeout server
is 0
(infinite). It allows continues running of the streaming flows (such as CDC). Setting it to other than 0
(for example 43200m
which is the same as 30 days) can cause the timeout of HTTP connection established between scheduler and ETL process. When the connection timeouts the ETL process is killed by the server.
6. Etlworks Installation
/opt/etlworks
directory and copy installer archive into it.sudo tar -zxf etlworks-installer.tar.gz
sudo ./etlworks-cli.sh install -s -u <user> --external-db \
--conf-postgres-url jdbc:postgresql://<rds_postgres_host>:5432/integrator \
--conf-postgres-username <rds_postgres_username> \
--conf-postgres-password <rds_postgres_password> \
--conf-redis-host <elasti_cache_redis_host> \
--conf-redis-port 6379 \
--conf-app-data /opt/app-data \
--conf-jwt-secret <jwt_16_character_alphanumeric_secret>
<user>
- withec2-user
if Amazon Linux 2,ubuntu
if Ubuntu 20.04.-
<rds_postgres_host>
- with RDS PostgreSQL URL (Step 3). -
<rds_postgres_username>
- with RDS PostgreSQL username (Step 3), by defaultporstgres
if not changed during initial setup. -
<rds_postgres_password>
- with RDS PostgreSQL password (Step 3). -
<elasti_cache_redis_host>
- with ElastiCache Redis URL (Step 4).-
If password was set for ElastiCache Redis, then also add
--conf-redis-password <elasti_cache_redis_password>
to the list ofinstall
command parameters where<elasti_cache_redis_password>
is the actual ElastiCache Redis password. - If encryption in transit is enabled for ElastiCache Redis, then also add
--conf-redis-ssl true
to the list ofinstall
command parameters.
-
-
<jwt_16_character_alphanumeric_secret>
- random 16 character long alphanumeric string. NOTE! that this value has to be the same on all nodes.
sudo etlworks-cli.sh update-conf
command to update configurations after install.7. Incoming traffic load balancer
7.1. Configure your incoming traffic load balancer to proxy requests to port 8080
of Application EC2s.
8. Global application settings
After entire installation has been completed login to the Etlworks with default super admin credentials admin:admin1 and navigate to Settings
.
8.1. Under General
set Home URL
to the load balancer URL including protocol part.
8.2. Under Email
set email configuration that will be used by the system to send notifications. This is required for adding new users, resetting passwords, and sending email notifications.
8.3. Under Network
set ETL Engine Proxy URL
to HaProxy URL (Step 5).
Comments
0 comments
Article is closed for comments.