· 5 min read
How to deploy Apache Superset with Dokku?
Why?
- Because it is much easier than doing everything manually
- Configuration changes are deployed via git (
git push dokku main
) - Reverse proxy (nginx) is configured by Dokku (no lines of nginx config files are touched, not even looked at)
- SSL is configured by a Dokku plugin (I don’t even know how to use certbot)
- Ideal to spin up a Superset instance quickly to test things out
Why not?
Well, if the plan is to serve thousands of dashboards and charts for thousands of users, maybe look into the kubernetes installation.
Before we start, ACME is used as an example of a company name. Feel free to replace with your own.
In your virtual machine
1. Install dokku
Go to Dokku’s installation page and follow the steps. There’s no need to have Docker pre-installed, Dokku installer will take care of that.
Make sure to follow all steps until the end including the ones about adding your SSH key and setting the global domain. A domain is required to run Superset with SSL.
The SSH key must belong to the machine from where you intend to deploy Superset, so that git push
can authenticate.
2. Create the app in Dokku
dokku apps:create acme-superset
3. Install postgres/redis
Install postgres plugin and redis plugin.
Then create the services and link them to the app.
sudo dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres
sudo dokku plugin:install https://github.com/dokku/dokku-redis.git redis
dokku postgres:create acme-superset
dokku postgres:link acme-superset acme-superset
dokku redis:create acme-superset
dokku redis:link acme-superset acme-superset
This is optional, but I recommend doing it.
By default Superset uses SQLite, which will not allow us to create multiple Datasets pointing to different tables and using the same name. This is solved by using Postgres as the metastore (metastore is how Superset documentation refers to the database where Supersets objects are stored—dashboards, charts, etc).
Redis is also optional but it’s nice to have in case you want to configure data caching.
Postgres and Redis connection strings will be present as environment variables in the acme-superset application:
dokku config:show acme-superset
After running the link command you may encounter a message like App image (dokku/acme-superset:latest) not found
, just ignore them.
4. Configure the default port
dokku proxy:ports-add acme-superset http:80:8088
By default Superset uses the 8088
port, but in order to properly configure SSL it is required for us to proxy the port 80
.
In your local machine
1. Create a new local repository
mkdir acme-superset
cd acme-superset
git init
Create a new folder and initialize git. We’ll use this folder as the Dokku application to deploy.
2. Create a config.py
curl https://raw.githubusercontent.com/apache/superset/2.1.0/superset/config.py -o config.py
The config.py from your repo will be placed instead of the default config.py.
Grab the default config.py from github.com/apache/superset/blob/VERSION/superset/config.py.
To use the database connection string from the DATABASE_URL
env var, update the config.py file as follows:
SQLALCHEMY_DATABASE_URI = os.environ.get("DATABASE_URL")
To use the redis connection string from the REDIS_URL
en var, update the config.py file as follows:
# Cache for datasource metadata and query results
DATA_CACHE_CONFIG: CacheConfig = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": int(timedelta(hours=1).total_seconds()),
"CACHE_KEY_PREFIX": "superset_data_cache_",
"CACHE_REDIS_URL": os.environ.get("REDIS_URL")
}
Configure anything else as needed.
3. Create a Dockerfile
touch Dockerfile
# ./Dockerfile
FROM apache/superset:2.1.0
USER root
# Set my secret key
ENV SUPERSET_SECRET_KEY=SUPER_SECRET_KEY_PLEASE_REPLACE_ME
# Use my config
COPY config.py superset/config.py
# Add database drivers
RUN pip install psycopg2
RUN pip install sqlalchemy-bigquery
# Adds vim to be able to enter the container and read files with vim
RUN apt-get update && apt-get -y install vim
USER superset
First line allows us to select what version to use. This example uses 2.1.0, latest version as of this writing.
Add any database driver needed. This example adds the driver for Bigquery. More drivers can be found here.
The base image can be found in Docker hub.
4. Add Dokku’s remote repository
git remote add dokku [email protected]:acme-superset
Where dokku.acme.com
is the global domain configured for Dokku and acme-superset
is the name of the application in Dokku.
5. Deploy
Commit all files and push to Dokku’s remote.
git add .
git commit -m 'Superset initial configuration'
git push dokku main
The deploy will start and you should have output in the terminal about the Dockerfile steps we defined previously.
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 8 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 23.48 KiB | 5.87 MiB/s, done.
Total 4 (delta 0), reused 0 (delta 0), pack-reused 0
-----> Cleaning up...
-----> Building jm-superset from Dockerfile
remote: build context to Docker daemon 66.56kB
Step 1/14 : FROM apache/superset:2.1.0
...
Finally configure SSL in the Dokku machine via the letsencrypt plugin.
# This is performed in the virtual machine, where Dokku is running
sudo dokku plugin:install https://github.com/dokku/dokku-letsencrypt.git
# Set global email for letsencrypt
dokku letsencrypt:set --global email [email protected]
dokku letsencrypt:enable acme-superset
That’s it, open acme-superset.dokku.acme.com
in your browser.
Superset is running and you are able to change its configuration, install new drivers, change version, and pretty much do anything in a very easy manner.
Final considerations
The first time you deploy superset, you may need to create an admin user, migrate the database and run superset init
.
Enter the container by running:
dokku enter acme-superset
Then run:
# Create an admin
superset fab create-admin \
--username admin \
--firstname Superset \
--lastname Admin \
--email [email protected] \
--password admin
# Migrate DB
superset db upgrade
# Init and setup roles
superset init
Refer to Superset documentation for more information about its configuration.
Happy dashboarding.
I have a Superset instance running smoothly in a Digital Ocean Droplet. Get $200 in credit over 60 days to try this out.