Creating a Postgres docker container and connecting dbt
- data
- 2024-05-09
- 5 minutes to read
Table of Contents
Introduction
I am currently researching dbt, a data transformation tool that allows you to transform data in your warehouse more effectively. One of the things that is really interesting about it, is that dbt has a set of adapters that allow you to interchange between different database engines.
It is also really easy to get started with local development. I initially tried to get the SQL Server adapter up and running because I am able to run SQL Server locally on my Mac. However, the SQL Server adapter is a community adapter and it’s not so quick to get started, so for now I’m going to work with Postgres.
I’ve chosen to use the Postgres adaptor because I can spin up a Postgres database locally in a Docker container in a few minutes. This means I can do dedicated development offline without having to sign up for any cloud services. Here’s how I set things up.
Pre-requisites
Before you can complete the steps below, you need to install and configure Docker from here.
Create a container instance
The command I’m going to run to create a Postgres container is:
1# make sure that Docker is running when you run this command
2Docker run -d --name dbt-quickstart -p 5432:5432 -e POSTGRES_USER=justin -e POSTGRES_PASSWORD=justin postgres:15-alpine
The parameters are as follows:
--name dbt-quickstart
is the name of the container.-p 5432:5432
maps the port 5432 on the host to the port 5432 in the container.-e POSTGRES_USER=justin
is the user name.-e POSTGRES_PASSWORD=justin
is the password (look away security fans).postgres:15-alpine
is the image that will be used to create the container.
If you haven’t had a Postgres container up and running before, then Docker will download the image, then create the container.
1Unable to find image 'postgres:15-alpine' locally
215-alpine: Pulling from library/postgres
3bca4290a9639: Pull complete
473755f5e6a3e: Pull complete
53a60b4a71875: Pull complete
656089f613fb2: Pull complete
73972f5d4acdc: Pull complete
8edbb275221bf: Pull complete
9fe375eefc236: Pull complete
10f8ac95c0dcf9: Pull complete
11f89251e2b9f9: Pull complete
12Digest: sha256:8a8d55343d6fc456cb183453e3094ff140b984157b36b48f817dd581654f2aec
13Status: Downloaded newer image for postgres:15-alpine
148b122e2d8b752c8ef7a32094a897c826193415151bcf0fa9371edafa46e3196a
You can check that the container is running by running Docker ps
or by reviewing the Docker dashboard.
Install and configure dbt-sqlserver
Next I need to install the dbt adapter for postgres:
1# installs dbt adapter
2pip install dbt-sqlserver
Run dbt
I’m using the jaffle-shop sandbox project to review dbt. All I need to do from here is set up my profile in the profiles.yml
file. I want this project to be built within my new postgres database, and so the profiles.yml file will look like this:
1quickstart-postgres:
2 target: dev
3 outputs:
4 dev:
5 type: postgres
6 host: localhost
7 user: justin
8 password: justin
9 port: 5432
10 dbname: jaffle
11 schema: dbo
By updating the profile in the dbt_project.yml
file to quickstart-postgres
I can now run dbt commands to build and test the project against my Postgres database.
Accessing the database in Azure Data Studio
I can also connect to the database using Azure Data Studio. I can see the tables that have been created by dbt as a result of running the dbt seed
command so I know that everything is working.
And so from there I can run dbt run
to build the project:
1dbt run
2
3# output
4Running with dbt=1.7.14
5Registered adapter: postgres=1.7.14
6Found 13 models, 6 seeds, 6 sources, 0 exposures, 19 metrics, 651 macros, 0 groups, 6 semantic models
7Concurrency: 1 threads (target='dev')
8
91 of 13 START sql table model dbo.metricflow_time_spine ........................ [RUN]
101 of 13 OK created sql table model dbo.metricflow_time_spine ................... [SELECT 3651 in 0.14s]
112 of 13 START sql view model dbo.stg_customers ................................. [RUN]
122 of 13 OK created sql view model dbo.stg_customers ............................ [CREATE VIEW in 0.06s]
133 of 13 START sql view model dbo.stg_locations ................................. [RUN]
143 of 13 OK created sql view model dbo.stg_locations ............................ [CREATE VIEW in 0.05s]
154 of 13 START sql view model dbo.stg_order_items ............................... [RUN]
164 of 13 OK created sql view model dbo.stg_order_items .......................... [CREATE VIEW in 0.04s]
175 of 13 START sql view model dbo.stg_orders .................................... [RUN]
185 of 13 OK created sql view model dbo.stg_orders ............................... [CREATE VIEW in 0.05s]
196 of 13 START sql view model dbo.stg_products .................................. [RUN]
206 of 13 OK created sql view model dbo.stg_products ............................. [CREATE VIEW in 0.05s]
217 of 13 START sql view model dbo.stg_supplies .................................. [RUN]
227 of 13 OK created sql view model dbo.stg_supplies ............................. [CREATE VIEW in 0.05s]
238 of 13 START sql table model dbo.locations .................................... [RUN]
248 of 13 OK created sql table model dbo.locations ............................... [SELECT 6 in 0.04s]
259 of 13 START sql table model dbo.products ..................................... [RUN]
269 of 13 OK created sql table model dbo.products ................................ [SELECT 10 in 0.04s]
2710 of 13 START sql table model dbo.order_items ................................. [RUN]
2810 of 13 OK created sql table model dbo.order_items ............................ [SELECT 90900 in 0.15s]
2911 of 13 START sql table model dbo.supplies .................................... [RUN]
3011 of 13 OK created sql table model dbo.supplies ............................... [SELECT 65 in 0.03s]
3112 of 13 START sql table model dbo.orders ...................................... [RUN]
3212 of 13 OK created sql table model dbo.orders ................................. [SELECT 61948 in 0.33s]
3313 of 13 START sql table model dbo.customers ................................... [RUN]
3413 of 13 OK created sql table model dbo.customers .............................. [SELECT 935 in 0.09s]
35Finished running 7 table models, 6 view models in 0 hours 0 minutes and 1.31 seconds (1.31s).
36Completed successfully
Which will build the project in the Postgres database:
Conclusion
If you want to explore dbt and you don’t want to go through the hassle of setting up a cloud database rource, then this is a pretty quick way to get started. I do like the idea of being able to develop and test the entire solution offline. There is also the option to connect to DuckDB, but I haven’t tried that yet.