Overview
In this abbreviated exercise, you will be focused on connecting a subset of our checkin project’s REST API to a true database-backed data layer. We will maintain a strong separation of concerns between the FastAPI HTTP routing, request, and response handling and the SQLAlchemy’s data persistence concerns. The shared vocabulary of “data transfer objects” enabling these layers to engage with one another will be models defined in Pydantic. These architectural decisions were discussed in Class 21 on Backend Layers.
Unlike the SQLAlchemy tutorial, in this exercise you will run a full PostgreSQL database with long-lived persistence. We will set this database up together as part of the exercise. The database will run as a separate Docker container from your service, just as it would in production, but it will all be managed by the DevContainer setup we establish.
This exercise is designed to be a solo exercise and everyone needs to create their own repository and submission. However, you are welcomed to collaborate with your final project teammates and work together through each of these steps as long as everyone is completing the work on their own machines.
Getting Started
To begin, accept the following project on GitHub Classroom: https://classroom.github.com/a/tVXzSuOh
Next, from outside of VSCode on your host machine, in a terminal, clone your project. We have seen issues with students attempting to clone DevContainer projects from within VSCode. These problems can be avoided if you begin by cloning your repository on your host machine and then open this folder in VSCode. It will subsequently ask you if you want to open in a DevContainer. You can go ahead and do so to build the container image.
Establishing Environment Variables
In order to establish our database, we need to provide some configuration settings. It is considered common and best practice to pass configuration settings to our servers via environment variables. Environment variables are passed into processes from their “environment”; typically the shell, but we will see another handy way in .env
files. You have encountered environment variables in your shell already: PATH
is one such. We will learn more about environment variables and why they are important in cloud-deployed 12-factor applications soon.
Adding a .env
file to backend
Environment variables can be clunky to manage and share via the terminal alone. Often you would like environment variables to be scoped to the projects you are working on rather than global to your user account or system. A common modern convention is defining environment variables in a dotfile named .env
.
The .env
file is a common place where secrets and system-specific configuration values are stored. Thus, it is important to be sure .env
is already listed as an entry in your project’s .gitignore
file. You do not want to commit .env
files to the history of your repository.
Now that you have confirmed the .gitignore
entry, from within your backend
directory, create a file named .env
. This is where we’ll establish some important environment variables for the backend database configuration. Copy paste the following contents:
MODE=development
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_HOST=db
POSTGRES_PORT=5432
POSTGRES_DATABASE=comp423
It is worth noting your shell program does not inherently pay attention or read .env
files, but soon we will see how Docker and our application can make use of them.
Save your .env
file and continue on.
Adding the Postgres Database Service to Docker Compose
Your DevContainer’s .devcontainer/docker-compose.yml
file specifies only a single container in your project: your development container for backend work. Let’s add an additional container to our project to run Postgres.
In doing so, we’ll be introducing a Docker Compose “service”. This is the yet another overloaded usage of the term service
you have seen in the course. In this context, a Docker “Service” in a single compose project refers to a container we can access by its name from other containers in the project.
The service container you’re about to setup will be named db
. Its image will be based on a modern release of PostgreSQL. It will be given environment variables via the .env
file you just established. These environment variables will specify the default username and password to access the database. These are kept simple (postgres/postgres) because we are working in a development environment on your machine only. In production, which we are not concerned with in this exercise, you’ll choose a randomized password.
Finally, this service needs to be backed by a persistent file system such that when you restart your project all of the data remains saved to disk. We will use a feature of Docker called a “Storage Volume” and attach storage to the container at the directory PostgreSQL expects to save its data.
Define the db
Service
Open .devcontainer/docker-compose.yml
. Under services, you will see the httpd
entry that defines our developer server container. Next, you’ll add a sibling to this service named db
. Add the following entry:
db:
image: postgres:15.2
restart: unless-stopped
env_file:
- ../backend/.env
ports:
- '5432:5432'
volumes:
- db-data:/var/lib/postgresql/data
In terms of indentation levels, db
should be at the same as httpd
, and image
should be at the same level as build
above in httpd
. Indentation is important in YAML files, so be careful yours is correct here to avoid problems later.
Notice under volumes
we’re listing a volume named db-data
that will be mounted into our db
container’s /var/lib/postgresql/data
directory; the place where PostgreSQL saves its data. Next, you’ll establish this volume:
Define the db-data
Storage Volume
Add the following to your .devcontainer/docker-compose.yml
file at the top-level of indentation; volumes
should be at the same level as services
.
volumes:
db-data:
We are not overriding any of the default storage volume settings for db-data
, so this looks a bit silly, but do notice we are explicitly declaring db-data
to be the name of a volume in our project. The db
service defined above’s reference of db-data
is what connects it to this entry. Once we rebuild and restart the devcontainer, you’ll be able to see this volume from within the Docker dashboard.
Rebuild Container
After reconfiguring our docker-compose.yml
file (or Dockerfile
, though no changes should be necessary in it), you will need to rebuild your container for the updates to land in your DevContainer. Open the Command Palette and run “Dev Containers: Rebuild Container”. Your project should reopen without issue, but if you have problems select reopen on host machine and return to the steps above. You may also scan the error output to see if you can find the error that led to the restart not working. The three places it could be would be the .env
file or the other steps above.
Open your Docker Dashboard and view your Containers tab. You should see the ex04
stack and if you expand it you’ll see db
and httpd
. Both should be running.
Now go to the Volumes tab of the dashboard and notice the storage volume that was created for your database data to be persistent.
Create Database in PostgreSQL
Now we need to create a Database for our app. The PostgreSQL server is capable of hosting many databases, each with their own tables and other resources, independent of one another.
Back in your VS Code Dev Container, open up a terminal and run the following commands. After the first command, type in postgres
at the password prompt.
psql --host=db --user=postgres
$
postgres=# CREATE DATABASE comp423;
postgres-# \q
When prompted for a password, enter postgres
.
Here we’re using the de facto command-line client for connecting to a PostgreSQL server: psql
. Notice the db
name of the host corresponds to what we named this service in the docker-compose file. Technically, this is a DNS host, just like www.unc.edu
. One of the features of docker-compose
is that it has a system for resolving these hostnames to the correct containers once they are started. Also notice the user
and password
are what you specified as environment variables in your .env
file. When the PostgreSQL service began for the first time, it used the environment variables to assign the username/password for the system by convention.
The second line is a SQL statement to create a database named comp423
within your db
container’s PostgreSQL server. The third line \q
exits the psql
client REPL.
Database Code Review
There are two source code files to review ahead of the next step:
backend/database.py
backend/env.py
backend/entities.py
Notice in database.py
, we are reading the environment variables you established in .env
in order to load the configuration necessary to create a connection string for SQLAlchemy. You’ll additionally notice the function getenv
is being imported from backend/env.py
. Take a quick gander at this file. You will see the getenv
function is merely passed through from the standar library’s os
(operating system) package. However, you will also notice this module imports and calls a function named load_dotenv
from the dotenv
third party library (installed from requirements.txt
). This 3rd party library is what is responsible for our .env
file’s contents to be loaded as environment variables. Without doing this, the contents of .env
would other wise be ignored. Back in database.py
, you’ll see the engine
being established with SQL echo’ing turned on, and a function definition named db_session
. The db_session
function is peculiar in its usage of yield
, making it a generator function. We will not go into depth on this detail, but if you would like to learn more please read about dependencies with yield. This function will produce Session
objects for our backend services layer to depend upon.
Next open entities.py
and notice this is the same simple UserEntity
discussed in class. It has some helper methods defined for going back and forth between pydantic
models (found in models.py
).
Scripting Developer Database Schema Reset and Sample Data
As a quick recap, you now have a running PostgreSQL container and a database named comp423
in it. However, you do not yet have a users
table within this database, nor any data in the table. Let’s write a quick script that will be easy to run to reset our developer database tables.
Create a directory named script
and add a file to it named reset_db.py
. Go ahead and add the contents below and complete the TODO. For the mock UserEntity
object you are constructing and adding to the database, perhaps choose your own pid
, first_name
, and last_name
to initialize the object with.
import database
from entities import Base, UserEntity
# Reset Tables
Base.metadata.drop_all(database.engine)
Base.metadata.create_all(database.engine)
# Enter Mock Data
from sqlalchemy.orm import Session
= Session(database.engine)
session
# TODO: Add a UserEntity to the database session and commit it.
After you have completed the TODO
, which you can refer to your notes on SQLAlchemy for the steps involved, save your file and try running the following command to run your script.
cd backend
$ python3 -m script.reset_db $
You should be able to wade through the SQL echoed out by SQLAlchemy and see the drop table, create table, and insert statements being issued. This script can be re-run, since it attempts to whipe the database clean (drop all) before building it back up. However, do note that these kinds of actions truly do whipe your database’s data. This is not an issue for this exercise, but in future scenarios or production management, you want to be very careful with commands and scripts like this!
Run the Backend FastAPI Server and open /docs
We will not have a front-end in this exercise as the focus is connecting the HTTP layer with the business logic / database layer. We will only have the OpenAPI /docs
facilities provided by FastAPI.
Go ahead and start your development server. From within a pwd
of backend
, you can start the uvicorn
server as usual: uvicorn main:app --reload
. Then navigate your browser to localhost:8000/docs
. You should see the four routes defined in main.py
. Only one of them will actually work: /api/users
.
If you are not seeing the correct backend, go back to VSCode and check the “PORTS” tab in the same pane as your terminal. You should a green dot next to port 8000
on your container and may see that your Local Address was mapped to a different port, in which case, you should navigate to that address with /docs
appended.
Try executing the route /api/users
. If all is well up to this point in the exercise, you should see the response include the mock data you established in reset_db.py
. If you do not, then you should not proceed any further as some step above failed. If you are seeing an error, it likely has to do with database creation or table creation. If you are seeing no error but an empty array response body, then your UserEntity
in reset_db.py
was not successfully inserted and committed to the database. Resolve these issues first.
Your Task
Your task in this exercise is to complete the three remaining route implementations in main.py
, for routes new_user
, get_user
, and delete_user
. These routes should persist and retrieve their data from the PostgreSQL database. The concerns of database interactions should be defined in user_service.py
. Here, between the two files, you will see a complete end-to-end implementation for the route you just tested: get_users
depends on UserService#all
.
As discussed in class, the parameter and return types of your service should be User
models. All database concerns, and UserEntity
usage, should be encapsulated in your UserService
class. One simple rule of thumb for this is looking at the imports of main.py
versus user_service.py
. Notice main.py
does not import any functionality from sqlalchemy
– it is implemented in terms of user_service
.
Code Read: Dependency Injection
Notice a new pattern in the parameters of the routes, which you have seen use FastAPI’s conventional dependency injection features. A new feature, which is demonstrated in each of the four route functions, is the ability to provide custom dependency injectors. Each of the functions has a parameter of user_service: UserService = Depends()
. The Depends
default is FastAPI’s convention and system for specifying custom dependencies. This particular dependency is indicating FastAPI needs to construct a new instance of UserService
before each route evaluation (which happens per each request from a web client).
Next, open user_service.py
and notice the definition of UserService#__init__
. Specifically, notice that it has a parameter with signature session: Session = Depends(db_session)
. Recall, db_session
is a special function defined in database.py
. In essence, this dependency is telling FastAPI that when it injects a UserService
it needs to first call db_session
to grab a fresh Session
and pass it in as an argument to the constructor of UserService
.
Thus, thanks to the FastAPI framework, when you request any of these API end-points, FastAPI is constructing this object graph and passing it into the function being routed to as a parameter. This has three distinct benefits. First, it alleviates you from needing to construct these objects manually in code for each route. Second, when it comes time to write tests, we will have a simple way of passing in our own arguments with test stubs and mocks rather than rely on dependency injection. Third, thanks to FastAPI’s conventions, we can define our dependencies to automatically clean up for themselves once the route has responded to the client. In our example, the session with the database is closed automatically in the database.py/db_session
function. Dependency injection is a common, modern pattern seen widely throughout many frameworks (you saw this in Angular, too!)
Technical Requirements
There are a few technical requirements for the routes you implement and the backing service layer:
- All
UserService
methods should retain and adhere to their method signatures’ parameter and return types. new_user
route should result in a 422 response if there is an error experienced in the creation of the user (such as attempting to register a duplicate PID, which the database will refuse and raise an error for you to handle inmain.py
).get_user
should result in a 404 response if the user cannot be found in the database.delete_user
should result in a 404 response if there is an issue deleting a user by their PID.
As you complete each route and its backing service method, you should be able to interactively test persistence in your database via the OpenAPI docs interface. Further, you can fully close your uvicorn
server, start it back up, and restart your Dev Container and host machine, and your data from earlier sessions will be retained thanks to finally having a truly persistent layer in the PostgreSQL database.
Hand-in
Hand-in instructions to follow soon.