Test drive PostgreSQL with wait event probes on Docker
This is a short blog introducing a docker build file that I created to quickly build a container for test driving PostgreSQL with wait event probes; see databaseperformance.hashnode.dev/timed-wait.. about that.
To create a container, you should have docker setup, which can be validated using the docker info
command:
➜ docker info
Client: Docker Engine - Community
Version: 24.0.6
Context: colima
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.2
Path: /Users/fritshoogland/.docker/cli-plugins/docker-buildx
Server:
...
If docker is not setup correctly you see:
➜ docker info
Client: Docker Engine - Community
Version: 24.0.6
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.2
Path: /Users/fritshoogland/.docker/cli-plugins/docker-buildx
Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
errors pretty printing info
If docker is setup, git clone the postges-bpftrace repository:
git clone https://github.com/FritsHoogland/postgres-bpftrace.git
And cd to postgres-bpftrace/docker
. Now run the script build
, which is a simple wrapper script for the docker build command:
➜ ./build
This will create the container for you.
All that is needed after the build is to run the container:
➜ ./run
postgres@10fc6773997c:~$
Now start the database:
postgres@10fc6773997c:~$ ./start_database
waiting for server to start.... done
server started
postgres@10fc6773997c:~$
And run psql
:
postgres@10fc6773997c:~$ psql
psql (17devel)
Type "help" for help.
postgres=#
To use bpftrace, use another shell and execute into the container using the exec
script, which also can be found in the postgres-pbftrace/docker
directory:
➜ ./exec
postgres@10fc6773997c:~$
Now you can get histograms from wait events per pid for example:
postgres@10fc6773997c:~$ sudo ./postgres-bpftrace/wait-histograms-pid.bt
Attaching 183 probes...
Wait event times are in microseconds(us).
Perform an action in the psql
session, such as:
postgres=# select count(*) from generate_series(1,1000000);
count
---------
1000000
(1 row)
postgres=#
And interrupt the bpftrace session:
^C
@wait_event_hist[3745, io, DATA_FILE_READ]:
[4, 8) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[8, 16) 12 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16, 32) 2 |@@@@@@@@ |
@wait_event_hist[3745, io, BUFFILE_TRUNCATE]:
[4, 8) 1058 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[8, 16) 647 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[16, 32) 3 | |
[32, 64) 1 | |
@wait_event_hist[3745, io, BUFFILE_READ]:
[2, 4) 1679 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4, 8) 29 | |
[8, 16) 2 | |
postgres@10fc6773997c:~$
What you can see is there are 3 distinct waits, all from the io
class, and all from pid 3745
. The times are in microseconds, and the latency times are incredibly low. That is because these calls all do perform a system call for IO, but the requests are fully serviced from the operating system file cache.
(word of warning: BUFFILE_TRUNCATE
should be BUFFILE_WRITE
, but these are switched from the file that lists the events (?))