In this page, we have laid out the steps to upgrade postgres using
pg_dump
andpg_restore
This is a generic document, which you can use to upgrade from any version of postgres to any other version
We also talk a bit about Azure postgres and some ways to ensure your services dont update the database during upgrade
This upgrade procedures requires a downtime
1 | Task | Command | Comments |
2 | Create a new postgres instance | Choose postgres version 10 (preferably 11 ) and above and same configuration as per existing instance on Azure You are free to have independent instances or a single instance for all services as per your need Whitelist all the IP's as per existing postgres Test the connection by connecting to the instance from jenkins If on VM, create a new VM (preferably Ubuntu 18 and above) and run the Postgres provision role. Note: This document does not cover details on how to upgrade postgres on the same / exisiting VM. Feel free to contribute the steps to this document. | New instance of same configuration should be created. We have used Postgres version 11 |
3 | Update the inventory | Search and update all the variables in common.yml, secrets.yml, kubernetes.yml in KP, DP and Core Replace the postgres instance address, postgres user and postgres password with the new details | It will be simpler if you use same username and password for all dbs on that instance. If you want to use different user name and password for each DB on an instance, then you need to ensure you create those users and roles before hand by running provision jobs against your postgres instance or do them manually. This document does not cover how to restore user accounts. For that you can refer to Azure docs https://docs.microsoft.com/en-us/azure/postgresql/how-to-upgrade-using-dump-and-restore |
4 | Start of down time - Stop traffic | Stop traffic, put Jenkins into maintenance mode so nobody can deploy the jobs Cut off connection to postgres from all services except Jenkins (In azure remove all connecting subnets except Jenkins, disable Azure services connections also) In VM you have multiple ways - Disable outside connections by editing List of services that are going to be affected - - Analytics - APIManager - Druid - ENC - fusionauth - gql - Hawkeye Superset - Inbound - InternalKong - Keycloak - Learner - LMS - odk - Orchestrator - Outbound - Report - Superset - Transformer - UCI | Respective services should be stopped / Postgres reachability should be removed Traffic can be stopped based on situation by removing nginx daemonset or changing service port to something else from 80 / 443 |
5 | Manually list all the DB's first |
| All DB should be listed |
6 | Run command to get all the DB's on terminal and store in a file Run command to get all user accounts and store in a file |
| Verify all db are present in the file and matches our db's |
7 | Take backup of the DB's from current instance |
| Replace |
8 | Get count of all tables from all DB's from current instance |
| All tables and counts will be displayed and also written into the files |
9 | Rearrange the DBs across two instances or based on your requirement | In this case, we will be creating the following databases across two instances Instance 1 - Keycloak, Public Kong, Private Kong, Quartz, Enc Keys Feel free to add any other dbs you have / want to. This is not an exhaustive list Create two directories as below
Move all the required backup folders and files to this directory. Below is a sample command Similarly do for the other dbs and pg11-db folder also (second instance)
If you are going to use only once instance, then you can move all the backup data and other files under a single directory | Instance 1: cat pg11/dbs.txt api_manager_loadtest_kong14 keycloak7 quartz api_manager_internal loadtest-keys ls -lrth pg11 api_manager_internal api_manager_internal-table-count.txt api_manager_internal-tables.txt api_manager_loadtest_kong14 api_manager_loadtest_kong14-table-count.txt api_manager_loadtest_kong14-tables.txt keycloak7 keycloak7-table-count.txt keycloak7-tables.txt loadtest-keys loadtest-keys-table-count.txt loadtest-keys-tables.txt quartz quartz-table-count.txt quartz-tables.txt dbs.txt Instance 2: cat pg11-dp/dbs.txt analytics druid-raw graphite superset ls -lrth pg11-dp analytics analytics-table-count.txt analytics-tables.txt druid-raw druid-raw-table-count.txt druid-raw-tables.txt graphite graphite-table-count.txt graphite-tables.txt superset superset-tables.txt dbs.txt |
10 | Install postgres 11 tools |
| If you to use
|
11 | Create empty databases in new instance |
| Empty databases should be created on new postgres instance Add double quotes if you have hyphen in DB name
|
12 | Restore the DB's to the new instance |
| Replace Ignore errors like Ignore errors like |
13 | Analyze the db's on new instance |
| Verbose logs will be displayed |
14 | Get count of all tables from all DB's from new instance |
| All tables and counts will be displayed and also written into the files |
15 | Compare the row counts of both the instances |
| This will not display any output. Which means files are identical If there are differences use the below to comapre |
16 | Remove all connections to old Postgres | Remove all subnets and enable deny public network acces on Azure portal If on VM, stop the VM / postgres service | To ensure no service can connect to old DBs |
17 | Clear unnecessary data | Clear offline session from Keycloak DB
| If we have millions of rows in these tables, Keycloak will not start |
18 | End of down time - Redeploy services | Redeploy the follwing services under - Analytics - APIManager - Provision/DataPipeline/Druid (Choose all in service except java) - ENC - fusionauth - gql - HawkeyeSuperset - Inbound - Deploy/atapipeline/InternalKong - Keycloak - Learner - LMS - odk - Orchestrator - Outbound - Report - Superset - Transformer - UCI | The configmaps and configuration files will be updated with new data. Do verify it by checking them manually for all these services. |