Migrate to Managed Postgres using PeerDB
This guide provides step-by-step instructions on how to migrate your PostgreSQL database to ClickHouse Managed Postgres using PeerDB.
Private preview in ClickHouse CloudPrerequisites
- Access to your source PostgreSQL database.
- A ClickHouse Managed Postgres instance where you want to migrate your data.
- PeerDB installed on a machine. You can follow the installation instructions on the PeerDB GitHub repository. You just need to clone the repository and run
docker-compose up. For this guide, we will be using PeerDB UI, which will be accessible athttp://localhost:3000once PeerDB is running.
Considerations before migration
Before starting your migration, keep the following in mind:
- Database objects: PeerDB will create tables automatically in the target database based on the source schema. However, certain database objects like indexes, constraints, and triggers will not be migrated automatically. You'll need to recreate these objects manually in the target database after the migration.
- DDL changes: If you enable continuous replication, PeerDB will keep the target database in sync with the source for DML operations (INSERT, UPDATE, DELETE) and will propagate ADD COLUMN operations. However, other DDL changes (like DROP COLUMN, ALTER COLUMN) are not propagated automatically. More on schema changes support here
- Network connectivity: Ensure that both the source and target databases are reachable from the machine where PeerDB is running. You may need to configure firewall rules or security group settings to allow connectivity.
Create peers
First, we need to create peers for both the source and target databases. A peer represents a connection to a database. In PeerDB UI, navigate to the "Peers" section by clicking on "Peers" in the sidebar. To create a new peer, click on the + New peer button.
Source peer creation
Create a peer for your source PostgreSQL database by filling in the connection details such as host, port, database name, username, and password. Once you have filled in the details, click on the Create peer button to save the peer.
Target peer creation
Similarly, create a peer for your ClickHouse Managed Postgres instance by providing the necessary connection details. You can get the connection details for your instance from the ClickHouse Cloud console. After filling in the details, click on the Create peer button to save the target peer.
Now, you should see both the source and target peers listed in the "Peers" section.
Create a mirror
Next, we need to create a mirror to define the data migration process between the source and target peers. In PeerDB UI, navigate to the "Mirrors" section by clicking on "Mirrors" in the sidebar. To create a new mirror, click on the + New mirror button.
- Give your mirror a name that describes the migration.
- Select the source and target peers you created earlier from the dropdown menus.
- You may choose to enable continuous replication if you want to keep the target database in sync with the source after the initial migration. Otherwise, under Advanced settings, you can enable Initial copy only to perform a one-time migration.
- Select the tables you want to migrate. You can choose specific tables or select all tables from the source database.
- Once you have configured the mirror settings, click on the
Create mirrorbutton.
You should see your newly created mirror in the "Mirrors" section.
Wait for the initial load
After creating the mirror, PeerDB will start the initial data load from the source to the target database. You can click on the mirror and click on the Initial load tab to monitor the progress of the initial data migration.
Once the initial load is complete, you should see a status indicating that the migration is finished.
Monitoring initial load and replication
If you click on the source peer, you can see a list of running commands which PeerDB is running. For instance:
- Initially we run a COUNT query to estimate the number of rows in each table.
- Then we run a partitioning query using NTILE to break down large tables into smaller chunks for efficient data transfer.
- We then do FETCH commands to pull data from the source database and then PeerDB syncs them to the target database.
Post-migration tasks
After the migration is complete:
- Recreate database objects: Remember to manually recreate indexes, constraints, and triggers in the target database, as these are not migrated automatically.
- Test your application: Make sure to test your application against the ClickHouse Managed Postgres instance to ensure everything is working as expected.
- Clean up resources: Once you are satisfied with the migration and have switched your application to use ClickHouse Managed Postgres, you can delete the mirror and peers in PeerDB to clean up resources.
If you enabled continuous replication, PeerDB will create a replication slot on the source PostgreSQL database. Make sure to drop the replication slot manually from the source database after you are done with the migration to avoid unnecessary resource usage.
References
- ClickHouse Managed Postgres Documentation
- PeerDB guide for CDC creation
- Postgres ClickPipe FAQ (holds true for PeerDB as well)
Next steps
Congratulations! You have successfully migrated your PostgreSQL database to ClickHouse Managed Postgres using pg_dump and pg_restore. You are now all set to explore Managed Postgres features and its integration with ClickHouse. Here's a 10 minute quickstart to get you going: