CATEGORY : Waiting for 9.2
TITLE : Cascading streaming replication
HEADER :
PostgreSQL's built-in replication system is simple and solid. But so far, it has lacked a key feature: cascading replication. With PostgreSQL 9.2, you'll be able to design sophisticated replication clusters.
TEXT :
On 19th of July, Simon Riggs committed this patch:
Cascading replication feature for streaming log-based replication. Standby servers can now have WALSender processes, which can work with either WALReceiver or archive_commands to pass data. Fully updated docs, including new conceptual terms of sending server, upstream and downstream servers. WALSenders terminated when promote to master.
Fujii Masao, review, rework and doc rewrite by Simon Riggs
Streaming replication is relatively new, added in 9.0. Since the beginning it shared a limitation with normal WAL-files based replication, in that there is only one source of data: that is master. While it makes sense, it is also pretty cool to be able to make a slave a source of replication for some other systems; for example, not to keep the master occupied with such tasks.
Now, with the patch, we can set up a replication schema like this:
So, let’s test it.
To make it work I will need a master database and 3 slaves, made off the master. Seems simple enough.
=$ mkdir master =$ initdb -D master ... =$ vim master/postgresql.conf
In postgresql.conf, I change:
port = 4001 wal_level = hot_standby checkpoint_segments = 20 archive_mode = on archive_command = '/bin/true' max_wal_senders = 3 wal_keep_segments = 100 logging_collector = on log_checkpoints = on log_connections = on log_line_prefix = '%m %r %u %d %p: '
I also set pg_hba.conf to something that matches my test environment:
# TYPE DATABASE USER ADDRESS METHOD local replication all trust local all all trust host all all 127.0.0.1/32 trust
With the master prepared that way, I can start it:
=$ pg_ctl -D master start
server starting
=$ psql -p 4001 -d postgres -c "select version()"
version
------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.2devel on x86_64-unknown-linux-gnu, compiled by gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 64-bit
(1 row)
All looks ok.
One note – you might not understand why I used /bin/true as archive_command. The reason is very simple – archive_command has to be set to something, otherwise archive_mode cannot be enabled, and this will cause problems with backups; but on the other hand I will not need to use the WAL archive, since I have a pretty large value for wal_keep_segments.
Now, we’ll set up the slaves. Starting with the first one of course:
=$ psql -p 4001 -d postgres -c "select pg_start_backup('whatever')"
pg_start_backup
-----------------
0/2000020
(1 row)
=$ rsync -a master/ slave/
=$ psql -p 4001 -d postgres -c "select pg_stop_backup()"
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
pg_stop_backup
----------------
0/20000D8
(1 row)
Of course we need some tidying of slave:
=$ rm -f slave/pg_xlog/???????????????????????? slave/pg_xlog/archive_status/* slave/pg_log/* slave/postmaster.pid
=$ vim slave/postgresql.conf
port = 4002 hot_standby = on
And I also create recovery.conf in slave/, with this content:
restore_command = '/bin/false' standby_mode = 'on' primary_conninfo = 'port=4001 user=depesz' trigger_file = '/tmp/slave.finish.recovery'
With this in place I can start slave:
=$ pg_ctl -D slave start server starting =$ head -n 1 slave/postmaster.pid | xargs -IPG ps uwf -p PG --ppid PG USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND depesz 13082 1.5 0.0 66484 7492 pts/3 S 12:51 0:00 /home/pgdba/work/bin/postgres -D slave depesz 13083 0.0 0.0 26136 716 ? Ss 12:51 0:00 \_ postgres: logger process depesz 13084 0.0 0.0 66556 1428 ? Ss 12:51 0:00 \_ postgres: startup process recovering 000000010000000000000006 depesz 13087 2.7 0.0 81504 3064 ? Ss 12:51 0:00 \_ postgres: wal receiver process streaming 0/6000078 depesz 13091 0.0 0.0 66484 1012 ? Ss 12:51 0:00 \_ postgres: writer process depesz 13092 0.0 0.0 26132 896 ? Ss 12:51 0:00 \_ postgres: stats collector process =$ head -n 1 master/postmaster.pid | xargs -IPG ps uwf -p PG --ppid PG USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND depesz 12981 0.2 0.0 66456 7520 pts/3 S 12:50 0:00 /home/pgdba/work/bin/postgres -D master depesz 12982 0.0 0.0 26140 724 ? Ss 12:50 0:00 \_ postgres: logger process depesz 12984 0.0 0.0 66456 1016 ? Ss 12:50 0:00 \_ postgres: writer process depesz 12985 0.0 0.0 66456 1012 ? Ss 12:50 0:00 \_ postgres: wal writer process depesz 12986 0.0 0.0 67296 2096 ? Ss 12:50 0:00 \_ postgres: autovacuum launcher process depesz 12987 0.0 0.0 26136 732 ? Ss 12:50 0:00 \_ postgres: archiver process depesz 12988 0.0 0.0 26136 1040 ? Ss 12:50 0:00 \_ postgres: stats collector process depesz 13088 0.3 0.0 67428 2480 ? Ss 12:51 0:00 \_ postgres: wal sender process depesz [local] streaming 0/6000078
One note – I used “user=depesz” in primary_conninfo, because I run the tests on the depesz system account, so initdb made a superuser named depesz, not postgres.
We now have replication between master and slave set up, and so we can test it:
=$ psql -p 4001 -d postgres -c “create table i (x int4)”; psql -p 4002 -d postgres -c '\d i'
CREATE TABLE
Table "public.i"
Column | Type | Modifiers
--------+---------+-----------
x | integer |
All looks OK. Now, we can add slave2 and slave3. Since I’m lazy, I will just stop slave, copy slave to slave2 and slave3 and then modify them:
=$ pg_ctl -D slave stop waiting for server to shut down.... done server stopped =$ rsync -a slave/ slave2/ =$ rsync -a slave/ slave3/ =$ pg_ctl -D slave start server starting
slave2 and slave3 will be basically the same as slave, but with different ports, and connecting to 4002 (slave) instead of 4001 (master) for their WAL. So, let’s do the changes:
=$ perl -pi -e 's/port = 4002/port = 4003/' slave2/postgresql.co
=$ perl -pi -e 's/port = 4002/port = 4004/' slave3/postgresql.conf
=$ perl -pi -e 's/port=4001/port=4002/' slave{2,3}/recovery.conf
=$ perl -pi -e 's/slave.finish.recovery/slave2.finish.recovery/' slave2/recovery.conf
=$ perl -pi -e 's/slave.finish.recovery/slave3.finish.recovery/' slave3/recovery.conf
Now, let’s start them and see the processes:
=$ for a in slave2 slave3; do pg_ctl -D $a/ start; done server starting server starting =$ head -n 1 -q */*.pid | xargs -IPG echo "-p PG --ppid PG" | xargs ps uwf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND depesz 14031 0.0 0.0 66488 7496 pts/3 S 13:03 0:00 /home/pgdba/work/bin/postgres -D slave3 depesz 14032 0.0 0.0 26140 720 ? Ss 13:03 0:00 \_ postgres: logger process depesz 14033 0.0 0.0 66556 1400 ? Ss 13:03 0:00 \_ postgres: startup process recovering 000000010000000000000006 depesz 14063 0.0 0.0 79456 2148 ? Ss 13:03 0:00 \_ postgres: wal receiver process streaming 0/6012ED0 depesz 14069 0.0 0.0 66488 1532 ? Ss 13:03 0:00 \_ postgres: writer process depesz 14070 0.0 0.0 26136 900 ? Ss 13:03 0:00 \_ postgres: stats collector process depesz 14026 0.0 0.0 66492 7496 pts/3 S 13:03 0:00 /home/pgdba/work/bin/postgres -D slave2 depesz 14042 0.0 0.0 26144 720 ? Ss 13:03 0:00 \_ postgres: logger process depesz 14043 0.0 0.0 66560 1400 ? Ss 13:03 0:00 \_ postgres: startup process recovering 000000010000000000000006 depesz 14067 0.0 0.0 79460 2148 ? Ss 13:03 0:00 \_ postgres: wal receiver process streaming 0/6012ED0 depesz 14071 0.0 0.0 66492 1532 ? Ss 13:03 0:00 \_ postgres: writer process depesz 14072 0.0 0.0 26140 900 ? Ss 13:03 0:00 \_ postgres: stats collector process depesz 14021 0.0 0.0 66488 7528 pts/3 S 13:03 0:00 /home/pgdba/work/bin/postgres -D slave depesz 14037 0.0 0.0 26140 724 ? Ss 13:03 0:00 \_ postgres: logger process depesz 14038 0.0 0.0 66560 1572 ? Ss 13:03 0:00 \_ postgres: startup process recovering 000000010000000000000006 depesz 14048 0.0 0.0 66488 1536 ? Ss 13:03 0:00 \_ postgres: writer process depesz 14050 0.0 0.0 26136 904 ? Ss 13:03 0:00 \_ postgres: stats collector process depesz 14052 0.0 0.0 79460 2136 ? Ss 13:03 0:00 \_ postgres: wal receiver process streaming 0/6012ED0 depesz 14064 0.0 0.0 67332 2476 ? Ss 13:03 0:00 \_ postgres: wal sender process depesz [local] streaming 0/6012ED0 depesz 14068 0.0 0.0 67452 2476 ? Ss 13:03 0:00 \_ postgres: wal sender process depesz [local] streaming 0/6012ED0 depesz 12981 0.0 0.0 66456 7524 pts/3 S 12:50 0:00 /home/pgdba/work/bin/postgres -D master depesz 12982 0.0 0.0 26140 724 ? Ss 12:50 0:00 \_ postgres: logger process depesz 12984 0.0 0.0 66456 1780 ? Ss 12:50 0:00 \_ postgres: writer process depesz 12985 0.0 0.0 66456 1012 ? Ss 12:50 0:00 \_ postgres: wal writer process depesz 12986 0.0 0.0 67296 2156 ? Ss 12:50 0:00 \_ postgres: autovacuum launcher process depesz 12987 0.0 0.0 26136 732 ? Ss 12:50 0:00 \_ postgres: archiver process depesz 12988 0.0 0.0 26136 1040 ? Ss 12:50 0:00 \_ postgres: stats collector process depesz 14053 0.0 0.0 67444 2520 ? Ss 13:03 0:00 \_ postgres: wal sender process depesz [local] streaming 0/6012ED0
Please note that master Postgres has only one sender process (pid 14053), slave Postgres has a receiver (14052) and two senders (14064 and 14068), and slave2 and slave3 have only single receiver (14067 and 14063).
Now we should test if it all works well, so:
=$ psql -d postgres -p 4001 -c \
'insert into i(x) values (123)'
for port in 4002 4003 4004
do
echo "port=$port"
psql -p $port -d postgres -c \
"select * from i"
done
INSERT 0 1
port=4002
x
---
(0 rows)
port=4003
x --- (0 rows) port=4004 x --- (0 rows)
The tables are empty. They should have some data, but it might be simply because of replication lag. So let’s retry the check, without the insert this time:
=$ for port in 4002 4003 4004 do echo "port=$port" psql -p $port -d postgres -c "select * from i" done port=4002 x ----- 123 (1 row)
port=4003
x ----- 123 (1 row)
port=4004
x ----- 123 (1 row)
And all works fine now. Great! The only missing feature is the ability to make slaves-off-slave still work when a slave gets promoted to standalone, but unfortunately, it’s not here:
=$ touch /tmp/slave.finish.recovery; sleep 5; head -n 1 -q */*.pid | xargs -IPG echo "-p PG --ppid PG" | xargs ps uwf USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND depesz 14896 0.1 0.0 66488 7524 pts/3 S 13:18 0:00 /home/pgdba/work/bin/postgres -D slave3 depesz 14897 0.0 0.0 26140 720 ? Ss 13:18 0:00 \_ postgres: logger process depesz 14898 0.0 0.0 66556 1696 ? Ss 13:18 0:00 \_ postgres: startup process waiting for 000000010000000000000006 depesz 14901 0.0 0.0 66488 1276 ? Ss 13:18 0:00 \_ postgres: writer process depesz 14902 0.0 0.0 26136 900 ? Ss 13:18 0:00 \_ postgres: stats collector process depesz 14883 0.1 0.0 66492 7528 pts/3 S 13:18 0:00 /home/pgdba/work/bin/postgres -D slave2 depesz 14885 0.0 0.0 26144 724 ? Ss 13:18 0:00 \_ postgres: logger process depesz 14886 0.0 0.0 66560 1700 ? Ss 13:18 0:00 \_ postgres: startup process waiting for 000000010000000000000006 depesz 14890 0.0 0.0 66492 1280 ? Ss 13:18 0:00 \_ postgres: writer process depesz 14891 0.0 0.0 26140 904 ? Ss 13:18 0:00 \_ postgres: stats collector process depesz 14021 0.0 0.0 66488 7528 pts/3 S 13:03 0:00 /home/pgdba/work/bin/postgres -D slave depesz 14037 0.0 0.0 26140 724 ? Ss 13:03 0:00 \_ postgres: logger process depesz 14048 0.0 0.0 66488 1780 ? Ss 13:03 0:00 \_ postgres: writer process depesz 14050 0.0 0.0 26136 1032 ? Ss 13:03 0:00 \_ postgres: stats collector process depesz 15018 0.0 0.0 66488 1016 ? Ss 13:20 0:00 \_ postgres: wal writer process depesz 15019 0.0 0.0 67320 2100 ? Ss 13:20 0:00 \_ postgres: autovacuum launcher process depesz 15020 0.0 0.0 26136 912 ? Ss 13:20 0:00 \_ postgres: archiver process last was 00000002.history depesz 12981 0.0 0.0 66456 7524 pts/3 S 12:50 0:00 /home/pgdba/work/bin/postgres -D master depesz 12982 0.0 0.0 26140 724 ? Ss 12:50 0:00 \_ postgres: logger process depesz 12984 0.0 0.0 66456 1780 ? Ss 12:50 0:00 \_ postgres: writer process depesz 12985 0.0 0.0 66456 1012 ? Ss 12:50 0:00 \_ postgres: wal writer process depesz 12986 0.0 0.0 67296 2164 ? Ss 12:50 0:00 \_ postgres: autovacuum launcher process depesz 12987 0.0 0.0 26136 732 ? Ss 12:50 0:00 \_ postgres: archiver process depesz 12988 0.0 0.0 26136 1040 ? Ss 12:50 0:00 \_ postgres: stats collector process
As you can see the sender in slave got killed, and thus slave2 and slave3 are still slaves, but without a source of WAL. Logs of slave2 and slave3 PostgreSQL show a clear reason why it doesn’t work:
2011-07-26 13:26:41.483 CEST 16318: FATAL: timeline 2 of the primary does not match recovery target timeline 1
Clearly slave is using timeline 2, while slave2 and slave3 are still on timeline 1. Theoretically it should be simple to fix, since slave has pg_xlog/00000002.history file, but the functionality to switch timelines in recovery is simply not there yet.
Anyway, the ability to have slaves that are receiving WAL from other slaves is pretty cool, and definitely a welcome addition.
BOX 1 : About the article
BOX 1 TEXT :
The originial article is available at http://pgmag.org/0134
BOX 2 TITLE : About the Author
BOX 2 TEXT :
Hubert “Depesz” Lubaczewski is a Database Architect at OmniTI. His blog (http://depesz.com/) is dedicated to the new features of the forthcoming version of PostgreSQL.
BOX 2 PICTURE :
d5ecebb85919589a7cbf7f90eb34eacf.jpeg
BOX 3 TITLE : How to test this at home ?
BOX 3 TEXT :
If you are not a code contributor, testing PostgreSQL Alphas and Betas is one of the best things you can do for the project. By participating in organized testing, you help get the release out faster, with more features and less bugs.
If you are able to contribute to PostgreSQL by doing alpha testing, please read the Alpha/Beta Testing Page in the wiki :
http://wiki.postgresql.org/wiki/HowToBetaTest