I ran into a silly error tonight during some configuration changes for PostgreSQL that took me awhile to figure out so I wanted to share so it might save others some time in the future. In an attempt to boot the slave server in a PostgreSQL SkyTools walmgr configuration I was unable to get it operational. The server would seem like it was stopping recovery mode however I was unable to login and the recovery process was still showing. Eventually I used the verbose switch to see if I could gain more information about why it was not booting up and the output is below.
Command:
walmgr.py conf/wal-slave.ini boot -v
Error:
2009-04-22 01:24:22,949 3899 INFO Stopping recovery mode
2009-04-22 01:24:22,949 3899 DEBUG Using pg_auth file from master.
2009-04-22 01:24:22,950 3899 DEBUG Execute cmd: ‘cp’ ‘/var/lib/pgsql/walshipping/logs.complete/pg_auth’ ‘/var/lib/pgsql/data/global/pg_auth’
2009-04-22 01:24:22,955 3899 DEBUG Only single loop requested, exiting
During this time if you attempted to login to the PostgreSQL server you would receive the below error basically stating the slave server was still in recovery mode.
Command:
psql -U postgres
Error:
psql: FATAL: the database system is starting up
After some tinkering around I remembered that I had put the slave server in a waiting state by issuing the pause command as shown below.
Command:
walmgr.py conf/wal-slave.ini pause
So to resolve the problem of not being able to boot the slave server I just needed to issue the continue command and then the slave server would boot as expected.
Command:
walmgr.py conf/wal-slave.ini continue
Once the server was actually in recovery mode again then I was able to boot the server by issuing the below command.
Command:
walmgr.py conf/wal-slave.ini boot
Now the slave server was no longer in recovery mode so it was not receiving log shipped files from the master postgresql server. I could now however login and operate the slave server as if it was a stand alone server without any errors.