Error communicating with postgresql will try again later

Describe the bug Consider: 5 etcd node, 3 patroni/psql node Node A leader Node B&C replica Start patroni on A; psql ok as master Start patroni on B; bootstrap from leader, then psql ok as repli...

Describe the bug
Consider:
5 etcd node, 3 patroni/psql node
Node A leader
Node B&C replica
Start patroni on A; psql ok as master
Start patroni on B; bootstrap from leader, then psql ok as replica sync
Start patroni on C; bootstrap from leader, then psql ok as replica sync but

  • Patroni stuck in «starting» mode,
  • Patroni can’t seem to connect to psql localhost:5432
  • Psql working fine in the back, including replication

Screenshots
If applicable, add screenshots to help explain your problem.

Environment

  • Patroni version: 2.1.1
  • PostgreSQL version: 13.4
  • DCS (and its version): etcd 3.5

Patroni configuration file

--- 
bootstrap: 
  dcs: 
    check_timeline: true
    loop_wait: 10
    master_start_timeout: 0
    maximum_lag_on_failover: 1048576
    postgresql: 
      parameters: 
        archive_command: "cp %p /var/lib/pgsql/13/archive/%f"
        archive_mode: "ON"
        archive_timeout: 1800s
        hot_standby: "ON"
        max_connections: 500
        wal_level: hot_standby
        wal_log_hints: "ON"
      use_pg_rewind: true
      use_slots: true
    retry_timeout: 10
    ttl: 30
  initdb: 
    - 
      encoding: UTF8
    - 
      locale: en-CA.UTF-8
  users: 
    admin: 
      options: 
        - createrole
        - createdb
      password: admin
etcd3: 
  host: "PGSQL-03-M:2379"
log: 
  dir: /etc/patroni/
  file_num: 1
  file_size: 5242880
name: PGSQL-03-M
postgresql: 
  authentication: 
    replication: 
      password: redacted
      username: replicator
    rewind: 
      password: redacted
      username: rewind
    superuser: 
      password: redacted
      username: postgres
  bin_dir: /usr/pgsql-13/bin
  config_dir: /var/lib/pgsql/13/data
  connect_address: "PGSQL-03-M:5432"
  data_dir: /var/lib/pgsql/13/data
  krbsrvname: POSTGRES
  listen: "localhost,PGSQL-03-M:5432"
  parameters: 
    krb_caseins_users: true
    krb_server_keyfile: /var/lib/pgsql/13/postgres.keytab
    log_connections: "ON"
    log_directory: /var/lib/pgsql/13/log
    log_disconnections: "ON"
    log_min_messages: DEBUG1
    shared_buffers: 2048MB
    synchronous_commit: "ON"
    synchronous_standby_names: "*"
    unix_socket_directories: "/var/run/postgresql, /tmp"
    work_mem: 8MB
  pgpass: /var/lib/pgsql/13/.pgpass
  remove_data_directory_on_diverged_timelines: "true"
  remove_data_directory_on_rewind_failure: "true"
restapi: 
  authentication: 
    password: redacted
    username: patroni_api
  connect_address: "PGSQL-03-M:8008"
  listen: "PGSQL-03-M:8008"
scope: casgrain
tags: 
  clonefrom: false
  nofailover: false
  noloadbalance: false
  nosync: false
watchdog: 
  device: /dev/watchdog
  mode: automatic
  safety_margin: -1

patronictl show-config

check_timeline: true
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    archive_command: cp %p /backups/archive/%f
    archive_mode: 'on'
    archive_timeout: 1800s
    hot_standby: 'on'
    max_replication_slots: 10
    max_wal_senders: 10
    wal_keep_segments: 16
    wal_level: hot_standby
    wal_log_hints: 'on'
  recovery_conf:
    restore_command: cp /backups/archive/%f %p
  use_pg_rewind: true
  use_slots: true
retry_timeout: 10
ttl: 30

patronictl list

+------------+------------+---------+----------+-----+-----------+
| Member     | Host       | Role    | State    |  TL | Lag in MB |
+ Cluster: casgrain (7015583270741613629) -----+-----+-----------+
| PGSQL-01-M | pgsql-01-m | Leader  | running  | 116 |           |
| PGSQL-02-M | pgsql-02-m | Replica | running  | 116 |         0 |
| PGSQL-03-M | pgsql-03-m | Replica | starting |     |   unknown |
+------------+------------+---------+----------+-----+-----------+

syslog

Oct 29 15:57:43 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:57:56 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:58:03 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:58:10 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:58:28 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:58:35 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:58:42 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:58:54 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:59:06 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:59:18 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:59:30 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 15:59:42 PGSQL-03-M patroni[18223]: localhost:G5432 - no response
Oct 29 16:00:05 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 16:00:17 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 16:00:29 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 16:00:52 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 16:01:04 PGSQL-03-M patroni[18223]: localhost:5432 - no response
Oct 29 16:01:16 PGSQL-03-M patroni[18223]: localhost:5432 - no response

psql replication

  usename   | application_name |  client_addr  | client_hostname | sync_priority | sync_state
------------+------------------+---------------+-----------------+---------------+------------
 replicator | PGSQL-02-M       | 192.168.23.12 |                 |             1 | sync
 replicator | PGSQL-03-M       | 192.168.23.13 |                 |             1 | potential

Patroni logs?

2021-10-29 16:00:58,618 WARNING: Loop time exceeded, rescheduling immediately.
2021-10-29 16:01:04,486 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:04,486 INFO: Still starting up as a standby.
2021-10-29 16:01:04,486 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:04,487 INFO: establishing a new patroni connection to the postgres cluster
2021-10-29 16:01:10,311 WARNING: Retry got exception: 'connection problems'
2021-10-29 16:01:10,356 INFO: Error communicating with PostgreSQL. Will try again later
2021-10-29 16:01:10,357 WARNING: Loop time exceeded, rescheduling immediately.
2021-10-29 16:01:16,259 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:16,259 INFO: Still starting up as a standby.
2021-10-29 16:01:16,259 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:16,260 INFO: establishing a new patroni connection to the postgres cluster
2021-10-29 16:01:22,260 WARNING: Retry got exception: 'connection problems'
2021-10-29 16:01:22,305 INFO: Error communicating with PostgreSQL. Will try again later
2021-10-29 16:01:22,306 WARNING: Loop time exceeded, rescheduling immediately.
2021-10-29 16:01:28,381 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:28,382 INFO: Still starting up as a standby.
2021-10-29 16:01:28,382 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:28,382 INFO: establishing a new patroni connection to the postgres cluster
2021-10-29 16:01:34,265 WARNING: Retry got exception: 'connection problems'
2021-10-29 16:01:34,311 INFO: Error communicating with PostgreSQL. Will try again later
2021-10-29 16:01:34,312 WARNING: Loop time exceeded, rescheduling immediately.
2021-10-29 16:01:40,248 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:40,248 INFO: Still starting up as a standby.
2021-10-29 16:01:40,249 INFO: Lock owner: PGSQL-01-M; I am PGSQL-03-M
2021-10-29 16:01:40,249 INFO: establishing a new patroni connection to the postgres cluster
2021-10-29 16:01:46,093 WARNING: Retry got exception: 'connection problems'
2021-10-29 16:01:46,138 INFO: Error communicating with PostgreSQL. Will try again later
2021-10-29 16:01:46,139 WARNING: Loop time exceeded, rescheduling immediately.

PostgreSQL logs?

2021-10-29 16:00:14.344 EDT [18547] LOG:  connection received: host=127.0.0.1 port=59060
2021-10-29 16:00:17.290 EDT [18548] LOG:  connection received: host=::1 port=52372
2021-10-29 16:00:20.158 EDT [18549] LOG:  connection received: host=127.0.0.1 port=59072
2021-10-29 16:00:23.189 EDT [18552] LOG:  connection received: host=::1 port=52392
2021-10-29 16:00:26.106 EDT [18553] LOG:  connection received: host=127.0.0.1 port=59092
2021-10-29 16:00:29.036 EDT [18554] LOG:  connection received: host=::1 port=52406
2021-10-29 16:00:43.620 EDT [18559] LOG:  connection received: host=::1 port=52446
2021-10-29 16:00:43.620 EDT [18560] LOG:  connection received: host=127.0.0.1 port=59138
2021-10-29 16:00:46.769 EDT [18562] LOG:  connection received: host=::1 port=52458
2021-10-29 16:00:49.687 EDT [18563] LOG:  connection received: host=127.0.0.1 port=59150
2021-10-29 16:00:52.650 EDT [18566] LOG:  connection received: host=::1 port=52474
2021-10-29 16:00:55.608 EDT [18567] LOG:  connection received: host=127.0.0.1 port=59174
2021-10-29 16:00:58.626 EDT [18569] LOG:  connection received: host=::1 port=52496
2021-10-29 16:01:01.589 EDT [18571] LOG:  connection received: host=127.0.0.1 port=59188
2021-10-29 16:01:04.488 EDT [18583] LOG:  connection received: host=::1 port=52508
2021-10-29 16:01:07.355 EDT [18584] LOG:  connection received: host=127.0.0.1 port=59208
2021-10-29 16:01:10.366 EDT [18586] LOG:  connection received: host=::1 port=52528
2021-10-29 16:01:13.327 EDT [18589] LOG:  connection received: host=127.0.0.1 port=59224
2021-10-29 16:01:16.261 EDT [18590] LOG:  connection received: host=::1 port=52544
2021-10-29 16:01:19.155 EDT [18593] LOG:  connection received: host=127.0.0.1 port=59244
2021-10-29 16:01:22.315 EDT [18596] LOG:  connection received: host=::1 port=52564
2021-10-29 16:01:25.353 EDT [18597] LOG:  connection received: host=127.0.0.1 port=59256
2021-10-29 16:01:28.384 EDT [18598] LOG:  connection received: host=::1 port=52578
2021-10-29 16:01:31.356 EDT [18601] LOG:  connection received: host=127.0.0.1 port=59282
2021-10-29 16:01:34.321 EDT [18603] LOG:  connection received: host=::1 port=52602
2021-10-29 16:01:37.270 EDT [18604] LOG:  connection received: host=127.0.0.1 port=59294
2021-10-29 16:01:40.250 EDT [18606] LOG:  connection received: host=::1 port=52614
2021-10-29 16:01:43.222 EDT [18610] LOG:  connection received: host=127.0.0.1 port=59314
2021-10-29 16:01:46.147 EDT [18619] LOG:  connection received: host=::1 port=52634

Have you tried to use GitHub issue search?
Similar to #1268

Additional context
Bootstrapping C from either A or B as leader gives same result
Only node C is affected. Cannot reproduce on either A or B.
On working node B, you can see after a connection received log entry, a corresponding connection authorized entry which is missing in C psql logs.
Firewall ports are open (confirmed not blocked in fw logs)
Running on C psql -c "select pg_is_in_recovery()" works fine.

I am trying to set up Patroni (2.0.1) for the first time with PG12.

Even though the authentication users specified in the config exist in PG (with correct passwords), PG keeps on rejecting the connection.

This is my config —

scope: postgres
name: postgresql0
restapi:
    listen: postgresql0_ip:8008
    connect_address: postgresql0_ip:8008
zookeeper:
    hosts: [...]
bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true
    initdb:
    - encoding: UTF8
    - data-checksums
    pg_hba:
    - host all all 0.0.0.0/0 md5
    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb
postgresql:
    listen: postgresql0_ip:5432
    connect_address: postgresql0_ip:5432
    data_dir: /data/patroni
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: password
        superuser:
            username: supahuser
            password: thesupass
    parameters:
        unix_socket_directories: '.'
        logging_collector: "on"
        log_directory: "/var/log/postgresql"
        log_filename: "postgresql-12-main.log"
    bin_dir: /usr/lib/postgresql/12/bin
tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

These are Patroni’s logs —

2020-10-07 19:25:16,240 INFO: establishing a new patroni connection to the postgres cluster
2020-10-07 19:25:16,374 INFO: establishing a new patroni connection to the postgres cluster
2020-10-07 19:25:16,378 WARNING: Retry got exception: 'connection problems'
postgresql0_ip:5432 - accepting connections
2020-10-07 19:25:16,399 INFO: establishing a new patroni connection to the postgres cluster
2020-10-07 19:25:16,404 ERROR: Exception when changing replication slots
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/patroni/ha.py", line 1422, in _run_cycle
    return self.process_unhealthy_cluster()
  File "/usr/local/lib/python3.8/dist-packages/patroni/ha.py", line 939, in process_unhealthy_cluster
    if self.is_healthiest_node():
  File "/usr/local/lib/python3.8/dist-packages/patroni/ha.py", line 770, in is_healthiest_node
    if self.state_handler.is_leader():
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 338, in is_leader
    return bool(self._cluster_info_state_get('timeline'))
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 318, in _cluster_info_state_get
    raise PostgresConnectionException(self._cluster_info_state['error'])
patroni.exceptions.PostgresConnectionException: "'Too many retry attempts'"
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 255, in _query
    cursor = self._connection.cursor()
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/connection.py", line 31, in cursor
    self._cursor_holder = self.get().cursor()
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/connection.py", line 23, in get
    self._connection = psycopg2.connect(**self._conn_kwargs)
  File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 127, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL:  password authentication failed for user "supahuser"
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/slots.py", line 45, in sync_replication_slots
    self.load_replication_slots()
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/slots.py", line 27, in load_replication_slots
    cursor = self._query('SELECT slot_name, slot_type, plugin, database FROM pg_catalog.pg_replication_slots')
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/slots.py", line 22, in _query
    return self._postgresql.query(sql, *params, retry=False)
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 274, in query
    return self._query(sql, *args)
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 270, in _query
    raise PostgresConnectionException('connection problems')
patroni.exceptions.PostgresConnectionException: 'connection problems'
2020-10-07 19:25:16,405 INFO: Error communicating with PostgreSQL. Will try again later

These are Postgres’ logs —

2020-10-07 19:25:37.057 UTC [2209766] DETAIL:  Role "supahuser" does not exist.
    Connection matched pg_hba.conf line 98: "host all all 0.0.0.0/0 md5"
2020-10-07 19:25:37.061 UTC [2209767] FATAL:  password authentication failed for user "supahuser"
2020-10-07 19:25:37.061 UTC [2209767] DETAIL:  Role "supahuser" does not exist.
    Connection matched pg_hba.conf line 98: "host all all 0.0.0.0/0 md5"

This is proof the users exist with the correct passwords —

postgres=# du
 Role name  |                         Attributes                         | Member of
------------+------------------------------------------------------------+-----------
 postgres   | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 replicator | Replication                                                | {}
 supahuser  | Superuser, Create role, Create DB, Replication             | {}

postgres=# alter user supahuser with encrypted password 'thesupass';
ALTER ROLE
postgres=# alter user replicator with encrypted password 'password';
ALTER ROLE

Anything you guys think I have done incorrectly or overlooked?

Я впервые пытаюсь установить Patroni (2.0.1) с PG12.

Несмотря на то, что пользователи аутентификации, указанные в конфигурации, существуют в PG (с правильными паролями), PG продолжает отклонять соединение.

Это мой конфиг —

scope: postgres
name: postgresql0
restapi:
    listen: postgresql0_ip:8008
    connect_address: postgresql0_ip:8008
zookeeper:
    hosts: [...]
bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true
    initdb:
    - encoding: UTF8
    - data-checksums
    pg_hba:
    - host all all 0.0.0.0/0 md5
    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb
postgresql:
    listen: postgresql0_ip:5432
    connect_address: postgresql0_ip:5432
    data_dir: /data/patroni
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: password
        superuser:
            username: supahuser
            password: thesupass
    parameters:
        unix_socket_directories: '.'
        logging_collector: "on"
        log_directory: "/var/log/postgresql"
        log_filename: "postgresql-12-main.log"
    bin_dir: /usr/lib/postgresql/12/bin
tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Это логи Патрони —

2020-10-07 19:25:16,240 INFO: establishing a new patroni connection to the postgres cluster
2020-10-07 19:25:16,374 INFO: establishing a new patroni connection to the postgres cluster
2020-10-07 19:25:16,378 WARNING: Retry got exception: 'connection problems'
postgresql0_ip:5432 - accepting connections
2020-10-07 19:25:16,399 INFO: establishing a new patroni connection to the postgres cluster
2020-10-07 19:25:16,404 ERROR: Exception when changing replication slots
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/patroni/ha.py", line 1422, in _run_cycle
    return self.process_unhealthy_cluster()
  File "/usr/local/lib/python3.8/dist-packages/patroni/ha.py", line 939, in process_unhealthy_cluster
    if self.is_healthiest_node():
  File "/usr/local/lib/python3.8/dist-packages/patroni/ha.py", line 770, in is_healthiest_node
    if self.state_handler.is_leader():
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 338, in is_leader
    return bool(self._cluster_info_state_get('timeline'))
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 318, in _cluster_info_state_get
    raise PostgresConnectionException(self._cluster_info_state['error'])
patroni.exceptions.PostgresConnectionException: "'Too many retry attempts'"
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 255, in _query
    cursor = self._connection.cursor()
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/connection.py", line 31, in cursor
    self._cursor_holder = self.get().cursor()
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/connection.py", line 23, in get
    self._connection = psycopg2.connect(**self._conn_kwargs)
  File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 127, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL:  password authentication failed for user "supahuser"
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/slots.py", line 45, in sync_replication_slots
    self.load_replication_slots()
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/slots.py", line 27, in load_replication_slots
    cursor = self._query('SELECT slot_name, slot_type, plugin, database FROM pg_catalog.pg_replication_slots')
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/slots.py", line 22, in _query
    return self._postgresql.query(sql, *params, retry=False)
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 274, in query
    return self._query(sql, *args)
  File "/usr/local/lib/python3.8/dist-packages/patroni/postgresql/__init__.py", line 270, in _query
    raise PostgresConnectionException('connection problems')
patroni.exceptions.PostgresConnectionException: 'connection problems'
2020-10-07 19:25:16,405 INFO: Error communicating with PostgreSQL. Will try again later

Это журналы Postgres —

2020-10-07 19:25:37.057 UTC [2209766] DETAIL:  Role "supahuser" does not exist.
    Connection matched pg_hba.conf line 98: "host all all 0.0.0.0/0 md5"
2020-10-07 19:25:37.061 UTC [2209767] FATAL:  password authentication failed for user "supahuser"
2020-10-07 19:25:37.061 UTC [2209767] DETAIL:  Role "supahuser" does not exist.
    Connection matched pg_hba.conf line 98: "host all all 0.0.0.0/0 md5"

Это доказательство того, что пользователи существуют с правильными паролями —

postgres=# du
 Role name  |                         Attributes                         | Member of
------------+------------------------------------------------------------+-----------
 postgres   | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 replicator | Replication                                                | {}
 supahuser  | Superuser, Create role, Create DB, Replication             | {}

postgres=# alter user supahuser with encrypted password 'thesupass';
ALTER ROLE
postgres=# alter user replicator with encrypted password 'password';
ALTER ROLE

Ребята, вы думаете, что я сделал что-то неправильно или пропустил?

Please ensure you do the following when reporting a bug:

  • Provide a concise description of what the bug is.

A failed replica in a 3-node Crunchy cluster is not being replaced or repaired

  • Provide information about your environment.

3 PG replicas; k3s v1.21.3 x86_64, 4 compute nodes

  • Provide clear steps to reproduce the bug.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: crunchy-db1
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0
  postgresVersion: 13
  instances:
    - name: crunchy-db1
      replicas: 3
      affinity:
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 1
              podAffinityTerm:
                topologyKey: kubernetes.io/hostname
                labelSelector:
                  matchLabels:
                    postgres-operator.crunchydata.com/cluster: crunchy-db1
                    postgres-operator.crunchydata.com/instance-set: crunchy-db1
      dataVolumeClaimSpec:
        accessModes:
        - "ReadWriteOnce"
        resources:
          requests:
            storage: 5Gi
  backups:
    pgbackrest:
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-0
      repoHost:
        dedicated: {}
      configuration:
      - secret:
          name: pgo-s3-creds
      global:
        repo1-path: /pgbackrest/crunchy-db1/crunchy-db1/repo1
        repo2-path: /pgbackrest/crunchy-db1/crunchy-db1/repo2
        repo1-retention-full: "30"
        repo1-retention-full-type: time
        repo2-retention-full: "7"
        repo2-retention-full-type: time
      repos:
      - name: repo1
        schedules:
          full: "15 1 * * *"
          incremental: "20 * * * *"
        s3:
          bucket: "pgbackrest"
          endpoint: "s3.us-central-1.wasabisys.com"
          region: "us-central-1"
      - name: repo2
        schedules:
          full: "30 1 * * *"
          incremental: "35 * * * *"
        s3:
          bucket: "crunchy-pgbackrest"
          endpoint: "sync1.local"
          region: "us-central-1"
  • Attach applicable logs. Please do not attach screenshots showing logs unless you are unable to copy and paste the log data.

$ kubectl describe pod -n crunchy-db1 crunchy-db1-crunchy-db1-jltp-0
Name:         crunchy-db1-crunchy-db1-jltp-0
Namespace:    crunchy-db1
Priority:     0
Node:         zbx2/192.168.88.66
Start Time:   Fri, 06 Aug 2021 16:11:13 -0500
Labels:       controller-revision-hash=crunchy-db1-crunchy-db1-jltp-84dbc7576c
              postgres-operator.crunchydata.com/cluster=crunchy-db1
              postgres-operator.crunchydata.com/instance=crunchy-db1-crunchy-db1-jltp
              postgres-operator.crunchydata.com/instance-set=crunchy-db1
              postgres-operator.crunchydata.com/patroni=crunchy-db1-ha
              statefulset.kubernetes.io/pod-name=crunchy-db1-crunchy-db1-jltp-0
Annotations:  kubectl.kubernetes.io/restartedAt: 2021-08-06T16:07:20-05:00
              status:
                {"conn_url":"postgres://crunchy-db1-crunchy-db1-jltp-0.crunchy-db1-pods:5432/postgres","api_url":"https://crunchy-db1-crunchy-db1-jltp-0.c...
Status:       Running
IP:           10.42.3.165
IPs:
  IP:           10.42.3.165
Controlled By:  StatefulSet/crunchy-db1-crunchy-db1-jltp
Init Containers:
  postgres-startup:
    Container ID:  containerd://3dd947e4bfa8bdcfbd47fcd304fad0a4f28ea41eb1eb7d6b5a65ed5fe2d0b485
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0
    Image ID:      registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha@sha256:3306cb1c16b5f4afd9a1ada13ad45f61b368d90267d9513302108ec80d0db1bf
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -ceu
      --
      declare -r expected_major_version="$1" pgwal_directory="$2"
      results() { printf '::postgres-operator: %s::%sn' "$@"; }
      safelink() (
        local desired="$1" name="$2" current
        current=$(realpath "${name}")
        if [ "${current}" = "${desired}" ]; then return; fi
        set -x; mv --no-target-directory "${current}" "${desired}"
        ln --no-dereference --force --symbolic "${desired}" "${name}"
      )
      echo Initializing ...
      results 'uid' "$(id -u)" 'gid' "$(id -G)"
      results 'postgres path' "$(command -v postgres)"
      results 'postgres version' "${postgres_version:=$(postgres --version)}"
      [[ "${postgres_version}" == *") ${expected_major_version}."* ]]
      results 'config directory' "${PGDATA:?}"
      postgres_data_directory=$([ -d "${PGDATA}" ] && postgres -C data_directory || echo "${PGDATA}")
      results 'data directory' "${postgres_data_directory}"
      [ "${postgres_data_directory}" = "${PGDATA}" ]
      bootstrap_dir="${postgres_data_directory}_bootstrap"
      [ -d "${bootstrap_dir}" ] && results 'bootstrap directory' "${bootstrap_dir}"
      [ -d "${bootstrap_dir}" ] && postgres_data_directory="${bootstrap_dir}"
      install --directory --mode=0700 "${postgres_data_directory}"
      [ -f "${postgres_data_directory}/PG_VERSION" ] || exit 0
      results 'data version' "${postgres_data_version:=$(< "${postgres_data_directory}/PG_VERSION")}"
      [ "${postgres_data_version}" = "${expected_major_version}" ]
      safelink "${pgwal_directory}" "${postgres_data_directory}/pg_wal"
      results 'wal directory' "$(realpath "${postgres_data_directory}/pg_wal")"
      startup
      13
      /pgdata/pg13_wal
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 06 Aug 2021 16:11:32 -0500
      Finished:     Fri, 06 Aug 2021 16:11:32 -0500
    Ready:          True
    Restart Count:  0
    Environment:
      PGDATA:  /pgdata/pg13
      PGHOST:  /tmp/postgres
      PGPORT:  5432
    Mounts:
      /pgdata from postgres-data (rw)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6wgwf (ro)
  database-client-cert-init:
    Container ID:  containerd://b886b3c0053b8a165375967b00bc18c3dd5c8dbf8570f0a571344017b8cc34b0
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0
    Image ID:      registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha@sha256:3306cb1c16b5f4afd9a1ada13ad45f61b368d90267d9513302108ec80d0db1bf
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -c
      mkdir -p /tmp/replication && install -m 0600 /pgconf/tls/replication/{tls.crt,tls.key,ca.crt} /tmp/replication
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 06 Aug 2021 16:11:34 -0500
      Finished:     Fri, 06 Aug 2021 16:11:34 -0500
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /pgconf/tls from cert-volume (ro)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6wgwf (ro)
  nss-wrapper-init:
    Container ID:  containerd://884112d919cccb5806017da54462d8fa0ecb49c539d38e8b06724b733df3b730
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0
    Image ID:      registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha@sha256:3306cb1c16b5f4afd9a1ada13ad45f61b368d90267d9513302108ec80d0db1bf
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -c
      NSS_WRAPPER_SUBDIR=postgres CRUNCHY_NSS_USERNAME=postgres CRUNCHY_NSS_USER_DESC="postgres" /opt/crunchy/bin/nss_wrapper.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 06 Aug 2021 16:11:36 -0500
      Finished:     Fri, 06 Aug 2021 16:11:36 -0500
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6wgwf (ro)
Containers:
  database:
    Container ID:  containerd://14f4a3ff665f7bc70d78f27d18d168f72d065bc7066fae502edf4bfc6ff34cb8
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0
    Image ID:      registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha@sha256:3306cb1c16b5f4afd9a1ada13ad45f61b368d90267d9513302108ec80d0db1bf
    Port:          5432/TCP
    Host Port:     0/TCP
    Command:
      patroni
      /etc/patroni
    State:          Running
      Started:      Fri, 06 Aug 2021 16:11:37 -0500
    Ready:          False
    Restart Count:  0
    Liveness:       http-get https://:8008/liveness delay=3s timeout=5s period=10s #success=1 #failure=3
    Readiness:      http-get https://:8008/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
    Environment:
      PGDATA:                              /pgdata/pg13
      PGHOST:                              /tmp/postgres
      PGPORT:                              5432
      PATRONI_NAME:                        crunchy-db1-crunchy-db1-jltp-0 (v1:metadata.name)
      PATRONI_KUBERNETES_POD_IP:            (v1:status.podIP)
      PATRONI_KUBERNETES_PORTS:            - name: postgres
                                             port: 5432
                                             protocol: TCP

      PATRONI_POSTGRESQL_CONNECT_ADDRESS:  $(PATRONI_NAME).crunchy-db1-pods:5432
      PATRONI_POSTGRESQL_LISTEN:           *:5432
      PATRONI_POSTGRESQL_CONFIG_DIR:       /pgdata/pg13
      PATRONI_POSTGRESQL_DATA_DIR:         /pgdata/pg13
      PATRONI_RESTAPI_CONNECT_ADDRESS:     $(PATRONI_NAME).crunchy-db1-pods:8008
      PATRONI_RESTAPI_LISTEN:              *:8008
      PATRONICTL_CONFIG_FILE:              /etc/patroni
      LD_PRELOAD:                          /usr/lib64/libnss_wrapper.so
      NSS_WRAPPER_PASSWD:                  /tmp/nss_wrapper/postgres/passwd
      NSS_WRAPPER_GROUP:                   /tmp/nss_wrapper/postgres/group
    Mounts:
      /etc/patroni from patroni-config (ro)
      /etc/pgbackrest/conf.d from pgbackrest-config (rw)
      /etc/ssh from ssh (ro)
      /pgconf/tls from cert-volume (ro)
      /pgdata from postgres-data (rw)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6wgwf (ro)
  replication-cert-copy:
    Container ID:  containerd://e6cba0e848461330ab259a707736109ca52ec23771a75a556274f7a660c8fae0
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0
    Image ID:      registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha@sha256:3306cb1c16b5f4afd9a1ada13ad45f61b368d90267d9513302108ec80d0db1bf
    Port:          <none>
    Host Port:     <none>
    Command:
      bash
      -c

      declare -r mountDir=/pgconf/tls/replication
      declare -r tmpDir=/tmp/replication
      while sleep 5s; do
        mkdir -p /tmp/replication
        DIFF=$(diff ${mountDir} ${tmpDir})
        if [ "$DIFF" != "" ]
        then
          date
          echo Copying replication certificates and key and setting permissions
          install -m 0600 ${mountDir}/{tls.crt,tls.key,ca.crt} ${tmpDir}
          patronictl reload crunchy-db1-ha --force
        fi
      done

    State:          Running
      Started:      Fri, 06 Aug 2021 16:11:37 -0500
    Ready:          True
    Restart Count:  0
    Environment:
      PGDATA:                              /pgdata/pg13
      PGHOST:                              /tmp/postgres
      PGPORT:                              5432
      PATRONI_NAME:                        crunchy-db1-crunchy-db1-jltp-0 (v1:metadata.name)
      PATRONI_KUBERNETES_POD_IP:            (v1:status.podIP)
      PATRONI_KUBERNETES_PORTS:            - name: postgres
                                             port: 5432
                                             protocol: TCP

      PATRONI_POSTGRESQL_CONNECT_ADDRESS:  $(PATRONI_NAME).crunchy-db1-pods:5432
      PATRONI_POSTGRESQL_LISTEN:           *:5432
      PATRONI_POSTGRESQL_CONFIG_DIR:       /pgdata/pg13
      PATRONI_POSTGRESQL_DATA_DIR:         /pgdata/pg13
      PATRONI_RESTAPI_CONNECT_ADDRESS:     $(PATRONI_NAME).crunchy-db1-pods:8008
      PATRONI_RESTAPI_LISTEN:              *:8008
      PATRONICTL_CONFIG_FILE:              /etc/patroni
    Mounts:
      /etc/patroni from patroni-config (ro)
      /pgconf/tls from cert-volume (ro)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6wgwf (ro)
  pgbackrest:
    Container ID:  containerd://01c7f58c579310e12ea66ab96080d2cf5e8ff9545259e365a43e800ca5c84aa9
    Image:         registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-0
    Image ID:      registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest@sha256:4c102e1d033b06ec0ba0bc7d442e92193126d379b2c949c42385a45840cff944
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/sbin/sshd
      -D
      -e
    State:          Running
      Started:      Fri, 06 Aug 2021 16:11:38 -0500
    Ready:          True
    Restart Count:  0
    Liveness:       tcp-socket :2022 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      LD_PRELOAD:          /usr/lib64/libnss_wrapper.so
      NSS_WRAPPER_PASSWD:  /tmp/nss_wrapper/postgres/passwd
      NSS_WRAPPER_GROUP:   /tmp/nss_wrapper/postgres/group
    Mounts:
      /etc/pgbackrest/conf.d from pgbackrest-config (rw)
      /etc/ssh from ssh (ro)
      /pgdata from postgres-data (rw)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6wgwf (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  postgres-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  crunchy-db1-crunchy-db1-jltp-pgdata
    ReadOnly:   false
  patroni-config:
    Type:                Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:       crunchy-db1-config
    ConfigMapOptional:   <nil>
    ConfigMapName:       crunchy-db1-crunchy-db1-jltp-config
    ConfigMapOptional:   <nil>
    SecretName:          crunchy-db1-crunchy-db1-jltp-certs
    SecretOptionalName:  <nil>
  ssh:
    Type:                Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:       crunchy-db1-ssh-config
    ConfigMapOptional:   <nil>
    SecretName:          crunchy-db1-ssh
    SecretOptionalName:  <nil>
  pgbackrest-config:
    Type:                Projected (a volume that contains injected data from multiple sources)
    SecretName:          pgo-s3-creds
    SecretOptionalName:  <nil>
    ConfigMapName:       crunchy-db1-pgbackrest-config
    ConfigMapOptional:   <nil>
  cert-volume:
    Type:                Projected (a volume that contains injected data from multiple sources)
    SecretName:          crunchy-db1-cluster-cert
    SecretOptionalName:  <nil>
    SecretName:          crunchy-db1-replication-cert
    SecretOptionalName:  <nil>
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  16Mi
  kube-api-access-6wgwf:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               24m                   default-scheduler        Successfully assigned crunchy-db1/crunchy-db1-crunchy-db1-jltp-0 to zbx2
  Normal   SuccessfulAttachVolume  24m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-a037fdc4-b23e-4d29-adb9-fcf1c416dde6"
  Normal   Pulled                  24m                   kubelet                  Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0" already present on machine
  Normal   Created                 24m                   kubelet                  Created container postgres-startup
  Normal   Started                 24m                   kubelet                  Started container postgres-startup
  Normal   Created                 24m                   kubelet                  Created container database-client-cert-init
  Normal   Pulled                  24m                   kubelet                  Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0" already present on machine
  Normal   Started                 24m                   kubelet                  Started container database-client-cert-init
  Normal   Pulled                  24m                   kubelet                  Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0" already present on machine
  Normal   Created                 24m                   kubelet                  Created container nss-wrapper-init
  Normal   Created                 24m                   kubelet                  Created container database
  Normal   Pulled                  24m                   kubelet                  Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0" already present on machine
  Normal   Started                 24m                   kubelet                  Started container nss-wrapper-init
  Normal   Started                 24m                   kubelet                  Started container database
  Normal   Pulled                  24m                   kubelet                  Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-13.3-0" already present on machine
  Normal   Created                 24m                   kubelet                  Created container replication-cert-copy
  Normal   Started                 24m                   kubelet                  Started container replication-cert-copy
  Normal   Pulled                  24m                   kubelet                  Container image "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.33-0" already present on machine
  Normal   Created                 24m                   kubelet                  Created container pgbackrest
  Normal   Started                 24m                   kubelet                  Started container pgbackrest
  Warning  Unhealthy               4m9s (x120 over 23m)  kubelet                  Readiness probe failed: HTTP probe failed with statuscode: 503

$ kubectl logs -n crunchy-db1 crunchy-db1-crunchy-db1-jltp-0 --since=1m -c database
2021-08-06 21:35:40,274 WARNING: Retry got exception: 'connection problems'
2021-08-06 21:35:40,275 INFO: Error communicating with PostgreSQL. Will try again later
/tmp/postgres:5432 - rejecting connections
2021-08-06 21:35:49,840 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:35:49,841 INFO: Still starting up as a standby.
2021-08-06 21:35:49,843 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:35:49,844 INFO: does not have lock
2021-08-06 21:35:49,844 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:35:50,343 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:35:50,349 WARNING: Retry got exception: 'connection problems'
2021-08-06 21:35:50,351 INFO: Error communicating with PostgreSQL. Will try again later
/tmp/postgres:5432 - rejecting connections
2021-08-06 21:35:59,803 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:35:59,803 INFO: Still starting up as a standby.
2021-08-06 21:35:59,806 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:35:59,807 INFO: does not have lock
2021-08-06 21:35:59,807 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:00,044 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:00,049 WARNING: Retry got exception: 'connection problems'
2021-08-06 21:36:00,051 INFO: Error communicating with PostgreSQL. Will try again later
/tmp/postgres:5432 - rejecting connections
2021-08-06 21:36:09,799 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:09,800 INFO: Still starting up as a standby.
2021-08-06 21:36:09,803 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:09,803 INFO: does not have lock
2021-08-06 21:36:09,804 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:10,450 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:10,455 WARNING: Retry got exception: 'connection problems'
2021-08-06 21:36:10,456 INFO: Error communicating with PostgreSQL. Will try again later
/tmp/postgres:5432 - rejecting connections
2021-08-06 21:36:19,837 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:19,838 INFO: Still starting up as a standby.
2021-08-06 21:36:19,840 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:19,841 INFO: does not have lock
2021-08-06 21:36:19,841 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:20,067 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:20,072 WARNING: Retry got exception: 'connection problems'
2021-08-06 21:36:20,074 INFO: Error communicating with PostgreSQL. Will try again later
/tmp/postgres:5432 - rejecting connections
2021-08-06 21:36:29,814 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:29,815 INFO: Still starting up as a standby.
2021-08-06 21:36:29,817 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:29,817 INFO: does not have lock
2021-08-06 21:36:29,818 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:30,087 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:30,093 WARNING: Retry got exception: 'connection problems'
2021-08-06 21:36:30,095 INFO: Error communicating with PostgreSQL. Will try again later
/tmp/postgres:5432 - rejecting connections
2021-08-06 21:36:39,790 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:39,791 INFO: Still starting up as a standby.
2021-08-06 21:36:39,795 INFO: Lock owner: crunchy-db1-crunchy-db1-sd9v-0; I am crunchy-db1-crunchy-db1-jltp-0
2021-08-06 21:36:39,797 INFO: does not have lock
2021-08-06 21:36:39,798 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:40,118 INFO: establishing a new patroni connection to the postgres cluster
2021-08-06 21:36:40,124 WARNING: Retry got exception: 'connection problems'
2021-08-06 21:36:40,126 INFO: Error communicating with PostgreSQL. Will try again later
  • Ensure any code / output examples are properly formatted for legibility.

Note that some logs needed to troubleshoot may be found in the /pgdata/<CLUSTERNAME>/pg_log directory on your Postgres instance.

An incomplete bug report can lead to delays in resolving the issue or the closing of a ticket, so please be as detailed as possible.

If you are looking for general support, please view the support page for where you can ask questions.

Thanks for reporting the issue, we’re looking forward to helping you!

Overview

Add a concise description of what the bug is.

Environment

Please provide the following details:

  • Platform: (Kubernetes, OpenShift, Rancher, GKE, EKS, AKS etc.)
  • Platform Version: (e.g. 1.20.3, 4.7.0)
  • PGO Image Tag: (e.g. ubi8-5.0.1-0)
  • Postgres Version (e.g. 13)
  • Storage: (e.g. hostpath, nfs, or the name of your storage class)

Steps to Reproduce

REPRO

Provide steps to get to the error condition:

  1. Run ...
  2. Do ...
  3. Try ...

EXPECTED

  1. Provide the behavior that you expected.

ACTUAL

  1. Describe what actually happens

Logs

Please provided appropriate log output or any configuration files that may help troubleshoot the issue. DO NOT include sensitive information, such as passwords.

Additional Information

Please provide any additional information that may be helpful.

Our application server is on EC2 running on Amazon Linux 1. Postgres dropped support for Amazon Linux and so we depend on Amazon providing the postgres client.

The client is 9.6 and our Amazon hosted RDS postgres server is 11.4. Because of our automation, it would take a good amount of time to upgrade from Amazon Linux 1 to Amazon Linux 2. Even then Amazon Linux 2 only has a postgres 10 client.

It’s a small app that uses the Rails ORM Active Record and only does simple queries and inserts.

Is it a no-go to use 9.6 as the client? I’m wondering what the risk are.

Laurenz Albe's user avatar

Laurenz Albe

38.8k4 gold badges34 silver badges58 bronze badges

asked Sep 24, 2019 at 2:20

Todd's user avatar

I don’t believe that PostgreSQL dropped support for any Linux distribution. Perhaps you mean that there are no binary installation packages provided.

Using a 9.6 client is no problem, since 9.6 is a supported release. You will not be able to use new features like scram-sha-256 authentication, but I guess you can live without that.

Building PostgreSQL from source would be another option.

answered Sep 24, 2019 at 2:42

Laurenz Albe's user avatar

Laurenz AlbeLaurenz Albe

38.8k4 gold badges34 silver badges58 bronze badges

4

Well, the complete answer is a bit more complicated. The general rule is that any client version will work with any server version, if both use the same major version of libpq communication library (and protocol).

So:

  1. psql command line client 9.6 (libpq version 5.8) should work properly with server 11.4 (even Postgres 12 uses libpq 5.12).

  2. Other tools based on libpq.so.5 client library should also work properly.

  3. However pg_dump tool will refuse to work with newer major server version (9.6 will work with any 9.x, but not with 10.x and later). This behavior is intentional, to prevent creating incomplete or invalid database backups.

  4. For any other tools, based not on libpq.so library, but eg. on native JDBC driver, you need to check the exact version of communication protocol they implement.

answered Sep 24, 2019 at 9:16

Tomasz Klim's user avatar

1

Понравилась статья? Поделить с друзьями:

Читайте также:

  • Error communicating with kernel eset
  • Error communicating to tpm chip ubuntu что это
  • Error command usr bin gcc failed with exit code 1
  • Error command swig failed with exit status 1
  • Error command swig exe failed none

  • 0 0 голоса
    Рейтинг статьи
    Подписаться
    Уведомить о
    guest

    0 комментариев
    Старые
    Новые Популярные
    Межтекстовые Отзывы
    Посмотреть все комментарии