PG database container issues

Azamat · August 30, 2024, 3:25pm

Recently, we have been having database connection issues with preview apps. Once preview app is created (we do that via qovery-cli in github actions), it starts running as normal, but then all of a sudden, our app container starts failing to connect to database. Looking at postgres container logs, I see these:

2024-08-30 15:07:44.651 GMT [1] LOG:  terminating any other active server processes
2024-08-30 15:07:44.763 GMT [35333] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.765 GMT [35334] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.767 GMT [1] LOG:  all server processes terminated; reinitializing
2024-08-30 15:07:44.864 GMT [35335] LOG:  database system was interrupted; last known up at 2024-08-30 14:24:39 GMT
2024-08-30 15:07:44.867 GMT [35339] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.868 GMT [35338] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.874 GMT [35340] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.876 GMT [35341] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.954 GMT [35342] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.955 GMT [35343] FATAL:  the database system is in recovery mode
2024-08-30 15:07:44.963 GMT [35345] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.051 GMT [35344] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.054 GMT [35346] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.055 GMT [35347] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.062 GMT [35349] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.064 GMT [35348] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.069 GMT [35350] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.155 GMT [35352] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.155 GMT [35351] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.158 GMT [35335] LOG:  database system was not properly shut down; automatic recovery in progress
2024-08-30 15:07:45.162 GMT [35335] LOG:  redo starts at 0/26A1C48
2024-08-30 15:07:45.163 GMT [35353] FATAL:  the database system is in recovery mode
2024-08-30 15:07:45.167 GMT [35354] FATAL:  the database system is not yet accepting connections
2024-08-30 15:07:45.167 GMT [35354] DETAIL:  Consistent recovery state has not been yet reached.
2024-08-30 15:07:45.171 GMT [35355] FATAL:  the database system is not yet accepting connections
2024-08-30 15:07:45.171 GMT [35355] DETAIL:  Consistent recovery state has not been yet reached.
2024-08-30 15:07:45.250 GMT [35335] LOG:  invalid record length at 0/26E1420: expected at least 24, got 0
2024-08-30 15:07:45.250 GMT [35335] LOG:  redo done at 0/26E13F8 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.08 s
2024-08-30 15:07:45.254 GMT [35356] FATAL:  the database system is not yet accepting connections
2024-08-30 15:07:45.254 GMT [35356] DETAIL:  Consistent recovery state has not been yet reached.
2024-08-30 15:07:45.254 GMT [35357] FATAL:  the database system is not yet accepting connections
2024-08-30 15:07:45.254 GMT [35357] DETAIL:  Consistent recovery state has not been yet reached.
2024-08-30 15:07:45.261 GMT [35358] FATAL:  the database system is not yet accepting connections
2024-08-30 15:07:45.261 GMT [35358] DETAIL:  Consistent recovery state has not been yet reached.
2024-08-30 15:07:45.261 GMT [35336] LOG:  checkpoint starting: end-of-recovery immediate wait
2024-08-30 15:07:45.266 GMT [35359] FATAL:  the database system is not yet accepting connections
2024-08-30 15:07:45.266 GMT [35359] DETAIL:  Consistent recovery state has not been yet reached.
2024-08-30 15:07:45.291 GMT [35336] LOG:  checkpoint complete: wrote 171 buffers (1.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.021 s, sync=0.004 s, total=0.033 s; sync files=94, longest=0.002 s, average=0.001 s; distance=253 kB, estimate=253 kB; lsn=0/26E1420, redo lsn=0/26E1420
2024-08-30 15:07:45.295 GMT [1] LOG:  database system is ready to accept connections
2024-08-30 15:08:12.203 GMT [35420] LOG:  disconnection: session time: 0:00:00.274 user=postgres database=core_staging host=ip-10-210-30-247.eu-central-1.compute.internal port=7028
2024-08-30 15:08:32.023 GMT [35459] LOG:  disconnection: session time: 0:00:00.137 user=postgres database=manage_staging host=ip-10-210-30-247.eu-central-1.compute.internal port=33372
2024-08-30 15:12:45.673 GMT [35365] LOG:  disconnection: session time: 0:05:00.168 user=postgres database=manage_staging host=ip-10-210-39-89.eu-central-1.compute.internal port=43527
2024-08-30 15:12:45.673 GMT [35364] LOG:  disconnection: session time: 0:05:00.290 user=postgres database=core_staging host=ip-10-210-61-14.eu-central-1.compute.internal port=41984
2024-08-30 15:12:45.676 GMT [35363] LOG:  disconnection: session time: 0:05:00.311 user=postgres database=core_staging host=ip-10-210-39-89.eu-central-1.compute.internal port=5269

What could this mean?
URL preview env: https://console.qovery.com/organization/4f429c8a-030d-4565-bf20-4ebde8a5fa52/project/7509adfe-45ed-4028-8648-793ea6c1f2b8/environment/42f94209-872a-4578-80d4-75c259e170cc/services/general

rophilogene · September 1, 2024, 7:16am

Hi @Azamat , I tried to take a look at your service with the web console URL but it seems that it no longer exists. Did you solve the issue?

Azamat · September 2, 2024, 8:25am

Hi Romaric, that env is deleted now, but issue happens often, as well with our staging app: https://console.qovery.com/organization/4f429c8a-030d-4565-bf20-4ebde8a5fa52/project/7509adfe-45ed-4028-8648-793ea6c1f2b8/environment/57e95f15-9b85-4017-934e-21cd6624fa90/logs/f7c1b99c-4cf5-4445-8c09-8c6396ab0de6/live-logs

Azamat · September 4, 2024, 7:32am

@rophilogene any updates?

ce_gagnaire · September 4, 2024, 11:40am

Hello @Azamat.

Nice to see you are using preview environments.

We checked your environment this morning, and it looks like you may have some resource problems.

On your staging namespace, some of your applications were OOM killed, and the memory % of your PostgreSQL was high.

We also found this log: database system was not properly shut down; automatic recovery in progress

One possibility is your database does not have enough memory to run smoothly.
Your memory setting is 100mo whereas our default is 512mo.

When your database exceeds its maximum memory, the system kills and respawns it.
The logs you linked may be because of this reboot.

Can you try increasing the memory allocated to your DB and check if you still have this problem?

Regards,
Charles-Edouard

Azamat · September 9, 2024, 3:46pm

Hello! Thanks for the reply. We increased the memory in settings and tried for couple of days, so far seems to be working well. Thanks!

ce_gagnaire · September 11, 2024, 6:00am

This topic was automatically closed after 38 hours. New replies are no longer allowed.

Topic		Replies	Views
Container databases are not reliable Questions and Answers	6	147	April 13, 2024
Error Container Postgres Questions and Answers qovery	3	347	March 25, 2024
Container PostgreSQL database does not start Database	2	31	November 13, 2024
Database Connection Refused Questions and Answers qovery , aws	5	645	May 21, 2023
My database is not deploying any more on my prod env Questions and Answers	2	108	July 15, 2024

PG database container issues

Related topics