Trace PostgreSQL queries using logging collector

A great way to see what queries are being executed and how long they take is by enabling the logging collector on your PostgreSQL database server. The logging collector is a background process which captures log messages sent to stderr and redirects them into log files.  Make sure the following settings are set in postgresql.conf:

logging_collector = on
log_min_duration_statement = 0
log_directory = '/var/log/postgresql'
log_min_messages = WARNING
log_rotation_age = 1d
log_rotation_size = 1GB
log_line_prefix = 'start_prefix{"log_time":"%m","application_name":"%a","user_name":"%u","database_name":"%d","remote_host":"%r","process_id":"%p","command_tag":"%i","sql_state":"%e","session_id":"%c","session_line_number":"%l","virtual_transaction_id":"%v","session_id":"%c"}end_prefix log_line: '

To log only actions that took more than for example 5 ms, set the value to 5. If set to 0, all actions will be logged, to disable, set to -1.

I changed the notation of the log_line_prefix to JSON format, so that this part can be easily parsed by Logstash.

Restart the PostgreSQL to make these settings active. My test setup:

– A Drupal installation on a LAPP stack (Linux, Apache, PostgreSQL, php) with Filebeat installed to ship the logs
– An ELK stack (Elasticsearch, Logstash, Kibana) to parse the logs and make the results visible

In directory /var/log/postgres you will find *.log files with log lines that look like the following:

start_prefix{"log_time":"2017-06-09 14:09:29.141 CEST","application_name":"[unknown]","user_name":"drupal_user","database_name":"drupaldb","remote_host":"::1(49408)","process_id":"8412","command_tag":"SELECT","sql_state":"00000","session_id":"593a8ff9.20dc","session_line_number":"8","virtual_transaction_id":"2/0","session_id":"593a8ff9.20dc"}end_prefix log_line: LOG: duration: 0.501 ms statement: SELECT cid, data, created, expire, serialized, tags, checksum FROM cache_data WHERE cid IN ( 'route:/search/node:keys=foobar' ) ORDER BY cid

The actual log_line should also be parsed, to get the duration value out. The complete Logstash filter can be found on this Github page. The end result in Elasticsearch will look like this:

Screenshot from 2017-06-09 14:31:57

Now make Drupal run some queries, search for ‘helloworld’:

Screenshot from 2017-06-09 14:48:56

A search for ‘helloworld’ in Elasticsearch will show that three queries where executed, two SELECT queries and one INSERT query:

Screenshot from 2017-06-09 14:52:26

Screenshot from 2017-06-09 15:01:49

With pg_stat_activity it is possible to get (almost) the same results, but there are a few drawbacks with this approach:
– Logs can not be shown real-time, scheduled queries are needed to get the information out of the pg_stat_activity table.
– The duration of the action is not available. To see query times, you need to load the pg_stat_statements module, which requires additional shared memory.

The logging collector collects all logs, not only query logs, so you will have all logs in one place. If you haven’t seen enough log line, set log_min_messages to DEBUG5, which will show you all logs possible from your database server.