PostgreSQL 10 running in Docker

Before the end of this year PostgreSQL version 10 will be released. In this post I will explain how you can already have a quick look at this new version by running it in a Docker container.

On my Github page you can find a docker-compose file with five defined containers: an ELK stack, pgAdmin and a PostgreSQL instance. The ELK stack will be used for collecting logs from the PostgreSQL instance. I am using a host machine running Ubuntu 16.04 with Docker version 17.05.0 and Docker Compose version 1.9.0. In the Kibana dashboard, you will be able to see an overview of all the SQL statements done by a pg_restore command, and look for any errors.

First step is to download the Github repository:

git clone -b v1.0 https://github.com/rudibroekhuizen/docker-postgresplus

Move to the docker-postgresplus folder and run the following command:

 

docker-compose up

The containers will start. Start an interactive session to the Postgres container:

docker exec -it dockerpostgresplus_postgres_1 sh

Download Sakila example database:

wget https://s3.amazonaws.com/assets.datacamp.com/course/sql/dvdrental.zip; unzip dvdrental.zip

Create the Sakila database and import data:

su postgres
psql
psql> CREATE DATABASE sakila;
/q
pg_restore -U postgres -d sakila dvdrental.tar

Open the Kibana webinterface and import the dashboard.json file:

http://localhost:5601

Now it is possible to see what and how many statements where issued by pg_restore and how long they took:

postgresql_1

Open the pgAdmin webinterface to browse through all tables in the Sakila database:

http://localhost:5050

Hostname, username and password are all set to “postgres”.

pgadmin_1

Advertisements

Trace PostgreSQL queries using logging collector

A great way to see what queries are being executed and how long they take is by enabling the logging collector on your PostgreSQL database server. The logging collector is a background process which captures log messages sent to stderr and redirects them into log files.  Make sure the following settings are set in postgresql.conf:

logging_collector = on
log_min_duration_statement = 0
log_directory = '/var/log/postgresql'
log_min_messages = WARNING
log_rotation_age = 1d
log_rotation_size = 1GB
log_line_prefix = 'start_prefix{"log_time":"%m","application_name":"%a","user_name":"%u","database_name":"%d","remote_host":"%r","process_id":"%p","command_tag":"%i","sql_state":"%e","session_id":"%c","session_line_number":"%l","virtual_transaction_id":"%v","session_id":"%c"}end_prefix log_line: '

To log only actions that took more than for example 5 ms, set the value to 5. If set to 0, all actions will be logged, to disable, set to -1.

I changed the notation of the log_line_prefix to JSON format, so that this part can be easily parsed by Logstash.

Restart the PostgreSQL to make these settings active. My test setup:

– A Drupal installation on a LAPP stack (Linux, Apache, PostgreSQL, php) with Filebeat installed to ship the logs
– An ELK stack (Elasticsearch, Logstash, Kibana) to parse the logs and make the results visible

In directory /var/log/postgres you will find *.log files with log lines that look like the following:

start_prefix{"log_time":"2017-06-09 14:09:29.141 CEST","application_name":"[unknown]","user_name":"drupal_user","database_name":"drupaldb","remote_host":"::1(49408)","process_id":"8412","command_tag":"SELECT","sql_state":"00000","session_id":"593a8ff9.20dc","session_line_number":"8","virtual_transaction_id":"2/0","session_id":"593a8ff9.20dc"}end_prefix log_line: LOG: duration: 0.501 ms statement: SELECT cid, data, created, expire, serialized, tags, checksum FROM cache_data WHERE cid IN ( 'route:/search/node:keys=foobar' ) ORDER BY cid

The actual log_line should also be parsed, to get the duration value out. The complete Logstash filter can be found on this Github page. The end result in Elasticsearch will look like this:

Screenshot from 2017-06-09 14:31:57

Now make Drupal run some queries, search for ‘helloworld’:

Screenshot from 2017-06-09 14:48:56

A search for ‘helloworld’ in Elasticsearch will show that three queries where executed, two SELECT queries and one INSERT query:

Screenshot from 2017-06-09 14:52:26

Screenshot from 2017-06-09 15:01:49

With pg_stat_activity it is possible to get (almost) the same results, but there are a few drawbacks with this approach:
– Logs can not be shown real-time, scheduled queries are needed to get the information out of the pg_stat_activity table.
– The duration of the action is not available. To see query times, you need to load the pg_stat_statements module, which requires additional shared memory.

The logging collector collects all logs, not only query logs, so you will have all logs in one place. If you haven’t seen enough log line, set log_min_messages to DEBUG5, which will show you all logs possible from your database server.

Connect to PostgreSQL database from pgAdmin4 running in Docker container using ssh tunnel

The latest version of pgAdmin, version 4, is not available in the Ubuntu repo’s. To install you need to install Python, set up an virtual environment, etc. It’s a lot easier to run pgAdmin from a Docker container. In this post I will explain how to set this up and create a secure ssh tunnel to your PostgreSQL database from the container.

Install Docker with this simple script:

sudo curl -sSL https://get.docker.com/ | sh

Start the Docker service:

sudo service docker start

Download and start the pgAdmin container:

docker run -p 5050:5050 --name pgadmin nphung/pgadmin

Now open your browser and navigate to http://127.0.0.1:5050. You should see something like this:

pgadmin

The safest way to connect to your database is to use ssh. You don’t need to open up any additional ports this way. Issue the following command on your local computer:

ssh -fNg -L 172.16.1.2:5432:localhost:5432 35.156.108.86

The fist IP adres and port (172.16.1.2:5432) is the IP of your local computer. In your case it will be different, use the command ip addr to look it up. The port 5432 is the port you will be connecting to.

The second IP and port is the address of your PostgreSQL database server.

Now go to your browser and right-click on Servers, Create”, Server”.

pgadmin_server

Enter a name in the General tab. Enter IP address, User name and Password and press Save:

pgadmin_create

Now right-click on the entry you just made and click Connect Server. You should now be connected to your database server.

pgadmin_conn

Configure pfSense using YAML data

Configuring a pfSense firewall using the web-interface works, but it takes a lot of time. With the developer shell you can configure your device using commands. There are already some posts on this topic, this one and this one. Using the simple ruby script on my Github page it is possible to create developer shell commands using a YAML file as input. In VirtualBox it is simple to set up a testing environment. Do a clean install, and enable ssh:

enable_ssh

Temporarily allow all traffic on the WAN interface using pfSense Developer Shell:

allowwan

Example YAML file:

#master.yaml:

system:
  hostname: 'master'
  domain: 'localdomain'
  dnsserver:
    '0': '8.8.8.8'
    '1': '8.8.4.4'

Run the Ruby script:

ruby creator.rb master.yaml > master.conf

This will create a .conf file containing developer shell commands:

#master.conf:

$config['system']['hostname'] = 'master';
$config['system']['domain'] = 'localdomain';
$config['system']['dnsserver']['0'] = '8.8.8.8';
$config['system']['dnsserver']['1'] = '8.8.4.4';
write_config();
exec
exit

Apply all settings from your Linux machine using ssh:

ssh root@192.168.1.122 '/usr/local/sbin/pfSsh.php' < master.conf

Reboot, log on the webinterface and skip the welcome wizard. All settings are applied:
web

Analyse tshark capture in Kibana

Tshark is the terminal version of the packet capture application Wireshark. Using Tshark in combination with an ELK stack (Elasticsearch, Logstash, Kibana) it is possible to display your capture results in graphs. In this post I will explain how to capture network traffic, send it to Elasticsearch using Logstash and display graphs in Kibana. As a client I used Windows, the ELK server runs on Ubuntu.

The following command will capture network traffic for 1 minute. Once it is finished, it will run again, and so on. This will prevent memory issues with tshark and a very large .csv file. It will create a .csv file. Install Wireshark and go to C:\Program Files\Wireshark. Now run tshark:

for /L %G in (*) do tshark -a duration:60 -i 1 -t ad -lT fields -E separator=, -E quote=d -e _ws.col.Time -e _ws.col.Source -e _ws.col.Destination -e ip.src -e ip.dst -e tcp.srcport -e tcp.dstport -e _ws.col.Protocol -e ip.len -e _ws.col.Info > C:\Windows\Temp\tshark.csv

The resulting .csv file will contain lines like that look like this:

“2016-02-12 20:04:12.137523”, “123.123.45.234”, “192.168.1.1”, “123.123.45.234”, “192.168.1.1”, “443”, “63103”, “TLSv1.2”, “987”, “Application Data”

On the Windows client Logstash or Filebeat needs to be installed to transport the .csv file to Elasticsearch. Filebeat is designed for this, you can install it using a Puppet module. On the ELK server Logstash will pick up the beat and apply a filter. Use the csv filter to assign the correct field names to the values in the .csv file.

filter {
  csv {
    source => “message”
    columns => [ “col.Time”, “col.Source”, “col.Destination”, “ip.src”, “ip.dst”, “tcp.srcport”, “tcp.dstport”, “col.Protocol”, “ip.len”, “col.Info” ]
  }
  mutate {
    convert => [ “ip.len”, “integer” ]
  }
  date {
    match => [ “col.Time”, “YYYY-MM-dd HH:mm:ss.SSSSSS” ]
  }
}

Send the results to Elasticsearch, and Kibana will show the data. Now you can start analyzing your network data:

2016-02-12 22_19_09-tshark - Dashboard - Kibana 4

2016_02_12_22_36_21_tshark_Dashboard_Kibana_4

2016_02_12_22_25_56_tshark_Dashboard_Kibana_4

2016-02-12 22_39_22-tshark - Dashboard - Kibana 4.png

Enjoy analyzing!

Windows metrics: part 2

There are two options to get Logstash running on your Windows machine. The first option is a manual installation. You need to download some files, install Java and copy your Logstash config file to the right folder. Another option is to let Puppet install it for you, which is faster and can be repeated easily. On github.com you can find a Puppet module to install Logstash on Windows. I have created a role module which applies the Puppet module and also installs jq and the script mentioned in part 1 of this series.

When using the Get-EventLog System command it is is possible to collect events from Windows eventviewer. In the following screenshot you can see some result in Kibana 4.

eventvwr

Windows metrics: part 1

So you want to know what is happening on your Windows machine? You can with Logstash!

Using Logstash you can collect and process all sorts of event logs. There are various input filters available, for both Linux and Windows. Since PowerShell version 4 it is possible to display PowerShell output in JSON format. This option makes it very easy to import PowerShell output into Logstash.

PowerShell can retrieve any fact about your Windows system, for example memory usage, disk space usage, cpu load, but also events from event viewer, account information from Active Directory, Radius logons from NPS, etc. etc. The possibilities are endless!

Lets take available memory as an example, run in PowerShell:

convertto-json @(Get-WmiObject Win32_OperatingSystem | select FreePhysicalMemory) | Out-File "C:/Windows/Temp/json_array.json" -encoding utf8

Open the file C:/Windows/Temp/json_array.json and you will see something like this:

[
    {
        "FreePhysicalMemory":  254436
    }
]

To use this as an input for Logstash, we will need to do some formatting on this output. Every new line in a textfile is a new event for Logstash. There is a great tool called jq, which can do just what we need. Again, run this in PowerShell:

cmd /c "jq -c .[] < C:/Windows/Temp/json_array.json" | Out-File -Append -NoClobber "C:/Windows/Temp/json_objects.json" -encoding utf8

Open the file C:/Windows/Temp/json_objects.json and you will see something like this:

{"FreePhysicalMemory":254436}

Everytime we run the two commands, an extra line will be added, a new event on every line. This is the sort of input we need for Logstash! In part 2 you can learn how to create the Logstash filters.