Freeze break request: koji autovacuum_freeze_max_age

Sunday, 1 March 2020

Greetings. 

Last night koji alerted due to slowness. It was not backups or anything,
but rather the database hitting the limit I raised in commit c678f73b:

-autovacuum_freeze_max_age = 200000000   # maximum XID age before forced vacuum 
+autovacuum_freeze_max_age = 300000000   # maximum XID age before forced vacuum 

What this means is basicially: postgres records the xid (transaction id)
that can 'see' other transactions in the table rows. However, xid is a
32 bit value, meaning there can only be about 2.1billion transactions
before it 'wraps around'. When it does so, all the 'old' XID's need
to
be gone or it will confuse it. It removes the old xids by marking old
transactions as 'frozen' (so any other transaction should see them). 

So, this value tells the autovacuumer to start processing the table for
old xids and frezzing them, so by the time the wrap around happens
everything will be set. 

Unfortunately, it's doing this on the buildroot_listing table, which is:
 public | buildroot_listing         | table | koji  | 219 GB     | 

So, the i/o load is heavy and koji is slow to respond to real requests. 

There's (at least) tree things we could do: 

1. Bump the autovacuum_freeze_max_age up to 600million. The 100million
bump I did in january gave us about 1.5 months, so if we do 600, we
might last until june, when we will be migrating to the new datacenter.
600million is still a long way from 2.1 billion, so it should be fine. 
At that point I hope to move db-koji01 to a rhel8 instance and much
newer postgresql. We could also run the vacuum duing downtime and let it
finish. 

2. Just let it finish now. Things will be slow, I don't know for how
long. Users will complain and it will take longer for people to get
things done, but at the end we should be in better shape and there's
basically no action we need to take (other than handling complaints)

3. Schedule an outage and take the db offline and run the vacuum. This
might be quicker than letting the autovac finish, I am not sure. 

Thoughts? please +1 the freeze break of the option you thing is best, or
feel free to ask more info or suggest other options. 

kevin

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006