I'm confused about something. Can any postfix experts help debug?
On Friday I put a new package-owner-alias cronjob in place on bastion (that Matt Prahl wrote). It generates the package owner list from pagure over dist-git instead of generating it from pkgdb.
https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=eb24...
Kevin saw that the hourly cronjob is spitting out some errors from the postfix command.
However, when I inspect the file, I don't find any duplicates:
$ cat /etc/postfix/package-owner | awk ' { print $1 } ' | uniq -d
Anybody know what's wrong with that file/output?
The problem is not in the file/output. It's in the running of the script. If Pagure on Dist-Git takes too long to reply, or is down, haproxy returns an HTML page with "this app is offline".
That is what's causing the failure: simplejson.scanner.JSONDecodeError: Expecting value: line 2 column 1 (char 1) error creating owner-alias file
Given that the script asks Pagure to page through a lot of projects everytime, the chance of any one of those requests failing gets quite high. Thus, basically every run crashes some part through the run because a Pagure result took too long or was aborted by haproxy.
-Ralph