[GTALUG] logrotate problem

Giles Orr gilesorr at gmail.com
Fri Oct 18 16:26:07 EDT 2019


On Thu, 17 Oct 2019 at 19:46, Michael Galea via talk <talk at gtalug.org>
wrote:

> On 2019-10-17 1:59 p.m., Giles Orr via talk wrote:
> >
> > We have a bunch of new(ish) Debian 10 VMs, and logrotate is failing to
> > rotate our non-standard logs.  Unfortunately we deleted all the old
> > Debian 9 VMs before I noticed this problem, so they're not readily
> > available for comparison.  The logrotate config files worked fine on
> > Debian 9 (provisioning is with Ansible, so it's consistent).  The
> > failures aren't detailed enough to help.  Here's the config:
> >
> >      # /etc/logrotate.d/ruby
> >      /opt/rubyapp/log/*.log {
> >              daily
> >              missingok
> >              rotate 28
> >              compress
> >              delaycompress
> >              copytruncate
> >      }
> >
> > The parent configuration is standard Debian 10:
> >
> >      # /etc/logrotate.conf
> >      # (system-supplied comments removed)
> >      weekly
> >      rotate 4
> >      create
> >      include /etc/logrotate.d
> >
> > Unfortunately my paranoia is such that I'm redacting or modifying
> > machine names and folder names ... I apologize for that.  But I don't
> > think the path involved is the problem.
> >
> > Here's one of the errors:
> >
> >      # systemctl status logrotate.service
> >      ● logrotate.service - Rotate log files
> >         Loaded: loaded (/lib/systemd/system/logrotate.service; static;
> > vendor preset: enabled)
> >         Active: failed (Result: exit-code) since Thu 2019-10-17 00:00:17
> > EDT; 12h ago
> >           Docs: man:logrotate(8)
> >                 man:logrotate.conf(5)
> >        Process: 29004 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf
> > (code=exited, status=1/FAILURE)
> >       Main PID: 29004 (code=exited, status=1/FAILURE)
> >
> >      Oct 17 00:00:01 acctserver systemd[1]: Starting Rotate log files...
> >      Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open
> > /opt/rubyapp/log/newrelic_agent.log.1 for compression
> >      Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open
> > /opt/rubyapp/log/puma.stderr.log.1 for compression
> >      Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open
> > /opt/rubyapp/log/puma.stdout.log.1 for compression
> >      Oct 17 00:00:14 acctserver logrotate[8621]: error: unable to open
> > /opt/rubyapp/log/traffic.log.1 for compression
> >      Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Main
> > process exited, code=exited, status=1/FAILURE
> >      Oct 17 00:00:17 acctserver systemd[1]: logrotate.service: Failed
> > with result 'exit-code'.
> >      Oct 17 00:00:17 acctserver systemd[1]: Failed to start Rotate log
> > files.
> >
> > Here's the folder contents:
> >
> >      # cd /opt/rubyapp/log
> >      # ls -l
> >      -rw-rw-r--+ 1 root root        1982 Oct 16 15:08 newrelic_agent.log
> >      -rw-rw-r--+ 1 root root        7194 Oct 16 13:37
> newrelic_agent.log.1
> >      -rw-rw-r--+ 1 root root        2549 Oct 10 17:45
> > newrelic_agent.log.2.gz
> >      -rw-rw-r--+ 1 root root      154290 Oct 17 12:34 puma.stderr.log
> >      -rw-rw-r--+ 1 root root      573253 Oct 16 13:37 puma.stderr.log.1
> >      -rw-rw-r--+ 1 root root      512648 Oct 10 17:45
> puma.stderr.log.2.gz
> >      -rw-rw-r--+ 1 root root         238 Oct 16 15:08 puma.stdout.log
> >      -rw-rw-r--+ 1 root root         722 Oct 16 13:37 puma.stdout.log.1
> >      -rw-rw-r--+ 1 root root         701 Oct 10 17:45
> puma.stdout.log.2.gz
> >      -rw-rw-r--+ 1 root root  4747006453 Oct 17 12:37 traffic.log
> >      -rw-rw-r--+ 1 root root 15668065757 Oct 10 17:55 traffic.log.1
> >      -rw-rw-r--+ 1 root root   850646513 Sep 20 18:12 traffic.log.2.gz
> >
> > I note that in /var/log/ - where logrotate continues to work fine - that
> > files are owned mostly root:adm (what is 'adm', and does it matter in
> > this context?) and the permissions are 640 rather than 664.  There are
> > ACLs attached to the files/folder shown above ... does _that_ matter?
> > Where this gets weirder is that if I run 'logrotate --force
> > /etc/logrotate.d/ruby' it gets rotated fine.  It runs fine if run by
> > hand, it fails if run on a SystemD timer.  Which suggests a difference
> > in permissions, but don't timers run as root:root?
> >
> > Any thoughts appreciated.  As you can see, these are damn big logs, and
> > we have this problem across multiple machines so I'd really like to fix
> > it ...
> >
> > Errors on other servers aren't always consistent with this: a fix for
> > this may or may not help with them, so I may be coming back for more.
> >
> > Thanks all.
> >
> > --
> > Giles
>
> I note that the versions of logrotate in stable (and also in unstable)
> have Bug
> "#831764 [logrotate] logrotate: does not rotate logs if delaycompress is
> used" logged against them. I'm not sure your problem sounds like the one
> posted in the bug report though.
>
> I think I remember from a previous life that logrotate will fail if the
> logging entity (rubyapp?) holds the file open. In that case you can try
> and use the post-rotate directive to "killall -hup .. " the entity,
> which should kick it off the file.
>
> Does /var/lib/logrotate/status have anything to say about the log? It
> should indicate when the last rotate occurred.
>

Yup, Ruby does hold the file open, but the logrotate option 'copytruncate'
was created for exactly that reason and worked on Debian 9: again, this
works when run by hand on Debian 10, which I think suggests that's not the
problem.

I was aware of the /var/lib/logrotate/status file, but didn't look at it
closely until you mentioned it.  And in looking at it closely, it mostly
looks as it should BUT I see a problem, an artifact of a previous attempt
to do log rotation on the parent folder:

        "/opt/rubyapp/*.log" 2019-9-17-0:0:0
        "/opt/rubyapp/log/traffic.log" 2019-10-18-0:0:5
        "/opt/rubyapp/log/puma.stdout.log" 2019-10-18-0:0:5

(lots of other lines deleted).  Notice that first line: we've been
inconsistent across our projects about where the Ruby/Puma logs were
stored, so I tried to rotate the logs in both places.  I've removed that
artifact.  It's a long shot, but at this point I'm giving percentage to
long shots ...

I'm also removing "delaycompress" from one of the hosts, we'll see if that
does anything.  Again, seems like a long shot, but ...

Thanks.

-- 
Giles
https://www.gilesorr.com/
gilesorr at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/talk/attachments/20191018/89f588c9/attachment.html>


More information about the talk mailing list