goaccess: Apache (httpd) LogFormat for common or combined

I use goaccess to quickly and unobtrusively collect analytics on a handful of websites I am responsible for. If you need a very lightweight (and easy to operate) web server log analysis tool, I encourage you to check out goaccess.

My apache (httpd) instance was configured for COMBINED logging (access_log) - this was fine but did not allow me to obtain “per vhost” traffic stats in my goaccess reports. I figured it would be relatively easy to obtain this information using either Common Log Format with Virtual Host or NCSA extended/combined log format with Virtual Host.

In hindsight, it makes perfect sense but if you encounter errors like the following in goaccess (when parsing VCOMMON or VCOMBINED apache access logs), read on.

Token '-0600]' doesn't match specifier '%h'

Format Errors - Verify your log/date/time format

Proper LogFormat directives (configuration)

Your Apache LogFormat directives must reflect the following for goaccess to properly parse vhost information (out of the box):

VCOMMON (Common Log Format with Virtual Host):

LogFormat "%v:%p %h %l %u %t \"%r\" %>s %b" commonvhost
CustomLog "logs/access_log" commonvhost

VCOMBINED (NCSA extended/combined log format with Virtual Host):

LogFormat "%v:%p %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combinedvhost
CustomLog "logs/access_log" combinedvhost

The confusing aspect is related to the official httpd documentation for LogFormat. There, Common Log Format with Virtual Host is defined as follows:

"%v %h %l %u %t \"%r\" %>s %b"

Notice the vhost port (%p) is not included.

By extension, I (erroneously) surmised that NCSA extended/combined log format with Virtual Host would be defined with:

"%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""

Specifying just the vhost name (%v) without the port prevents goaccess from parsing these log formats out of the box. Appending the port (:%p) addresses that.

Hope this helps someone in a similar situation.