Skip to main content

What would log(s) say - Part 1: Find requests by HTTP status code

As the title suggests, this will be a multi part post. I hope you find the series useful.
The idea for this post and blog originated when i posted a possible quick solution in an internal forum. 2 people reached out saying that i should document it. Since i keep on experimenting with such stuff, why not share that in the form of the blog. 
Trivia: The post title is a inspired by a common Hindi saying "Log Kya Kahenge". Yes. Pun intended.
Today's solution: from the access logs, get list of URLs resulting in an error and sort them by most frequent to least frequent. 
 
awk '{if($9>400) { print $9 "\t" $7}}' access.log*  | cut  -d '?' -f1  | sort -k2 | uniq -c | sort -nr

Explanation:
Scan through all the access.log in the folder and look for requests with status code is above 400. The idea is to catch only the errors like 404, 500 etc. 
With awk we only print the status code ($9) and the PATH which is broken ($7).
With cut, we get rid of the query-string part from the PATH as we are only interested in the URI.
with sort and uniq we organize the data so that we have the URLs causing most errors appearing at top.

OutPut: 
The command will generate the output like this. The first column here is the count of error, 2nd is the status code and the third is the PATH.
   7 404 /etc.clientlibs/settings/wcm/designs/default/resources.css
   4 500 /content/experience-fragments/test-fragment/test-fragment.html
   4 404 /editor.htmlblank
   2 404 /favicon.ico
   1 500 /libs/cq/gui/components/authoring/editors/clientlibs/sites/page.js
   1 500 /libs/cq/experience-fragments/content/commons/targetexporter.html
   1 404 /libs/wcm/core/content/pageinfo.json
Assumption:
the script is based on the standard format of AEM. However, the command can work with any server logs given that all the access logs capture both URL and the status code.
P.S: in the enterprise space, its very common to see logs aggregators and they are all great. However, there are times when i have been restricted to get the insights i need and that's where these neat utilities comes handy.
Let me know what do you think.

Comments

Popular posts from this blog

Make dispatcher beautiful again

I started setting up a local AEM dispatcher after years today. The start wasn't great as after setting up the ssl, vhost and dispatcher.any, i got to the error that the dispatcher doesn't work with the Apache shipped with mac . that was the first learning of the day. Thanks to brew, i was up and running with the latest Apache in minutes. Once the installation was working, i started working on the filter rules and was annoyed with the numbering of rules. This is what my filter looked like. /filter { /0001 { /type "deny" /glob "*" } /0002 { /type "allow" /url "/content*" } /0003 { /type "allow" /extension '(clientlibs|css|gif|ico|js|png|swf|jpe?g|woff2?)' } /0004 { /type "allow" /url "/libs/cq/personalization/*" } #enable personalization /0005 { /type "deny" /selectors '((sys|doc)view|query|[0-9-]+)' /extension '(json|xml)' } /0006 { /