Skip to main content

Make dispatcher beautiful again

I started setting up a local AEM dispatcher after years today. The start wasn't great as after setting up the ssl, vhost and dispatcher.any, i got to the error that the dispatcher doesn't work with the Apache shipped with mac. that was the first learning of the day.

Thanks to brew, i was up and running with the latest Apache in minutes.

Once the installation was working, i started working on the filter rules and was annoyed with the numbering of rules. This is what my filter looked like.

 /filter {  
    /0001 { /type "deny" /glob "*" }  
    /0002 { /type "allow" /url "/content*" }  
    /0003 { /type "allow" /extension '(clientlibs|css|gif|ico|js|png|swf|jpe?g|woff2?)' }  
    /0004 { /type "allow" /url "/libs/cq/personalization/*" } #enable personalization  
    /0005 { /type "deny" /selectors '((sys|doc)view|query|[0-9-]+)' /extension '(json|xml)' }  
    /0006 { /type "deny" /path "/content" /selectors '(feed|rss|pages|languages|blueprint|infinity|tidy)' /extension '(json|xml|html)' }  
    /0007 { /type "allow" /url "/libs/granite/csrf/token.json*" }
 }  

I could see there are many problems with it
  1. The number doesn't have any significance on the rule order. The rules are evaluated top-to-bottom irrespective of the rule number.
  2. It's a common scenario that rule needs to be added in the middle to optimize evaluation performance. If you are a person like me who prefer keeping things in order, this becomes annoying as all the rule below the insertion requires a +1 in the rule number. that makes git diff messy.
  3. One really need to review the complete line to understand what the rule offers as the rule number doesn't give any hint.
That made me think. Wouldn't it be nice if we can apply clean code to dispatcher.any and give meaningful name to each rule.

Guess what. You can. A quick test confirmed that i can name the rules the way i want. Here is the same configuration with the clean code applied.

 /filter {
    /deny-all { /type "deny" /glob "*" }
    /allow-content { /type "allow" /url "/content*" }
    /allow-extensions { /type "allow" /extension '(clientlibs|css|gif|ico|js|png|swf|jpe?g|woff2?)' }
    /allow-cq-personalization { /type "allow" /url "/libs/cq/personalization/*" }
    /deny-selectors-on-all-paths { /type "deny" /selectors '((sys|doc)view|query|[0-9-]+)' /extension '(json|xml)' }
    /deny-selectors-on-content { /type "deny" /path "/content" /selectors '(feed|rss|pages|languages|blueprint|infinity|tidy)' /extension '(json|xml|html)' }
    /allow-csrf-token { /type "allow" /url "/libs/granite/csrf/token.json*" }
 }

I hope this was helpful.

Comments

Post a Comment

Popular posts from this blog

What would log(s) say - Part 1: Find requests by HTTP status code

As the title suggests, this will be a multi part post. I hope you find the series useful. The idea for this post and blog originated when i posted a possible quick solution in an internal forum. 2 people reached out saying that i should document it. Since i keep on experimenting with such stuff, why not share that in the form of the blog .   Trivia: The post title is a inspired by a common Hindi saying "Log Kya Kahenge". Yes. Pun intended. Today's solution: from the access logs, get list of URLs resulting in an error and sort them by most frequent to least frequent.   awk '{if($9>400) { print $9 "\t" $7}}' access.log* | cut -d '?' -f1 | sort -k2 | uniq -c | sort -nr Explanation: Scan through all the access.log in the folder and look for requests with status code is above 400. The idea is to catch only the errors like 404, 500 etc.  With awk we only print the status code ($9) and the PATH which is broken ($7). With cut, we get rid of