Faster grepping with awk.

Turns out that for some cases, awking is much, much faster than grep.

Just now, I wanted to know how many unique MAC addresses appeared in our DHCP server’s log file asking for a lease but not getting it.  There are a few ways to skin this cat.  What’s interesting is that some ways are *much* faster than others and when you’re searching through large log files, speed helps:

This is all on openSUSE 10.3 (X86-64) with kernel 2.6.22.17-0.1-default. Your milage will vary, of course, but the ratio should be about the same.

wc daemon.log
  90145 1044693 9287866 daemon.log

So, 90,000 lines, about 9.2 MB. Not a huge file. Searching for a fixed phrase, no fancy regexp.

With GNU grep 2.5.2:

time (grep 'no free leases' daemon.log > t1)
real    0m12.512s
user    0m12.505s
sys    0m0.004s

I tried various switches to optomize – like -F, -E, and with a $ at the end of the search string. No help. Looks like the builtin optomizer knows as much as I do in this case.

With GNU Awk 3.1.5g

time (awk '/no free leases/ {print}' daemon.log > t1)
real	0m0.558s
user	0m0.548s
sys	0m0.012s

So, learn the basic awk syntax and start using it instead of always reaching for grep.

Leave a Reply

You must be logged in to post a comment.