Friday, September 6, 2013

How to use Regular Expression

One of the most powerful tools available to sysadmins/programers  are Regular Expressions, if you are good with regular expressions you can solve many day to day problem extremely quickly e.g. find all the lines with ERROR in log file , find count of any particular ID in log file , find exceptions etc.

regular expression is generic concept which has implemented on many different language and many different tools including Java, here is some of the tools and languages which use regular expression.

  • The vi editor which comes standard with the Unix/Linux operating system.
  • Any decent programmer's editor e.g. EditPlus,Notepad ++
  • The grep command found standard on many operating systems including Unix/Linux
  • Sed command can be found in Unix/Linux
  • The Perl programming language.  
  • The PHP programming language. 

Regular Even if you feel regular expression is complex and hard to learn I would suggest to familiarize with atleast basic set of regex and try to use it as much as possible and later you will only want to learn more and more to do the stuff quickly.

Here are some of the basic regex and there examples:

1) you want to find ERROR in log file in linux ?    
grep ERROR logfile

2) If you want to find lines starts with ERROR ?
grep ^ERROR logfile     (^ is used to find for startswith)

3) If you want to find lines ends with ERROR ?
grep ERROR$ logfile     ($ is used to find for endswith)

4) If you want to find empty lines in log file ?
grep ^$ logfile

5) If  you want to match upper or lower case ERROR ?
grep [Ee]RROR logfile               ([] is used to include letters]

6)If you want find all lines that contains Error or Exception ?
egrep ERROR|Exception logfile   ( | is used for OR condition)

7) Match the letter E when it appears at least 3 times in a row but possibly 4 or more times in a row: E {3,} 

8) Match the letter E when it appears 3 times in a row or 6 times in a row or anything in between. E{3,6}

9)Match E when it appears 1 or more times in a row.
E+ 

10) (E+ and E{1,} mean exactly the same thing!)

No comments: