Monday, April 4, 2011

Regex 101 - Introduction

Predefined Character Classes

. Any character (may or may not match line terminators)
\d A digit: [0-9]
\D A non-digit: [^0-9]
\s A whitespace character: [ \t\n\x0B\f\r]
\S A non-whitespace character: [^\s]
\w A word character: [a-zA-Z_0-9]
\W A non-word character: [^\w]

Quantifiers
Meaning  Greedy  Reluctant  Possessive  
X?  X??  X?+  X, once or not at all  
X*  X*?  X*+  X, zero or more times  
X+  X+?  X++  X, one or more times  
X{n}  X{n}?  X{n}+  X, exactly n times  
X{n,}  X{n,}?  X{n,}+  X, at least n times  
X{n,m}  X{n,m}?  X{n,m}+  X, at least n but not more than m times

Boundary Matchers
 ^  The beginning of a line  
$  The end of a line  
\b  A word boundary  
\B  A non-word boundary  
\A  The beginning of the input  
\G  The end of the previous match  
\Z  The end of the input but for the final terminator, if any  
\z  The end of the input

Constant Equivalent Embedded Flag Expression
Pattern.CANON_EQ  None  
Pattern.CASE_INSENSITIVE  (?i)  
Pattern.COMMENTS  (?x)  
Pattern.MULTILINE  (?m)  
Pattern.DOTALL  (?s)  
Pattern.LITERAL  None  
Pattern.UNICODE_CASE  (?u)  
Pattern.UNIX_LINES  (?d)

No comments:

Post a Comment