Support Forums

Full Version: Teaching Regular Expressions (Regex)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Note: Sorry if I am unclear with somethings I say; when I am teaching things I tend to explian things abstractly.

A regular expression is a sort of pattern-matching mini language. Regex can be very useful for anythign using string manipulation such as filtering certain characters, matching, or input validation. It can also be used for searches! In this tutorial, you are going to learn the basics of regex and hopefully how to work off this sort of "mini language."

Be aware that the actual regex syntax is different than Java.
Here is an example of regex pattern matching;
Code:
Enter Regex: abc
Enter String: abc
Match
Enter String: abcd
No Match
In this example, the user inputs the characters abc. If the user inputs a string which correctly corresponds to the regex, the two strings match. If its not exactly the same, it will not match.

Regex lets you use two types of character classes; predefined and custom classes. Self explanatory, predefined is what is already made, and custom is what you want the matcher to specifically find. Here's a table of predefined characters:
[Image: picture1bj.png]

The period (.) is a wildcard that will match any character:
Code:
Enter Regex: b.t
Enter String: bat
Match
Enter String: bot
Match
Enter String: boat
No Match

The \d class like said in the table is for any digit (0-9). Here is an example how to use digits in a phone number format:
Code:
Enter Regex: \d\d\d - \d\d\d - \d\d\d\d
Enter String: 905-867-5309
Match.
Enter String: 550-403-04
No Match
Enter String: asd-423-aesd
No Match
In this example, the second string did not match because only 2 digits were inputted instead of the required 4 (did anyone catch the first phone number? Cool ). Also, the third didnt match because word characters were used instead of numbers.

I will lastly explain the \s class since the others you should have an idea of. Remember that the \s class matches white space characters. Heres an example:
Code:
Enter Regex: ...\s...
Enter String: abc 123
Match
Enter String: abc    123
Match

If you simply want to limit the pattern to match only one space, use this:
Code:
Enter Regex: ... ...
Enter String: abc 123
Match
Enter String: abc    123
No Match

Now onto some custom character classes, its hardly different from basic character matching using predefined classes, but still very effective. To create a custom character class, place the characters you wish in a set of brackets like so:
Code:
Enter Regex: b[aeiou]te
Enter String: bite
Match
Enter String: bete
Match
Enter String: byte
No Match
As you can see, y was not located inside the brackets, therefore it doesnt match the pattern. Note that you can use digits or upper and lowercase numbers in the brackets.

You can also use more than one set of brackets like so:
Code:
Enter Regex: [bB] [aeiou] [tT]
Enter String: bit
Match
Enter String: beT
Match
Enter String: Bat
Match
And so on.

Custom character classes can also specify ranges given inside the brackets. For example:
Code:
Enter Regex: [a-z] [0-5]
Enter String: a1
Match
Enter String: z5
Match
Enter String: 5a
No Match

You can also range more than once class:
Code:
Enter Regex: [a-zA-Z0-9]
Enter String: a
Match
Enter String: A
Match
Enter String: 0
Match

Regex can include classes that can match any character but ones preceeding a caret(^) as shown:
Code:
Enter Regex: [^cf]at
Enter String: bat
Match
Enter String: mat
Match
Enter String: cat
No Match
Enter String: fat
No Match

Stay tuned as I will be posting an advanced tutorial of regular expressions. I hope you learned something new today.


~ Project Evolution
Looks like a good tutorial, I'll give it a read when I get home from my driving lesson.
Sure thing, im going to be including the advanced version soon which will come with a sample program as to some good uses for regex.