Character Classes

The Class


These are not really chacater classes but classes. A class adresses one character and one character only.

CREATED 2016-04-19 13:53:32.0

00-24-CE

UPDATED 2016-04-19 13:53:51.0

Structure


Character classes are enclosed in brackets [...] and address a set of characters that can apply to a single position.

Example: There are a couple ways to spell the word gray, with an a and with an e. To capture both spellings of this word a character class can be used.

gr[ae]y

This would capture a g followed by r followed by either a or e followed by y, any where on a single line.

CREATED 2016-04-19 13:32:28.0

00-24-CD

UPDATED 2016-04-19 13:53:46.0

Metacharacters


Classes have their own set of metacharacters that act differently when inside a class structure. These symbols would need to be escaped outside of a class (the brackets).

CREATED 2016-04-19 13:54:52.0

00-24-CF

UPDATED 2016-04-19 13:54:59.0

The Caret (^) - Nagates the contents of a class


The caret ^, in the first position, negates the contents of a class.

Example: You want to find a word that is not followed by an s. Like Table but not Tables

Table[^s] will find the word table any place on a single line that is not followed by an s or that is not plural. NOTE: This will not find the word table if it is that last word in a line because the engine is looking for the word table that is not followed a s. In other words it is looking to a T followed by a followed by b followed by l followed by e followed by any character that is not an s which does not include an end of line marker.

CREATED 2016-04-19 14:30:11.0

00-24-D1

UPDATED 2016-04-19 14:32:51.0

The Dash (-) - Indicates a range


The dash (-) is also a metacharacter that indicates a range as long as it is NOT in the first position. [a-z] would address any lowercase letter but [-az] would find either - or a or z in a single position.

Ranges can be strung together e.g. [A-Za-z] would address all upper and lowercase letters.

CREATED 2016-04-19 14:30:26.0

00-24-D2

UPDATED 2016-04-19 14:33:00.0

The Dot (.) - Any Character


The dot, in a character class, means any character. so [.] would match all characters on all lines. However, it would not match blank lines because they don't have any characters. On the other hand [^.] would match only lines with no characters on them i.e. blank lines. Therefore [.] would get rid of blank lines.

Let's say we were looking for a serial number that had a 4 in the third position. ^[.][.][4] would find all lines with a 4 in that position.

CREATED 2016-04-20 14:35:58.0

00-24-D7

UPDATED 2016-04-20 14:46:08.0

Knowledge

L
I
N
K
S

DBID: db.wam

Page Server: Ruger

©2012 Leistware Data Systems

      Hello anonymous