regular expressions in AntConc

  • when using regular expressions, the interface doesn’t look for complete words but (also) strings within words: hil → hills, awhile, child, philosophy
  • \b stand for word boundary: \bgr.ey\b only finds grey as a complete word
  • can be combined with other searches and applied to several strings using brackets ()

 

  • alternative characters: gr[ae]y → gray or grey
  • range of characters (or numbers): hil[a-e] → awhile, child, childish
  • excludes character(s) after ^: child[^i] → child, child’s, children… NOT childish

 

  • \d replaces one digit: \d1 → numbers starting with 1
  • \w replaces one letter: s\win → posting, blessing, destinies, swashing
  • \s replaces a whitespace character (spaces, tabs, line breaks)
  • . replaces one character (letter, number, space): gr.y → grey, gray

 

  • alternatives: cat|dog|mouse|fish finds examples of all search terms
  • but: my husband|wife finds ‘my husband’ and ‘wife’  → my (husband|wife)

 

  • ? makes preceding token optional: rea?d → matches read or red
  • * matches the preceding token zero or more times: ree*d → red but also reed
  • + matches the preceding token once or more: re+d → same results: red, reed

 

  • If you want to use any of these characters as a literal in a regex, you need to escape them with a backslash \.