I have been using RegEx a lot more recently. Mostly I have been doing Splunk searches, but I have also been writing a standard operating procedure here and there, and that tends to require defining custom fields or attributes in a way that it seems only RegEx can articulate.
- RegExr is an online tool to learn, build, & test Regular Expressions (RegEx / RegExp). Supports JavaScript & PHP/PCRE RegEx. Results update in real-time as you type. Roll over a match or expression for details. Validate patterns with suites of Tests. Save & share expressions with others.
- In addition, you will also be able to create custom expressions and integrate them with Regex Renamer with the plug-in APIs. Regex-renamer is a cross-platform utility capable of running on any operating system that comes with Java support (e.g. Mac OS X, Windows, Linux).
- Regex for Mac OS X automatically does this escaping for you when you copy regex from the the app and automatically unescapes (removing double ) when you paste into the app. This tutorial gives a simple introduction to building regular expressions using Regex for Mac OS X. You might have noticed that the price regex developed only.
As a result, I have had to create MAC and IP Address RegEx searches. I can’t tell you how frustrated I am with Cisco’s own internal inconcistencies with how they display MAC addresses. Most IOS devices use nnnn.nnnn.nnnn
, but things like ISE use nn:nn:nn:nn:nn:nn
. DOS / Windows Command prompt uses nn-nn-nn-nn-nn-nn
, just to confuse things all the more.
By comparison, IP Addresses are pretty tame. It’s just 4 numbers, ranging from 0-255 separated by three dots. If you include CIDR notation, then you need a forward slash and a number from 0-32 at the end. There are no changes in notation in either location or character. IP Addresses aren’t sometimes separated by colons or hyphens. Easy Peasy.
Domain Names on the other hand… Those are, at first blush, deceptively simple, then more complex as you try and further restrict how literal you want your RegEx to be. DNS is a series of labels separated by periods. Each label can be 63 characters long, and there can be up to 127 labels in a DNS name. However, all of this is constrained by by a total character limit of 253. To further complicate things, DNS allows hyphens, but never for the first or last character of a label.
Hi, I am trying to extract MAC addresses from a log that has all the values separated by a comma. I would use the delimiter for creating the field, but the order of the fields change from time to time, and the values might be switched around. Is there any way to use REGEX to extract the MAC address?.
Well, now that I have complained through my preamble, let us explore some of my solutions to these problems:
I have added?:
to the front of many of the groups here to make them “non-capturing”. This prevents RegEx from numbering each group that is surrounded in parenthesis. I then purposefully leave it off of the first group that defines the separator, the colon or the hyphen, so that I can use 1
Regex Builder Online
later in the search. This helps ensure that if the colon was used first, RegEx continues to expect the colon as the separator, not a mix of any of either the colon, hyphen, or period.This RegEx will match on aa:bb:cc:dd:ee:ff
, aa-bb-cc-dd-ee-ff
, or aabb.ccdd.eeff
, regardless of case, but not a mix, like aa-bb:cc.dd-ee:ff
.
10.5.60.0
, as well as 10.5.60.0/24
. This RegEx also allows single zeros in the octets and the CIDR notation so you can still match on a default route that looks like 0.0.0.0
, or 0.0.0.0/0
. Because it allows single zeros in the octets, it is also good for matching inverse expressions typically needed on access-lists like 0.0.15.255
.This RegEx allows for 1 to 127 labels that can be 2 to 63 characters in length. I figued no one runs into 1 character labels all that often, and requiring at least 2 characters makes it easy to enforce the LDH (Letters, Digits, Hyphens) rule of DNS (https://www.google.com/search?q=LDH+rule) where a label can have hyphens (even repeating ones –eye roll–), but must not start or end on one.
I also made the arbitrary decision to only allow letters in top-level domain names. This should cover the vast majority of TLDs, except for “Internationalized country code top-level domains“.
I know this breaks with convention but it covers 99.99% of the situations where you need to search for a domain, including the GTLDs. While I was able to maintain the 63 character limit per label, and the 127 label limit per FQDN, I was not able to verify that the RegEx is keeping to the 252 FQDN total character limit. If you have a RegEx way of dealing with that, I would love to hear from you!
Two invaluable RegEx sites that I use are:
https://www.rexegg.com/regex-quickstart.html#chars and
https://www.debuggex.com/
C# Regex Examples
Regex For Matching Emails
If you have found these RegExs to be useful, please comment and let me know. Additionally, I would really love to hear if you have better ways of dealing with these patterns. Of course, if you have a pattern you would like to share, please do so!