Welcome to the world of regular expressions (RegEx), where the power of pattern matching and text manipulation knows no bounds.
In this comprehensive guide, we present 30 real-world examples that will take you on an exciting journey from a RegEx novice to a true master.
this compilation of practical scenarios will equip you with the skills to tackle complex text processing tasks with ease.
Regular Expressions for Numbers
Only Numbers Validation
You can use the following regular expression to validate that a string contains only numbers (digits) and no other characters.
^[0-9]+$
This regular expression matches a string that begins with one or more digits (0-9) and ends with one or more digits, with no other characters in between. The ^
character denotes the beginning of the string and the $
character denotes the end of the string. The +
character after the character class [0-9]
means “one or more occurrences”.
Min and Max Number Length Validation
To validate that a number falls within a specific length range using regex, you can use the following regular expression pattern.
^(?=[0-9]*$).{minLength,maxLength}$
In this pattern, minLength
and maxLength
are the minimum and maximum lengths for the number, respectively.
Here’s an example regex pattern for validating that a number is between 3 and 8 digits long:
^(?=[0-9]*$).{5,10}$
Explanation:
^
matches the beginning of the string(?=[0-9]*$)
is a positive lookahead assertion that requires the entire string to only contain digits.{3,8}
matches any character (except a newline) between 5 and 10 times$
matches the end of the string
You can modify the minLength
and maxLength
values in the pattern to suit your specific validation needs.
- 1st and second inputs are not valid, because the minimum number of digits required is three.
- 3rd and 4th are valid because both have a minimum of three digits and a maximum of 8 digits
- 4th is not valid, because it has min three but more than 8 digits.
Number Range Validation
To validate that a number falls within a specific range, you can use the following regular expression pattern:
^(?=[0-9]*$)(minValue|maxValue|[minValue-maxValue])$
In this pattern, minValue
and maxValue
are the minimum and maximum values for the range, respectively. You can use either minValue
or maxValue
to specify a one-sided range, or you can use both to specify a two-sided range.
Here’s an example regex pattern for validating that a number is between 100 and 200 (inclusive):
^(?=[0-9]*$)(100|200|[1-9][0-9][0-9])$
Explanation:
^
matches the beginning of the string(?=[0-9]*$)
is a positive lookahead assertion that requires the entire string to only contain digits(100|200|[1-9][0-9][0-9])
matches either 100 or 200, or any three-digit number between 100 and 199 (inclusive)$
matches the end of the string
You can modify the minValue
and maxValue
values in the pattern to suit your specific validation needs. Note that this pattern assumes that a number is a whole number (integer). If you need to validate a number that can have a decimal component, you can modify the pattern accordingly.
Only Positive Numbers
To validate that a number is positive (greater than or equal to zero), you can use the following regular expression pattern:
^(?=[0-9]*$)[0-9]+(\.[0-9]+)?$
Explanation:
^
matches the beginning of the string(?=[0-9]*$)
is a positive lookahead assertion that requires the entire string to only contain digits[0-9]+
matches one or more digits(\.[0-9]+)?
is an optional group that matches a decimal point followed by one or more digits$
matches the end of the string
This pattern allows for positive numbers that may have a decimal component. If you only want to allow integers (whole numbers), you can remove the optional decimal component like this:
^(?=[0-9]*$)[0-9]+$
In either case, the pattern will match any non-negative integer or decimal number.
Limit Decimal Places
To limit the number of decimal places in a number using regular expressions, you can use the following pattern:
/^\d+(\.\d{1,2})?$/
This pattern matches a string that starts with one or more digits (\d+
), followed by an optional decimal point ((\.)?
), and then an optional set of one to n
digits (\d{1,n}
) after the decimal point.
Accept Any Integer Number
/^-?\d+$/
This pattern matches a string that starts with an optional -
sign (-?
) to allow for negative numbers, followed by one or more digits (\d+
) representing the integer number.
Note that this pattern does not handle numbers with leading zeros or numbers in scientific notation. If you need to handle those cases, you may need to modify the pattern accordingly.
10 Digit Phone Number With No Space and Special Characters
Here’s a step-by-step explanation of the regular expression pattern ^[0-9]{10}$
for validating a 10-digit phone number:
^[0-9]{10}$
^
matches the beginning of the string[0-9]
matches any digit (0-9){10}
specifies that the previous character set (i.e.,[0-9]
) must appear exactly 10 times$
matches the end of the string
So, when you put it all together, the pattern ^[0-9]{10}$
matches any string that contains exactly 10 digits, and no other characters.
Phone Number with Country Code and Special Character
Use the following regular expression pattern to verify a phone number that has a country code and a special character (such as a plus sign or hyphen):
^\+(?:[0-9] ?){6,14}[0-9]$
Explanation:
^
matches the beginning of the string\+
matches a plus sign (the escape character \ is used before the + symbol because it has a special meaning in regular expressions)(?:[0-9] ?)
matches a digit followed by an optional space (the ?: indicates a non-capturing group){6,14}
specifies that the previous group (i.e.,[0-9] ?
) must appear between 6 and 14 times[0-9]
matches a digit (required to ensure there are no trailing spaces after the final digit)$
matches the end of the string
This pattern will match phone numbers that start with a plus sign, followed by a country code, and then the local phone number. The country code and local phone number are separated by a special character (space or hyphen) that may appear after every digit in the country code. The country code can be between 1 and 13 digits long (including any special characters), and the local phone number must be at least one digit.
Accept Only Binary Numbers
You can use the following regex to check whether the provided number is in binary format or not.
/^[01]+$/g
This pattern matches any string that contains only 0’s and/or 1’s and has at least one character. Here’s a breakdown of the pattern:
^
: matches the start of the string.[01]+
: matches one or more occurrences of the characters 0 or 1.$
: matches the end of the string.
The g
flag at the end of the pattern makes the match global, meaning it will find all matches in the input string.
Validate HexaDecimal Number using Regular Expression
/^([0-9a-fA-F]+)$/g
This pattern matches any string that contains only 0-9 digits and/or a-f or A-F letters (i.e., the hexadecimal digits) and has at least one character. Here’s a breakdown of the pattern:
^
: matches the start of the string.(
: open group.[0-9a-fA-F]+
: matches one or more occurrences of the characters 0-9, a-f, or A-F.)
: close group.$
: matches the end of the string.
The g
flag at the end of the pattern makes the match global, meaning it will find all matches in the input string.
Regular Expressions for Letters
Allow Only Letters
The provided string must not have any digits or special characters on it. Only letters are allowed.
^[a-zA-Z]+$
Explanation:
^
matches the beginning of the string[a-zA-Z]
matches any letter (upper or lower case)+
specifies that the previous character set (i.e.,[a-zA-Z]
) must appear one or more times$
matches the end of the string
This pattern will match any string that consists of one or more letters and no other characters. This pattern assumes that the string does not contain spaces or any other special characters.
Allow Only Small Case Letters
To validate that a string contains only small case letters (no digits, spaces or special characters), you can use the following regular expression pattern:
^[a-z]+$
Explanation:
^
matches the beginning of the string[a-z]
matches any small case letter+
specifies that the previous character set (i.e.,[a-z]
) must appear one or more times$
matches the end of the string
This pattern will match any string that consists of one or more small case letters and no other characters. This pattern assumes that the string does not contain spaces or any other special characters.
Allow Only Capital Case Letters
To validate that a string contains only capital case letters (no digits, spaces or special characters), you can use the following regular expression pattern:
^[A-Z]+$
Explanation:
^
matches the beginning of the string[A-Z]
matches any capital case letter+
specifies that the previous character set (i.e.,[A-Z]
) must appear one or more times$
matches the end of the string
This pattern will match any string that consists of one or more capital case letters and no other characters. This pattern assumes that the string does not contain spaces or any other special characters.
Allow Letters With Some Special Characters
To allow a string to contain letters, digits, and some special characters, you can use the following regular expression pattern as an example:
^[a-zA-Z0-9!@#$%&*()-_+=]+$
Explanation:
^
matches the beginning of the string[a-zA-Z0-9!@#$%&*()-_+=]
matches any letter (upper or lower case), digit or the allowed special characters !@#$%&*()-_+=+
specifies that the previous character set must appear one or more times$
matches the end of the string
This pattern will match any string that consists of one or more letters, digits, and the specified special characters. You can modify the list of allowed special characters based on your specific requirements.
For example, this pattern would match the following strings:
- hello123
- AbC!dEf#G
- s-u+p=3-r&v1s@n
And it would not match the following strings:
- hello 123 (contains a space)
- Hello!@# (contains capital letters)
- $MONEY (contains a non-allowed special character)
- (empty string) (contains no characters)
Validate Full Name Using Regular Expression
To validate that a string is a valid full name, you can use the following regular expression pattern as an example:
^[A-Z][a-z]*( [A-Z][a-z]*)*$
Explanation:
^
matches the beginning of the string[A-Z]
matches the first letter, which must be a capital letter[a-z]*
matches zero or lowercase letters after the first letter( [A-Z][a-z]*)*
matches zero or more additional words in the name, where each word starts with a space followed by a capital letter and zero or lowercase letters$
matches the end of the string
This pattern will match any string that consists of one or more words that are part of a full name, where each word starts with a capital letter and is followed by zero or lowercase letters. A full name can consist of one or more such words, separated by a space.
Provided Input Must be a valid UserName
To validate that a string is a valid username, you can use the following regular expression pattern:
^[a-zA-Z0-9_]{3,20}$
^
matches the beginning of the string[a-zA-Z0-9_]
matches any letter (upper or lower case), digit, or underscore{3,20}
specifies that the previous character set (i.e.,[a-zA-Z0-9_]
) must appear between 3 and 20 times$
matches the end of the string
This pattern will match any string that consists of between 3 and 20 characters, using only letters, digits, or underscores. This pattern assumes that the username does not contain spaces or any other special characters.
- 1st is a valid input because it has only alphabets and numbers
- 2nd is valid because it has only alphabets
- The rest of the inputs are not in a valid format, because of the spaces, and special characters at the beginning or at the end.
Allow only hyphen(-) in UserName
^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$
This regular expression will match any username that starts with one or more alphanumeric characters and is followed by zero or more groups of a hyphen followed by one or more alphanumeric characters.
Here is a breakdown of the regular expression:
^
: Matches the beginning of the string.[a-zA-Z0-9]+
: Matches one or more alphanumeric characters.(-[a-zA-Z0-9]+)*
: Matches zero or more groups of a hyphen followed by one or more alphanumeric characters.$
: Matches the end of the string.
Regular Expression to validate the Email Address
To validate that a string is a valid email address, you can use the following regular expression pattern as an example:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Explanation:
^
matches the beginning of the string[a-zA-Z0-9._%+-]+
matches one or more letters, digits, or any of the special characters ._%+-@
matches the at symbol, which separates the local part from the domain part of the email address[a-zA-Z0-9.-]+
matches one or more letters, digits, or hyphens in the domain name\.
matches a literal period, which separates the domain name from the top-level domain (TLD)[a-zA-Z]{2,}
matches two or more letters as the TLD$
matches the end of the string
This pattern will match any string that is a valid email address format. However, it does not check if the domain exists or if the email address is currently in use.
Domain Name Validation
To validate that a string is a valid domain name, you can use the following regular expression pattern as an example:
^(?:[-A-Za-z0-9]+\.)+[A-Za-z]{2,6}$
Explanation:
^
matches the beginning of the string(?:[-A-Za-z0-9]+\.)+
matches one or more subdomains, where each subdomain starts with one or more letters, digits, or hyphens, followed by a literal period (note the use of a non-capturing group with(?:...)
)[A-Za-z]{2,6}
matches the top-level domain (TLD), which consists of two to six letters$
matches the end of the string
This pattern will match any string that is a valid domain name. It allows for multiple subdomains, such as “www.example.com“, and it supports internationalized domain names (IDN) with non-ASCII characters.
Allow Website With Only Https
To allow only websites with the “https” protocol, you can use the following regular expression pattern as an example:
^https://[^\s/$.?#].[^\s]*$
Explanation:
^
matches the beginning of the stringhttps://
matches the literal characters “https://”[^\s/$.?#]
matches any character that is not a whitespace, a slash, a dollar sign, a period, or a hash mark (which are invalid characters in a URL).
matches a literal period, which separates the domain name from the top-level domain (TLD)[^\s]*
matches any number of non-whitespace characters, which can include query parameters and anchors (e.g.,?param=value#anchor
)$
matches the end of the string
This pattern will match any string that is a valid URL with the “https” protocol.
Allow Website With Only Http
It is almost identical to the above example where we are validating websites only with HTTPS. The only change here will be to update the beginning string from HTTPS to HTTP.
^http://[^\s/$.?#].[^\s]*$
Validate SFTP Address
An SFTP address typically consists of the following elements:
sftp://
orsftps://
(to indicate that the protocol is SFTP or SFTP with SSL, respectively)- The hostname or IP address of the server
- An optional port number (if the default port 22 is not used)
- An optional username and password (to authenticate to the server)
- An optional path to a file or directory on the server
^sftp://(([a-zA-Z0-9]+):([a-zA-Z0-9]+)@)?([a-zA-Z0-9.-]+)(:([0-9]+))?(/[a-zA-Z0-9-._~:/?#[\]@!$&'()*+,;=]+)?$
Regex to Allow only valid IP Addresses
^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
This pattern uses a combination of the ^
and $
symbols to match the start and end of the string, respectively, and a series of ( ... )
groups to match each segment of the IP address. Each segment can be one of the following:
25[0-5]
: Matches a value between 250 and 255.2[0-4][0-9]
: Matches a value between 200 and 249.[01]?[0-9][0-9]?
: Matches a value between 0 and 199.
By using this pattern, you can ensure that any IP address entered is properly formatted and meets the valid range for each segment.
IP Address with Port Number
^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?):\d{1,5}$
This regex pattern is similar to the previous one, but it also includes a :\d{1,5}
section to match the port number, which can be any number between 1 and 65535. The \d{1,5}
pattern matches one to five digits, which is the range of valid port numbers.
This regex pattern will validate IP addresses in the format, that XXX
is a valid segment of an IP address and YYYYY
is a valid port number.
Valid Html Tags Validation
To validate HTML tags, you can use a regular expression to match the syntax of an HTML tag. Here is an example of a regex pattern that can be used to validate HTML tags:
^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$
This pattern uses the following elements:
^
and$
: Match the start and end of the string, respectively.<([a-z]+)
: Matches the opening angle bracket<
followed by the tag name, which can consist of one or more lowercase letters.([^<]+)*
: Matches zero or more attributes within the tag, which can consist of any character except<
.(?:>(.*)<\/\1>|\s+\/>)
: Matches either a closing tag</[tag_name]>
or a self-closing tag/>
.
By using this pattern, you can ensure that any HTML tag entered is properly formatted and meets the syntax requirements for an HTML tag.
Note that this regex pattern is a simplified version of the HTML tag syntax and may not be suitable for all use cases. For a more comprehensive solution, you may want to use an HTML parser or a library that specializes in HTML validation.
Extract Metatags in HTML
We can use the following regex to extract the metadata tags from the HTML.
/<meta\s+[^>]*name=["']([^'"]+)["'][^>]*content=["']([^'"]+)["'][^>]*>/i
This pattern matches any meta
tag with an name
attribute and a content
attribute, and captures the values of those attributes using capturing groups. Here’s a breakdown of the pattern:
/<meta\s+
: matches the openingmeta
tag with any amount of whitespace betweenmeta
and the next attribute.[^>]*
: matches any number of characters that are not the closing>
character.name=["']
: matches thename
attribute with either single or double quotes around the value.([^'"]+)
: captures the value of thename
attribute using a capturing group, allowing for any character except single or double quotes.["'][^>]*
: matches any number of characters that are not the closing>
character, but only if they are part of thename
attribute (i.e. if they come before thecontent
attribute).content=["']
: matches thecontent
attribute with either single or double quotes around the value.([^'"]+)
: captures the value of thecontent
attribute using a capturing group, allowing for any character except single or double quotes.["'][^>]*>
: matches any remaining characters that are part of themeta
tag, up to and including the closing>
character.
The i
flag at the end of the pattern makes the match case insensitive.
Find Comments in HTML
To find HTML comments using regular expressions, you can use the following pattern:
/<!--[\s\S]*?-->/g
This pattern matches anything that starts with <!--
and ends with -->
, including any characters in between. Here’s a breakdown of the pattern:
<!--
: matches the start of an HTML comment.[\s\S]*?
: matches any number of characters, including line breaks, in a non-greedy way, meaning it will match as few characters as possible to satisfy the pattern.-->
: matches the end of an HTML comment.
The g
flag at the end of the pattern makes the match global, meaning it will find all matches in the input string.
Regex to validate the Date
To validate a date using a regular expression, you’ll need to specify the format of the date you’re expecting. Here’s an example of a regex pattern that can be used to validate dates in the format YYYY-MM-DD
:
^(20\d{2})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$
This pattern uses the following elements:
^
and$
: Match the start and end of the string, respectively.(20\d{2})
: Matches a year in the range 2000-2099.-(0[1-9]|1[0-2])-
: Matches a month in the range 01-12.(0[1-9]|[12][0-9]|3[01])
: Matches a day in the range 01-31.
This regex pattern will validate dates in the format YYYY-MM-DD
. If you need to validate a different date format, you’ll need to modify the pattern accordingly.
Regex to validate the Date and Time
To validate a date and time using a regular expression, you’ll need to specify the format of the date and time you’re expecting. Here’s an example of a regex pattern that can be used to validate dates and times in the format YYYY-MM-DD HH:MM:SS
:
^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}$
This pattern uses the following elements:
^
and$
: Match the start and end of the string, respectively.(20\d{2})
: Matches a year in the range 2000-2099.-(0[1-9]|1[0-2])-
: Matches a month in the range 01-12.(0[1-9]|[12][0-9]|3[01])
: Matches a day in the range 01-31.\s
: Matches a space character.([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]
: Matches a time in the format HH:MM: SS, with hours in the range 00-23, and minutes and seconds in the range 00-59.
This regex pattern will validate dates and times in the format YYYY-MM-DD HH:MM:SS
. If you need to validate a different date and time format, you’ll need to modify the pattern accordingly.
Validate UTC DateTime using Regex
To validate that a string represents a valid UTC DateTime in the format YYYY-MM-DDTHH:mm:ssZ
, you can use the following regular expression pattern as an example:
^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z$
Explanation:
^
matches the beginning of the string(\d{4})
matches four digits for the year-(\d{2})
matches a hyphen followed by two digits for the month-(\d{2})
matches a hyphen followed by two digits for the dayT
matches the literal character “T”, which separates the date from the time(\d{2})
matches two digits for the hour:(\d{2})
matches a colon followed by two digits for the minutes:(\d{2})
matches a colon followed by two digits for the secondsZ
matches the literal character “Z”, which indicates the UTC time zone$
matches the end of the string
Regex to Find String, Has Special Word
\b(specialword)\b
Replace “specialword” with the actual word you want to search for. This regular expression will match any occurrence of the word “specialword” as a whole word in a string.
Here’s a breakdown of the regular expression:
\b
: Word boundary to match the beginning or end of a word(specialword)
: The word you want to search for, enclosed in parentheses to create a capture group\b
: Another word boundary to match the end of the word
For example, if you want to find a string that has the word “password” in it, you can use the following regular expression:
Regex to find # tags in a Twitter post
Use the following Regex to find the hashtags in the Twitter post.
#\w+
This regular expression will match any string that starts with the ‘#’ character followed by one or more word characters (letters, digits, or underscores).
Note that this regular expression will match any string that starts with the ‘#’ character, so it may also match some false positives, such as URLs that contain a ‘#’ character. To avoid this, you can add a negative look-behind assertion to exclude URLs:
Filter Spam Comments using Regex
To filter out spam comments, you can use a regular expression to match common patterns found in spam comments. Here’s an example regular expression that you can use:
\b(?:(?:https?|ftp)://)?(?:www\.)?[a-z0-9]+(?:[._-][a-z0-9]+)*\.[a-z]{2,}\b|[\w._%+-]+@(?:[a-z0-9]+\.)+[a-z]{2,}\b|\b\w{15,}\b
This regular expression matches three types of patterns commonly found in spam comments:
- URLs: The regular expression matches any URL that starts with “http://”, “https://”, or “ftp://” and has a domain name with at least two letters (e.g., “.com”, “.org”, etc.).
- Email addresses: The regular expression matches any email address with a domain name that has at least two letters.
- Long words: The regular expression matches any word that is 15 characters or longer.
using System; using System.Text.RegularExpressions; public class Example { public static void Main() { string pattern = @"\b(?:(?:https?|ftp)://)?(?:www\.)?[a-z0-9]+(?:[._-][a-z0-9]+)*\.[a-z]{2,}\b|[\w._%+-]+@(?:[a-z0-9]+\.)+[a-z]{2,}\b|\b\w{15,}\b"; string input = @"[email protected] Hi, I hope you are doing well. Would you be interested in a guest post offer that will help you boost your website traffic? I’ve got one for you! I’ll provide you with a unique, SEO optimized, google.com keywords oriented, quality content that will interest your readers and would justneedabacklink to my website in return. You’ll just have to choose one topic out of the three I’ll send you in my next email and I will then send over the article on that topic. Shall I send you the topics then? Looking forward to your response. Best, Amelia Lopez"; RegexOptions options = RegexOptions.Multiline; foreach (Match m in Regex.Matches(input, pattern, options)) { Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index); } } }
Find Substring in String Using Regular Expression
One of the simplest regex examples is to find the substrings inside the string. You have to just keep the word inside the /b.
\blove\b
using System; using System.Text.RegularExpressions; public class Example { public static void Main() { string pattern = @"\blove\b"; string input = @"I love programing.Do you love too."; RegexOptions options = RegexOptions.Multiline; foreach (Match m in Regex.Matches(input, pattern, options)) { Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index); } } }
Conclusion
By now, you should feel confident in your ability to craft sophisticated patterns, confidently navigate the nuances of RegEx syntax, and creatively apply this knowledge to solve a diverse range of challenges.
Remember that becoming proficient in RegEx takes practice, patience, and a willingness to experiment. As you continue on your journey, you’ll find that the more you use it, the more intuitive and second-nature it becomes.