The fact that regex doesn’t support inverse matching is not entirely true. You can mimic this behavior by using negative look-arounds:
1 ^((?!hede).)*$
The regex above will match any string, or line without a line break, not containing the (sub) string ‘hede’.As mentioned, this is not something regex is “good” at (or should do), but still, it is possible.
Explanation
A string is just a list of
1 | n |
characters. Before, and after each character, there’s an empty string. So a list of
1 | n |
characters will have
1 | n+1 |
empty strings. Consider the string
1 | "ABhedeCD" |
:
1
2
3
4 +--+---+--+---+--+---+--+---+--+---+--+---+--+---+--+---+--+
S = |e1| A |e2| B |e3| h |e4| e |e5| d |e6| e |e7| C |e8| D |e9|
+--+---+--+---+--+---+--+---+--+---+--+---+--+---+--+---+--+
index 0 1 2 3 4 5 6 7
where the
1 | e |
‘s are the empty strings. The regex
1 | (?!hede). |
looks ahead to see if there’s no substring
1 | "hede" |
to be seen, and if that is the case (so something else is seen), then the
1 | . |
(dot) will match any character except a line break. Look-arounds are also called zero-width-assertions because they don’t consume any characters. They only assert/validate something.
So, in my example, every empty string is first validated to see if there’s no
1 | "hede" |
up ahead, before a character is consumed by the
1 | . |
(dot). The regex
1 | (?!hede). |
will do that only once, so it is wrapped in a group, and repeated zero or more times:
1 | ((?!hede).)* |
. Finally, the start- and end-of-input are anchored to make sure the entire input is consumed:
1 | ^((?!hede).)*$ |
As you can see, the input
1 | "ABhedeCD" |
will fail because on
1 | e3 |
, the regex
1 | (?!hede) |
fails (there is
1 | "hede" |
up ahead!).
Jan 18 at 15:49 community-wiki Thanks to Bart Kiers