Page 1 of 1

Ignore my philosophies

PostPosted: Fri Jul 26, 2013 6:03 am
by TheWanderer
So I am sitting here trying to devise an expression to that will reliably match file extensions and only file extensions without having to type them all out.
(i.e. ^.*?\.(reg|ex|can|suk) )


When devising expressions that are not simple I always remind myself that regex is greedy.
Regex wants to take everything and it doesn't give two f..ks that you don't want it to take everything.
Doesn't care what you think or how you feel about anything.
It will only laugh (if it could) when you get frustrated because it isn't being logical or rational.
It is a matter of trying to make it do what you want (or trick it) and you often still lose.


Then it dawned on me, regex is Friend of the Court.

(That's the punch line. Go back about your business. Nothing more to see here.)

Re: Ignore my philosophies

PostPosted: Fri Jul 26, 2013 8:29 am
by Quade
I'm on the "good enough" school of regexes.

In the case of usenet. 99% of the extension end in a space or quote so, I use that.

"[.]jpg[\s\"]" if I want to be relatively precise.

I know I'm probably missing the point of your post...

Re: Ignore my philosophies

PostPosted: Mon Jun 13, 2016 11:31 am
by bobkoure
Ages late, but on the off chance someone finds this thread via search

\w word character i.e. [a-zA-Z0-9_]
\W inverse of above (non word character)
\. dot
{N} number of characters
q(?=u) lookahead, matches a q that is followed by a u
soooo...

(\.\w{3})(?=\W) // dot, then 3 word chars, followed by a non-word char
or, if you don't want the dot in \1
\.(\w{3})(?=\W)
You can also use lookbehind to see if there's a dot before the three word chars, but lookbehind has performance issues

Sounds like the OP got bitten by a 'hungry' match. Regex has... a fairly steep learning curve, but it's super useful

Re: Ignore my philosophies

PostPosted: Thu Dec 15, 2016 7:39 am
by kalzekdor
bobkoure wrote:Ages late, but on the off chance someone finds this thread via search

\w word character i.e. [a-zA-Z0-9_]
\W inverse of above (non word character)
\. dot
{N} number of characters
q(?=u) lookahead, matches a q that is followed by a u
soooo...

(\.\w{3})(?=\W) // dot, then 3 word chars, followed by a non-word char
or, if you don't want the dot in \1
\.(\w{3})(?=\W)
You can also use lookbehind to see if there's a dot before the three word chars, but lookbehind has performance issues

Sounds like the OP got bitten by a 'hungry' match. Regex has... a fairly steep learning curve, but it's super useful


Except there are plenty of file extensions that aren't three letters long...