Page 1 of 1

Filtering out Crazy Posts

PostPosted: Tue Dec 04, 2012 9:06 pm
by luezuve
Hello All. I'll be the first to admit that I know zip about regular expressions. In some of the groups there are these huge posts which seem to contain random headers. I'd usually just reject the poster but unfortunately they're using a generic username that would filter out too many posts. So...I'm hoping there's some kind of regular expression I could employ. Ill give an example of the post:

Topic: 645000a7ab8cbbbe86203a5ff1fb97aa[001/104] - "22d79fa0b128555468ab375611e092eb943e91f8b4c10229b06c0b5dcd0be627.0" yEnc
Poster: Yenc@power-post.org (Yenc-PP-A&A)
Group: alt.binaries.hdtv.x264
Message ID example: 1354564952.50075.1@eu.news.astraweb.com

Thanks for any help.

Re: Filtering out Crazy Posts

PostPosted: Wed Jul 24, 2013 5:43 am
by TheWanderer
^[a-f0-9]{25,}\[[\d/]+\][\s-]+"[a-f0-9]{35,}[.][a-z0-9]{1,4}" yenc

I wouldn't know where to begin explaining this without direct questions.

Have fun.


This expression is offered free of charge and comes with no expressed or implied warranty

Re: Filtering out Crazy Posts

PostPosted: Wed Nov 05, 2014 12:53 pm
by TBlack
Are there any new thoughts on filtering out this type of garbage? I've tried the supplied string and it doesn't work for the newer garbage strings.

Is there an app where I can enter the string I want to filter and it builds the correct RE string?

Thanks

Re: Filtering out Crazy Posts

PostPosted: Wed Nov 05, 2014 1:49 pm
by Quade
If you Ctrl-R some of these files, you might be able to see what's in them.

"[A-Z0-9]{30}"

Should match these.

Re: Filtering out Crazy Posts

PostPosted: Wed Nov 05, 2014 2:23 pm
by TBlack
Yes, thanks. This does catch quite a few. Some others have one or more hyphens in them and it misses those.

Thanks

Re: Filtering out Crazy Posts

PostPosted: Wed Nov 05, 2014 6:02 pm
by Quade
You can add hyphens to that too.

"[A-Z0-9\-]{30}"

Maybe this one. I haven't tested that it works.

Re: Filtering out Crazy Posts

PostPosted: Wed Nov 05, 2014 6:49 pm
by TBlack
Thanks....let me try it.

Re: Filtering out Crazy Posts

PostPosted: Thu Nov 06, 2014 12:31 pm
by TBlack
"[A-Z0-9\-]{30}"

Quade, this works very well for filtering out a lot of the header garbage....thanks for the tip!