Filtering out Crazy Posts

Tips on writing regular expressions for searching the post list

Moderators: Quade, dexter

Filtering out Crazy Posts

Postby luezuve » Tue Dec 04, 2012 9:06 pm

Hello All. I'll be the first to admit that I know zip about regular expressions. In some of the groups there are these huge posts which seem to contain random headers. I'd usually just reject the poster but unfortunately they're using a generic username that would filter out too many posts. So...I'm hoping there's some kind of regular expression I could employ. Ill give an example of the post:

Topic: 645000a7ab8cbbbe86203a5ff1fb97aa[001/104] - "22d79fa0b128555468ab375611e092eb943e91f8b4c10229b06c0b5dcd0be627.0" yEnc
Poster: Yenc@power-post.org (Yenc-PP-A&A)
Group: alt.binaries.hdtv.x264
Message ID example: 1354564952.50075.1@eu.news.astraweb.com

Thanks for any help.
User avatar
luezuve
Occasional Contributor
Occasional Contributor
 
Posts: 11
Joined: Wed Jul 09, 2003 12:15 am

Registered Newsbin User since: 05/19/03

Re: Filtering out Crazy Posts

Postby TheWanderer » Wed Jul 24, 2013 5:43 am

^[a-f0-9]{25,}\[[\d/]+\][\s-]+"[a-f0-9]{35,}[.][a-z0-9]{1,4}" yenc

I wouldn't know where to begin explaining this without direct questions.

Have fun.


This expression is offered free of charge and comes with no expressed or implied warranty
TheWanderer
n00b
n00b
 
Posts: 8
Joined: Sat Jul 20, 2013 12:53 am

Registered Newsbin User since: 04/21/13

Re: Filtering out Crazy Posts

Postby TBlack » Wed Nov 05, 2014 12:53 pm

Are there any new thoughts on filtering out this type of garbage? I've tried the supplied string and it doesn't work for the newer garbage strings.

Is there an app where I can enter the string I want to filter and it builds the correct RE string?

Thanks
Tom
User avatar
TBlack
Seasoned User
Seasoned User
 
Posts: 340
Joined: Sat Mar 23, 2002 12:30 pm
Location: Indiana

Registered Newsbin User since: 04/05/03

Re: Filtering out Crazy Posts

Postby Quade » Wed Nov 05, 2014 1:49 pm

If you Ctrl-R some of these files, you might be able to see what's in them.

"[A-Z0-9]{30}"

Should match these.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: Filtering out Crazy Posts

Postby TBlack » Wed Nov 05, 2014 2:23 pm

Yes, thanks. This does catch quite a few. Some others have one or more hyphens in them and it misses those.

Thanks
Tom
User avatar
TBlack
Seasoned User
Seasoned User
 
Posts: 340
Joined: Sat Mar 23, 2002 12:30 pm
Location: Indiana

Registered Newsbin User since: 04/05/03

Re: Filtering out Crazy Posts

Postby Quade » Wed Nov 05, 2014 6:02 pm

You can add hyphens to that too.

"[A-Z0-9\-]{30}"

Maybe this one. I haven't tested that it works.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: Filtering out Crazy Posts

Postby TBlack » Wed Nov 05, 2014 6:49 pm

Thanks....let me try it.
Tom
User avatar
TBlack
Seasoned User
Seasoned User
 
Posts: 340
Joined: Sat Mar 23, 2002 12:30 pm
Location: Indiana

Registered Newsbin User since: 04/05/03

Re: Filtering out Crazy Posts

Postby TBlack » Thu Nov 06, 2014 12:31 pm

"[A-Z0-9\-]{30}"

Quade, this works very well for filtering out a lot of the header garbage....thanks for the tip!
Tom
User avatar
TBlack
Seasoned User
Seasoned User
 
Posts: 340
Joined: Sat Mar 23, 2002 12:30 pm
Location: Indiana

Registered Newsbin User since: 04/05/03


Return to Regular Expressions

Who is online

Users browsing this forum: Google [Bot] and 2 guests