Page 1 of 1

Extracting filenames from the subject

PostPosted: Fri Dec 27, 2002 5:57 pm
by GrindKore
Hello hopefully some RegExp expert here can help me...

NOTE: This has nothing to do with NewsBin Pro functionality, but rather a personal mod I made to allow me further extend certain features.

I wrote a small VBScript that parses *.spool files and populates access database. This script runs every night and updates database with the latest record sets like a clockwork.

What I need is a hint as to how "Show Filename" filter works.
I have constructed different RegExp patters to isolate filename string from the post subject but cannot come up with a consistent way of getting filename from the various diversity of subject formats. I understand that it's not going to be 100% perfect every time due to lack of standard, but hopefully can get it to work like NBP4.

RE: Extracting filenames from the subject

PostPosted: Fri Dec 27, 2002 6:42 pm
by dexter
Off the top of my head, there are two basic ways, both start from the right and work backwards to make a best guess at the filename. If there is quoted text, it just uses all the quoted text as the filename. If there is no quoted text, it grabs all the characters until the first whitespace character. For example:

([a-zA-Z0-9,.-_]+\([0-9]+/[0-9]+\)$) for non-quoted and
(".+"\([0-9]+/[0-9]+\)$) for quoted

These are close to the real ones in NewsBin but there is a little more logic involved. Maybe when Quade gets back from vacation, he'll give more details. Let me know if you need an english version of how these expressions work.

RE: Extracting filenames from the subject

PostPosted: Sat Dec 28, 2002 4:43 pm
by GrindKore
Thanks dexter, your help is apreciated.

RE: Extracting filenames from the subject

PostPosted: Mon Dec 30, 2002 2:46 pm
by Quade
One problem I have with extracting filenames, is that embedded spaces will always screw you up unless it's a yEnc encoded. yEnc encoded posts use quoting around the filenames.