Page 1 of 1

a quick thought on extending npb search

PostPosted: Thu Apr 27, 2006 9:17 am
by bobkoure
There are some folks who'd like NBP searching to be more like Google et al - and there are some of us who think it's fine as-is.

What happens when you take a very simple syntax, plus some kind of enclosing characters (parens, braces, square brackets, whatever) to enclose regex expressions?
For instance, -(RRR) means not any posts that the regex expression RRR returns true for? (using whatever set of logical characters best make sense - and maybe even insisting that there be no whitespace between the logical character and the opening enclosing character.) Oh, and no nesting. Should be simple to parse, would satisfy the "like Google" request, and the default would be that it continued to work as it is.

Thoughts? Could be a very stupid idea - I haven't taken the time to really think it through.

PostPosted: Thu Apr 27, 2006 11:02 am
by Quade
It's not that it's a bad idea, it's just that I'm not sure how to implement it. Maybe after 5.2 comes out I'll get a chance to think about it. There are major changes to search coming though. More fundamental than just what syntax the search query is. It won't make 5.2 but, it's coming pretty quickly afterwords.

PostPosted: Thu Apr 27, 2006 1:45 pm
by Smite
Not just search, but the find box as well.

Any delimeter would be safe (I suggest {} since they're so rare) as long as you searched for the matching close delimeter, and ignored any internal ones.

Something like:
Code: Select all
for(int pos = 0; pos < searchstring.length; pos++)
{
  switch(searchstring[pos])
  {
    case '+':
      AndList.Add(getSearchTermStartingAt(pos+1));
      break;
    case '-':
      NotList.Add(getSearchTermStartingAt(pos+1));
      break;
    default:
      OrList.Add(getSearchTermStartingAt(pos));
      break;
  }
  pos += term.length;
}

foreach(term in OrList)
{
  AddToList(linesThatMatch(term));
}
etc

.
.
.

public string getSearchTermStartingAt(int pos)
{
  if the first and last char are { and }, strip them
  if the first char is ", find the next ", and return the text between them, escaped for RegEx.
  Otherwise, return the text between pos and the next space, escaped for RegEx
}