[cgi-wiki-dev] Search indexer errors
Kate L Pugh
Fri, 28 Nov 2003 18:24:41 +0000
Earle Martin <email@example.com> wrote to openguides-dev:
> [Thu Nov 27 23:15:03 2003] index.cgi:
> Search::InvertedIndex::remove_index_from_group() - Corrupted database.
> Unable to find 'ged_000000000000_c_000000000523' record
> [Thu Nov 27 23:15:03 2003] index.cgi: at
> /home/earle/openguides.org/lib/CGI/Wiki/Search/SII.pm line 204
> This is for the London site. Should I be worried?
It's deja-vu all over again:
Search::InvertedIndex does throw these little errors from time to
time. Like, about three times a year, which makes it very difficult
to track down.
So what can we do about it? CGI::Wiki only has two search backends so
far. One is the Search::InvertedIndex one, and the other is based on
DBIx::FullTextSearch, which only works on MySQL. There really is a
dearth of decent indexing/searching modules on CPAN. Does anyone know
of any that I've overlooked?
Simon Cozens is working on Plucene, a port of Lucene to Perl:
I've not properly looked into Lucene though, since it's written in
Java and I only found out about Plucene last night. Anyone here used it?
Another option would be rewriting Search::InvertedIndex to be simpler
and hence hopefully less likely to go wrong. I did a fair bit of work
towards this about a year or so ago, and sent a multitude of patches
to the author, but he is very busy and apologised profusely for not
having time to integrate them. I am not hopeful that they will ever
go into the distribution, but they might be useful for a simple
rewrite that could be released as Search::SimpleIndexer or
Search::InvertedIndec::Simple or something. I don't want to tread on
the guy's toes, and I don't even think he would see it that way,
especially as CGI::Wiki simply doesn't need much of the complexity
that's in the current distro.
The drawback of that approach (ie me writing a new search module) is
that I am already overwhelmed with programming tasks. I am deep in
the guts of both CGI::Wiki and OpenGuides this week, and I don't want
to start a major new project until the current work on those two
projects is released. I *may* be able to break the back of the
current stuff this weekend.
Is anyone else interested in a Search::InvertedIndex rewrite? You
could write code, docs, tests or all three.