On 06/02/12 08:26, Dominic Hargreaves wrote:
Hrm, although this does raise the point that W::T does rely on some quite complex metadata queries; performance on a collection of JSON/YAML will suck for that, so that's probably not as good an idea after all. I can't help saying 'sqlite' at this point...
For the legacy HTML thing, I suppose I /could/ put everything into a db then take it out and put it back every time we have a new batch of tweaks to run. Using Lucy would essentially be a compromise to that, having the 'content' lying around as flat files but indexing them on certain metadata. I'm sure one could construct something similar using a conventional db backend but only putting the metadata in it, along with just filenames instead of the current content.
Another way to come at it I guess would be to use W::T in the usual way, but in addition generate an on-disk copy of the current revision of each document, almost like a cache.
Then if you ran a batch textual manipulation on those documents you could have a script to check for any document that's different from its most up-to-date version in the W::T database and update that version accordingly. That wouldn't be too bad.
On the other hand, all our documents are already under version control anyway... I suppose you could have a hybrid db / git backend where metadata, backlinks etc are kept in a .gitignored db but the revisions themselves are kept in git. Again, pretty convoluted!
--Ryan