[cgi-wiki-dev] Formatter API change proposal

Kjetil Kjernsmo cgi-wiki-dev@earth.li
Mon, 24 Jan 2005 17:38:58 +0100

On mandag 24 januar 2005, 17:09, Kake L Pugh wrote:
> On Sun 23 Jan 2005, Kjetil Kjernsmo <kjetil@kjernsmo.net> wrote:
> > In other cases, you can't reasonably expect to get just a fragment,
> > you'll get the whole document, even if you're just interested in a
> > fragment. I'd like to create a Formatter around HTML::Tidy RSN, and
> > while libtidy can format fragments, HTML::Tidy can't yet, AFAIK.
> I'm not sure that the limitations of a helper module are a good thing
> to base an API on.  You seem to be saying that because HTML::Tidy
> can't be told not to add <head>, <body>, etc to things, then a
> Formatter using it as a helper module must always return a full HTML
> document regardless of whether this is appropriate.

Ah, not really. The key is just that you need to know what the Formatter 
will do. Currently, the format method is undefined, and that's not good 
enough. So, it is not the shortcomings of a helper module as such, it 
is different use-cases combined with the limitations one has to expect 
from different helper modules.

That's why I propose to have a document and a fragment method. In the 
case of HTML::Tidy, format will naturally return a document, and so 
will document, whereas fragment will return undef. In the case of 
Textile, format will naturally return a fragment, and so will fragment. 
document will return 

> Or perhaps I'm misunderstanding, since you mention the "fragment"
> method.  But if you're going to provide a method to return a
> fragment, the limitations of whatever module you get to do the heavy
> lifting aren't relevant, surely.

Well, it could be, if it can't return a fragment, the behaviour has to 
be defined...

> I've just read the HTML::Tidy docs at
>   http://cpan.uwinnipeg.ca/htdocs/HTML-Tidy/HTML/Tidy.html
> and as far as I can tell what it does is check the syntax of an HTML
> document.  I can't see how to make it spit out HTML.  What am I
> missing?

Ah, I think that's a version old....? Anyway, the HTML::Tidy clean 
method can clean up the string.

> > For an AxKit Provider, you would usually want to return a document,
> > since, in principle it can be served directly to the user.
> I don't actually know how AxKit works, but it seems to me that a
> Formatter can only ever create a really basic HTML document.  


> How is 
> it to know things like keywords, description, stylesheet, RDF
> auto-discovery whatsit, etc, to put in the <head>? 

It can't... But often, that's the best you can do with the information 
you have available. 

> What can it be 
> reasonably expected to do other than stick in a really basic
> <html><head><title>Whatever the title was set as in the
> input</title></head><body> at the start of the output and
> </body></html> at the end?  Is that ever really going to be useful?

If it is the best you've got, it is better than nothing... :-) 

Basically, what I do in AxKit is that I record a lot of metadata and 
classification information independently of the text itself. So, it is 
not really that's the issue. The issue is just that I need to know if I 
get just the fragment, something that can be inserted in the body 
element, or the entire document. 

In my case, I will always need just the fragment, since all the 
metadata, titles, authors and stuff comes from other sources. In the 
cases where I can't get just the fragment, I want to know, in those 
cases, I'll use XSLT to extract the contents of the body element. I 
just need to know what I get. As for the AxKit provider, I found that 
it probably easier to always have it return a document, people can use 
that standalone if they wish, and I'll always use XSLT to get my 
fragment. But it is really quite irrelevant... :-)

> This is not meant to pour scorn on your idea, but to explain how I
> see it at the moment and hopefully provoke comments from others.

Oh, no problem! Thanks for the opportunity to clarify the ideas!


Kjetil Kjernsmo
Astrophysicist/IT Consultant/Skeptic/Ski-orienteer/Orienteer/Mountaineer
kjetil@kjernsmo.net  webmaster@skepsis.no  editor@learn-orienteering.org
Homepage: http://www.kjetil.kjernsmo.net/        OpenPGP KeyID: 6A6A0BBC