Hi all!
The Formatter API was designed on #openguides and here a few months ago. I just released Apache::AxKit::Provider::File::Formatter to make it easy to use any Formatter formatted file within AxKit, but I bumped into a problem that I would like your input on:
All the current Formatters simply return HTML fragments from the format method. That is, the stuff they return is something that you would stick in a body element of a HTML document. That's what you'd want in most cases.
In other cases, you can't reasonably expect to get just a fragment, you'll get the whole document, even if you're just interested in a fragment. I'd like to create a Formatter around HTML::Tidy RSN, and while libtidy can format fragments, HTML::Tidy can't yet, AFAIK. For an AxKit Provider, you would usually want to return a document, since, in principle it can be served directly to the user.
So, I think we need to do something that distinguishes fragments from entire documents.
In the original API, I note the following things:
1) the format method did not define its meaning w.r.t. fragment vs. whole document. 2) To retrieve things like title, one needs to first call the format method to do the hard work, then the title method.
I suppose it is a Good Thing for those two points to stay the same.
Since we allready have to call additional methods to get things like title, I think the most reasonable way to distinguish between fragment and document is to make two additional methods with those names.
format would have to be called first, and is free to return what is most reasonable, for Formatter::HTML::Textile, that would be a fragment, like it is now, and for a HTML::Tidy based Formatter::XHTML::HTML it would be a document.
If the user needs to be sure to get a fragment or document, s/he would have to call fragment or document on the object _after_ having called format.
How does this sound?
Then, we need to define what happens if the object cannot return either a fragment or document. I guess the choices are 1) croak, 2) warn the user that the method is not implemented and return undef, or 3) just return undef.
If it just returns undef, this should be noted in the documentation of each Formatter. One should be able to meaningfully implement a something that uses different Formatters without having the code die in a special case. OTOH, one would perhaps want to know about it, so a warning may be in order. But then, the warning may just be an annoyance too, and for those who are surprised by a Formatter returning undef in some cases, well, they should have RTFMed... ;-)
I'd love to hear your input on these things.
Cheers,
Kjetil
Hi again!
After a bit of discussion on IRC, it became clear that it is not that clear what the Formatter API _is_. Understandably, since the only real documentation has been in my Formatter::HTML::Preformatted, and that's not really elaborate either.
So, I sat down and wrote a first draft POD of the specification, as it currently is. I'll write up a specification like I envision it with fragment and document, plus some other stuff, and send that too in a moment. After lunch... :-)
I'll paste it here, and attach the POD in case someone prefers to read that or even modify it. Also, I don't remember everyone who participated in the discussion we had, please kick me if you feel left out... :-)
NAME Formatter - The Formatter API specification
VERSION 0.1
SYNOPSIS Formatters are Perl Modules conforming to the following specification. Formatters are intended to assist the conversion between different markup syntaxes.
INTRODUCTION The basic idea of Formatters is to have a simple and standard way to convert from one format to another. This is a common problem across many applications, and so, a simple API for all applications to use is desireable.
Formatters generally operate on strings. For example, you have a plain text string, possibly with a bit of syntax, and you want to convert it to HTML. You will simply use the appropriate Formatter module, and call the "format" method on it, with the text string as parameter. The HTML will be returned.
In many cases, the Formatter will be a thin wrapper around a different module which does the hard work.
DESCRIPTION Module naming convention A Formatter module should be named with the format it is converted to first, then the format it is converted from. For example, the module Formatter::HTML::Textile will convert from the Textile syntax to HTML.
Methods "new" The constructor, nothing special.
"format($string)" The main formatter. Takes a string with the text that one wants converted and returns the converted text.
Must call the constructor if the object is not a reference to itself.
"links($string)" Should return all links found the input plain text string as a list.
"title" Should return the title of the document or "undef" if none can be found.
Inheritance from other modules A Formatter module may inherit methods from other modules, to aid setting syntax-specific parameters.
AUTHOR Kjetil Kjernsmo, kjetilk@cpan.org
ACKNOWLEDGEMENTS The Formatter API was originally conceived on the openguides channel on irc.perl.org. In particular, Tom Insam was an important architect of the API.
COPYRIGHT AND LICENSE Copyright (C) 2004 by Kjetil Kjernsmo
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Cheers,
Kjetil
On mandag 24 januar 2005, 14:36, Kjetil Kjernsmo wrote:
I'll write up a specification like I envision it with fragment and document, plus some other stuff, and send that too in a moment.
Uhm, right. Here we go, this is my proposal for a draft (i.e. v0.9) of the Formatter API:
NAME Formatter - The Formatter API specification
VERSION 0.9
SYNOPSIS Formatters are Perl Modules conforming to the following specification. Formatters are intended to assist the conversion between different markup syntaxes.
INTRODUCTION The basic idea of Formatters is to have a simple and standard way to convert from one format to another. This is a common problem across many applications, and so, a simple API for all applications to use is desireable.
Formatters generally operate on strings. For example, you have a plain text string, possibly with a bit of syntax, and you want to convert it to HTML. You will simply use the appropriate Formatter module, and call the "format" method on it, with the text string as parameter. The HTML will be returned.
In many cases, the Formatter will be a thin wrapper around a different module which does the hard work.
DESCRIPTION Module naming convention A Formatter module should be named with the format it is converted to first, then the format it is converted from. For example, the module Formatter::HTML::Textile will convert from the Textile syntax to HTML.
Methods "new" The constructor must be implemented, and return a reference "bless"ed into this class. There are no other requirements on the constructor, but implementors will find it useful to have a field for the string "format" will be sent.
"format($string)" This method shall initialize the formatter. As argument it must take a string with the text that one wants converted. The "format" method must allow the user to call "format" on the class name, and call the constructor implicitly in that case.
This method must return the converted string. It is however free to return a document fragment or a full document based on what is most appropriate for the module. See the "document" and "fragment" methods.
"document" The "document" method may be called on the object after it has been initialized with the "format" method. It must return a full document if it can be produced or "undef" if not. For example, if it converts to HTML, it must return a full, valid HTML document.
"fragment" The "document" method may be called on the object after it has been initialized with the "format" method. It shall only return a minimal fragment of the converted text, as little as possible markup shall be added to the fragment. In the example of HTML, it must not return a full document, and should only produce markup that can be inserted in a HTML "body" element. If it is unable to produce just the minimal markup, it must return "undef".
"links" This method may be called on the object after it has been initialized with the "format" method. Should return all links found the input plain text string as a list or an empty list if none can be found.
"title" This method may be called on the object after it has been initialized with the "format" method. Should return the title of the document or "undef" if none can be found.
Inheritance from other modules A Formatter module may inherit methods from other modules. It may inherit all the methods mentioned above if they exist in a suitable parent class, and also other methods, to aid setting syntax-specific parameters.
Formatter module implementors are encouraged to contact the API author(s) to discuss methods that should be included in the API.
AUTHOR Kjetil Kjernsmo, kjetilk@cpan.org
ACKNOWLEDGEMENTS The Formatter API was originally conceived on the openguides channel on irc.perl.org. In particular, Tom Insam was an important architect of the API.
EXAMPLES The module Formatter::HTML::Preformatted contains a minimal Formatter by the author of the specification.
COPYRIGHT AND LICENSE Copyright (C) 2005 by Kjetil Kjernsmo
This specification can be redistributed it and/or modified it under the same terms as Perl itself. The author asks that only modules conformant with the specification uses the Formatter:: namespace.
So, how does that sound?
Cheers,
Kjetil
On Sun 23 Jan 2005, Kjetil Kjernsmo kjetil@kjernsmo.net wrote:
In other cases, you can't reasonably expect to get just a fragment, you'll get the whole document, even if you're just interested in a fragment. I'd like to create a Formatter around HTML::Tidy RSN, and while libtidy can format fragments, HTML::Tidy can't yet, AFAIK.
I'm not sure that the limitations of a helper module are a good thing to base an API on. You seem to be saying that because HTML::Tidy can't be told not to add <head>, <body>, etc to things, then a Formatter using it as a helper module must always return a full HTML document regardless of whether this is appropriate.
Or perhaps I'm misunderstanding, since you mention the "fragment" method. But if you're going to provide a method to return a fragment, the limitations of whatever module you get to do the heavy lifting aren't relevant, surely.
I've just read the HTML::Tidy docs at http://cpan.uwinnipeg.ca/htdocs/HTML-Tidy/HTML/Tidy.html and as far as I can tell what it does is check the syntax of an HTML document. I can't see how to make it spit out HTML. What am I missing?
For an AxKit Provider, you would usually want to return a document, since, in principle it can be served directly to the user.
I don't actually know how AxKit works, but it seems to me that a Formatter can only ever create a really basic HTML document. How is it to know things like keywords, description, stylesheet, RDF auto-discovery whatsit, etc, to put in the <head>? What can it be reasonably expected to do other than stick in a really basic <html><head><title>Whatever the title was set as in the input</title></head><body> at the start of the output and </body></html> at the end? Is that ever really going to be useful?
This is not meant to pour scorn on your idea, but to explain how I see it at the moment and hopefully provoke comments from others.
Kake
On mandag 24 januar 2005, 17:09, Kake L Pugh wrote:
On Sun 23 Jan 2005, Kjetil Kjernsmo kjetil@kjernsmo.net wrote:
In other cases, you can't reasonably expect to get just a fragment, you'll get the whole document, even if you're just interested in a fragment. I'd like to create a Formatter around HTML::Tidy RSN, and while libtidy can format fragments, HTML::Tidy can't yet, AFAIK.
I'm not sure that the limitations of a helper module are a good thing to base an API on. You seem to be saying that because HTML::Tidy can't be told not to add <head>, <body>, etc to things, then a Formatter using it as a helper module must always return a full HTML document regardless of whether this is appropriate.
Ah, not really. The key is just that you need to know what the Formatter will do. Currently, the format method is undefined, and that's not good enough. So, it is not the shortcomings of a helper module as such, it is different use-cases combined with the limitations one has to expect from different helper modules.
That's why I propose to have a document and a fragment method. In the case of HTML::Tidy, format will naturally return a document, and so will document, whereas fragment will return undef. In the case of Textile, format will naturally return a fragment, and so will fragment. document will return <html><head><title>$self->title</title></head><body>$self->fragment</body></html>
Or perhaps I'm misunderstanding, since you mention the "fragment" method. But if you're going to provide a method to return a fragment, the limitations of whatever module you get to do the heavy lifting aren't relevant, surely.
Well, it could be, if it can't return a fragment, the behaviour has to be defined...
I've just read the HTML::Tidy docs at http://cpan.uwinnipeg.ca/htdocs/HTML-Tidy/HTML/Tidy.html and as far as I can tell what it does is check the syntax of an HTML document. I can't see how to make it spit out HTML. What am I missing?
Ah, I think that's a version old....? Anyway, the HTML::Tidy clean method can clean up the string.
For an AxKit Provider, you would usually want to return a document, since, in principle it can be served directly to the user.
I don't actually know how AxKit works, but it seems to me that a Formatter can only ever create a really basic HTML document.
Yup.
How is it to know things like keywords, description, stylesheet, RDF auto-discovery whatsit, etc, to put in the <head>?
It can't... But often, that's the best you can do with the information you have available.
What can it be reasonably expected to do other than stick in a really basic
<html><head><title>Whatever the title was set as in the input</title></head><body> at the start of the output and </body></html> at the end? Is that ever really going to be useful?
If it is the best you've got, it is better than nothing... :-)
Basically, what I do in AxKit is that I record a lot of metadata and classification information independently of the text itself. So, it is not really that's the issue. The issue is just that I need to know if I get just the fragment, something that can be inserted in the body element, or the entire document.
In my case, I will always need just the fragment, since all the metadata, titles, authors and stuff comes from other sources. In the cases where I can't get just the fragment, I want to know, in those cases, I'll use XSLT to extract the contents of the body element. I just need to know what I get. As for the AxKit provider, I found that it probably easier to always have it return a document, people can use that standalone if they wish, and I'll always use XSLT to get my fragment. But it is really quite irrelevant... :-)
This is not meant to pour scorn on your idea, but to explain how I see it at the moment and hopefully provoke comments from others.
Oh, no problem! Thanks for the opportunity to clarify the ideas!
Cheers,
Kjetil
Or perhaps I'm misunderstanding, since you mention the "fragment" method. But if you're going to provide a method to return a fragment, the limitations of whatever module you get to do the heavy lifting aren't relevant, surely.
Well, it could be, if it can't return a fragment, the behaviour has to be defined...
I would imagine it can't be too difficult to have the formatter turn an HTML document into a fragment? (Am I missing something?)
Justin
On Mon 24 Jan 2005, Kjetil Kjernsmo kjetil@kjernsmo.net wrote:
Currently, the format method is undefined, and that's not good enough. [...]
In the case of HTML::Tidy, format will naturally return a document, [...] In the case of Textile, format will naturally return a fragment, [...]
This feels to me as though it's breaking the main strength of the whole idea of using Formatter instead of just using the underlying modules - a standard API. Surely it would be best to have ->format act in the same way whichever Formatter module you were using? If you can't rely on it to always return a document, or always return a fragment, then you lose the possibility of blithely swapping from one Formatter to another without making changes in the code.
How about you standardise that ->format will always return a fragment, and ->format_as_document will always return a document?
Kake
On mandag 24 januar 2005, 18:40, justin+earth@dicatek.com wrote:
Or perhaps I'm misunderstanding, since you mention the "fragment" method. But if you're going to provide a method to return a fragment, the limitations of whatever module you get to do the heavy lifting aren't relevant, surely.
Well, it could be, if it can't return a fragment, the behaviour has to be defined...
I would imagine it can't be too difficult to have the formatter turn an HTML document into a fragment? (Am I missing something?)
Hmmm, I came to think of, the current spec is too tied to HTML. The idea is to allow any format. So, yeah, you could strip some elements, and get a fragment.
But the idea with a fragment is that you shouldn't add more than you can defend. It is a minimum, and leave the rest to the calling application. Then, document to fragment is not so clearly defined.
It would be nice to get a fragment anyway, that's true. But the question is if it is a good idea for the Formatter to try or if it is better to tell the app that "this is not clearly defined", and the app may call the document instead, and do it its own way. I think I would prefer the latter.
The alternative is to define on a per-format basis what is meant by a "fragment". While it is doable, I do prefer the idea of a minimal fragment. Or?
Cheers,
Kjetil
On mandag 24 januar 2005, 19:26, Kake L Pugh wrote:
This feels to me as though it's breaking the main strength of the whole idea of using Formatter instead of just using the underlying modules - a standard API. Surely it would be best to have ->format act in the same way whichever Formatter module you were using? If you can't rely on it to always return a document, or always return a fragment, then you lose the possibility of blithely swapping from one Formatter to another without making changes in the code.
Well, it is the whole reason why I started on this line of thought: I need to rely on a function to do the right thing, so it is swappable... :-)
You wouldn't loose anything, you can rely on ->document returning a document and ->fragment returning a fragment, that's idea of the current proposal anyway. ->format($string) initialises the formatter and returns whatever, in the case where you don't care what you get.
I think the real controversy here is actually the problem Justin brought up: That you can't always rely on a fragment being returned. I presume that it is very rare that you can't rely on a document being returned, but such situations might exist.
In the current proposal, I have turned to this problem and said that it must return undef in these rare situations. However, I'm open to the idea that the formatter must make a "best effort" to return a document or fragment, also in the case where it is not clearly defined what the document or fragment may be.
When pushed, I might even agree it is the most sensible thing to do, so that the calling application wouldn't need to care about error situations.
The problem with this approach is that the quality of the returned matter is lower that you might expect, and you wouldn't discover it. However, I guess it is not so much of a concern, since we'll mostly work on syntaxes that aren't very rich to begin with. Most of the matter is relatively crude, and the gain from not having to deal with the error situations might be greater than the loss of predictability as to what constitutes a fragment.
How about you standardise that ->format will always return a fragment, and ->format_as_document will always return a document?
Well, that's a naming problem mostly... For one thing, the format method is different, it takes the string to be converted as argument. To now define that it should only return fragments would change its semantics from previous version, admittedly with no effect, since all existing Formatters return fragments.
That's why I felt defining two new methods would make more sense, fragment and document will give you exactly what you need in terms of swappability. The difficult point is what you should do in the case where it is not completely clear what a fragment or document _is_. Here, perhaps undef is not the Right Thing[tm] to return.
Another difficulty with ->format_as_document is that one would clearly define if it should initialise like format do, or if one should call ->format first and then ->format_as_document. And if you should call either before calling things like ->links and ->title... I think I prefer to have a single initialising method. In fact, I thought about changing the constructor and drop the new(), but I decided against that, to allow using parent's methods like ->charset, which is something we may want to define too...
Great discussion!
Cheers,
Kjetil
Given that jerakeen is taking a break from all this (correct me if I'm wrong), I'm pretty busy, and I don't know of anyone else intending to code for Wiki::Toolkit in the near future, perhaps we're engaging in premature specification?
In other words, I'm proposing either sticking to/clarifying the CGI::Wiki formatter API, or just implementing something without having it be definitive...
Justin
P.S.
I think the real controversy here is actually the problem Justin brought up: That you can't always rely on a fragment being returned. I presume
Er, was that someone else? Either that or I have no idea what I'm talking about from one minute to the next :)
On mandag 24 januar 2005, 21:10, justin+earth@dicatek.com wrote:
Given that jerakeen is taking a break from all this (correct me if I'm wrong), I'm pretty busy, and I don't know of anyone else intending to code for Wiki::Toolkit in the near future, perhaps we're engaging in premature specification?
Oh, I'm coding on AxKit and TABOO, so I'm on a totally different project really. The reason why I'm taking it here is that jerakeen was interested in it, and that it is something that is interesting across many projects and applications.
Also, I really need this sorted out, and I'm under a lot of pressure, so I have to move along to other topics tomorrow at noon, at which point I need the Formatter:: API to be sorted out... :-( I really hate to rush stuff like this, but Real World demands are taking over... I'm not usually this stressed out, but I am now...
So, I have to move along, but I hate to go anywhere without you, since that would mean I engage in a sort of insular behaviour that is not going to attract people to write Formatter:: modules... :-)
In other words, I'm proposing either sticking to/clarifying the CGI::Wiki formatter API, or just implementing something without having it be definitive...
I think jerakeen's idea was to move over to this Formatter API. We discussed this when I came along, I looked at the CGI::Wiki formatter API, found it very interesting but also too tied to CGI::Wiki for my purpose, and jerakeen and I started on this work to make something that could be used across both projects.
I think the real controversy here is actually the problem Justin brought up: That you can't always rely on a fragment being returned. I presume
Er, was that someone else? Either that or I have no idea what I'm talking about from one minute to the next :)
:-D I guess it was just that you put me on that track by saying that it would be easy to get a HTML fragment from a full HTML document... :-)
Cheers,
Kjetil
OK, continuing the tradition of following up on myself, I've thought some more about this and...
I wrote:
When pushed, I might even agree it is the most sensible thing to do, so that the calling application wouldn't need to care about error situations.
I figured it was a good idea to make it must return a fragment or document. I have rewritten the spec to reflect that. Also, I have moved the format-specific things and some discussion into a sub-section of its own.
------ Spec excerpt ------
This method must return the converted string. It is however free to return a document fragment or a full document based on what is most appropriate for the module. A user who needs to be sure to retrieve either must call the C<document> or C<fragment> method afterwards.
=item C<document>
The C<document> method may be called on the object after it has been initialized with the C<format> method. It must return a full document. In the case where an underlying helper module has no concept of full document, the method must nevertheless make a best effort to return something that can be regarded a standalone document.
=item C<fragment>
The C<fragment> method may be called on the object after it has been initialized with the C<format> method. It shall only return a minimal fragment of the converted text, as little as possible markup shall be added to the fragment. In the case where only a full document is available from an underlying helper module, it should make a best effort to strip down to a minimal fragment.
[snip]
=head2 Meaning of fragment vs. document
It is to be anticipated that not all formats have a concept of full document and others not a fragment. To save the user the trouble of dealing with an error situation, the Formatter must make a best effort to return both. What is meant by a fragment and a full document varies from format to format, and must be dealt with on a per format basis.
In the case where it really doesn't make sense to return either a fragment or document, the Formatter may produce a warning, but must nevertheless return a best effort fragment or document.
For HTML, a full document is understood to be a complete valid, HTML document. The largest possible HTML fragment consists of the child elements of the C<body> element, excluding C<body> itself.
For XML, any well-formed XML document can be a full document, and any well-balanced XML region can be a fragment. An XML fragment should not contain a Prolog or Document Type Declaration.
----- end excerpt -----
As you can see, I haven't changed the names of the methods, or redefined ->format. I feel that the only thing to win by doing it is to save a single line, and while it may not be very Lazy to have that extra line, I feel there is much to be gained in clarity and consistency by having those two extra methods. I still would like to hear your input, I would really like it if there is a common API for this kind of formatting for both our projects, and I'm very open to arguments, it is just that I have to finish this and move on by noon....
Sorry to be pushing this so hard, as I said previously, I'm not usually this stressed out and hard-hitting, it is Real World demands that does this to me... :-)
Best,
Kjetil
On Mon 24 Jan 2005, Kjetil Kjernsmo kjetil@kjernsmo.net wrote:
->format($string) initialises the formatter and returns whatever, in the case where you don't care what you get.
I think the real controversy here is actually the problem Justin brought up: That you can't always rely on a fragment being returned. I presume that it is very rare that you can't rely on a document being returned, but such situations might exist.
Ahh, I get it now. It's not clear from the name that ->format does initialisation and leaves things behind in the formatter for ->document and ->fragment to act on.
It also seems possible that the ability to return a fragment rather than a document (or vice versa) could depend on the input as well as the class of the formatter.
And it feels wrong to me that information about the last input is stored in the formatter. If you want to format something else before you're done with the last thing, either the work is lost or you need to create a new formatter, and that feels very wrong.
Finally, it seems that there really should be some _programmatic_ way to find out whether we can get a fragment (or a document) back from our input.
With those points in mind, how about we change ->format to ->parse, and return an object:
my $parsed = $formatter->parse( $string ); if ( $parsed->can( "document" ) ) { print $parsed->as_document; } else { print "<html><head><title>My Doc</title></head><body>", $parsed->as_fragment, "</body></html"; }
This allows expansion of the API like so:
if ( $parsed->can( "interwiki_links" ) ) { @links = $parsed->interwiki_links; }
$parsed would be of class something like Formatter::Parsed::UseMod.
The way this fits in with Wiki::Toolkit is that when a node is created, its content is parsed by a Formatter, and the $parsed object is stored in the node object.
Kake
On tirsdag 25 januar 2005, 10:35, Kake L Pugh wrote:
Ahh, I get it now. It's not clear from the name that ->format does initialisation and leaves things behind in the formatter for ->document and ->fragment to act on.
Yup, it is not quite clear.
It also seems possible that the ability to return a fragment rather than a document (or vice versa) could depend on the input as well as the class of the formatter.
Hmmm, it could, there are many unknowns when you try to create something that is supposed to work over a wide array of things... But I suspect that usually you can smack something around any string to get a whole document...
And it feels wrong to me that information about the last input is stored in the formatter. If you want to format something else before you're done with the last thing, either the work is lost or you need to create a new formatter, and that feels very wrong.
Hmmm, I didn't understand that...
If you format $foo->format("the *foo* of bar") you can always call $foo->title anywhere, so there is nothing lost there. If OTOH you want to $baz->format("barring _foo_"), yes you should create a new object, I see that as a good thing. Nothing is lost since all the hard work must be done separately for each string anyway. I think I learnt something about this in OO class a while ago... :-)
Finally, it seems that there really should be some _programmatic_ way to find out whether we can get a fragment (or a document) back from our input.
Hm, well, that was the idea with requiring that it should return undef if it couldn't return what we wanted, you could check the return value, like...:
With those points in mind, how about we change ->format to ->parse, and return an object:
my $parsed = $formatter->parse( $string ); if ( $parsed->can( "document" ) ) { print $parsed->as_document; } else { print "<html><head><title>My Doc</title></head><body>", $parsed->as_fragment, "</body></html"; }
my $formatter = Formatter::Foo::Bar->format( $string ); if ( $formatter->document ) { print $formatter->document; } else { print "<html><head><title>My Doc</title></head><body>", $formatter->fragment, "</body></html"; }
While some way to find the capabilities of a formatter is nice, I can't really see any advantage to is in this case.
But then, when we started discussing this, I realized that it would be better just to demand that when you call $formatter->document; you're done, you can trust that you'll get something. It may not be the best, but it is something you can use. To take the worries off of the user.
This allows expansion of the API like so:
if ( $parsed->can( "interwiki_links" ) ) { @links = $parsed->interwiki_links; }
Yup. Well, I agree that something to check capabilities of the object is useful, but only in the cases where it is much more expensive to just do it than find out if we have a capability. I suspect that in most cases here, it'll be pretty cheap to find the links, so testing on $formatter->interwiki_links would be sufficient. Formatters that know they do expensive operations can cache their result transparently.
$parsed would be of class something like Formatter::Parsed::UseMod.
That would be a pretty radical departure from the architecture we are aiming at, which is a simple Formatter::To::From between formats. And I'm sort of running out on time... :-(
The way this fits in with Wiki::Toolkit is that when a node is created, its content is parsed by a Formatter, and the $parsed object is stored in the node object.
OK!
I've been discussing with jerakeen offlist, and what we have come to is to drop the new method, make format the constructor, which returns the object, not the string. Then, the document and fragment methods return the strings. OK?
Cheers,
Kjetil
On tirsdag 25 januar 2005, 10:35, Kake L Pugh wrote:
And it feels wrong to me that information about the last input is stored in the formatter. ?If you want to format something else before you're done with the last thing, either the work is lost or you need to create a new formatter, and that feels very wrong.
On Tue 25 Jan 2005, Kjetil Kjernsmo kjetil@kjernsmo.net wrote:
If you format $foo->format("the *foo* of bar") you can always call $foo->title anywhere, so there is nothing lost there. If OTOH you want to $baz->format("barring _foo_"), yes you should create a new object, I see that as a good thing.
Your Formatter object isn't really a formatter then, though, it's more of what I was getting at with the Parsed object idea. I think it is confusing to have a Formatter be something other than a thing that formats, and it's counter-intuitive to have Formatters not be re-usable.
Kake
On tirsdag 25 januar 2005, 15:12, Kake L Pugh wrote:
Your Formatter object isn't really a formatter then, though, it's more of what I was getting at with the Parsed object idea. I think it is confusing to have a Formatter be something other than a thing that formats, and it's counter-intuitive to have Formatters not be re-usable.
I just have to admit I don't understand this at all. All I do is follow the ideas of Object Oriented design as I understand them. I really don't understand what you mean by not reusable. Sure it is reusable. I think one of us has really misunderstood the fundamentals of OOP, it surely could be me, I don't have a lot of training in computer science, allthough I did go to the University where OO was invented.
Best,
Kjetil
Hi again!
We had a little chat on IRC, and got an understanding there. But we agreed I might just get it out in the wild, and take it from there. So, consider it sort of official, I'll try to get it on CPAN in some form... :-)
Cheers,
Kjetil