Indexing documents

Dear Lazyweb,

I am prone to downloading random documents (usually PDFs) from the interweb. Often these have helpful filenames like 30430.pdf. Renaming them helps, but I seem to collect lots of them.

Is there a simple document management system I can import them into that will let me at least categorise things, and then search for titles etc (bonus points if I can actually search the text in the documents, but I can cope if I need to manually enter the info on import)? PDF is the most useful format for me, but I'm sure I'd end up using Word or whatever support too.

apt-cache search dms turns up MyDMS and Owl though both look a little heavy - I only need it for local personal use.

Thoughts on calendars

I keep having what I've started to think of as the "calendar conversation" with various people. I start by outlining what I think are some fairly reasonable requirements. I'd like to have multiple calenders viewable by different sets of people. So Simon would be able to view my work calendar, while Katherine could see my personal calender (and possibly small bits of work calender where I was going to be away from home). I also want to be able to view my calenders from a variety of devices. My desktop, my phone and my laptop, for example. A web interface might be nice too, though less important to me.

This doesn't seem like a lot to ask. My general hope is that whoever I'm boring about this will then say "Oh, you need to look at foo and maybe bar". After all, that's the case with most software related things I need - it's already been done, and there's a good solution available.

What actually happens is the person I'm talking to nods in agreement and says they'd like the same sort of things but don't know of anything that manages it.

I think the problem comes down to syncing. Sharing my calendars with other people is effectively allowing them to sync their calendar(s) with mine, with some sort of filtering so we each see only what the other wants to see.

Likewise, multiple devices is all about syncing. Changes made via one method need to be visible on all of them (ideally as soon as possible).

Decent syncing also brings another advantage; client independence. If I can reliably sync my information to a variety of devices/clients then I can try a different client with little risk. We have this already for mail - anything that can talk POP3, IMAP and/or mbox is easy to try out and if you don't like it then you can just change back to what you had before or try something else. Calendaring doesn't seem to have that.

I don't think that the interface a client exposes needs to be complex. At a basic level what you need is a way to find out what the client thinks has been added, removed or changed. You also need to present this information in a standard format. Unfortunately iCal (RFC2445) appears to be what we've got here. I think it's a bit overly inclusive, which seems to lead to incomplete implementations. Is there a complete iCal parser library out there?

The best hope for syncing that I'm aware of is OpenSync. And I haven't been that impressed; while I managed to get it to move the contacts from my SonyEricsson v600i to my Nokia E70 some of the details got mangled on the way. And lack of things like recurring event support in some of the plugins (eg the Google Calendar syncing adaptor) or warnings about data loss in the Sunbird plugin aren't encouraging.

Speaking of Sunbird, I'm not impressed. Last time I looked (and it doesn't appear to have moved very far since then) it didn't make any attempt to support syncing. I'm sorry, WebDAV isn't a syncing system. It's a way of storing files remotely. Sure, I found the interface ok, but I need to make my calendaring data accessible for it to be useful.

The Windows approach to all of this seems to be ActiveSync and the use of Outlook. I don't believe that it really handles more than the desktop client and a remote device though? And group calendar sharing requires an Exchange server? Not exactly the flexible solution I'm looking for.

I think a key point is that all of this needs to be reliable. I didn't spend time debugging my OpenSync contact syncing issues partly because I didn't notice straight away that there was a problem (it was with additional details like email addresses rather than names/numbers IIRC), but also because it was a lot of work to delve into the code and find out just what on earth was going on. A while back I started looking at writing some calendar syncing code (called Pony) and hit a problem with complete inconsistency in handling time zones between clients that put me right off. If it's not reliable it's not useful. If it's not useful people won't test it and fix it.

I think a starting point is a reliable iCal implementation that can handle everything in the spec. From there you can write plugins that will take whatever bastardised variant your device/client supports and translate to and from it. The syncing tool will need to keep a master copy of all the data, because there's no guarantee anything else will be able to support all the fields. A local complete copy should enable appropriate frobbing of data to/from any device/client without losing the information if a new device/client that can handle the data is added.

Of course almost all the above applies to contacts as well; I have work contacts, personal contacts, boring house contacts (plumber, garage etc). vCard is the appropriate interchange format for that (why did vCalendar become iCalendar, but vCard not change?), but contacts feature similar issues to calendars I believe.

Does this seem reasonable? Please tell me someone has already done it. Point me at Debian packages that will let me sync my phone, my Google Calendar and a local client and make me very happy. I have visions of a suite of tools all chatting over DBus; a little client that looks for my phone to appear over bluetooth and then syncs when it does, a little client that monitors Google Calendar for changes. Sunbird reporting when something changes (and accepting notification of changes). Some other client ([Dates http://www.pimlico-project.org/dates.html]?) doing the same. Another sending the data over WebDAV or SFTP so that my laptop can pick up the info when I'm on the move.

I don't think I'm the only person looking for this sort of stuff. Where is it?

What's the fuss about ATI/AMD?

What did you say?

I am currently partially deaf, thanks to a sexy build up of wax in my ears. I have drops (peanut oil, apparently) which are supposed to help, but first they make it worse. So my apologies to anyone at the [Nominet](http://www.nominet.org.uk/) AGM today who thought I was blanking them, and indeed for anyone sees me over the next week or so and has to deal with me constantly asking them to repeat themselves. And my sympathy to anyone who has a permanent hearing problem.

In search of a HDTV PVR

As mentioned in the past I got a new TV this year. It's an LG 32LCD2DB with a native panel resolution of 1366x768. As well as the usual SCART/component inputs it has VGA and HDMI. Currently my homebrew PVR outputs to the TV via a full featured DVB card with onboard MPEG2 decoder, but this limits me to 720x576 output and a basic OSD. My current box is a PIII-800 with a basic onboard SiS graphics chipset, so it's not got the grunt to drive the TV directly itself. I'm thus on the lookout for an upgrade.

Let's start with my constraints. I have a Silverstone LC-02 case. I'd previously planned to upgrade to something bigger, but the discovery that LinITX can provide a variety of riser cards to suit PCI-E/PCI as needed means I don't have to. The LC-02 will take a MicroATX board and provide me with 2 slots, one of which is taken up by the PCI DVB card. This box lives in my living room, so it needs to be quiet - the current incarnation is a bit too loud but I believe that's mainly due to using a standard heatsink+fan assembly, so I'm hoping that something specifically designed to be quiet will be better. An lirc dongle and wireless keyboard/mouse are already sorted.

I'd like to expand the use of the PVR to be a more general PC as well; up until now it's just been a PVR, but being able to check email/IRC would be nice, as would the ability to play games. Up until now I've run VDR, which takes full advantage of the MPEG2 decoder, but I'd like to give MythTV a decent try. All of this points to wanting more than just a HD hardware MPEG2/MPEG4 decoder.

First lets start with the graphics chipset options. The contenders seem to be ATI, Intel, nVidia and Via. From my reading ATI lack any form of XvMC support in both their Free and binary blob drivers, so we can discount them. nVidia have support, but only in their binary drivers. The Nouveau project have support on their TODO list, but I can't see any sign of it yet. I'd really rather avoid non-free drivers, so that leaves us with just Intel and Via.

I really like the look of the Intel G965/X3000. They've opened up their driver development and it seems to be quite active from reading the Xorg mailing lists. However they don't have any XvMC support; Keith Packard on the xorg list in February says "For media purposes, the current drivers aren't taking full advantage of the hardware yet", though later on does say "XvMC is on several of our lists; I don't know when someone will pick it up and implement it". Unfortunately I get the feeling it's not a high priority; I've seen/heard several comments about the fact that any processor that'll get hooked up to this chipset will be more than fast enough to cope without the assistance. That may be true, but I don't want my CPU to be spending all its time trying to decode MPEG2 or MPEG4 at 1920x1080i while the GPU sits mostly unused. I'd much rather the GPU was used to the full giving me spare CPU cycles, or perhaps the ability to clock it a bit slower for common use cases, reducing power and heat. I'm keeping Intel on my list because I hope they do get round to proper support sooner rather than later. They've also talked about the fact the GPU is programmable which would make it perfectly possible to add MPEG4 acceleration to the chipset; that would be pretty sweet.

It's a bit odd to end up with Via on the list. I mainly think of them as associated with MiniITX boards, but they do also produce various motherboards for Intel/AMD CPUs with integrated graphics. They managed to shoot themselves in the foot for a period of time by limiting the maximum accelerated resolution to 1024x1024, but recently they've released the CX700M /P4M900 chipsets which appear to do 2048x2048. The OpenChrome project have these up and running to a basic level from what I can tell, but work on XvMC support appears to be progressing with various patches and reports flying about on the list.

So currently it looks like Via has the best option (albeit through the Free software community, rather than a concerted effort on their part), while Intel has something that could be very appealing if the driver was more complete.

The other factor is the CPU. 3 contenders this time; AMD Intel and Via. My current PIII-800 is rated at 20.8W TDP. Modern processors are in general a lot hotter, but I need to keep in mind that I'm looking for a quiet machine, and that the case doesn't have a huge amount of room for big cooling systems. AMD's dual core chips all start at 89W or more, though some of their single core lower end CPUs are more like 67W. Of course the Intel graphics won't work with it and I'm not 100% sure Via has a suitable Athlon chipset at present either (the 2 I found were for Intel/Via). Intel's P4 offerings aren't much better for power consumption, with many of the highest end chips being over 100W. Core 2 Duo looks promising; 65W, so more than the current CPU, but a lot more grunt. Via come out tops here; 20W for their 2GHz top end processor. However this is only single core and compares poorly to the equivalent clock speed of Core 2 Duo.

A MiniITX board with a 2GHz C7 and a CX700M would probably make a fine PVR, but I'm not convinced about there being enough grunt for anything else. The Via chipsets appear to have more active in-progress support for media acceleration, which suggests a P4M900 + Core 2 Duo. That's contrasted with what looks to be the more powerful X3000, that actually has a manufacturer who seems to want to get full Free support out there.

I'm probably a couple of months off actually making a decision; partly because I want to watch how it all plays out in the hope that the Intel chipset will get some more support, or that the Via chipset will be proven to be reliable in operation. And assuming I do make a video chipset/CPU decision I then get the joys of trying to find a motherboard with DVI + SP/DIF. I know the G965 can do DVI out with an ADD2 card in the PCI-E slot, but I don't know if the P4M900 supports the same thing. I can't see an Intel MicroATX 965 board with SP/DIF out - there's an HD audio header, but this doesn't seem to be a direct mapping. Can I get a converter?

subscribe via RSS