PDF Module

11/04/2005 14:42:55

Installing a module is easy: Create a modules subdirectory in your data directory, and put the Perl file in there. It will be loaded automatically.

Download module:

Introduction

This is an OddMuse module that takes advantage of the approach behind the Static Hybrid Module and MultiMarkdown to allow any wiki page to be downloaded as a pdf, without regenerating it for each request.

It is currently still in testing, but seems to work.

First, to request a pdf version of a page, use a URL that looks like this:

(see the added /pdf/ bit?)

How a pdf is served by a website

If the pdf has already been created, it will be returned without any script or processing occuring. This works in the same manner as the Static Hybrid Module - the pdf is already in the /pdf directory, so it gets served by Apache.

Otherwise, we start a multistep process to generate a pdf:

  1. Create a temporary directory to work in. This directory also serves as a lockfile to prevent more than one visitor from generating the same pdf at the same time.

  2. Output a copy of the XHTML from Oddmuse (without headers, side bars, footers, etc). Some metadata is added by the module (this can be customized).

  3. Run a custom script to convert XHTML to pdf ( I use XSLT to convert to LaTeX, and then pdflatex to create a pdf. See [[My Markdown Workflow]] for more information. You can, of course, use other tools to do this.)

  4. Copy the pdf file to proper directory.

  5. Clean up all the temp files, and remove the working directory (which was also the lockfile, remember?)

  6. Send a 302 to have the browser reload the page - now they should see the actual pdf file since it exists.

In much the same way as the Static Hybrid Module, whenever a file is edited, the old pdf is deleted. If someone else requests a pdf, a new one will be generated. This way, the newest version is always served, without the delay of generating a pdf whenever you edit a page. It does mean that the first person to request a pdf for a particular page after it has changed will see a slight delay.

Required Software

  • OddMuse
  • webserver software that allows you access to .htaccess files and the redirect bit (see [[Colophon]] for more info. You have to be able to use the same URL to access the pdf if it exists, OR the Oddmuse cgi script if it does not.
  • some tool to convert XHTML to pdf. I use:
    • Markdown (MultiMarkdown is definitely preferred, but I think the regular version will still work.)
    • some xslt files to transform XHTML into Tex (I recommend mine as starting points - [[Markdown and XML]]
    • pdflatex
    • several latex packages (in addition to what I consider a basic install):
      • pagenote
      • plainfootnote
      • xmpincl

How to Install and Configure This Module

  • Install module in your $ModuleDir
  • Choose a directory for your temporary files
  • Choose a directory for the pdf’s
  • Configure your webserver to look to Oddmuse for pdf’s that are not in that directory. See [[Colophon]] for how I do this for my Oddmuse wiki - it’s why you (almost) never see the wiki.cgi bit in my URL’s). It’s also the same way that the Static Hybrid Module works.
  • Create a script that converts XHTML to pdf (my system is detailed in the module, but it might be a bit too complicated for most users. All you need is a tool that you are happy with, which one is up to you. Feel free to give your favorite on the comments page)
  • set the appropriate variables in the module, or your config file, for the temporary directory, the pdf directory, and the conversion tool

Tips

If you are using my conversion setup, I recommend using <h1> as your top level outline items on your documents. I had previously been using <h2> and <h3> without any real reason, and will be slowly migrating this website to to the new standard. This makes the pdf output more useful. Feel free to offer feedback about this.

Nifty Features

If you use my system, the pdf that is generated will have a table of contents (or whatever it’s called) that allows you to see the overall structure of the pdf and choose the section you are interested in. On Mac’s, Preview lets you do this, but Safari does not.

Known Limitations

It is not easy to get this module to work. You need to be at least somewhat familiar with Apache, and some sort of utility to convert XHTML to pdf.

That said, this module can be set up to give you almost any kind of output you want. You can customize things at almost any step in the process, and I encourage you to explore. You could also do something similar using XSL-FO or whatever it’s called, where you use XSLT to create a pdf directly. That might be easier, but I like the way LaTeX pdf files look, so I am not going to pursue that.

It would also be trivial to replicate the module for other types of output besides pdf.

I will not offer one on one help on setting up the conversion tools - it’s up to you to pick a tool that you are familiar with and can use. My system using XSLT and LaTeX is somewhat complicated, but quite powerful. Others might prefer something similar. It’s up to you.

Things To Do

Title should be based on the “Non-Normal” form of $id (remove underscores)

Notes

Please, please, please don’t go around requesting the pdf version of every page on this site. Feel free to explore, but be reasonable about it.

This concept could be extended to provide other document types, especially those that can be created by XSLT transforms of the XHTML source. I am open to ideas.

If you’re one of the minority of OddMuse users who want to use this module and takes the time to set it up, let me know. I’d be interested in hearing your thoughts about how it works, and how you configured your version.

Version History

  • 1.5 - Fix incompatibility with Firefox (and possibly other browsers); download links should now end in .pdf

  • 1.4 - added a feature to provide “Download this page as PDF” link in footer

  • 1.3 - moved XSLT stuff to external script, module itself is now less complicated, but there is more burden on the user to configure the conversion process.

  • 1.2 - Initial public release

Similar Pages