Update on MultiMarkdown 4.0

04/22/2013 07:31:28

I’ve been hard at work nights/weekends to rewrite MultiMarkdown, and am making fantastic progress!

The new version now passes all of the Markdown/MultiMarkdown/Compatibility tests for HTML that are expected (there is one Markdown test that peg-markdown fails, and therefore MMD fails as well. This is really an edge case, and arguably John MacFarlane’s approach is the better one here.) During the testing, I found one or two small things that I actually changed — this means that 4.0 won’t be character for character identical to 3.7, but I believe all the changes are for the better and should have minimal, if any, impact on most users.

Performance is actually better than MMD (aka peg-multimarkdown) 3.7 right now, but I still need to add the “deadman’s switch” to keep things from running amok forever on bad documents (this will slow things down some). I was worried that my new code would lose some efficiency over John’s peg-markdown work, but I managed to simplify some things that improved overall performance. It’s still slower than peg-markdown, but this is expected as it has more features.

During my tests, there are still no memory leaks. This should be a tremendous help for MultiMarkdown Composer.

I’ve added a bit more safety checking to help prevent crashes with “unusual” documents that don’t fit expectations.

More importantly, the code should be reentrant/thread-safe. This means that the commands can be called multiple times in multiple threads without “colliding” with each other. These collisions can lead to crashes that are difficult to track down as it is unlikely that two threads will collide in exactly the same way on multiple runs. I have not tested this aspect of the code yet, but that will be coming up in the next phase.

As mentioned, HTML functionality is basically complete. Adding in LaTeX, OPML, FODT support will now be relatively easy since the framework is complete.

I still have a few command line options/commands to add back in, but these should be fairly straightforward.

Once I get these features added in, I’ll upload the source to github and get some more eyeballs looking for bugs. 4.0 will basically be a feature match release for 3.7. Once all the kinks are worked out, I’ll start considering new features and updates.

For those who are interested:

  • The code is still C, and should be compatible across hardware/OS combinations
  • It uses a PEG definition that is almost the same as that used by peg-multimarkdown — the C code that handles the parsing, however, is quite different.
  • It uses [greg] instead of [peg-leg] to parse the grammar file. Later versions of peg-leg can supposedly make reentrant parsers, but greg opens up a few other features I hope to take advantage of.
  • I rewrote the Makefile. This means that there may need to be some changes for other hardware/OS types. Let me know what I need to do if you find a problem.
  • The structure of the code/header files should be a little bit more sane — peg-markdown seemed strangely configured, and I didn’t improve it when I forked to peg-multimarkdown. This should make it easier to include in other projects.

Limitations:

  • peg-markdown, and therefore MultiMarkdown use a true parser to analyze text documents for conversion to HTML, instead of regular expressions. This allows for improved accuracy, and generally excellent performance. However, in certain cases the parser can get bogged down by unusual document structures (e.g. nested HTML). Most, but not all, of these have been fixed over the years (mostly by John MacFarlane), but there are still some specific documents that just don’t play well with MMD and peg-markdown. If you stumble across this, send me the document and I’ll take a look.

Similar Pages