From 8699f81114645905cc80359d9179f3fb08104a37 Mon Sep 17 00:00:00 2001 From: Michel Fortin Date: Tue, 14 Aug 2007 16:29:47 -0400 Subject: [PATCH] PHP Markdown 1.0.1d --- PHP Markdown Readme.text | 196 +++++++++++++++++++++------------- markdown.php | 222 ++++++++++++++------------------------- 2 files changed, 202 insertions(+), 216 deletions(-) diff --git a/PHP Markdown Readme.text b/PHP Markdown Readme.text index 9bc20ca..9c2e49c 100644 --- a/PHP Markdown Readme.text +++ b/PHP Markdown Readme.text @@ -1,7 +1,7 @@ PHP Markdown ============ -Version 1.0.2b7 - Sat 16 Sep 2006 +Version 1.0.1d - Fri 1 Dec 2006 by Michel Fortin @@ -57,54 +57,53 @@ version. same line than Markdown. Your entries will now be formatted by PHP Markdown. +3. To post Markdown content, you'll first have to disable the + "visual" editor in the User section of WordPress. + You can configure PHP Markdown to not apply to the comments on your WordPress weblog. See the "Configuration" section below. -Note: It is not possible at this time to apply a different set of +It is not possible at this time to apply a different set of filters to different entries. All your entries will be formated by -PHP Markdown. This is currently a limitation of WordPress. If your old -entries are written in HTML (as opposed to another formatting syntax), -your site should not suffer much from installing PHP Markdown. +PHP Markdown. This is a limitation of WordPress. If your old entries +are written in HTML (as opposed to another formatting syntax, like +Textile), they'll probably stay fine after installing Markdown. ### bBlog ### -PHP Markdown also works with the latest version of [bBlog][bb]. +PHP Markdown also works with [bBlog][bb]. [bb]: http://www.bblog.com/ -1. To use PHP Markdown with bBlog, rename "markdown.php" to - "modifier.markdown.php" and place the file in the "bBlog_plugins" - folder. This folder is located inside the "bblog" directory of - your site, like this: +To use PHP Markdown with bBlog, rename "markdown.php" to +"modifier.markdown.php" and place the file in the "bBlog_plugins" +folder. This folder is located inside the "bblog" directory of +your site, like this: (site home)/bblog/bBlog_plugins/modifier.markdown.php -2. Select "Markdown" as the "Entry Modifier" when you post a new - entry. This setting will only apply to the entry you are editing. +Select "Markdown" as the "Entry Modifier" when you post a new +entry. This setting will only apply to the entry you are editing. -### Replacing Textile ### +### Replacing Textile in TextPattern ### -Many web programs written in PHP use [Textile][tx] to format your text. -To use PHP Markdown with these programs without having to change the -code, you can use PHP Markdown in "Textile Compatibility Mode." +[TextPattern][tp] use [Textile][tx] to format your text. You can +replace Textile by Markdown in TextPattern without having to change +any code by using the *Texitle Compatibility Mode*. This may work +with other software that expect Textile too. [tx]: http://www.textism.com/tools/textile/ + [tp]: http://www.textpattern.com/ -1. Rename the "markdown.php" file to "classTextile.php". +1. Rename the "markdown.php" file to "classTextile.php". This will + make PHP Markdown behave as if it was the actual Textile parser. -2. Locate the "classTextile.php" file hidden somewhere inside the - installation of your program (see table below). Replace it with - the PHP Markdown file you just renamed. +2. Replace the "classTextile.php" file TextPattern installed in your + web directory. It can be found in the "lib" directory: -As an helper, here you can learn where is the "classTextile.php" file -in some web programs: - - Program Location - ---------------------------------------------------------------- - TextPattern (site home)/textpattern/lib/classTextile.php - Pivot (site home)/pivot/includes/textile/classtextile.php + (site home)/textpattern/lib/ Contrary to Textile, Markdown does not convert quotes to curly ones and does not convert multiple hyphens (`--` and `---`) into en- and @@ -158,17 +157,17 @@ Markdown can be configured to produce HTML-style tags; e.g.:
-To do this, you must edit the "$md_empty_element_suffix" variable -below the "Global default settings" header at the start of the -"markdown.php" file. +To do this, you must edit the "MARKDOWN_EMPTY_ELEMENT_SUFFIX" +definition below the "Global default settings" header at the start of +the "markdown.php" file. ### WordPress-Specific Settings ### By default, the Markdown plugin applies to both posts and comments on your WordPress weblog. To deactivate one or the other, edit the -`$md_wp_posts` or `$md_wp_comments` variable under the "WordPress -settings" header at the start of the "markdown.php" file. +`MARKDOWN_WP_POSTS` or `MARKDOWN_WP_COMMENTS` definitions under the +"WordPress settings" header at the start of the "markdown.php" file. Bugs @@ -184,6 +183,99 @@ expected; (3) the output PHP Markdown actually produced. Version History --------------- +1.0.1d (1 Dec 2006) + +* Fixed a bug where inline images always had an empty title attribute. The + title attribute is now present only when explicitly defined. + +* Link references definitions can now have an empty title, previously if the + title was defined but left empty the link definition was ignored. This can + be useful if you want an empty title attribute in images to hide the + tooltip in Internet Explorer. + +* Made `detab` aware of UTF-8 characters. UTF-8 multi-byte sequences are now + correctly mapped to one character instead of the number of bytes. + +* Fixed a small bug with WordPress where WordPress' default filter `wpautop` + was not properly deactivated on comment text, resulting in hard line breaks + where Markdown do not prescribes them. + +* Added a `TextileRestrited` method to the textile compatibility mode. There + is no restriction however, as Markdown does not have a restricted mode at + this point. This should make PHP Markdown work again in the latest + versions of TextPattern. + +* Converted PHP Markdown to a object-oriented design. + +* Changed span and block gamut methods so that they loop over a + customizable list of methods. This makes subclassing the parser a more + interesting option for creating syntax extensions. + +* Also added a "document" gamut loop which can be used to hook document-level + methods (like for striping link definitions). + +* Changed all methods which were inserting HTML code so that they now return + a hashed representation of the code. New methods `hashSpan` and `hashBlock` + are used to hash respectivly span- and block-level generated content. This + has a couple of significant effects: + + 1. It prevents invalid nesting of Markdown-generated elements which + could occur occuring with constructs like `*something [link*][1]`. + 2. It prevents problems occuring with deeply nested lists on which + paragraphs were ill-formed. + 3. It removes the need to call `hashHTMLBlocks` twice during the the + block gamut. + + Hashes are turned back to HTML prior output. + +* Made the block-level HTML parser smarter using a specially-crafted regular + expression capable of handling nested tags. + +* Solved backtick issues in tag attributes by rewriting the HTML tokenizer to + be aware of code spans. All these lines should work correctly now: + + bar + bar + `` + +* Changed the parsing of HTML comments to match simply from `` + instead using of the more complicated SGML-style rule with paired `--`. + This is how most browsers parse comments and how XML defines them too. + +* `
` has been added to the list of block-level elements and is now + treated as an HTML block instead of being wrapped within paragraph tags. + +* Now only trim trailing newlines from code blocks, instead of trimming + all trailing whitespace characters. + +* Fixed bug where this: + + [text](http://m.com "title" ) + + wasn't working as expected, because the parser wasn't allowing for spaces + before the closing paren. + +* Filthy hack to support markdown='1' in div tags. + +* _DoAutoLinks() now supports the 'dict://' URL scheme. + +* PHP- and ASP-style processor instructions are now protected as + raw HTML blocks. + + + <% ... %> + +* Fix for escaped backticks still triggering code spans: + + There are two raw backticks here: \` and here: \`, not a code span + + +1.0.1c (9 Dec 2005) + +* Fixed a problem occurring with PHP 5.1.1 due to a small + change to strings variable replacement behaviour in + this version. + 1.0.1b (6 Jun 2005) @@ -248,46 +340,6 @@ Version History filter so that it runs after Markdown. -1.0.2b1 - 5 Mar 2005 - -* Fix for backticks within HTML tag: - - like this - -* Fix for escaped backticks still triggering code spans: - - There are two raw backticks here: \` and here: \`, not a code span - -* Improved integration with WordPress. With WordPress 1.5, the - balenceTags filter now runs after Markdown, so it won't - interfere anymore. You can still disable balanceTags from the admin - interface (in Options > Writing) if you want to. - -* PHP Markdown now correctly filter text for excerpts in WordPress. - There is still one glitch: autolinks and tags in code samples are - stripped by WordPress when trimming it. A fix for this is possible - with WordPress 1.5, but would require duplicating WordPress entry - trimming code within Markdown, which I can't do because of a license - issue. (Nor do I think it is a good solution to fix this.) - -* Improved Textile compatibility mode. Markdown will now honor the - no-image and the lite parameters. In lite mode, no header, blockquote, - list, or code block will be made, and inline HTML is limited - to the following tags: - - - - This is acheived by backslash-escaping block markers before sending - text through the Markdown filter. - - The improved Textile comatibility means that the Markdown syntax will now - be processed for comments in TextPattern (only for span elements due to - TextPattern using the lite mode for comments). Sadly, due to TextPattern - tag stripping, sample code in code span and auto-links will be stripped - before the Markdown filter can see them. So I guess I should say it - half-work for comments TextPattern. - - 1.0.1 (16 Dec 2004): * Changed the syntax rules for code blocks and spans. Previously, diff --git a/markdown.php b/markdown.php index b5ae177..a53d0e1 100644 --- a/markdown.php +++ b/markdown.php @@ -7,12 +7,12 @@ # # # Original Markdown -# Copyright (c) 2004-2005 John Gruber +# Copyright (c) 2004-2006 John Gruber # # -define( 'MARKDOWN_VERSION', "1.0.2b7" ); # Sat 16 Sep 2006 +define( 'MARKDOWN_VERSION', "1.0.1d" ); # Fri 1 Dec 2006 # @@ -62,7 +62,7 @@ function Markdown($text) { Plugin Name: Markdown Plugin URI: http://www.michelf.com/projects/php-markdown/ Description: Markdown syntax allows you to write using an easy-to-read, easy-to-write plain text format. Based on the original Perl version by John Gruber. More... -Version: 1.0.2b7 +Version: 1.0.1d Author: Michel Fortin Author URI: http://www.michelf.com/ */ @@ -96,7 +96,7 @@ if (isset($wp_version)) { # - Scramble important tags before passing them to the kses filter. # - Run Markdown on excerpt then remove paragraph tags. if (MARKDOWN_WP_COMMENTS) { - remove_filter('comment_text', 'wpautop'); + remove_filter('comment_text', 'wpautop', 30); remove_filter('comment_text', 'make_clickable'); add_filter('pre_comment_content', 'Markdown', 6); add_filter('pre_comment_content', 'mdwp_hide_tags', 8); @@ -145,7 +145,7 @@ function identify_modifier_markdown() { 'nicename' => 'Markdown', 'description' => 'A text-to-HTML conversion tool for web writers', 'authors' => 'Michel Fortin and John Gruber', - 'licence' => 'GPL', + 'licence' => 'BSD-like', 'version' => MARKDOWN_VERSION, 'help' => 'Markdown syntax allows you to write using an easy-to-read, easy-to-write plain text format. Based on the original Perl version by John Gruber. More...' ); @@ -173,6 +173,10 @@ if (strcasecmp(substr(__FILE__, -16), "classTextile.php") == 0) { if (function_exists('SmartyPants')) $text = SmartyPants($text); return $text; } + # Fake restricted version: restrictions are not supported for now. + function TextileRestricted($text, $lite='', $noimage='') { + return $this->TextileThis($text, $lite); + } # Workaround to ensure compatibility with TextPattern 4.0.3. function blockLite($text) { return $text; } } @@ -302,7 +306,7 @@ class Markdown_Parser { (?: (?<=\s) # lookbehind for whitespace ["(] - (.+?) # title = $3 + (.*?) # title = $3 [")] [ \t]* )? # title is optional @@ -692,14 +696,14 @@ class Markdown_Parser { # These must come last in case you've also got [link test][1] # or [link test](/foo) # - $text = preg_replace_callback('{ - ( # wrap whole match in $1 - \[ - ([^\[\]]+) # link text = $2; can\'t contain [ or ] - \] - ) - }xs', - array(&$this, '_doAnchors_reference_callback'), $text); +// $text = preg_replace_callback('{ +// ( # wrap whole match in $1 +// \[ +// ([^\[\]]+) # link text = $2; can\'t contain [ or ] +// \] +// ) +// }xs', +// array(&$this, '_doAnchors_reference_callback'), $text); return $text; } @@ -841,15 +845,12 @@ class Markdown_Parser { $whole_match = $matches[1]; $alt_text = $matches[2]; $url = $matches[3]; - $title = ''; - if (isset($matches[6])) { - $title = $matches[6]; - } + $title =& $matches[6]; $alt_text = str_replace('"', '"', $alt_text); - $title = str_replace('"', '"', $title); $result = "\"$alt_text\"";empty_element_suffix; @@ -1148,22 +1149,23 @@ class Markdown_Parser { # must go first: $text = preg_replace_callback('{ ( # $1: Marker - (?: @@ -1207,9 +1209,10 @@ class Markdown_Parser { $bq = $this->runBlockGamut($bq); # recurse $bq = preg_replace('/^/m', " ", $bq); - # These leading spaces screw with
 content, so we need to fix that:
+		# These leading spaces cause problem with 
 content, 
+		# so we need to fix that:
 		$bq = preg_replace_callback('{(\s*
.+?
)}sx', - array(&$this, '_DoBlockQuotes_callback2'), $bq); + array(&$this, '_DoBlockQuotes_callback2'), $bq); return $this->hashBlock("
\n$bq\n
")."\n\n"; } @@ -1245,52 +1248,46 @@ class Markdown_Parser { # # Unhashify HTML blocks # -// foreach ($grafs as $key => $value) { -// if (isset( $this->html_blocks[$value] )) { -// $grafs[$key] = $this->html_blocks[$value]; -// } -// } - foreach ($grafs as $key => $graf) { # Modify elements of @grafs in-place... if (isset($this->html_blocks[$graf])) { $block = $this->html_blocks[$graf]; $graf = $block; - if (preg_match('{ - \A - ( # $1 =
tag -
]* - \b - markdown\s*=\s* ([\'"]) # $2 = attr quote char - 1 - \2 - [^>]* - > - ) - ( # $3 = contents - .* - ) - (
) # $4 = closing tag - \z - }xs', $block, $matches)) - { - list(, $div_open, , $div_content, $div_close) = $matches; - - # We can't call Markdown(), because that resets the hash; - # that initialization code should be pulled into its own sub, though. - $div_content = $this->hashHTMLBlocks($div_content); - - # Run document gamut methods on the content. - foreach ($this->document_gamut as $method => $priority) { - $div_content = $this->$method($div_content); - } - - $div_open = preg_replace( - '{\smarkdown\s*=\s*([\'"]).+?\1}', '', $div_open); - - $graf = $div_open . "\n" . $div_content . "\n" . $div_close; - } +// if (preg_match('{ +// \A +// ( # $1 =
tag +//
]* +// \b +// markdown\s*=\s* ([\'"]) # $2 = attr quote char +// 1 +// \2 +// [^>]* +// > +// ) +// ( # $3 = contents +// .* +// ) +// (
) # $4 = closing tag +// \z +// }xs', $block, $matches)) +// { +// list(, $div_open, , $div_content, $div_close) = $matches; +// +// # We can't call Markdown(), because that resets the hash; +// # that initialization code should be pulled into its own sub, though. +// $div_content = $this->hashHTMLBlocks($div_content); +// +// # Run document gamut methods on the content. +// foreach ($this->document_gamut as $method => $priority) { +// $div_content = $this->$method($div_content); +// } +// +// $div_open = preg_replace( +// '{\smarkdown\s*=\s*([\'"]).+?\1}', '', $div_open); +// +// $graf = $div_open . "\n" . $div_content . "\n" . $div_close; +// } $grafs[$key] = $graf; } } @@ -1403,21 +1400,23 @@ class Markdown_Parser { function tokenizeHTML($str) { # - # Parameter: String containing HTML markup. + # Parameter: String containing HTML + Markdown markup. # Returns: An array of the tokens comprising the input - # string. Each token is either a tag (possibly with nested, - # tags contained therein, such as , or a - # run of text between tags. Each element of the array is a + # string. Each token is either a tag or a run of text + # between tags. Each element of the array is a # two-element array; the first is either 'tag' or 'text'; # the second is the actual value. - # Note: Takes code spans into account and does not generate tag - # tokens inside code spans. + # Note: Markdown code spans are taken into account: no tag token is + # generated within a code span. # $tokens = array(); while ($str != "") { # - # + # Each loop iteration seach for either the next tag or the next + # openning code span marker. If a code span marker is found, the + # code span is extracted in entierty and will result in an extra + # text token. # $parts = preg_split('{ ( @@ -1496,7 +1495,8 @@ class Markdown_Parser { unset($blocks[0]); # Do not add first block twice. foreach ($blocks as $block) { # Calculate amount of space, insert spaces, insert block. - $amount = $this->tab_width - strlen($line) % $this->tab_width; + $amount = $this->tab_width - + mb_strlen($line, 'UTF-8') % $this->tab_width; $line .= str_repeat(" ", $amount) . $block; } $text .= "$line\n"; @@ -1558,73 +1558,7 @@ Version History See the readme file for detailed release notes for this version. -1.0.2b7 (16 Sep 2006) - -* Changed span and block gamut methods so that they loop over a - customizable list of methods. This makes subclassing the parser a more - interesting option for creating syntax extensions. - -* Also added a "document" gamut loop which can be used to hook document-level - methods (like for striping link definitions). - -* Changed all methods which were inserting HTML code so that they now return - a hashed representation of the code. New methods `hashSpan` and `hashBlock` - are used to hash respectivly span- and block-level generated content. This - has a couple of significant effects: - - 1. It prevents invalid nesting of Markdown-generated elements which - could occur occuring with constructs like `*something [link*][1]`. - 2. It prevents problems occuring with deeply nested lists on which - paragraphs were ill-formed. - 3. It removes the need to call `hashHTMLBlocks` twice during the the - block gamut. - - Hashes are turned back to HTML prior output. - -* Made the block-level HTML parser smarter using a specially-crafted regular - expression capable of handling nested tags. - -* Solved backtick issues in tag attributes by rewriting the HTML tokenizer to - be aware of code spans. All these lines should work correctly now: - - bar - bar - `` - -* `
` has been added to the list of block-level elements and is now - treated as an HTML block instead of being wrapped within paragraph tags. - -* Now only trim trailing newlines from code blocks, instead of trimming - all trailing whitespace characters. - -* Fixed bug where this: - - [text](http://m.com "title" ) - - wasn't working as expected, because the parser wasn't allowing for spaces - before the closing paren. - -* Filthy hack to support markdown='1' in div tags. - -* _DoAutoLinks() now supports the 'dict://' URL scheme. - -* PHP- and ASP-style processor instructions are now protected as - raw HTML blocks. - - - <% ... %> - -* Experimental support for [this] as a synonym for [this][]. - -* Fix for escaped backticks still triggering code spans: - - There are two raw backticks here: \` and here: \`, not a code span - - -1.0.1oo (19 May 2006) - -* Converted PHP Markdown to a object-oriented design. - +1.0.1d (1 Dec 2006) 1.0.1c (9 Dec 2005) @@ -1654,7 +1588,7 @@ Copyright (c) 2004-2006 Michel Fortin All rights reserved. -Copyright (c) 2003-2004 John Gruber +Copyright (c) 2003-2006 John Gruber All rights reserved.