Unbekannter Autor ;)

Programming & Webdesign Resources

By

No-MVC Zend Framework: Filter Zend_Form input with HtmlPurifier

As my favorite Zend Framework guru Padraic Brady pointet out on his blog, most forms are just about an invitation for hackers and other subversive folks to (ab)use your forms, and PHPs “addslashes” or “striptags” just don´t get the job of protecting your site done. It is one thing to assume everybody is just nice, and submits exactly what you want them to, but expect it to be different, and be prepared… just in case. That´s the point where Zend_Filter should be implemented, and even those are not to trust entirely, as tests have shown.

To prevent cross site scripting, better known as XSS, Padraic recommends HtmlPurifier as preferred method of filtering user input on web forms, so I´d suggest we comply ;) Get the newest version of HtmlPurifier, and unzip the contents into a subdirectory “HTMLPurifier” of you “library/JD/” folder (or whatever your personal folder is called). Make sure you have the upper case characters correct, iX OS is not as forgiving as Windows ;) In the following code, I used upper case HTMLP* for original files, and HtmlP* for those I wrote, so don´t get mixed up here.

Next, create “HtmlPurifier.php” in a subfolder “Filter” (according to “addPrefixPath in your form controller settings). I chose to not make “Filter” a subfolder of “Forms” (unlike “Decorator” and “Validator”), because I might use the filter to clean output from databases or other sources, instead of only input from forms. In this file, find the code:

class JD_Filter_HtmlPurifier implements Zend_Filter_Interface
{
 /**
 * The HTMLPurifier instance
 *
 * @var HTMLPurifier
 */
 protected $_htmlPurifier;

 /**
 * Constructor
 *
 * @param mixed $config
 * @return void
 */
 public function __construct($config = null)
 {
   require_once 'JD/HMTLPurifier/HTMLPurifier.auto.php';
   $this->_htmlPurifier = new HTMLPurifier($config);
 }

 /**
 * Defined by Zend_Filter_Interface
 *
 * Returns the string $value, purified by HTMLPurifier
 *
 * @param string $value
 * @return string
 */
 public function filter($value)
 {
   return $this->_htmlPurifier->purify($value);
 }
}

The $config will hold those tags allowed for this specific instance (pls refer to HtmlPurifier´s docs for details). Those values will come from the general config file, so edit your “/configs/application.ini”, and add the following lines to your “production” section:

/*Editor and Purifier*/
allowedHTML.Restrictive = "sup"
allowedHTML.Minimal = "p,em,strong"
allowedHTML.Standard = "p,em,strong,ul,ol,li"
allowedHTML.Extended = "p,em,strong,ul,ol,li,sub,sup,a[href]"

If you ever implement a WYSIWYG editor like CKEditor, you can re-use this part of your “application.ini” to set the toolbars allowed for the editor. As for the allowed tags, please adjust as you see fit.  The “restrictive” is just the tag I wouldn´t expect in any user input, so I will use that one to filter names and mail addresses.

If you´re getting impatient about how to use this, add the following line to a text element of your Zend_Form:

$form_element["comment_text"]->addFilter('HtmlPurifier', array(array('HTML.Allowed' => $this->registry->config->allowedHTML->Minimal)));

For now, this only filters the comment text element (which is a textarea, BTW). The smart thing to do would be, to check if the input is XSS, and return an error in form of “the finger” :D. Got to figure out how to do it sometime.

Anyway, you can download “Zend_Forms.zip”, containing the directory structure and files from this series, including the “bootstrap.php” in the “public” folder. Please note you need ZF in your “library/Zend/” and HtmlPurifier in your “JD/HTMLPurifier” folders!

The next article will show how to put all this stuff into use – a comment form. I´m already figuring out a smart way to do it without reloading the page, but still using the filters, so it looks like it´s going to need some AJAX.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>