Allow editors to use HTML attributes by default
James Williams
Drupal provides an excellent sanitisation system to filter the HTML content that editors might create. Think of it like a series of traffic cops that filter different vehicles into different lanes. Some content is allowed through to its destination, some has to be transformed along the way, and some is simply blocked from displaying. Administrators can use the 'Limit allowed HTML tags and correct faulty HTML' option to configure which HTML elements and attributes they can use. This helps protect a site against nefarious HTML - whether it might be malicious or just ugly. Under Drupal 9, editors can't use attributes on their HTML unless an admin explicitly allows them to. But on Drupal 7 sites, it was the opposite way around: attributes were allowed by default, until an admin restricted them. So Drupal 9 might have tightened up this security measure, but that can be a problem when you're migrating a site from Drupal 7 to 9.
We at ComputerMinds have spent a huge amount of 2022 on Drupal 9 upgrade projects. Our content migrations are always tailored to the specific needs of our clients going forward, but also looking back on how they used Drupal to write their content. In many cases, editors were in the habit of liberally spreading HTML attributes around their content across all sorts of different elements. Sometimes this was to achieve specific designs, sometimes it wasn't really intended, and sometimes it was to embed specific forms of external content. (YouTube iframes, I'm looking at you!) Here are three examples of HTML elements (tags), each with attributes that Drupal 9 would commonly strip out, breaking their intended functionality and/or appearance:
<a href="/contact" class="btn" target="_blank" data-track-as="button--contact">Contact us</a>
<iframe loading="lazy" width="100%" src="https://www.youtube.com/embed/..." frameborder="0" allowfullscreen></iframe>
<div class="col span_3_of_12">content here...</div>
When editors expect to be able to do this, and we trust them sufficiently, I think it's fair to allow them to continue using HTML attributes on their shiny new Drupal 9 sites in the same way that they did on their legacy Drupal 7 sites. We trusted our editors sufficiently before, so why not now? I suggest keeping Drupal 9's default HTML filtering behaviour for new projects, to set expectations that customising HTML attributes isn't recommended. But when editors are used to doing so - and their content relies on that ability - a solution is needed. We still want to restrict which HTML elements editors can use, because we don't want them adding <style> and <script> tags at least. But we don't mind them setting various attributes on elements. Auditing their content to see which attributes have been used on which elements isn't really viable, as that could involve parsing thousands (maybe even millions?) of fields of HTML.
The solution I chose was to build a plugin for Drupal 9 that simply copies its ordinary 'Limit allowed HTML tags and correct faulty HTML' filter, to allow most attributes by default, instead of blocking them by default. It's as simple as adding this single file to a custom module:
<?php
namespace Drupal\MYMODULE\Plugin\Filter;
use Drupal\filter\Plugin\Filter\FilterHtml;
/**
* Provides a filter to limit allowed HTML tags.
*
* The attributes in the annotation show examples of allowing all attributes
* by only having the attribute name, or allowing a fixed list of values, or
* allowing a value with a wildcard prefix.
*
* @Filter(
* id = "MYMODULE_filter_html_allowing_most_attributes",
* title = @Translation("Limit allowed HTML tags, allowing most attributes, and correct faulty HTML"),
* description = @Translation("This replicates the legacy behaviour from Drupal 7 that allowed use of attributes that weren't internally explicitly limited."),
* type = Drupal\filter\Plugin\FilterInterface::TYPE_HTML_RESTRICTOR,
* settings = {
* "allowed_html" = "<a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd> <h2> <h3> <h4> <h5> <h6>",
* "filter_html_help" = TRUE,
* "filter_html_nofollow" = FALSE
* },
* weight = -10
* )
*/
class FilterHtmlAllowingMostAttributes extends FilterHtml {
/**
* {@inheritDoc}
*/
protected function findAllowedValue(array $allowed, $name) {
if (isset($allowed['exact'][$name])) {
return $allowed['exact'][$name];
}
// Handle prefix (wildcard) matches.
foreach ($allowed['prefix'] as $prefix => $value) {
if (strpos($name, $prefix) === 0) {
return $value;
}
}
// When an attribute has not been specifically allowed or disallowed, allow
// it.
return TRUE;
}
}
This should be placed in a src/Plugin/filter
directory within a custom module, and 'MYMODULE' in the code should be replaced with your module machine name. Then simply configure your text formats that use the 'Limit allowed HTML tags and correct faulty HTML' filter to use this 'Limit allowed HTML tags, allowing most attributes, and correct faulty HTML' one instead.
To explain this code: the class is just an extension of Drupal core's 'Limit allowed HTML tags and correct faulty HTML' filter class (FilterHtml
). It overrides just the part that blocks attributes by default, to instead allow them by default. It still filters elements, and will filter attributes on any element that has an explicit list of attributes to allow. Even with this, Drupal is still smart enough to block style
and on*
attributes on all elements by default. So your editors can rest easy that their content from Drupal 7 would still come out the same in Drupal 9, without allowing the most obvious security or design problems through.
Normally I like to contribute code back to the community as modules on drupal.org, but as this one is just a single file, and means overriding security defaults, I'm just posting it here as a more cautious option. Still, feel free to use it, as long as you understand what you're doing! Hopefully, I've helped you do exactly that - and now your editors can enjoy their content on their sparkly new Drupal 9 site.
If you'd like help with bespoke needs for your Drupal migration project, get in touch with us.