One of the Drupal 7 sites I maintain is www.powerpoetry.org which is a platform for publishing poems and comments about them.
Recently, we heard from an irate poet. When browsing to one of her poems, the page would partially load, then automatically redirect to a movie streaming site that had nothing to do with her poem nor with the site in general. Her poem page somehow got hijacked, and readers could never read her work.
It turned out that all site users, even Anonymous ones, had permissions for the Full HTML Text Format. This allowed comments to be submitted that contained malicious code to redirect the page.
So far, I have found two different techniques that were used to implement redirection.
(1) Use of the <meta> tag with an attribute of http-equiv="refresh". For example,
where the value of 2 means a delay of two seconds before the refresh.
As documented, this attribute and closely related ones are valid for almost all HTML tags, which would make it potentially a big problem where markup is allowed.
If you have good knowledge of HTML, these techniques may be obvious, but for me as a back-end developer they were new.
Because I have direct access to the Drupal database, I was able to use a SQL query such as the following to find malicious comments.
select entity_id, comment_body_value
where comment_body_value like "%onmouseover%";
- - - - -
To help plug these security holes, we decided to allow only Administrators to have access to all Text Formats. Depending on the needs of your site, you might want to disable the Full HTML format entirely.
For all other users, the Filtered HTML Text Format was permitted, configured using Limit Allowed HTML Tags so that only the following tags are processed.
<p> <br> <em> <strong> <cite> <blockquote> <ul> <ol> <li>
Although not part of this security problem, we purposely disallowed <a> tags, deciding that they are not essential for comments on our site and almost always only appear in spam comments. As part of eliminating links, we also turned off Convert URLs Into Links.
Note that the user can still type in whatever they want, including any kind of HTML. The original text is stored unchanged as the content of the comment.
However, what the Filtered HTML format does is effectively strip out disallowed tags as part of the rendering process.
- - - - -
Even with Filtered HTML, I was concerned that the onmouseover attribute could still be used inside one of the tags that is allowed. What about something like
But some testing and some stepping through core code revealed that onmouseover as well as other risky attributes are stripped out of those tags for output even though the tags themselves are processed.
The holes were plugged for these two exploits.
- - - - -
The final thing left to do was to go back and check again the malicious comments, which I had left in the database while carrying out the above steps. I thought that applying Filtered HTML would neutralize them. I was wrong.
In the database table field_data_comment_body there is a column comment_body_format that stores the format in effect when the comment was created. The format apparently is used to render the comment even if it is no longer allowed.
As a result, those comments were still redirecting!
So I used SQL queries to find their entity ids, then deleted them by browsing to, for example, the following path where 30259 is the entity id of a comment.
The latter path goes directly to the delete confirmation dialog. In some cases I had to use it because loading the edit form for the comment triggered the redirection, so I couldn't even edit it.
- - - - -
Michael Richardson at Pantheon support did an awesome job of doing the initial troubleshooting when I was at a loss as to where to start. It was he who uncovered the use of onmouseover on our site. Thanks, Michael!
What is the Meta Refresh Tag?
HTML onmouseover Event Attribute
Drupal Text Formats and Filters Tutorial
Text Filters and Input Formats