People post a good bit of HTML in the comments of articles on this site. They are trying to demonstrate something, ask for troubleshooting help, show alternate techniques, etc. This is excellent. I want to encourage this as much as possible. Unfortunately people are often confused on how to do it correctly and get frustrated when it comes out wrong.
I have to post instructions in the comment area to teach people the best way for this site:
- You can use basic HTML
- When posting code, please turn all < characters into <
- If the code is multi-line, use <pre><code></code></pre>
Ideally I’d like to get rid of all of those instructions completely, and have everything “just work”. Here are two changes that would get pretty close to ideal for comments on this site:
1. Any HTML that isn’t one of the “allowed tags” gets escaped.
WordPress has this default set of allowed tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
If someone uses these tags in a comment, they will apply to that text and render appropriately. If they use any other tag, it should escape, not be stripped or inappropriately render. So:
I’d also say that if it’s a single line of code as in this example (does not contain line breaks) it should wrap the newly escaped code in <code></code> tags.
Also, if the code is already escaped like
<span> then leave it alone.
2. Make sure multi-line code is wrapped in <pre> tags
If someone puts out multiple lines of code (either they put it in
<code> tags or it’s not allowed tags so it auto-escapes and auto-code-wraps it), that multi-line code should be wrapped in
This should also strip whitespace from the beginning and end of the code block, so no extra spaces get rendered. This:
<pre><code> <ul> <li>
Should turn into:
This would be whether the commenter uses the tags themselves or they are auto-generated based on the above rules.
I’m specifically thinking of WordPress here because that’s what I always use, but I can imagine this being useful in any commenting environment that allows some-but-not-all HTML tags and is reasonable that the discussion may involve nerds discussing HTML. Hey maybe this should be a WordPress plugin eh? WINK WINK.
Thought about this?
You don’t even need the
Clever thinkin =) But I’d rather handle this from the backend than the frontend ideally. For example, this won’t work if the tag gets stripped out completely as a non-allowed tag.
I like the idea, but you should escape it when you write it in the database! I remember the day when youtube gets “hacked” by the html if statement ;)
Honestly, more than anything I would just prefer to have the ability to edit my comment and correct any mistakes that I may have made. That said, I’m a huge fan of markdown, and love being able to use characters like backticks to call out a piece of code (usually a variable, class, or html tag) inline as well as for multi-line code blocks.
So… Chris… when can we expect to see this implemented on this site?? (This new styling is great, by the way.)
I’ve also been thinking about.. when you paste, can you see that there is multiline paste content with extra whitespace and automatically treat that as a code block?
This is why I think every web development blog/site should just use Markdown. Backticks create an inline code span (
<code>, four-space indents create a code block (a
<pre><code>, and everything in between two lines with three backticks becomes a code block as well (in case you don’t want to indent every line in the code block with four spaces).
Yeah I like markdown too but I don’t want to rely on it. I think that makes the problem worse. Now there is yet another thing that someone needs to “learn” to comment properly. I want it to just work without thinking too hard about it. But extra-ideally, it would also support Markdown.
I think there should be an option to switch a comment form into Markdown mode. That way you’re not forcing it on anyone while letting the big boys play with code.
Shouldn’t there be some basic generators on the net to do this ? Just throw in the HTML you wan’t to display as code and get it back? Not the plugin stuff you can use to highlight but just some generator.
Anyways, thanks for the insight on this. Sometimes it can be really frustrating to show code :-).
Classic example just happened on another post just minutes ago: http://cl.ly/9QJ2
And again… http://cl.ly/9QD4
This is a legit problem round these parts. Allowing editing is probably a good first step.
The email address isn’t very well hidden.
You’re looking at the ‘preprocess_comment’ filter there. Perhaps you could first compare all instances of characters surrounded by < and > against the $allowedtags array, replacing those that don’t match with < and >.
The second stage would beprobably be parsing the resultant text for two or more instances of the < and > separated by one or more newline characters.
I’d have a stab at writing something here, but it’s late enough for me to be hard of thinking and I’m not sure how WP would mangle it. Love your comment preview, btw.
Here’s what I’ve got for the first problem.
One problem though, allowed tags inside escaped tags are not escaped. The RegEx should be fixed, and I don’t know how! :)
Also note that I’m using the get_comment_text filter, which is non-destructive and doesn’t change the data in the database.
Hey it just destroyed my multi-line code text!
I have been guilty of this, and have written some comments of this blog that have had to be edited (apologies for that).
When I started writing articles myself though, I decided to make an html entity encoder which can be used to copy/paste or write code into (there are many already, but obviously this is my favourite way to do it).
Also, this comment preview is a great idea!
I prefer to imbed (or link to) GitHub gists. Maybe you could parse out those links (and other popular ones) and convert them to imbedded blocks. You see this on twitter’s website.
Ooh, I like your new comment preview. I had tried at one point to find a suitable comment preview script/plugin for my site, but couldn’t find one. I tested out the one that James Padolsey uses, but I found it had a few problems. Can’t even remember what problems. Something related to multi-line code blocks I believe.
Are you using this plugin?
If so, I may try that out, thanks!
I found the following ID in the markup: div-comment-lcp. The last part (lcp) might be a shorthand for Live Comment Preview, which is the plugin you mentioned.
Part of the solution I’m using on my forum is real-time previews, as I see you have here. That more than anything else helps put my mind at ease when making posts/comments, and removes a lot of the need for an Edit function.
I’m also using Markdown, or rather a client-side version called Showdown (the original page for which seems to have disappeared). The upshot of this is that forum posts get stored on the server exactly as the user entered them, and presentational transformations to HTML are done client-side.
Obviously this won’t work with JS disabled, but since my forum relies so fundamentally on JS to work anyway, I didn’t consider this a big deal. Should I make a non-JS version it could just present the Markdown-encoded posts as-is, since Markdown is pretty unobtrusive.
If it’s not already in place I’d be using some server-side code to sanitize the data before saving it to the server.
I encountered this dilemma myself not too long ago. I decided to try my hand at writing my own website, from scratch, partly as a learning tool and partly as it would allow me to be very specific about what people can and can’t do.
In the end, I found the easiest solution was to escape all HTML, and allow the use of BBCode for simple stuff like text formatting, inserting code blocks, etc.
For me, the logic involved in working out what HTML should be escaped and what should be kept was too difficult, especially when it comes to inserting code sections and stuff in the middle of a comment.
Oh, and of course it doesn’t require a user (or myself) to do any escaping manually, as there is a clear distinction between the stuff the site must escape (HTML) and the stuff it should keep and subsequently convert to HTML (BBCode).
Nice article about the commenting. I think the content will help me to comment more in the future. Thanx!
Excelente resource… This was help to me.
The latest version of wordpress is doing this already (without the need to convert the less than sign), but how difficult would it be to give your commentators a simple editor.
Excellent write up as usual…
Thanks for sharing this information..
I really like the way that stackoverflow handles how you input code.
I like the technique used in WordPress.org’s forums where back ticks (`) are used to surround code, be they inline or multiline.
A filter & regular express can be used to change
<, etc before allowed tags are checked.
It still requires instructions but they’re simplified to cover both use cases.
Knowing what method to use is the problem I have all over the web without an ugly kitchen sink type toolbar I never know where I can use Markdown or what HTML can actually be used etc…
Chris might be cool to add a tooltip type thing to the “You can use basic HTML” explaining what basic HTML is.
Oh, and Ajax edit comments is a cool plugin…
Definitely +1 for being able to edit my comments.
Thanks for sharing this information..
Lol, I’m so happy Chris made this post. I embarrassed my self the other day with a very half assed comment when I tried to demonstrate something via HTML and my code came out all botched up. Not even an alien would have been able to figure out my message, needless to say I was pretty pissed off about that lol.
Allow me to edit my comment, kthxbye. ;)