In HTML (4, 5, whatever) quoting attribute values is optional, kinda. Both of these are totally fine:
<b class=boom>
<b class="boom">
But there are rules and limitations. Hopefully obviously, this is a problem:
<a title=Hi, mom! href=#>
That space in Hi, mom!
is the problem. The browser will think the value of the title is Hi,
and think mom!
is an attribute onto itself. Any whitespace-like character will cause this problem. Mathias Bynens created a tool called the Mother-effin’ unquoted attribute value validator, a simple tool for evaluating a possible attribute value to see if it’s valid or not.
The problem characters are: spaces, tabs, line feeds, form feeds, carriage returns, “, ‘, `, =, <, or >.
The Actual Problems
So those are the rules, but what happens if you actually break the rules? Using some of those characters in unquoted attributes can cause serious problems. Others do not. And some characters not referenced in those rules cause problems.
Let’s just look at a bunch of them.
<div rel></div>
<div rel=></div>
Not valid, but not a big deal in any browser. Just sees the rel attribute as present but empty.
<div rel==></div>
Not valid, but works in every browser. Rel attribute seen as “=”.
<div rel=&></div>
Rel value seen as “&” in all modern browsers including IE 9 and 10. IE 6, 7, and 8 see empty value, except when accessed through CSS (e.g. content: attr(rel);
) in which the value is seen as “&”
<div rel=&a></div>
Rel value seen as “&a” in all modern browsers including IE 9 and 10. IE 6, 7, and 8 sees value as “a”.
<div rel=a&b></div>
Rel value seen as “a&b” in all modern browsers including IE 9 and 10. IE 6, 7, and 8 see the value as “a”.
<div rel=a&b></div>
In this example the ampersand is encoded. Modern browsers including IE 9 and 10 see the value as “a&b” (same as leaving it unencoded). IE 6, 7, and 8 also behave the same as the last test, truncating the value to only “a”.
Seems like IE 6, 7, and 8 don’t freak out about ampersands in attribute values, but will truncate the value at the point they first see one (unless it starts with one, then everything after it and until the next one).
<div rel=`></div>
All versions of IE (even 10) have problems here. They treat the backtick character as a quote, so by only using one of them, it’s like the entire rest of the document until the next quote is part of the value of the attribute. All other browsers have no problem as see the value as “`”.
<div rel=```></div>
This is similarly problematic, because there is an odd number of backticks, it leaves a set of quotes open. All non-IE browsers see the value as ““`”.
<div rel=``></div>
This is slightly less problematic because what IE sees as quotes are closed. Non-IE browsers see the value as ““”.
<div rel=<></div>
Modern browsers, including IE 9 and 10, see the rel attribute value as “<“. IE 6 and 7 see the value as blank, but no big disasters ensue. IE 8 sees the value as blank through JavaScript but sees the value as “<” through CSS (e.g. content: attr(rel);
).
<div rel=>></div>
Here we’re attempting to set the value to “>”, but what’s actually happening is that we’re creating a div with an actual “>” inside it. Another way to see that is:
<div rel=>
>
</div>
The rel attribute, in every browser, will be blank.
<div rel={@#():,*!![[]]}></div>
This looks weird, but none of these characters are problematic and no browser has any problem with that.
<div rel='></div>
<div rel="></div>
This is problematic in any browser (single or odd numbers of quotes). Similar to the backtick situation only all browsers understand quotes and quotes. This is world-crashing-down problematic, as the entire rest of the document until the next quote is seen as a value of that attribute.
In this article “modern browsers” generally refers to Safari 5.0, Firefox 4, Chrome 12, and Opera 11. IE are generally listed by version. “All browsers” refers to what I listed for modern browsers plus IE 6-10. Some of what I used for testing is at this Pen, which can be easily altered as needed for more tests. No JS libraries were used.
Also see Mathias’ in-depth article on the subject, which covers using them in CSS as well.
The moral of the story: if any of this is confusing to you, just quote all your attributes.
I find it best to just quote all attributes all of the time XML stylee. XHTML had a lot of things right and it’s a shame that more didn’t get carried over to HTML5, such as having to close all tags, be it self closing or open and close tags.
Agree 100%. I just keep coding that way anyway in HTML5, i.e. I close all tags and always use quotes for attributes.
BTW, I always use “double quotes” for HTML and ‘single quotes’ for JavaScript.
i totally agree and as much as i enjoy being kind of loose the cleanliness of xhtml tag closing and such makes me feel really efficient and secure
It’s laziness more than style. Discipline yourself and quote your attributes. It will make you :)
Absolutely agree with Jamie, Carlos, Thomas and Andy!
Why will someone who has done xhtml work will ever choose to unquote the attributes or keep tags which are not closed?
Infact I just don’t understand the importance of this post. These problems are always taken care while creating html page.
Anyways this is just my opinion those who keep things unquoted might be helpful with this post!
Absolutely. Its a shame that the requirement to always quote attributes isn’t in the HTML5 spec. The argument that it is a matter of coding style seems a little week as surely that is what a specification is there to set in the first place.
It feels like a step back to the sort of code that was generated 10 years ago and still haunts the web.
That’s what I thought too. However, the OWASP XSS Prevention Cheat Sheet still lists most of these characters as dangerous inside unquoted attribute values:
Frankly, I don’t buy it. I wish they offered some more information (e.g. which browsers are they talking about?) or had something like a demo though, as I’d love to be proven wrong.
It’s just good practice to use the double quotes for any attribute. Single quotes are a bit slower for some languages and if you need a quote inside a quote, you could either escape or encode it.
Eh, you can’t generalize about quoting. Sometimes single and double quotes are the same, sometimes different, and when they differ it’s a toss-up which is which. Bash/shell script it affects several things depending on where it’s used like word-splitting and whether escapes are interpreted. In Haskell and the ML family it’s the difference between character and string literals. In Python it doesn’t really matter except like most other langs you can alternate between single and double nested quotes instead of escaping.
Anyway, it would be hard to make a good argument about how quoting should work based on how other languages do it. Certainly not about any speed difference. Technically single quotes are (negligibly) faster in languages which don’t do further processing on their contents.
So, why bother? Isn’t it easier and safer just to quote all your attributes?
Yeah I just always use the quotes and end tags, it makes it easier to manage and keep track of the code, read the code and guarantees no mystery problems. I guess breaking it and seeing exactly how it breaks is knowledge I didn’t have before, but I doubt it’s going to come up any time soon.
Answer: Just quote your damn attributes. It’s easier, it won’t break things, it won’t cause problems for parsers and it just looks better.
While interesting, I’m not sure this is relevant to web development these days… everyone quotes their attributes anyways, right?
Absolutely agree with you Andrew!
The reason spaces matter in HTML is because it’s the delimiter for minimized boolean attributes.
http://www.whatwg.org/specs/web-apps/current-work/#boolean-attributes
XHTML didn’t really have these. For example if you want to define a video element with controls enabled using XML syntax, you’re required to use an empty string value.
vs
Otherwise I completely agree with other posters – personally I just do it the XML way.
quote, quote, quote I say … all the time
what I learnt in XHTML is just cool to use in HTML5
quote everything, close tags properly … 99% of problems solved :)
Agreed Jamie!
Just like you pointed out most browsers will try to parse whatever the developer writes, whether their code is correct or not. But that’s not an excuse for developers to start coding like that.
Remember the coding chaos with HTML 4.01 a few years back? We don’t want it to happen again, do we?
Slightly off-topic: loose syntax is one of the biggest cons of HTML5, (although many people refer to having the option to adopt your own coding style as something good) let’s hope it won’t turn out to be a disaster. There won’t be a new “XHTML” to the rescue.
+1 to Spyros .. couldn’t have said it better – coding chaos = tag soup
browsers will always forgive, but the the easier you make it for them the more your code will work with newer libraries and microformats = = what Chris said
“if any of this is confusing to you, just quote all your attributes.”
> if any of this is confusing to you, just quote all your attributes.
Projects are rarely an one man show so I think it should be rephrased as “Always quote all your attributes because it may be confusing (to your or others)”.
Better yet, rephrase it as “Always quote all your attributes.”
The fewer “rules” you need to remember, the better.
Just quote any attribute that you cannot control, e.g. CMS entered values.
Well I’m currently in the process of switching from XHTML Strict to HTML5. I fully intend to carry on using the same ‘strict’ criteria in all my coding – simply because as I believe it to be best practise. Plus, it’s become habit :).
Cheers
I
Hmm, interesting. But as someone pointed out before, who actually does not use quotes?
What about the difference for attribute values enclosed with parenthesis? For example: background:url(myimage.png) no-repeat 0 0;
And: background:url(“myimage.png”) no-repeat 0 0;
As far as I can see it makes no difference. Do you agree?
I prefer using quotes for a very reason: coloured code (doesn’t work without quotes in InType in which I code).
How many quotes do you have in your final document? Maybe its worth to not use them. I’m starting to stop using it in IDs and Classes names but keeping them on two or more values words.
And come on, who use these akward symbols to name something in your tags??
If somebody does, there’s a lot of things to worry about before thinking in quoting or not quoting.
Cheers!
I agree with Scott and Frank “always quote all the attributes” to be safe