Better Line Breaks for Long URLs

Avatar of Reuben Lillie
Reuben Lillie on

CSS-Tricks has covered how to break text that overflows its container before, but not much as much as you might think. Back in 2012, Chris penned “Handling Long Words and URLs (Forcing Breaks, Hyphenation, Ellipsis, etc)” and it is still one of only a few posts on the topic, including his 2018 follow-up Where Lines Break is Complicated. Here’s all the Related CSS and HTML.

Chris’s tried-and-true technique works well when you want to leverage automated word breaks and hyphenation rules that are baked into the browser:

.dont-break-out {
  /* These are technically the same, but use both */
  overflow-wrap: break-word;
  word-wrap: break-word;

  word-break: break-word;

  /* Adds a hyphen where the word breaks, if supported (No Blink) */
  hyphens: auto;
}

But what if you can’t? What if your style guide requires you to break URLs in certain places? These classic sledgehammers are too imprecise for that level of control. We need a different way to either tell the browser exactly where to make a break.

Why we need to care about line breaks in URLs

One reason is design. A URL that overflows its container is just plain gross to look at.

Then there’s copywriting standards. The Chicago Manual of Style, for example, specifies when to break URLs in print. Then again, Chicago gives us a pass for electronic documents… sorta:

It is generally unnecessary to specify breaks for URLs in electronic publications formats with reflowable text, and authors should avoid forcing them to break in their manuscripts.

Chicago 17th ed., 14.18

But what if, like Rachel Andrew (2015) encourages us, you’re designing for print, not just screens? Suddenly, “generally unnecessary” becomes “absolutely imperative.” Whether you’re publishing a book, or you want to create a PDF version of a research paper you wrote in HTML, or you’re designing an online CV, or you have a reference list at the end of your blog post, or you simply care how URLs look in your project—you’d want a way to manage line breaks with a greater degree of control.

OK, so we’ve established why considering line breaks in URLs is a thing, and that there are use cases where they’re actually super important. But that leads us to another key question…

Where are line breaks supposed to go, then?

We want URLs to be readable. We also don’t want them to be ugly, at least no uglier than necessary. Continuing with Chicago’s advice, we should break long URLs based on punctuation, to help signal to the reader that the URL continues on the next line. That would include any of the following places:

  • After a colon or a double slash (//)
  • Before a single slash (/), a tilde (~), a period, a comma, a hyphen, an underline (aka an underscore, _), a question mark, a number sign, or a percent symbol
  • Before or after an equals sign or an ampersand (&)

At the same time, we don’t want to inject new punctuation, like when we might reach for hyphens: auto; rules in CSS to break up long words. Soft or “shy” hyphens are great for breaking words, but bad news for URLs. It’s not as big a deal on screens, since soft hyphens don’t interfere with copy-and-paste, for example. But a user could still mistake a soft hyphen as part of the URL—hyphens are often in URLs, after all. So we definitely don’t want hyphens in print that aren’t actually part of the URL. Reading long URLs is already hard enough without breaking words inside them.

We still can break particularly long words and strings within URLs. Just not with hyphens. For the most part, Chicago leaves word breaks inside URLs to discretion. Our primary goal is to break URLs before and after the appropriate punctuation marks.

How do you control line breaks?

Fortunately, there’s an (under-appreciated) HTML element for this express purpose: the <wbr> element, which represents a line break opportunity. It’s a way to tell the browser, Please break the line here if you need to, not just any-old place.

We can take a gnarly URL, like the one Chris first shared in his 2012 post:

http://www.amazon.com/s/ref=sr_nr_i_o?rh=k%3Ashark+vacuum%2Ci%3Agarden&keywords=shark+vacuum&ie=UTF8&qid=1327784979

And sprinkle in some <wbr> tags, “Chicago style”:

http:<wbr>//<wbr>www<wbr>.<wbr>amazon<wbr>.com<wbr>/<wbr>s/<wbr>ref<wbr>=<wbr>sr<wbr>_<wbr>nr<wbr>_<wbr>i<wbr>_o<wbr>?rh<wbr>=<wbr>k<wbr>%3Ashark<wbr>+vacuum<wbr>%2Ci<wbr>%3Agarden<wbr>&<wbr>keywords<wbr>=<wbr>shark+vacuum<wbr>&ie<wbr>=<wbr>UTF8<wbr>&<wbr>qid<wbr>=<wbr>1327784979

Even if you’re the most masochistic typesetter ever born, you’d probably mark up a URL like that exactly zero times before you’d start wondering if there’s a way to automate those line break opportunities.

Yes, yes there is. Cue JavaScript and some aptly placed regular expressions:

/**
 * Insert line break opportunities into a URL
 */
function formatUrl(url) {
  // Split the URL into an array to distinguish double slashes from single slashes
  var doubleSlash = url.split('//')

  // Format the strings on either side of double slashes separately
  var formatted = doubleSlash.map(str =>
    // Insert a word break opportunity after a colon
    str.replace(/(?<after>:)/giu, '$1<wbr>')
      // Before a single slash, tilde, period, comma, hyphen, underline, question mark, number sign, or percent symbol
      .replace(/(?<before>[/~.,\-_?#%])/giu, '<wbr>$1')
      // Before and after an equals sign or ampersand
      .replace(/(?<beforeAndAfter>[=&])/giu, '<wbr>$1<wbr>')
    // Reconnect the strings with word break opportunities after double slashes
    ).join('//<wbr>')

  return formatted
}

Try it out

Go ahead and open the following demo in a new window, then try resizing the browser to see how the long URLs break.

This does exactly what we want:

  • The URLs break at appropriate spots.
  • There is no additional punctuation that could be confused as part of the URL.
  • The <wbr> tags are auto-generated to relieve us from inserting them manually in the markup.

This JavaScript solution works even better if you’re leveraging a static site generator. That way, you don’t have to run a script on the client just to format URLs. I’ve got a working example on my personal site built with Eleventy.

If you really want to break long words inside URLs too, then I’d recommend inserting those few <wbr> tags by hand. The Chicago Manual of Style has a whole section on word division (7.36–47, login required).

Browser support

The <wbr> element has been seen in the wild since 2001. It was finally standardized with HTML5, so it works in nearly every browser at this point. Strangely enough, <wbr> worked in Internet Explorer (IE) 6 and 7, but was dropped in IE 8, onward. Support has always existed in Edget, so it’s just a matter of dealing with IE or other legacy browsers. Some popular HTML-to-PDF programs, like Prince, also need a boost to handle <wbr>.

One more possible solution

There’s one more trick to optimize line break opportunities. We can use a pseudo-element to insert a zero width space, which is how the <wbr> element is meant to behave in UTF-8 encoded pages anyhow. That’ll at least push support back to IE 9, and perhaps more importantly, work with Prince.

/** 
 * IE 8–11 and Prince don’t recognize the `wbr` element,
 * but a pseudo-element can achieve the same effect with IE 9+ and Prince.
 */
wbr:before {
  /* Unicode zero width space */
  content: "\200B";
  white-space: normal;
}

Striving for print-quality HTML, CSS, and JavaScript is hardly new, but it is undergoing a bit of a renaissance. Even if you don’t design for print or follow Chicago style, it’s still a worthwhile goal to write your HTML and CSS with URLs and line breaks in mind.

References