Sir Tim Berners-Lee is fascinated with information. It has been his life’s work. For over four decades, he has sought to understand how it is mapped and stored and transmitted. How it passes from person to person. How the seeds of information become the roots of dramatic change. It is so fundamental to the work that he has done that when he wrote the proposal for what would eventually become the World Wide Web, he called it “Information Management, a Proposal.”
Information is the web’s core function. A series of bytes stream across the world and at the end of it is knowledge. The mechanism for this transfer — what we know as the web — was created by the intersection of two things. The first is the Internet, the technology that makes it all possible. The second is hypertext, the concept that grounds its use. They were brought together by Sir Tim Berners-Lee. And when he was done he did something truly spectacular. He gave it away to everyone to use for free.
When Berners-Lee submitted “Information Management, a Proposal” to his superiors, they returned it with a comment on the top that read simply:
Vague, but exciting…
The web wasn’t a sure thing. Without the hindsight of today it looked far too simple to be effective. In other words, it was a hard sell. Berners-Lee was proficient at many things, but he was never a great salesman. He loved his idea for the web. But he had to convince everybody else to love it too.
Sir Tim Berners-Lee has a mind that races. He has been known — based on interviews and public appearances — to jump from one idea to the next. He is almost always several steps ahead of what he is saying, which is often quite profound. Until recently, he only gave a rare interview here and there, and masked his greatest achievements with humility and a wry British wit.
What is immediately apparent is that Sir Tim Berners-Lee is curious. Curious about everything. It has led him to explore some truly revolutionary ideas before they became truly revolutionary. But it also means that his focus is typically split. It makes it hard for him to hold on to things in his memory. “I’m certainly terrible at names and faces,” he once said in an interview. His original fascination with the elements for the web came from a very personal need to organize his own thoughts and connect them together, disparate and unconnected as they are. It is not at all unusual that when he reached for a metaphor for that organization, he came up with a web.
As a young boy, his curiosity was encouraged. His parents, Conway Berners-Lee and Mary Lee Woods, were mathematicians. They worked on the Ferranti Mark I, the world’s first commercially available computer, in the 1950s. They fondly speak of Berners-Lee as a child, taking things apart, experimenting with amateur engineering projects. There was nothing that he didn’t seek to understand further. Electronics — and computers specifically — were particularly enchanting.
Berners-Lee sometimes tells the story of a conversation he had with his with father as a young boy about the limitations of computers making associations between information that was not intrinsically linked. “The idea stayed with me that computers could be much more powerful,” Berners-Lee recalls, “if they could be programmed to link otherwise unconnected information. In an extreme view, the world can been seen as only connections.” He didn’t know it yet, but Berners-Lee had stumbled upon the idea of hypertext at a very early age. It would be several years before he would come back to it.
History is filled with attempts to organize knowledge. An oft-cited example is the Library of Alexandria, a fabled library of Egypt built by Alexander the Great that was thought to have had tens of thousands of meticulously organized texts.
At the turn of the century, Paul Otlet tried something similar in Belgium. His project was called the Répertoire Bibliographique Universel (Universal Bibliography). Otlet and a team of researchers created a library of over 15 million index cards, each with a discrete and small piece of information in topics ranging from science to geography. Otlet devised a sophisticated numbering system that allowed him to link one index card to another. He fielded requests from researchers around the world via mail or telegram, and Otlet’s researchers could follow a trail of linked index cards to find an answer. Once properly linked, information becomes infinitely more useful.
A sudden surge of scientific research in the wake of World War II prompted Vanneaver Bush to propose another idea. In his groundbreaking essay in The Atlantic in 1945 entitled “As We May Think,” Bush imagined a mechanical library called a Memex. Like Otlet’s Universal Bibliography, the Memex stored bits of information. But instead of index cards, everything was stored on compact microfilm. Through the process of what he called “associative indexing,” users of the Memex could follow trails of related information through an intricate web of links.
The list of attempts goes on. But it was Ted Neslon who finally gave the concept a name in 1968, two decades after Bush’s article in The Atlantic. He called it hypertext.
Hypertext is, essentially, linked text. Nelson observed that in the real world, we often give meaning to the connections between concepts; it helps us grasp their importance and remember them for later. The proximity of a Post-It to your computer, the orientation of ingredients in your refrigerator, the order of books on your bookshelf. Invisible though they may seem, each of these signifiers hold meaning, whether consciously or subconsciously, and they are only fully realized when taking a step back. Hypertext was a way to bring those same kinds of meaningful connections to the digital world.
Nelson’s primary contribution to hypertext is a number of influential theories and a decades-long project still in progress known as Xanadu. Much like the web, Xanadau uses the power of a network to create a global system of links and pages. However, Xanadu puts a far greater emphasis on the ability to trace text to its original author for monetization and attribution purposes. This distinction, known as transculsion, has been a near impossible technological problem to solve.
Nelson’s interest in hypertext stems from the same issue with memory and recall as Berners-Lee. He refers to it is as his hummingbird mind. Nelson finds it hard to hold on to associations he creates in the real world. Hypertext offers a way for him to map associations digitally, so that he can call on them later. Berners-Lee and Nelson met for the first time a couple of years after the web was invented. They exchanged ideas and philosophies, and Berners-Lee was able to thank Nelson for his influential thinking. At the end of the meeting, Berners-Lee asked if he could take a picture. Nelson, in turn, asked for a short video recording. Each was commemorating the moment they knew they would eventually forget. And each turned to technology for a solution.
By the mid-80s, on the wave of innovation in personal computing, there were several hypertext applications out in the wild. The hypertext community — a dedicated group of software engineers that believed in the promise of hypertext – created programs for researchers, academics, and even off-the-shelf personal computers. Every research lab worth their weight in salt had a hypertext project. Together they built entirely new paradigms into their software, processes and concepts that feel wonderfully familiar today but were completely outside the realm of possibilities just a few years earlier.
At Brown University, the very place where Ted Nelson was studying when he coined the term hypertext, Norman Meyrowitz, Nancy Garrett, and Karen Catlin were the first to breathe life into the hyperlink, which was introduced in their program Intermedia. At Symbolics, Janet Walker was toying with the idea of saving links for later, a kind of speed dial for the digital world – something she was calling a bookmark. At the University of Maryland, Ben Schneiderman sought to compile and link the world’s largest source of information with his Interactive Encyclopedia System.
Dame Wendy Hall, at the University of Southhampton, sought to extend the life of the link further in her own program, Microcosm. Each link made by the user was stored in a linkbase, a database apart from the main text specifically designed to store metadata about connections. In Microcosm, links could never die, never rot away. If their connection was severed they could point elsewhere since links weren’t directly tied to text. You could even write a bit of text alongside links, expanding a bit on why the link was important, or add to a document separate layers of links, one, for instance, a tailored set of carefully curated references for experts on a given topic, the other a more laid back set of links for the casual audience.
There were mailing lists and conferences and an entire community that was small, friendly, fiercely competitive and locked in an arms race to find the next big thing. It was impossible not to get swept up in the fervor. Hypertext enabled a new way to store actual, tangible knowledge; with every innovation the digital world became more intricate and expansive and all-encompassing.
Then came the heavy hitters. Under a shroud of mystery, researchers and programmers at the legendary Xerox PARC were building NoteCards. Apple caught wind of the idea and found it so compelling that they shipped their own hypertext application called Hypercard, bundled right into the Mac operating system. If you were a late Apple II user, you likely have fond memories of Hypercard, an interface that allowed you to create a card, and quickly link it to another. Cards could be anything, a recipe maybe, or the prototype of a latest project. And, one by one, you could link those cards up, visually and with no friction, until you had a digital reflection of your ideas.
Towards the end of the 80s, it was clear that hypertext had a bright future. In just a few short years, the software had advanced in leaps and bounds.
After a brief stint studying physics at The Queen’s College, Oxford, Sir Tim Berners-Lee returned to his first love: computers. He eventually found a short-term, six-month contract at the particle physics lab Conseil Européen pour la Recherche Nucléaire (European Council for Nuclear Research), or simply, CERN.
CERN is responsible for a long line of particle physics breakthroughs. Most recently, they built the Large Hadron Collider, which led to the confirmation of the Higgs Boson particle, a.k.a. the “God particle.”
CERN doesn’t operate like most research labs. Its internal staff makes up only a small percentage of the people that use the lab. Any research team from around the world can come and use the CERN facilities, provided that they are able to prove their research fits within the stated goals of the institution. A majority of CERN occupants are from these research teams. CERN is a dynamic, sprawling campus of researchers, ferrying from location to location on bicycles or mine-carts, working on the secrets of the universe. Each team is expected to bring their own equipment and expertise. That includes computers.
Berners-Lee was hired to assist with software on an earlier version of the particle accelerator called the Proton Synchrotron. When he arrived, he was blown away by the amount of pure, unfiltered information that flowed through CERN. It was nearly impossible to keep track of it all and equally impossible to find what you were looking for. Berners-Lee wanted to capture that information and organize it.
His mind flashed back to that conversation with his father all those years ago. What if it were possible to create a computer program that allowed you to make random associations between bits of information? What if you could, in other words, link one thing to another? He began working on a software project on the side for himself. Years later, that would be the same way he built the web. He called this project ENQUIRE, named for a Victorian handbook he had read as a child.
Using a simple prompt, ENQUIRE users could create a block of info, something like Otlet’s index cards all those years ago. And just like the Universal Bibliography, ENQUIRE allowed you to link one block to another. Tools were bundled in to make it easier to zoom back and see the connections between the links. For Berners-Lee this filled a simple need: it replaced the part of his memory that made it impossible for him to remember names and faces with a digital tool.
Compared to the software being actively developed at the University of Southampton or at Xerox or Apple, ENQUIRE was unsophisticated. It lacked a visual interface, and its format was rudimentary. A program like Hypercard supported rich-media and advanced two-way connections. But ENQUIRE was only Berners-Lee’s first experiment with hypertext. He would drop the project when his contract was up at CERN.
Berners-Lee would go and work for himself for several years before returning to CERN. By the time he came back, there would be something much more interesting for him to experiment with. Just around the corner was the Internet.
Packet switching is the single most important invention in the history of the Internet. It is how messages are transmitted over a globally decentralized network. It was discovered almost simultaneously in the late-60s by two different computer scientists, Donald Davies and Paul Baran. Both were interested in the way in which it made networks resilient.
Traditional telecommunications at the time were managed by what is known as circuit switching. With circuit switching, a direct connection is open between the sender and receiver, and the message is sent in its entirety between the two. That connection needs to be persistent and each channel can only carry a single message at a time. That line stays open for the duration of a message and everything is run through a centralized switch.
If you’re searching for an example of circuit switching, you don’t have to look far. That’s how telephones work (or used to, at least). If you’ve ever seen an old film (or even a TV show like Mad Men) where an operator pulls a plug out of a wall and plugs it back in to connect a telephone call, that’s circuit switching (though that was all eventually automated). Circuit switching works because everything is sent over the wire all at once and through a centralized switch. That’s what the operators are connecting.
Packet switching works differently. Messages are divided into smaller bits, or packets, and sent over the wire a little at a time. They can be sent in any order because each packet has just enough information to know where in the order it belongs. Packets are sent through until the message is complete, and then re-assembled on the other side. There are a few advantages to a packet-switched network. Multiple messages can be sent at the same time over the same connection, split up into little packets. And crucially, the network doesn’t need centralization. Each node in the network can pass around packets to any other node without a central routing system. This made it ideal in a situation that requires extreme adaptability, like in the fallout of an atomic war, Paul Baran’s original reason for devising the concept.
When Davies began shopping around his idea for packet switching to the telecommunications industry, he was shown the door. “I went along to Siemens once and talked to them, and they actually used the words, they accused me of technical — they were really saying that I was being impertinent by suggesting anything like packet switching. I can’t remember the exact words, but it amounted to that, that I was challenging the whole of their authority.” Traditional telephone companies were not at all interested in packet switching. But ARPA was.
ARPA, later known as DARPA, was a research agency embedded in the United States Department of Defense. It was created in the throes of the Cold War — a reaction to the launch of the Sputnik satellite by Russia — but without a core focus. (It was created at the same time as NASA, so launching things into space was already taken.) To adapt to their situation, ARPA recruited research teams from colleges around the country. They acted as a coordinator and mediator between several active university research projects with a military focus.
ARPA’s organization had one surprising and crucial side effect. It was comprised mostly of professors and graduate students who were working at its partner universities. The general attitude was that as long as you could prove some sort of modest relation to a military application, you could pitch your project for funding. As a result, ARPA was filled with lots of ambitious and free-thinking individuals working inside of a buttoned-up government agency, with little oversight, coming up with the craziest and most world-changing ideas they could. “We expected that a professional crew would show up eventually to take over the problems we were dealing with,” recalls Bob Kahn, an ARPA programmer critical to the invention of the Internet. The “professionals” never showed up.
One of those professors was Leonard Kleinrock at UCLA. He was involved in the first stages of ARPANET, the network that would eventually become the Internet. His job was to help implement the most controversial part of the project, the still theoretical concept known as packet switching, which enabled a decentralized and efficient design for the ARPANET network. It is likely that the Internet would not have taken shape without it. Once packet switching was implemented, everything came together quickly. By the early 1980s, it was simply called the Internet. By the end of the 1980s, the Internet went commercial and global, including a node at CERN.
The first applications of the Internet are still in use today. FTP, used for transferring files over the network, was one of the first things built. Email is another one. It had been around for a couple of decades on a closed system already. When the Internet began to spread, email became networked and infinitely more useful.
Other projects were aimed at making the Internet more accessible. They had names like Archie, Gopher, and WAIS, and have largely been forgotten. They were united by a common goal of bringing some order to the chaos of a decentralized system. WAIS and Archie did so by indexing the documents put on the Internet to make them searchable and findable by users. Gopher did so with a structured, hierarchical system.
Kleinrock was there when the first message was ever sent over the Internet. He was supervising that part of the project, and even then, he knew what a revolutionary moment it was. However, he is quick to note that not everybody shared that feeling in the beginning. He recalls the sentiment held by the titans of the telecommunications industry like the Bell Telephone Company. “They said, ‘Little boy, go away,’ so we went away.” Most felt that the project would go nowhere, nothing more than a technological fad.
In other words, no one was paying much attention to what was going on and no one saw the Internet as much of a threat. So when that group of professors and graduate students tried to convince their higher-ups to let the whole thing be free — to let anyone implement the protocols of the Internet without a need for licenses or license fees — they didn’t get much pushback. The Internet slipped into public use and only the true technocratic dreamers of the late 20th century could have predicted what would happen next.
Berners-Lee returned to CERN in a fellowship position in 1984. It was four years after he had left. A lot had changed. CERN had developed their own network, known as CERNET, but by 1989, they arrived and hooked up to the new, internationally standard Internet. “In 1989, I thought,” he recalls, “look, it would be so much easier if everybody asking me questions all the time could just read my database, and it would be so much nicer if I could find out what these guys are doing by just jumping into a similar database of information for them.” Put another way, he wanted to share his own homepage, and get a link to everyone else’s.
What he needed was a way for researchers to share these “databases” without having to think much about how it all works. His way in with management was operating systems. CERN’s research teams all bring their own equipment, including computers, and there’s no way to guarantee they’re all running the same OS. Interoperability between operating systems is a difficult problem by design — generally speaking — the goal of an OS is to lock you in. Among its many other uses, a globally networked hypertext system like the web was a wonderful way for researchers to share notes between computers using different operating systems.
However, Berners-Lee had a bit of trouble explaining his idea. He’s never exactly been concise. By 1989, when he wrote “Information Management, a Proposal,” Berners-Lee already had worldwide ambitions. The document is thousands of words, filled with diagrams and charts. It jumps energetically from one idea to the next without fully explaining what’s just been said. Much of what would eventually become the web was included in the document, but it was just too big of an idea. It was met with a lukewarm response — that “Vague, but exciting” comment scrawled across the top.
A year later, in May of 1990, at the encouragement of his boss Mike Sendall (the author of that comment), Beners-Lee circulated the proposal again. This time it was enough to buy him a bit of time internally to work on it. He got lucky. Sendall understood his ambition and aptitude. He wouldn’t always get that kind of chance. The web needed to be marketed internally as an invaluable tool. CERN needed to need it. Taking complex ideas and boiling them down to their most salient, marketable points, however, was not Berners-Lee’s strength. For that, he was going to need a partner. He found one in Robert Cailliau.
Cailliau was a CERN veteran. By 1989, he’d worked there as a programmer for over 15 years. He’d embedded himself in the company culture, proving a useful resource helping teams organize their informational toolset and knowledge-sharing systems. He had helped several teams at CERN do exactly the kind of thing Berners-Lee was proposing, though at a smaller scale.
Temperamentally, Cailliau was about as different from Berners-Lee as you could get. He was hyper-organized and fastidious. He knew how to sell things internally, and he had made plenty of political inroads at CERN. What he shared with Berners-Lee was an almost insatiable curiosity. During his time as a nurse in the Belgian military, he got fidgety. “When there was slack at work, rather than sit in the infirmary twiddling my thumbs, I went and got myself some time on the computer there.” He ended up as a programmer in the military, working on war games and computerized models. He couldn’t help but look for the next big thing.
In the late 80s, Cailliau had a strong interest in hypertext. He was taking a look at Apple’s Hypercard as a potential internal documentation system at CERN when he caught wind of Berners-Lee’s proposal. He immediately recognized its potential.
Working alongside Berners-Lee, Cailliau pieced together a new proposal. Something more concise, more understandable, and more marketable. While Berners-Lee began putting together the technologies that would ultimately become the web, Cailliau began trying to sell the idea to interested parties inside of CERN.
The web, in all of its modern uses and ubiquity can be difficult to define as just one thing — we have the web on our refrigerators now. In the beginning, however, the web was made up of only a few essential features.
There was the web server, a computer wired to the Internet that can transmit documents and media (webpages) to other computers. Webpages are served via HTTP, a protocol designed by Berners-Lee in the earliest iterations of the web. HTTP is a layer on top of the Internet, and was designed to make things as simple, and resilient, as possible. HTTP is so simple that it forgets a request as soon as it has made it. It has no memory of the webpages its served in the past. The only thing HTTP is concerned with is the request it’s currently making. That makes it magnificently easy to use.
These webpages are sent to browsers, the software that you’re using to read this article. Browsers can read documents handed to them by server because they understand HTML, another early invention of Sir Tim Berners-Lee. HTML is a markup language, it allows programmers to give meaning to their documents so that they can be understood. The “H” in HTML stands for Hypertext. Like HTTP, HTML — all of the building blocks programmers can use to structure a document — wasn’t all that complex, especially when compared to other hypertext applications at the time. HTML comes from a long line of other, similar markup languages, but Berners-Lee expanded it to include the link, in the form of an anchor tag. The
<a> tag is the most important piece of HTML because it serves the web’s greatest function: to link together information.
The hyperlink was made possible by the Universal Resource Identifier (URI) later renamed to the Uniform Resource Indicator after the IETF found the word “universal” a bit too substantial. But for Berners-Lee, that was exactly the point. “Its universality is essential: the fact that a hypertext link can point to anything, be it personal, local or global, be it draft or highly polished,” he wrote in his personal history of the web. Of all the original technologies that made up the web, Berners-Lee — and several others — have noted that the URL was the most important.
By Christmas of 1990, Sir Tim Berners-Lee had all of that built. A full prototype of the web was ready to go.
Cailliau, meanwhile, had had a bit of success trying to sell the idea to his bosses. He had hoped that his revised proposal would give him a team and some time. Instead he got six months and a single staff member, intern Nicola Pellow. Pellow was new to CERN, on placement for her mathematics degree. But her work on the Line Mode Browser, which enabled people from around the world using any operating system to browse the web, proved a crucial element in the web’s early success. Berners-Lee’s work, combined with the Line Mode Browser, became the web’s first set of tools. It was ready to show to the world.
When the team at CERN submitted a paper on the World Wide Web to the San Antonio Hypertext Conference in 1991, it was soundly rejected. They went anyway, and set up a table with a computer to demo it to conference attendees. One attendee remarked:
They have chutzpah calling that the World Wide Web!
The highlight of the web is that it was not at all sophisticated. Its use of hypertext was elementary, allowing for only simplistic text based links. And without two-way links, pretty much a given in hypertext applications, links could go dead at any minute. There was no linkbase, or sophisticated metadata assigned to links. There was just the anchor tag. The protocols that ran on top of the Internet were similarly basic. HTTP only allowed for a handful of actions, and alternatives like Gopher or WAIS offered far more options for advanced connections through the Internet network.
It was hard to explain, difficult to demo, and had overly lofty ambition. It was created by a man who didn’t have much interest in marketing his ideas. Even the name was somewhat absurd. “WWW” is one of only a handful of acronyms that actually takes longer to say than the full “World Wide Web.”
We know how this story ends. The web won. It’s used by billions of people and runs through everything we do. It is among the most remarkable technological achievements of the 20th century.
It had a few advantages, of course. It was instantly global and widely accessible thanks to the Internet. And the URL — and its uniqueness — is one of the more clever concepts to come from networked computing.
But if you want to truly understand why the web succeeded we have to come back to information. One of Berners-Lee’s deepest held beliefs is that information is incredibly powerful, and that it deserves to be free. He believed that the Web could deliver on that promise. For it to do that, the web would need to spread.
Berners-Lee looked to his successors for inspiration: the Internet. The Internet succeeded, in part, because they gave it away to everyone. After considering several licensing options, he lobbied CERN to release the web unlicensed to the general public. CERN, an organization far more interested in particle physics breakthroughs than hypertext, agreed. In 1993, the web officially entered the public domain.
And that was the turning point. They didn’t know it then, but that was the moment the web succeeded. When Berners-Lee was able to make globally available information truly free.
In an interview some years ago, Berners-Lee recalled how it was that the web came to be.
I had the idea for it. I defined how it would work. But it was actually created by people.
That may sound like humility from one of the world’s great thinkers — and it is that a little — but it is also the truth. The web was Berners-Lee’s gift to the world. He gave it to us, and we made it what it was. He and his team fought hard at CERN to make that happen.
Berners-Lee knew that with the resources available to him he would never be able to spread the web sufficiently outside of the hallways of CERN. Instead, he packaged up all the code that was needed to build a browser into a library called libwww and posted it to a Usenet group. That was enough for some people to get interested in browsers. But before browsers would be useful, you needed something to browse.
really cool. i’m always fascinated with the history of web browsers, languages, the internet, and just computers in general. this was a great read. looking forward to the next chapter!
Awesome! I really enjoy your Blog and it is great to see you on one of my favourite sites.
Great article – he was knighted by the queen so its Sir Tim Berners-Lee
Thanks very much for the note!
Thanks for this comprehensive summary.
Brings back many happy memories of our exciting times at CERN.
I will be sure to recommend it to my students :-)
It is very interesting and exciting written. I will definitely be your regular reader. Thanks for your work!
I Loved this! Looking forward to other articles
A really enjoyable read, thank you for that!
I stumbled upon a tiny mistake: “The ‘H’ in HTML stands for Hypertext”. I do believe the “HT” stands for Hypertext ;)
Awesome article, can’t wait to read the rest.
HTML is a language derived from SGML, a well established mark up language at the time. You could even define a DTD for HTML and an SGML parser would verify an HTML document.
This is such a great article and I’m very excited to read the next one in the series! One small correction/note? The Library of Alexandria was located in Egypt and not Greece.
Can we get a podcast, á la Jeremy’s “Resilient Web Design” web book?