Building a Form in PHP Using DOMDocument

Avatar of Jonathan Land
Jonathan Land on

Templating makes the web go round. The synthesis of data and structure into content. It’s our coolest superpower as developers — grab some data, then make it work for us, in whatever presentation we need. An array of objects can become a table, a list of cards, a chart, or whatever we think is most useful to the user. Whether the data is our own blog posts in Markdown files, or on-the-minute global exchange rates, the markup and resulting UX are up to us as front-end developers.

PHP is an amazing language for templating, providing many ways to merge data with markup. Let’s get into an example of using data to build out an HTML form in this post.

Want to get your hands dirty right away? Jump to the implementation.

In PHP, we can inline variables into string literals that use double quotes, so if we have a variable $name = 'world', we can write echo "Hello, {$name}", and it prints the expected Hello, world. For more complex templating, we can always concatenate strings, like: echo "Hello, " . $name . ".".

For the old-schoolers, there’s printf("Hello, %s", $name). For multiline strings, you can use Heredoc (the one that starts like <<<MYTEXT). And, last but certainly not least, we can sprinkle PHP variables inside HTML, like <p>Hello, <?= $name ?></p>.

All of these options are great, but things can get messy when a lot of inline logic is required. If we need to build compound HTML strings, say a form or navigation, the complexity is potentially infinite, since HTML elements can nest inside each other.

What we’re trying to avoid

Before we go ahead and do the thing we want to do, it’s worth taking a minute to consider what we don’t want to do. Consider the following abridged passage from the scripture of WordPress Core, class-walker-nav-menu.php, verses 170-270:

<?php // class-walker-nav-menu.php
// ...
$output .= $indent . '<li' . $id . $class_names . '>';
// ...
$item_output  = $args->before;
$item_output .= '<a' . $attributes . '>';
$item_output .= $args->link_before . $title . $args->link_after;
$item_output .= '</a>';
$item_output .= $args->after;
// ...
$output .= apply_filters( 'walker_nav_menu_start_el', $item_output, $item, $depth, $args );
// ...
$output .= "</li>{$n}";

In order to build out a navigation <ul> in this function, we use a variable, $output, which is a very long string to which we keep adding stuff. This type of code has a very specific and limited order of operations. If we wanted to add an attribute to the <a>, we must have access to $attributes before this runs. And if we wanted to optionally nest a <span> or an <img> inside the <a>, we’d need to author a whole new block of code that would replace the middle of line 7 with about 4-10 new lines, depending on what exactly we want to add. Now imagine you need to optionally add the <span> , and then optionally add the <img>, either inside the <span> or after it. That alone is three if statements, making the code even less legible.

It’s very easy to end up with string spaghetti when concatenating like this, which is as fun to say as it is painful to maintain.

The essence of the problem is that when we try to reason about HTML elements, we’re not thinking about strings. It just so happens that strings are what the browser consumes and PHP outputs. But our mental model is more like the DOM — elements are arranged into a tree, and each node has many potential attributes, properties, and children.

Wouldn’t it be great if there were a structured, expressive way to build our tree?

Enter…

The DOMDocument class

PHP 5 added the DOM module to it’s roster of Not So Strictly Typed™ types. Its main entry point is the DOMDocument class, which is intentionally similar to the Web API’s JavaScript DOM. If you’ve ever used document.createElement or, for those of us of a certain age, jQuery’s $('<p>Hi there!</p>') syntax, this will probably feel quite familiar.

We start out by initializing a new DOMDocument:

$dom = new DOMDocument();

Now we can add a DOMElement to it:

$p = $dom->createElement('p');

The string 'p' represents the type of element we want, so other valid strings would be 'div', 'img' , etc.

Once we have an element, we can set its attributes:

$p->setAttribute('class', 'headline');

We can add children to it:

$span = $dom->createElement('span', 'This is a headline'); // The 2nd argument populates the element's textContent
$p->appendChild($span);

And finally, get the complete HTML string in one go:

$dom->appendChild($p);
$htmlString = $dom->saveHTML();
echo $htmlString;

Notice how this style of coding keeps our code organized according to our mental model — a document has elements; elements can have any number of attributes; and elements nest inside one another without needing to know anything about each other. The whole “HTML is just a string” part comes in at the end, once our structure is in place.

The “document” here is a bit different from the actual DOM, in that it doesn’t need to represent an entire document, just a block of HTML. In fact, if you need to create two similar elements, you could save a HTML string using saveHTML(), modify the DOM “document” some more, and then save a new HTML string by calling saveHTML() again.

Getting data and setting the structure

Say we need to build a form on the server using data from a CRM provider and our own markup. The API response from the CRM looks like this:

{
  "submit_button_label": "Submit now!",
  "fields": [
    {
      "id": "first-name",
      "type": "text",
      "label": "First name",
      "required": true,
      "validation_message": "First name is required.",
      "max_length": 30
    },
    {
      "id": "category",
      "type": "multiple_choice",
      "label": "Choose all categories that apply",
      "required": false,
      "field_metadata": {
        "multi_select": true,
        "values": [
          { "value": "travel", "label": "Travel" },
          { "value": "marketing", "label": "Marketing" }
        ]
      }
    }
  ]
}

This example doesn’t use the exact data structure of any specific CRM, but it’s rather representative.

And let’s suppose we want our markup to look like this:

<form>
  <label class="field">
    <input type="text" name="first-name" id="first-name" placeholder=" " required>
    <span class="label">First name</span>
    <em class="validation" hidden>First name is required.</em>
  </label>
  <label class="field checkbox-group">
    <fieldset>
      <div class="choice">
        <input type="checkbox" value="travel" id="category-travel" name="category">
        <label for="category-travel">Travel</label>
      </div>
      <div class="choice">
        <input type="checkbox" value="marketing" id="category-marketing" name="category">
        <label for="category-marketing">Marketing</label>
      </div>
    </fieldset>
    <span class="label">Choose all categories that apply</span>
  </label>
</form>

What’s that placeholder=" "? It’s a small trick that allows us to track in CSS whether the field is empty, without needing JavaScript. As long as the input is empty, it matches input:placeholder-shown, but the user doesn’t see any visible placeholder text. Just the kind of thing you can do when we control the markup!

Now that we know what our desired result is, here’s the game plan:

  1. Get the field definitions and other content from the API
  2. Initialize a DOMDocument
  3. Iterate over the fields and build each one as required
  4. Get the HTML output

So let’s stub out our process and get some technicalities out of the way:

<?php
function renderForm ($endpoint) {
  // Get the data from the API and convert it to a PHP object
  $formResult = file_get_contents($endpoint);
  $formContent = json_decode($formResult);
  $formFields = $formContent->fields;

  // Start building the DOM
  $dom = new DOMDocument();
  $form = $dom->createElement('form');

  // Iterate over the fields and build each one
  foreach ($formFields as $field) {
    // TODO: Do something with the field data
  }

  // Get the HTML output
  $dom->appendChild($form);
  $htmlString = $dom->saveHTML();
  echo $htmlString;
}

So far, we’ve gotten the data and parsed it, initialized our DOMDocument and echoed its output. What do we want to do for each field? First off, let’s build the container element which, in our example, should be a <label>, and the labelling <span> which is common to all field types:

<?php
// ...
// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // Build the container `<label>`
  $element = $dom->createElement('label');
  $element->setAttribute('class', 'field');

  // Reset input values
  $label = null;

  // Add a `<span>` for the label if it is set
  if ($field->label) {
    $label = $dom->createElement('span', $field->label);
    $label->setAttribute('class', 'label');
  }

  // Add the label to the `<label>`
  if ($label) $element->appendChild($label);
}

Since we’re in a loop, and PHP doesn’t scope variables in loops, we reset the $label element on each iteration. Then, if the field has a label, we build the element. At the end, we append it to the container element.

Notice that we set classes using the setAttribute method. Unlike the Web API, there unfortunately is no special handing of class lists. They’re just another attribute. If we had some really complex class logic, since It’s Just PHP™, we could create an array and then implode it:
$label->setAttribute('class', implode($labelClassList)).

Single inputs

Since we know that the API will only return specific field types, we can switch over the type and write specific code for each one:

<?php
// ...
// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // Build the container `<label>`
  $element = $dom->createElement('label');
  $element->setAttribute('class', 'field');

  // Reset input values
  $input = null;
  $label = null;

  // Add a `<span>` for the label if it is set
  // ...

  // Build the input element
  switch ($field->type) {
    case 'text':
    case 'email':
    case 'telephone':
      $input = $dom->createElement('input');
      $input->setAttribute('placeholder', ' ');
      if ($field->type === 'email') $input->setAttribute('type', 'email');
      if ($field->type === 'telephone') $input->setAttribute('type', 'tel');
      break;
  }
}

Now let’s handle text areas, single checkboxes and hidden fields:

<?php
// ...
// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // Build the container `<label>`
  $element = $dom->createElement('label');
  $element->setAttribute('class', 'field');

  // Reset input values
  $input = null;
  $label = null;

  // Add a `<span>` for the label if it is set
  // ...

  // Build the input element
  switch ($field->type) {
    //...  
    case 'text_area':
      $input = $dom->createElement('textarea');
      $input->setAttribute('placeholder', ' ');
      if ($rows = $field->field_metadata->rows) $input->setAttribute('rows', $rows);
      break;

    case 'checkbox':
      $element->setAttribute('class', 'field single-checkbox');
      $input = $dom->createElement('input');
      $input->setAttribute('type', 'checkbox');
      if ($field->field_metadata->initially_checked === true) $input->setAttribute('checked', 'checked');
      break;

    case 'hidden':
      $input = $dom->createElement('input');
      $input->setAttribute('type', 'hidden');
      $input->setAttribute('value', $field->field_metadata->value);
      $element->setAttribute('hidden', 'hidden');
      $element->setAttribute('style', 'display: none;');
      $label->textContent = '';
      break;
  }
}

Notice something new we’re doing for the checkbox and hidden cases? We’re not just creating the <input> element; we’re making changes to the container <label> element! For a single checkbox field we want to modify the class of the container, so we can align the checkbox and label horizontally; a hidden <input>‘s container should also be completely hidden.

Now if we were merely concatenating strings, it would be impossible to change at this point. We would have to add a bunch of if statements regarding the type of element and its metadata in the top of the block. Or, maybe worse, we start the switch way earlier, then copy-paste a lot of common code between each branch.

And here is the real beauty of using a builder like DOMDocument — until we hit that saveHTML(), everything is still editable, and everything is still structured.

Nested looping elements

Let’s add the logic for <select> elements:

<?php
// ...
// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // Build the container `<label>`
  $element = $dom->createElement('label');
  $element->setAttribute('class', 'field');

  // Reset input values
  $input = null;
  $label = null;

  // Add a `<span>` for the label if it is set
  // ...

  // Build the input element
  switch ($field->type) {
    //...  
    case 'select':
      $element->setAttribute('class', 'field select');
      $input = $dom->createElement('select');
      $input->setAttribute('required', 'required');
      if ($field->field_metadata->multi_select === true)
        $input->setAttribute('multiple', 'multiple');
    
      $options = [];
    
      // Track whether there's a pre-selected option
      $optionSelected = false;
    
      foreach ($field->field_metadata->values as $value) {
        $option = $dom->createElement('option', htmlspecialchars($value->label));
    
        // Bail if there's no value
        if (!$value->value) continue;
    
        // Set pre-selected option
        if ($value->selected === true) {
          $option->setAttribute('selected', 'selected');
          $optionSelected = true;
        }
        $option->setAttribute('value', $value->value);
        $options[] = $option;
      }
    
      // If there is no pre-selected option, build an empty placeholder option
      if ($optionSelected === false) {
        $emptyOption = $dom->createElement('option');
    
        // Set option to hidden, disabled, and selected
        foreach (['hidden', 'disabled', 'selected'] as $attribute)
          $emptyOption->setAttribute($attribute, $attribute);
        $input->appendChild($emptyOption);
      }
    
      // Add options from array to `<select>`
      foreach ($options as $option) {
        $input->appendChild($option);
      }
  break;
  }
}

OK, so there’s a lot going on here, but the underlying logic is the same. After setting up the outer <select>, we make an array of <option>s to append inside it.

We’re also doing some <select>-specific trickery here: If there is no pre-selected option, we add an empty placeholder option that is already selected, but can’t be selected by the user. The goal is to place our <label class="label"> as a “placeholder” using CSS, but this technique can be useful for all kinds of designs. By appending it to the $input before appending the other options, we make sure it is the first option in the markup.

Now let’s handle <fieldset>s of radio buttons and checkboxes:

<?php
// ...
// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // Build the container `<label>`
  $element = $dom->createElement('label');
  $element->setAttribute('class', 'field');

  // Reset input values
  $input = null;
  $label = null;

  // Add a `<span>` for the label if it is set
  // ...

  // Build the input element
  switch ($field->type) {
    // ...  
    case 'multiple_choice':
      $choiceType = $field->field_metadata->multi_select === true ? 'checkbox' : 'radio';
      $element->setAttribute('class', "field {$choiceType}-group");
      $input = $dom->createElement('fieldset');
    
      // Build a choice `<input>` for each option in the fieldset
      foreach ($field->field_metadata->values as $choiceValue) {
        $choiceField = $dom->createElement('div');
        $choiceField->setAttribute('class', 'choice');
    
        // Set a unique ID using the field ID + the choice ID
        $choiceID = "{$field->id}-{$choiceValue->value}";
    
        // Build the `<input>` element
        $choice = $dom->createElement('input');
        $choice->setAttribute('type', $choiceType);
        $choice->setAttribute('value', $choiceValue->value);
        $choice->setAttribute('id', $choiceID);
        $choice->setAttribute('name', $field->id);
        $choiceField->appendChild($choice);
    
        // Build the `<label>` element
        $choiceLabel = $dom->createElement('label', $choiceValue->label);
        $choiceLabel->setAttribute('for', $choiceID);
        $choiceField->appendChild($choiceLabel);
    
        $input->appendChild($choiceField);
      }
  break;
  }
}

So, first we determine if the field set should be for checkboxes or radio button. Then we set the container class accordingly, and build the <fieldset>. After that, we iterate over the available choices and build a <div> for each one with an <input> and a <label>.

Notice we use regular PHP string interpolation to set the container class on line 21 and to create a unique ID for each choice on line 30.

Fragments

One last type we have to add is slightly more complex than it looks. Many forms include instruction fields, which aren’t inputs but just some HTML we need to print between other fields.

We’ll need to reach for another DOMDocument method, createDocumentFragment(). This allows us to add arbitrary HTML without using the DOM structuring:

<?php
// ...
// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // Build the container `<label>`
  $element = $dom->createElement('label');
  $element->setAttribute('class', 'field');

  // Reset input values
  $input = null;
  $label = null;

  // Add a `<span>` for the label if it is set
  // ...

  // Build the input element
  switch ($field->type) {
    //...  
    case 'instruction':
      $element->setAttribute('class', 'field text');
      $fragment = $dom->createDocumentFragment();
      $fragment->appendXML($field->text);
      $input = $dom->createElement('p');
      $input->appendChild($fragment);
      break;
  }
}

At this point you might be wondering how we found ourselves with an object called $input, which actually represents a static <p> element. The goal is to use a common variable name for each iteration of the fields loop, so at the end we can always add it using $element->appendChild($input) regardless of the actual field type. So, yeah, naming things is hard.

Validation

The API we are consuming kindly provides an individual validation message for each required field. If there’s a submission error, we can show the errors inline together with the fields, rather than a generic “oops, your bad” message at the bottom.

Let’s add the validation text to each element:

<?php
// ...
// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // build the container `<label>`
  $element = $dom->createElement('label');
  $element->setAttribute('class', 'field');

  // Reset input values
  $input = null;
  $label = null;
  $validation = null;

  // Add a `<span>` for the label if it is set
  // ...

  // Add a `<em>` for the validation message if it is set
  if (isset($field->validation_message)) {
    $validation = $dom->createElement('em');
    $fragment = $dom->createDocumentFragment();
    $fragment->appendXML($field->validation_message);
    $validation->appendChild($fragment);
    $validation->setAttribute('class', 'validation-message');
    $validation->setAttribute('hidden', 'hidden'); // Initially hidden, and will be unhidden with Javascript if there's an error on the field
  }

  // Build the input element
  switch ($field->type) {
    // ...
  }
}

That’s all it takes! No need to fiddle with the field type logic — just conditionally build an element for each field.

Bringing it all together

So what happens after we build all the field elements? We need to add the $input, $label, and $validation objects to the DOM tree we’re building. We can also use the opportunity to add common attributes, like required. Then we’ll add the submit button, which is separate from the fields in this API.

<?php
function renderForm ($endpoint) {
  // Get the data from the API and convert it to a PHP object
  // ...

  // Start building the DOM
  $dom = new DOMDocument();
  $form = $dom->createElement('form');

  // Iterate over the fields and build each one
  foreach ($formFields as $field) {
    // Build the container `<label>`
    $element = $dom->createElement('label');
    $element->setAttribute('class', 'field');
  
    // Reset input values
    $input = null;
    $label = null;
    $validation = null;
  
    // Add a `<span>` for the label if it is set
    // ...
  
    // Add a `<em>` for the validation message if it is set
    // ...
  
    // Build the input element
    switch ($field->type) {
      // ...
    }
  
    // Add the input element
    if ($input) {
      $input->setAttribute('id', $field->id);
      if ($field->required)
        $input->setAttribute('required', 'required');
      if (isset($field->max_length))
        $input->setAttribute('maxlength', $field->max_length);
      $element->appendChild($input);
  
      if ($label)
        $element->appendChild($label);
  
      if ($validation)
        $element->appendChild($validation);
  
      $form->appendChild($element);
    }
  }
  
  // Build the submit button
  $submitButtonLabel = $formContent->submit_button_label;
  $submitButtonField = $dom->createElement('div');
  $submitButtonField->setAttribute('class', 'field submit');
  $submitButton = $dom->createElement('button', $submitButtonLabel);
  $submitButtonField->appendChild($submitButton);
  $form->appendChild($submitButtonField);

  // Get the HTML output
  $dom->appendChild($form);
  $htmlString = $dom->saveHTML();
  echo $htmlString;
}

Why are we checking if $input is truthy? Since we reset it to null at the top of the loop, and only build it if the type conforms to our expected switch cases, this ensures we don’t accidentally include unexpected elements our code can’t handle properly.

Hey presto, a custom HTML form!

Bonus points: rows and columns

As you may know, many form builders allow authors to set rows and columns for fields. For example, a row might contain both the first name and last name fields, each in a single 50% width column. So how would we go about implementing this, you ask? By exemplifying (once again) how loop-friendly DOMDocument is, of course!

Our API response includes the grid data like this:

{
  "submit_button_label": "Submit now!",
  "fields": [
    {
      "id": "first-name",
      "type": "text",
      "label": "First name",
      "required": true,
      "validation_message": "First name is required.",
      "max_length": 30,
      "row": 1,
      "column": 1
    },
    {
      "id": "category",
      "type": "multiple_choice",
      "label": "Choose all categories that apply",
      "required": false,
      "field_metadata": {
        "multi_select": true,
        "values": [
          { "value": "travel", "label": "Travel" },
          { "value": "marketing", "label": "Marketing" }
        ]
      },
      "row": 2,
      "column": 1
    }
  ]
}

We’re assuming that adding a data-column attribute is enough for styling the width, but that each row needs to be it’s own element (i.e. no CSS grid).

Before we dive in, let’s think through what we need in order to add rows. The basic logic goes something like this:

  1. Track the latest row encountered.
  2. If the current row is larger, i.e. we’ve jumped to the next row, create a new row element and start adding to it instead of the previous one.

Now, how would we do this if we were concatenating strings? Probably by adding a string like '</div><div class="row">' whenever we reach a new row. This kind of “reversed HTML string” is always very confusing to me, so I can only imagine how my IDE feels. And the cherry on top is that thanks to the browser auto-closing open tags, a single typo will result in a gazillion nested <div>s. Just like fun, but the opposite.

So what’s the structured way to handle this? Thanks for asking. First let’s add row tracking before our loop and build an additional row container element. Then we’ll make sure to append each container $element to its $rowElement rather than directly to $form.

<?php
function renderForm ($endpoint) {
  // Get the data from the API and convert it to a PHP object
  // ...

  // Start building the DOM
  $dom = new DOMDocument();
  $form = $dom->createElement('form');

  // init tracking of rows
  $row = 0;
  $rowElement = $dom->createElement('div');
  $rowElement->setAttribute('class', 'field-row');

  // Iterate over the fields and build each one
  foreach ($formFields as $field) {
    // Build the container `<label>`
    $element = $dom->createElement('label');
    $element->setAttribute('class', 'field');
    $element->setAttribute('data-row', $field->row);
    $element->setAttribute('data-column', $field->column);
    
    // Add the input element to the row
    if ($input) {
      // ...
      $rowElement->appendChild($element);
      $form->appendChild($rowElement);
    }
  }
  // ...
}

So far we’ve just added another <div> around the fields. Let’s build a new row element for each row inside the loop:

<?php
// ...
// Init tracking of rows
$row = 0;
$rowElement = $dom->createElement('div');
$rowElement->setAttribute('class', 'field-row');

// Iterate over the fields and build each one
foreach ($formFields as $field) {
  // ...
  // If we've reached a new row, create a new $rowElement
  if ($field->row > $row) {
    $row = $field->row;
    $rowElement = $dom->createElement('div');
    $rowElement->setAttribute('class', 'field-row');
  }

  // Build the input element
  switch ($field->type) {
    // ...  
    // Add the input element to the row
      if ($input) {
        // ...
        $rowElement->appendChild($element);

        // Automatically de-duped
        $form->appendChild($rowElement);
      }
  }
}

All we need to do is overwrite the $rowElement object as a new DOM element, and PHP treats it as a new unique object. So, at the end of every loop, we just append whatever the current $rowElement is — if it’s still the same one as in the previous iteration, then the form is updated; if it’s a new element, it is appended at the end.

Where do we go from here?

Forms are a great use case for object-oriented templating. And thinking about that snippet from WordPress Core, an argument might be made that nested menus are a good use case as well. Any task where the markup follows complex logic makes for a good candidate for this approach. DOMDocument can output any XML, so you could also use it to build an RSS feed from posts data.

Here’s the entire code snippet for our form. Feel free it adapt it to any form API you find yourself dealing with. Here’s the official documentation, which is good for getting a sense of the available API.

We didn’t even mention DOMDocument can parse existing HTML and XML. You can then look up elements using the XPath API, which is kinda similar to document.querySelector, or cheerio on Node.js. There’s a bit of a learning curve, but it’s a super powerful API for handling external content.

Fun(?) fact: Microsoft Office files that end with x (e.g. .xlsx) are XML files. Don’t tell the marketing department, but it’s possible to parse Word docs and output HTML on the server.

The most important thing is to remember that templating is a superpower. Being able to build the right markup for the right situation can be the key to a great UX.