I’m not sure if “data massaging” is a real thing, but that’s how I think of what I’m about to describe.
Dave and I were thinking about a bit of a redesign for ShopTalk Show. Fresh coat of paint kinda thing. Always nice to do that from time to time. But we wanted to start from the inside out this time. It didn’t sound very appealing to design around the data that we had. We wanted to work with cleaner data. We needed to massage the data that we had, so that it would open up more design possibilities.
We had fallen into the classic WordPress trap
Which is… just dumping everything into the default content area:

We used Markdown, which I think is smart, but still was a pile of rather unstructured content. An example:

If that content was structured entirely differently every time (like a blog post probably would be), that would be fine. But it wasn’t. Each show has that same structure.
It’s not WordPress’ fault
We just didn’t structure the data correctly. You can mess that up in any CMS.
To be fair, it probably took quite a while to fall into a steady structure. It’s hard to set up data from day one when you don’t know what that structure is going to be. Speaking of which…
The structure we needed
This is what one podcast episode needs as far as structured data:
- Title of episode
- Description of episode
- Featured image of episode
- MP3
- URL
- Running Time
- Size in Bytes
- A list of topics in the show with time stamps
- A list of links
- Optional: Guest(s)
- Guest Name
- Guest URL
- Guest Twitter
- Guest Bio
- Guest Photo
- Optional: Advertiser(s)
- Advertiser Name
- Advertiser URL
- Advertiser Text
- Advertiser Timestamp
- Optional: Job Mention(s)
- Job Company
- Job Title
- Job URL
- Job Description
- Optional: Transcript
Even that’s not perfect
For example: we hand-number the episodes as part of the title, which means when we need that number individually we’re doing string manipulation in the templates, which feels a bit janky.
Another example: guests aren’t a programmatic construct to themselves. A guest isn’t its own database record with an ID. Which means if a guest appears on multiple shows, that’s duplicated data. Plus, it doesn’t give us the ability to “display all shows with Rebecca Murphey” very easily, which is something we discussed wanting. There is probably some way to program out way out of this in the future, we’re thinking.
Fortunately, that structure is easy to express in Advanced Custom Fields
Once you know what you need, ACF makes it pretty easy to build that out and apply it to whatever kind of page type you need to.
I’m aware that other CMS’s encourage this kind of structuring by default. Cool. I think that’s smart. You should be very proud of yourself for choosing YourFavoriteCMS.
In ACF, our “Field Group” ended up like this:

We needed “Repeater” fields for data like guests, where there is a structure that needs to repeat any number of times. That’s a PRO feature of ACF, which seems like a genius move on their part.

Let the data massaging begin
Unfortunately, now that we had the correct structure, it doesn’t mean that all the old data just instantly popped into place. There are a couple of ways we could have gone about this…
We could have split the design of show pages by date. If it was an old show, dump out the content like we always have. If it’s a new show, use the nice data format. That feels like an even bigger mess than what we had, though.
We could have tried to program our way out of it. Perhaps some scripts we could run that would parse the old data, make intelligent guesses about what content should be ported to the new structure, and run it. Definitely, a non-trivial thing to write. Even if we could have written it, it may have taken more time than just moving the data by hand.
Or… we could move the data by hand. So that’s what we ended up doing. Or rather, we hired someone to move the data for us. Thanks Max! Max Kohler was our data massager.

Hand moving really seemed like the way to go. It’s essentially data entry work, but required a little thought and decision making (hence “massaging”), so it’s the perfect sort of job to either do yourself or find someone who could use some extra hours.
Design is a lot easier with clean and structured data
With all the data nicely cleaned up, I was able to spit it out in a much more consistent and structured way in the design itself:

This latest design of ShopTalk Show is no masterpiece, but now that all this structural work is done, the next design we should be able to focus more on aesthetics and, perhaps, the more fun parts of visual design.
Hey Chris,
Chunking content out is always a good thing but one side effect of doing this is the feed that gets sent to podcast apps has less information in it by default.
May be worth looking into a bit of template wrangling to add the TimeJump & Links sections back into the RSS feed?
The best part of the ShopTalk feed is the ability to use the TimeJump links directly within a podcast app to jump to parts of the episode (using Overcast at least).
I’m guessing one of the most common ways people interact with the episode content is within their podcast app, not the site itself? Maybe call it designing for feed-first? :)
Yes that is an unfortunate side effect!
It’s totally able to be dealt with, though. For example, I put the description of the show back in like this:
I had a little trouble getting a Markdown parser thingy to work from within functions.php, but we’ll get that sorted eventually and get the feed back as full-featured as it was.
Clever idea! I bet more and more sites are doing that, as they build CMS’ in which the content is an API.
You can probably make this DRY-er. For example, some of your guests may return, and it might be silly to add their info again. I had to include writers of a magazine (not Authors). So I set up a CPT called Writers, adding ACF fields similar to yours. Then I attached the writer to the relevant post using ACF’s post object. ACF also allows you to attach multiple objects to a post. So, attaching multiple guests to a show in the order you define, shouldn’t be a problem. Plus, with a CPT, each writer has his/her own page including bio, social info, etc, and a queried list of all his/her articles.
Yep I mention that a bit in the post.
We were also considering using actual WordPress users as user, but ultimately didn’t go there.
As soon as I read your opening paragraph, I thought, “they should use ACF”.
Imagine my surprise!
If you want some more flexibility with the content blocks, you could also use ACF’s pro feature Flexible Content.
We are using this to give our clients the choice what type of content blocks they want. So, text, image, embed, quote, two cols etc. This way the client has control over what blocks are used in a specific order.
Helps separate all the different content elements and really nail the design part on the front end.
Advanced Custom Fields is, hands down, the best WordPress plugin there is.
Great stuff. Time and time again I find myself using ACF to simplify relatively complex layout. Clients LOVE how simple it is once it’s all setup.
If you’re using WordPress’s default search, beware of the impact this has on your search results. By default, WP only searches the post title and main post content fields.
If you’re not already using a custom search solution, you’ll probably want to look into something like searchwp.com for including your acf fields and weighting results.
Nice One – next time you need to wrangle with CPT a little bit more take a look at https://de.wordpress.org/plugins/pods/ – and relationships – wouldn’t it be nice if you had the same advertiser again all it’s data would already be there in a separate CPT and you just would have a relation/link to it? Or if you have regular guests? Or Create a listing of your Guests and in what show they were – or link from the Job board to the show ;)
But yes Data Structure !important :D nice summary of your journey!
As mentioned above, breaking your content into meta fields can make that content less accessible for RSS readers and search.
In instances like this I like to hide the content editor, then when the post is created or updated I generate the markup from the post meta and save it in post_content. This gives you the best of both worlds – structured data entry/storage in the backend, and a single field storing the resulting markup.
Clever. Would like to see an example of that!