Let’s Create a Custom Audio Player

Avatar of Idorenyin Udoh
Idorenyin Udoh on

HTML has a built-in native audio player interface that we get simply using the <audio> element. Point it to a sound file and that’s all there is to it. We even get to specify multiple files for better browser support, as well as a little CSS flexibility to style things up, like giving the audio player a border, some rounded corners, and maybe a little padding and margin.

But even with all that… the rendered audio player itself can look a little, you know, plain.

Did you know it’s possible to create a custom audio player? Of course we can! While the default <audio> player is great in many cases, having a custom player might suit you better, like if you run a podcast and an audio player is the key element on a website for the podcast. Check out the sweet custom player Chris and Dave set up over at the ShopTalk Show website.

Showing a black audio player with muted orange controls, including options to jump or rewind 30 seconds on each side of a giant play button, which sits on top of a timeline showing the audio current time and duration at both ends of the timeline, and an option to set the audio speed in the middle.
The audio player fits in seamlessly with other elements on the page, sporting controls that complement the overall design.

We’re going to take stab at making our own player in this post. So, put on your headphones, crank up some music, and let’s get to work!

The elements of an audio player

First, let’s examine the default HTML audio players that some of the popular browsers provide.

Google Chrome, Opera, and Microsoft Edge
Blink
Mozilla Firefox
Firefox
Internet Explorer
Internet Explorer

If our goal is to match the functionality of these examples, then we need to make sure our player has:

  • a play/pause button,
  • a seek slider,
  • the current time indicator,
  • the duration of the sound file,
  • a way to mute the audio, and
  • a volume control slider.

Let’s say this is the design we’re aiming for:

We’re not going for anything too fancy here: just a proof of concept sorta thing that we can use to demonstrate how to make something different than what default HTML provides.

Basic markup, styling and scripts for each element

We should first go through the semantic HTML elements of the player before we start building features and styling things. We have plenty of elements to work with here based on the elements we just listed above.

Play/pause button

I think the HTML element appropriate for this button is the <button> element. It will contain the play icon, but the pause icon should also be in this button. That way, we’re toggling between the two rather than taking up space by displaying both at the same time.

Something like this in the markup:

<div id="audio-player-container">
  <p>Audio Player</p>
  <!-- swaps with pause icon -->
  <button id="play-icon"></button>
</div>

So, the question becomes: how do we swap between the two buttons, both visually and functionally? the pause icon will replace the play icon when the play action is triggered. The play button should display when the audio is paused and the pause button should display when the audio is playing.

Of course, a little animation could take place as the icon transitions from the play to pause. What would help us accomplish that is Lottie, a library that renders Adobe After Effects animations natively. We don’t have to create the animation on After Effects though. The animated icon we’re going to use is provided for free by Icons8.

New to Lottie? I wrote up a thorough overview that covers how it works.

In the meantime, permit me to describe the following Pen:

The HTML section contains the following:

  • a container for the player,
  • text that briefly describes the container, and
  • a <button> element for the play and pause actions.

The CSS section includes some light styling. The JavaScript is what we need to break down a bit because it’s doing several things:

// imports the Lottie library via Skypack
import lottieWeb from 'https://cdn.skypack.dev/lottie-web';

// variable for the button that will contain both icons
const playIconContainer = document.getElementById('play-icon');
// variable that will store the button’s current state (play or pause)
let state = 'play';

// loads the animation that transitions the play icon into the pause icon into the referenced button, using Lottie’s loadAnimation() method
const animation = lottieWeb.loadAnimation({
  container: playIconContainer,
  path: 'https://maxst.icons8.com/vue-static/landings/animated-icons/icons/pause/pause.json',
  renderer: 'svg',
  loop: false,
  autoplay: false,
  name: "Demo Animation",
});

animation.goToAndStop(14, true);

// adds an event listener to the button so that when it is clicked, the the player toggles between play and pause
playIconContainer.addEventListener('click', () => {
  if(state === 'play') {
    animation.playSegments([14, 27], true);
    state = 'pause';
  } else {
    animation.playSegments([0, 14], true);
    state = 'play';
  }
});

Here’s what the script is doing, minus the code:

  • It imports the Lottie library via Skypack.
  • It references the button that will contain both icons in a variable.
  • It defines a variable that will store the button’s current state (play or pause).
  • It loads the animation that transitions the play icon into the pause icon into the referenced button, using Lottie’s loadAnimation() method.
  • It displays the play icon on load since the audio is initially paused.
  • It adds an event listener to the button so that when it is clicked, the the player toggles between play and pause.

Current time and duration

The current time is like a progress indicate that shows you how much time has elapsed from the start of the audio file. The duration? That’s just how long the sound file is.

A <span> element is okay to display these. The <span> element for the current time, which is to be updated every second, has a default text content of 0:00. On the other side, the one for duration is the duration of the audio in mm:ss format.

<div id="audio-player-container">
  <p>Audio Player</p>
  <button id="play-icon"></button>
  <span id="current-time" class="time">0:00</span>
  <span id="duration" class="time">0:00</span>
</div>

Seek slider and volume control slider

We need a way to move to any point in time in the sound file. So, if I want to skip ahead to the halfway point of the file, I can simply click and drag a slider to that spot in the timeline.

We also need a way to control the sound volume. That, too, can be some sort of click-and-drag slider thingy.

I would say <input type="range"> is the right HTML element for both of these features.

<div id="audio-player-container">
  <p>Audio Player</p>
  <button id="play-icon"></button>
  <span id="current-time" class="time">0:00</span>
  <input type="range" id="seek-slider" max="100" value="0">
  <span id="duration" class="time">0:00</span>
  <input type="range" id="volume-slider" max="100" value="100">
</div>

Styling range inputs with CSS is totally possible, but I’ll tell you what: it is difficult for me to wrap my head around. This article will help. Mad respect to you, Ana. Handling browser support with all of those vendor prefixes is a CSS trick in and of itself. Look at all the code needed on input[type="range"] to get a consistent experience:

input[type="range"] {
  position: relative;
  -webkit-appearance: none;
  width: 48%;
  margin: 0;
  padding: 0;
  height: 19px;
  margin: 30px 2.5% 20px 2.5%;
  float: left;
  outline: none;
}
input[type="range"]::-webkit-slider-runnable-track {
  width: 100%;
  height: 3px;
  cursor: pointer;
  background: linear-gradient(to right, rgba(0, 125, 181, 0.6) var(--buffered-width), rgba(0, 125, 181, 0.2) var(--buffered-width));
}
input[type="range"]::before {
  position: absolute;
  content: "";
  top: 8px;
  left: 0;
  width: var(--seek-before-width);
  height: 3px;
  background-color: #007db5;
  cursor: pointer;
}
input[type="range"]::-webkit-slider-thumb {
  position: relative;
  -webkit-appearance: none;
  box-sizing: content-box;
  border: 1px solid #007db5;
  height: 15px;
  width: 15px;
  border-radius: 50%;
  background-color: #fff;
  cursor: pointer;
  margin: -7px 0 0 0;
}
input[type="range"]:active::-webkit-slider-thumb {
  transform: scale(1.2);
  background: #007db5;
}
input[type="range"]::-moz-range-track {
  width: 100%;
  height: 3px;
  cursor: pointer;
  background: linear-gradient(to right, rgba(0, 125, 181, 0.6) var(--buffered-width), rgba(0, 125, 181, 0.2) var(--buffered-width));
}
input[type="range"]::-moz-range-progress {
  background-color: #007db5;
}
input[type="range"]::-moz-focus-outer {
  border: 0;
}
input[type="range"]::-moz-range-thumb {
  box-sizing: content-box;
  border: 1px solid #007db5;
  height: 15px;
  width: 15px;
  border-radius: 50%;
  background-color: #fff;
  cursor: pointer;
}
input[type="range"]:active::-moz-range-thumb {
  transform: scale(1.2);
  background: #007db5;
}
input[type="range"]::-ms-track {
  width: 100%;
  height: 3px;
  cursor: pointer;
  background: transparent;
  border: solid transparent;
  color: transparent;
}
input[type="range"]::-ms-fill-lower {
  background-color: #007db5;
}
input[type="range"]::-ms-fill-upper {
  background: linear-gradient(to right, rgba(0, 125, 181, 0.6) var(--buffered-width), rgba(0, 125, 181, 0.2) var(--buffered-width));
}
input[type="range"]::-ms-thumb {
  box-sizing: content-box;
  border: 1px solid #007db5;
  height: 15px;
  width: 15px;
  border-radius: 50%;
  background-color: #fff;
  cursor: pointer;
}
input[type="range"]:active::-ms-thumb {
  transform: scale(1.2);
  background: #007db5;
}

Whoa! What does even mean, right?

Styling the progress section of range inputs is a tricky endeavor. Firefox provides the ::-moz-range-progress pseudo-element while Internet Explorer provides ::-ms-fill-lower. As WebKit browsers do not provide any similar pseudo-element, we have to use the ::before pseudo-element to improvise the progress. That explains why, if you noticed, I added event listeners in the JavaScript section to set custom CSS properties (e.g. --before-width) that update when the input event is fired on each of the sliders.

One of the native HTML <audio> examples we looked at earlier shows the buffered amount of the audio. The --buffered-width property specifies the amount of the audio, in percentage, that the user can play through to without having to wait for the browser to download. I imitated this feature with the linear-gradient() function on the track of the seek slider. I used the rgba() function in the linear-gradient() function for the color stops to show transparency. The buffered width has a much deeper color compared to the rest of the track. However, we would treat the actual implementation of this feature much later.

Volume percentage

This is to display the percentage volume. The text content of this element is updated as the user changes the volume through the slider. Since it is based on user input, I think this element should be the <output> element.

<div id="audio-player-container">
  <p>Audio Player</p>
  <button id="play-icon"></button>
  <span id="current-time" class="time">0:00</span>
  <input type="range" id="seek-slider" max="100" value="0">
  <span id="duration" class="time">0:00</span>
  <output id="volume-output">100</output>
  <input type="range" id="volume-slider" max="100" value="100">
</div>

Mute button

Like for the play and pause actions, this should be in a <button> element. Luckily for us, Icons8 also has an animated mute icon. So we would use the Lottie library here just as we did for the play/pause button.

<div id="audio-player-container">
  <p>Audio Player</p>
  <button id="play-icon"></button>
  <span id="current-time" class="time">0:00</span>
  <input type="range" id="seek-slider" max="100" value="0">
  <span id="duration" class="time">0:00</span>
  <output id="volume-output">100</output>
  <input type="range" id="volume-slider" max="100" value="100">
  <button id="mute-icon"></button>
</div>

That’s all of the basic markup, styling and scripting we need at the moment!

Working on the functionality

The HTML <audio> element has a preload attribute. This attribute gives the browser instructions for how to load the audio file. It accepts one of three values:

  • none – indicates that the browser should not load the audio at all (unless the user initiates the play action)
  • metadata – indicates that only the metadata (like length) of the audio should be loaded
  • auto – loads the complete audio file

An empty string is equivalent to the auto value. Note, however, that these values are merely hints to the browser. The browser does not have to agree to these values. For example, if a user is on a cellular network on iOS, Safari does not load any part of an audio, regardless of the preload attribute, except the user triggers the play action. For this player, we would use the metadata value since it doesn’t require much overhead and we want to display the length of the audio.

What would help us accomplish the features our audio player should have is the JavaScript HTMLMediaElement interface, which the HTMLAudioElement interface inherits. For our audio player code to be as self-explanatory as possible, I’d divide the JavaScript into two sections: presentation and functionality.

First off, we should create an <audio> element in the audio player that has the basic features we want:

<div id=”audio-player-container”>
  <audio src=”my-favourite-song.mp3” preload=”metadata” loop>
  <button id="play-icon"></button>
  <!-- ... -->
</div>

Display the audio duration

The first thing we want to display on the browser is the duration of the audio, when it is available. The HTMLAudioElement interface has a duration property, which returns the duration of the audio, returned in seconds units. If it is unavailable, it returns NaN.

We’ve set preload to metadata in this example, so the browser should provide us that information up front on load… assuming it respects preload. Since we’d be certain that the duration will be available when the browser has downloaded the metadata of the audio, we display it in our handler for the loadedmetadata event, which the interface also provides:

const audio = document.querySelector('audio');

audio.addEventListener('loadedmetadata', () => {
  displayAudioDuration(audio.duration);
});

That’s great but, again, we get the duration in second. We probably should convert that to a mm:ss format:

const calculateTime = (secs) => {
  const minutes = Math.floor(secs / 60);
  const seconds = Math.floor(secs % 60);
  const returnedSeconds = seconds < 10 ? `0${seconds}` : `${seconds}`;
  return `${minutes}:${returnedSeconds}`;
}

We’re using Math.floor() because the seconds returned by the duration property typically come in decimals.

The third variable, returnedSeconds, is necessary for situations where the duration is something like 4 minutes and 8 seconds. We would want to return 4:08, not 4:8.

More often than not, the browser loads the audio faster than usual. When this happens, the loadedmetadata event is fired before its listener can be added to the <audio> element. Therefore, the audio duration is not displayed on the browser. Nevertheless, there’s a hack. The HTMLMediaElement has a property called readyState. It returns a number that, according to MDN Web Docs, indicates the readiness state of the media. The following describes the values:

  • 0 – no data about the media is available.
  • 1 – the metadata attributes of the media are available.
  • 2 – data is available, but not enough to play more than a frame.
  • 3 – data is available, but for a little amount of frames from the current playback position.
  • 4 – data is available, such that the media can be played through to the end without interruption.

We want to focus on the metadata. So our approach is to display the duration if the metadata of the audio is available. If it is not available, we add the event listener. That way, the duration is always displayed.

const audio = document.querySelector('audio');
const durationContainer = document.getElementById('duration');

const calculateTime = (secs) => {
  const minutes = Math.floor(secs / 60);
  const seconds = Math.floor(secs % 60);
  const returnedSeconds = seconds < 10 ? `0${seconds}` : `${seconds}`;
  return `${minutes}:${returnedSeconds}`;
}

const displayDuration = () => {
  durationContainer.textContent = calculateTime(audio.duration);
}

if (audio.readyState > 0) {
  displayDuration();
} else {
  audio.addEventListener('loadedmetadata', () => {
    displayDuration();
  });
}

Seek slider

The default value of the range slider’s max property is 100. The general idea is that when the audio is playing, the thumb is supposed to be “sliding.” Also, it is supposed to move every second, such that it gets to the end of the slider when the audio ends.

Notwithstanding, if the audio duration is 150 seconds and the value of the slider’s max property is 100, the thumb will get to the end of the slider before the audio ends. This is why it is necessary to set the value of the slider’s max property to the audio duration in seconds. This way, the thumb gets to the end of the slider when the audio ends. Recall that this should be when the audio duration is available, when the browser has downloaded the audio metadata, as in the following:

const seekSlider = document.getElementById('seek-slider');

const setSliderMax = () => {
  seekSlider.max = Math.floor(audio.duration);
}

if (audio.readyState > 0) {
  displayDuration();
  setSliderMax();
} else {
  audio.addEventListener('loadedmetadata', () => {
    displayDuration();
    setSliderMax();
  });
}

Buffered amount

As the browser downloads the audio, it would be nice for the user to know how much of it they can seek to without delay. The HTMLMediaElement interface provides the buffered and seekable properties. The buffered property returns a TimeRanges object, which indicates the chunks of media that the browser has downloaded. According to MDN Web Docs, a TimeRanges object is a series of non-overlapping ranges of time, with start and stop times. The chunks are usually contiguous, unless the user seeks to another part in the media. The seekable property returns a TimeRanges object, which indicates “seekable” parts of the media, irrespective of whether they’ve been downloaded or not.

Recall that the preload="metadata" attribute is present in our <audio> element. If, for example the audio duration is 100 seconds, the buffered property returns a TimeRanges object similar to the following:

Displays a line from o to 100, with a marker at 20. A range from 0 to 20 is highlighted on the line.

When the audio has started playing, the seekable property would return a TimeRanges object similar to the following:

Another timeline, but with four different ranges highlighted on the line, 0 to 20, 30 to 40, 60 to 80, and 90 to 100.

It returns multiple chunks of media because, more often than not, byte-range requests are enabled on the server. What this means is that multiple parts of the media can be downloaded simultaneously. However, we want to display the buffered amount closest to the current playback position. That would be the first chunk (time range 0 to 20). That would be the first and last chunk from the first image. As the audio starts playing, the browser begins to download more chunks. We would want to display the one closest to the current playback position, which would be the current last chunk returned by the buffered property. The following snippet would store in the variable, bufferedAmount, i.e. the time for the end of the last range in the TimeRanges object returned by the buffered property.

const audio = document.querySelector('audio');
const bufferedAmount = audio.buffered.end(audio.buffered.length - 1);

This would be 20 from the 0 to 20 range in the first image. The following snippet stores in the variable, seekableAmount, the time for the end of the last range in the TimeRanges object returned by the seekable property.

const audio = document.querySelector('audio');
const seekableAmount = audio.seekable.end(audio.seekable.length - 1);

Nevertheless, this would be 100 from the 90 to 100 range in the second image, which is the entire audio duration. Note that there are some holes in the TimeRanges object as the browser only downloads some parts of the audio. What this means is that the entire duration would be displayed to the user as the buffered amount. Meanwhile, some parts in the audio are not available yet. Because this won’t provide the best user experience, the first snippet is what we should use.

As the browser downloads the audio, the user should expect that the buffered amount on the slider increases in width. The HTMLMediaElement provides an event, the progress event, which fires as the browser loads the media. Of course, I’m thinking what you’re thinking! The buffered amount should be incremented in the handler for the audio’s progress event.

Finally, we should actually display the buffered amount on the seek slider. We do that by setting the property we talked about earlier, --buffered-width, as a percentage of the value of the slider’s max property. Yes, in the handler for the progress event too. Also, because of the browser loading the audio faster than usual, we should update the property in the loadedmetadata event and its preceding conditional block that checks for the readiness state of the audio. The following Pen combines all that we’ve covered so far:

Current time

As the user slides the thumb along the range input, the range value should be reflected in the <span> element containing the current time of the audio. This tells the user the current playback position of the audio. We do this in the handler of the slider’s input event listener.

If you think the correct event to listen to should be the change event, I beg to differ. Say the user moved the thumb from value 0 to 20. The input event fires at values 1 through to 20. However, the change event will fire only at value 20. If we use the change event, it will not reflect the playback position from values 1 to 19. So, I think the input event is appropriate. Then, in the handler for the event, we pass the slider’s value to the calculateTime() function we defined earlier.

We created the function to take time in seconds and return it in a mm:ss format. If you’re thinking, Oh, but the slider’s value is not time in seconds, let me explain. Actually, it is. Recall that we set the value of the slider’s max property to the audio duration, when it is available. Let’s say the audio duration is 100 seconds. If the user slides the thumb to the middle of the slider, the slider’s value will be 50. We wouldn’t want 50 to appear in the current time box because it is not in accordance with the mm:ss format. When we pass 50 to the function, the function returns 0:50 and that would be a better representation of the playback position.

I added the snippet below to our JavaScript.

const currentTimeContainer = document.getElementById('current-time');

seekSlider.addEventListener('input', () => {
  currentTimeContainer.textContent = calculateTime(seekSlider.value);
});

To see it in action, you can move the seek slider’s thumb back and forth in the following Pen:

Play/pause

Now we’re going to set the audio to play or pause according to the respective action triggered by the user. If you recall, we created a variable, playState, to store the state of the button. That variable is what will help us know when to play or pause the audio. If its value is play and the button is clicked, our script is expected to perform the following actions:

  • play the audio
  • change the icon from play to pause
  • change the playState value to pause

We already implemented the second and third actions in the handler for the button’s click event. What we need to do is to add the statements to play and pause the audio in the event handler:

playIconContainer.addEventListener('click', () => {
  if(playState === 'play') {
    audio.play();
    playAnimation.playSegments([14, 27], true);
    playState = 'pause';
  } else {
    audio.pause();
    playAnimation.playSegments([0, 14], true);
    playState = 'play';
  }
});

It is possible that the user will want to seek to a specific part in the audio. In that case, we set the value of the audio’s currentTime property to the seek slider’s value. The slider’s change event will come in handy here. If we use the input event, various parts of the audio will play in a very short amount of time.

Recall our scenario of 1 to 20 values. Now imagine the user slides the thumb from 1 to 20 in, say, two seconds. That’s 20 seconds audio playing in two seconds. It’s like listening to Busta Rhymes on 3× speed. I’d suggest we use the change event. The audio will only play after the user is done seeking. This is what I’m talking about:

seekSlider.addEventListener('change', () => {
  audio.currentTime = seekSlider.value;
});

With that out of the way, something needs to be done while the audio is playing. That is to set the slider’s value to the current time of the audio. Or move the slider’s thumb by one tick every second. Since the audio duration and the slider’s max value are the same, the thumb gets to the end of the slider when the audio ends. Now the timeupdate event of the HTMLMediaElement interface should be the appropriate event for this. This event is fired as the value of the media’s currentTime property is updated, which is approximately four times in one second. So in the handler for this event, we could set the slider’s value to the audio’s current time. This should work just fine:

audio.addEventListener('timeupdate', () => {
  seekSlider.value = Math.floor(audio.currentTime);
});

However, there are some things to take note of here:

  1. As the audio is playing, and the seek slider’s value is being updated, a user is unable to interact with the slider. If the audio is paused, the slider won’t be able to receive input from the user because it is constantly being updated.
  2. In the handler, we update the value of the slider but its input event does not fire. This is because the event only fires when a user updates the slider’s value on the browser, and not when it is updated programmatically.

Let’s consider the first issue.

To be able to interact with the slider while the audio is playing, we would have to pause the process of updating it’s value when it receives input. Then, when the slider loses focus, we resume the process. But, we don’t have access to this process. My hack would be to use the requestAnimationFrame() global method for the process. But this time, we won’t be using the timeupdate event for this because it still won’t work. The animation would play forever until the audio is paused, and that’s not what we want. Therefore, we use the play/pause button’s click event.

To use the requestAnimationFrame() method for this feature, we have to accomplish these steps:

  1. Create a function to keep our “update slider value” statement.
  2. Initialize a variable in the function that was previously created to store the request ID returned by the function (that will be used to pause the update process).
  3. Add the statements in the play/pause button click event handler to start and pause the process in the respective blocks.

This is illustrated in the following snippet:

let rAF = null;

const whilePlaying = () => {
  seekSlider.value = Math.floor(audio.currentTime);
  rAF = requestAnimationFrame(whilePlaying);
}

playIconContainer.addEventListener('click', () => {
  if(playState === 'play') {
    audio.play();
    playAnimation.playSegments([14, 27], true);
    requestAnimationFrame(whilePlaying);
    playState = 'pause';
  } else {
    audio.pause();
    playAnimation.playSegments([0, 14], true);
    cancelAnimationFrame(rAF);
    playState = 'play';
  }
});

But this doesn’t exactly solve our problem. The process is only paused when the audio is paused. We also need to pause the process, if it is in execution (i.e. if the audio is playing), when the user wants to interact with the slider. Then, after the slider loses focus, if the process was ongoing before (i.e. if the audio was playing), we start the process again. For this, we would use the slider’s input event handler to pause the process. To start the process again, we would use the change event because it is fired after the user is done sliding the thumb. Here is the implementation:

seekSlider.addEventListener('input', () => {
  currentTimeContainer.textContent = calculateTime(seekSlider.value);
  if(!audio.paused) {
    cancelAnimationFrame(raf);
  }
});

seekSlider.addEventListener('change', () => {
  audio.currentTime = seekSlider.value;
  if(!audio.paused) {
    requestAnimationFrame(whilePlaying);
  }
});

I was able to come up with something for the second issue. I added the statements in the seek slider’s input event handlers to the whilePlaying() function. Recall that there are two event listeners for the slider’s input event: one for the presentation, and the other for the functionality. After adding the two statements from the handlers, this is how our whilePlaying() function looks:

const whilePlaying = () => {
  seekSlider.value = Math.floor(audio.currentTime);
  currentTimeContainer.textContent = calculateTime(seekSlider.value);
  audioPlayerContainer.style.setProperty('--seek-before-width', `${seekSlider.value / seekSlider.max * 100}%`);
  raf = requestAnimationFrame(whilePlaying);
}

Note that the statement on the fourth line is the seek slider’s appropriate statement from the showRangeProgress() function we created earlier in the presentation section.

Now we’re left with the volume-control functionality. Whew! But before we begin working on that, here’s a Pen covering all we’ve done so far:

Volume-control

For volume-control, we’re utilizing the second slider, #volume-slider. When the user interacts with the slider, the slider’s value is reflected in the volume of the audio and the <output> element we created earlier.

The slider’s max property has a default value of 100. This makes it easy to display its value in the <output> element when it is updated. We could implement this in the input event handler of the slider. However, to implement this in the volume of the audio, we’re going to have to do some math.

The HTMLMediaElement interface provides a volume property, which returns a value between 0 and 1, where 1 being is the loudest value. What this means is if the user sets the slider’s value to 50, we would have to set the volume property to 0.5. Since 0.5 is a hundredth of 50, we could set the volume to a hundredth of the slider’s value.

const volumeSlider = document.getElementById('volume-slider');
const outputContainer = document.getElementById('volume-output');

volumeSlider.addEventListener('input', (e) => {
  const value = e.target.value;

  outputContainer.textContent = value;
  audio.volume = value / 100;
});

Not bad, right?

Muting audio

Next up is the speaker icon, which is clicked to mute and unmute the audio. To mute the audio, we would use its muted property, which is also available via HTMLMediaElement as a boolean type. Its default value is false, which is unmuted. To mute the audio, we set the property to true. If you recall, we added a click event listener to the speaker icon for the presentation (the Lottie animation). To mute and unmute the audio, we should add the statements to the respective conditional blocks in that handler, as in the following:

const muteIconContainer = document.getElementById('mute-icon');

muteIconContainer.addEventListener('click', () => {
  if(muteState === 'unmute') {
    muteAnimation.playSegments([0, 15], true);
    audio.muted = true;
    muteState = 'mute';
  } else {
    muteAnimation.playSegments([15, 25], true);
    audio.muted = false;
    muteState = 'unmute';
  }
});

Full demo

Here’s the full demo of our custom audio player in all its glory!

But before we call it quits, I’d like to introduce something — something that will give our user access to the media playback outside of the browser tab where our custom audio player lives.

Permit me to introduce to you, drumroll, please…

The Media Session API

Basically, this API lets the user pause, play, and/or perform other media playback actions, but not with our audio player. Depending on the device or the browser, the user initiates these actions through the notification area, media hubs, or any other interface provided by their browser or OS. I have another article just on that for you to get more context on that.

The following Pen contains the implementation of the Media Session API:

If you view this Pen on your mobile, take a sneak peek at the notification area. If you’re on Chrome on your computer, check the media hub. If your smartwatch is paired, I’d suggest you look at it. You could also tell your voice assistant to perform some of the actions on the audio. Ten bucks says it’ll make you smile. 🤓

One more thing…

If you need an audio player on a webpage, there’s a high chance that the page contains other stuff. That’s why I think it’s smart to group the audio player and all the code needed for it into a web component. This way, the webpage possesses a form of separation of concerns. I transferred everything we’ve done into a web component and came up with the following:

Wrapping up, I’d say the possibilities of creating a media player are endless with the HTMLMediaElement interface. There’s so many various properties and methods for various functions. Then there’s the Media Session API for an enhanced experience.

What’s the saying? With great power comes great responsibility, right? Think of all the various controls, elements, and edge cases we had to consider for what ultimately amounts to a modest custom audio player. Just goes to show that audio players are more than hitting play and pause. Properly speccing out functional requirements will definitely help plan your code in advance and save you lots of time.