TECH TALKS: #Convergence and the Role of Audio in the Modern Video Matrix Switcher

Those who follow the blog should be familiar with my distaste for industry buzzwords, particularly the overused word “collaboration,” which I term “hashtag collaboration.” A close second to that famously clichéd term is another, similarly vague buzzword: convergence (or should it be #convergence?). As the rise in digital technology has allowed different products to combine, many in my field have spoken at length about the idea of “convergence,” using the word any time two different ideas came together, from the crossover of AV and IT to devices like Crown amplifiers with on-board Digital Signal Processing (DSP).

Two concepts you might think have already been “converged” are audio and video. After all, “audiovisual” as a term has been around since 1937. However, when it comes to professional AV, these two have traditionally been separate systems (we’ve talked about the “convergence” of the AVL signal chain before), especially in the world of integrated AV and large matrix video switchers. That brings us to today’s “Tech Talk,” a series where we speak with the experts here at HARMAN about technology trends and challenges. I had the opportunity to sit down with Paul Hand, Product Manager, Video Distribution for HARMAN Professional Solutions, and talk about how the role of audio in video switching and distribution has changed in recent years, and how that relates to new technology, such as the audio switching board on the AMX Enova DGX Series digital matrix switchers.

[SKD] How have you seen technology needs change regarding traditional AV installations with video switching, especially the way audio works with video?

[PH] When the industry first provided digital video solutions, it was a big deal. In the past, it had always been that video was handled using a video switching plane and audio on an audio switching plane. When the HDMI format came along and you could actually have embedded audio with the video. That was kind of a big deal.

Some of the earlier video distribution systems had audio associated with them, but in many cases, and especially in the larger command and control operations, there were plenty of applications in which audio support or discreet audio switching wasn’t necessarily needed. If you had an embedded audio path with a video signal, you would be perfectly fine. The audio was always going to go to the same place as the video.

But then came the issue of, “How do we get easy access to that audio? How do we switch it?” If you wanted to break the audio away and amplify it for sound reinforcement or modify it in some way, there was no easy way to access or get audio into or outside of the system from the origination point or the final destination point.

Instead, designers had to figure out a way to decouple the audio and video, and more importantly, consider how they could do the things that they wanted to do. Oftentimes, a designer might want to alter the audio a bit, maybe slow it down for other video processing happening somewhere else. They might want to tweak the audio from a sound perspective, maybe change the format from stereo to mono. There are plenty of things they might want to do that they could probably achieve by just extracting the audio and taking it to a DSP. The industry really wanted to be able to manipulate the audio separately and even distribute it separately, and there there are some specific reasons why.

Take, for example, a room like a cafeteria. Stereo in a space like that doesn’t make sense, right? There is no such thing as left and right in a space like this—it depends on where you’re sitting. And if I just pass the left signal to one side and the right somewhere else, you’re only going to hear half of the content. So, being able to take what started as a left and a right, sum that and send it into a room as a sum stereo in a mono format to all the speakers is a big deal.

Another example would be a room that is filled with soft sound absorption material. You’re going to have very different acoustical characteristics than a room with hard surfaces. You can have the room set up perfectly and as soon as it gets filled up with people, the entire setup changes because of the way bodies absorb or reflect sound. With 10-band parametric equalization on every output and the ability to change the acoustical attributes programmatically from a control system, you could have presets based on how many people are going to be in a room.

Back in the analog days, you just handled two separate audio and video planes. There was the video switching plane and then there was the audio switching plane, and audio was pretty basic because it was always stereo audio in, stereo audio out and you just switched it. If you wanted to run some of those feeds to a DSP, you did that, but once we started combining and coupling those signals, the industry saw a lot of benefit from being able to have those as embedded signals, one cable connector from HDMI and have all that media pass on the same cable. You almost took a step back. It was ease of use for the guy putting the cable in, but from a programmer, integrator or audio designer perspective, it was difficult because now they didn’t have access like they did before. Now, we’re finally getting back to that spot.

It is still easy to put together from an integration standpoint, and it allows the designer to do everything they want to do on the video and audio side. It’s funny, in the digital world, we got back to the last generation and now we’ve leapfrogged. In this generation, now we have the ability to do everything we could before, but we’re doing even more, and that’s where the integrated audio processing capability comes into play.

It makes sense that we, as HARMAN, have awesome audio processing capabilities inside the Enova DGX platform. We’re already touching it; we already have access to all those audio paths. Why not give an integrator or programmer the ability to go in and either set something up once, based on the acoustical attributes of a room or— this is the really cool thing from a programmer’s perspective—because everything is controllable, you could have presets based on your expected room population for a particular room configuration.

[SKD] With what we know about the centralized infrastructure switching products like the Enova DGX, how does that relate to the rest of an audio system and the rest of a complete HARMAN system?

[PH] Looking at the flow, the ins and outs, there’s something really interesting that often gets overlooked. Inside a video distribution solution, oftentimes we want to mix in or provide additional audio ins and additional audio outs that may not have originated with video sources. I may want to tie into a public address system or some mixed microphones along the way. So, it seems like inevitably the need to bring in auxiliary audio sources or have auxiliary audio outs that are going to do even further DSP work, for some true mixing with some other sources or mixing a couple of audio sources together, so they can live on a common output.

Here’s the challenge. In most video distribution solutions on the market, what we see is a video input is an audio input, or conversely, an audio input is a video input. Which means that if you want to bring in an audio only source, you do it by taking up a video input. The problem is cost. If you look at the digital video backplane, there’s a lot of expense in that high-speed digital backplane required to deliver high definition video. Audio doesn’t need all that. When you add in a few audio sources and take up those video spots, that is a lot of processing power designed for video switching capabilities you aren’t leveraging. This is problematic, because moving to the next-sized system is a lot of additional cost for extra room on the backplane that you don’t need. It’s like buying another laptop so you can have an additional power cord. It doesn’t make sense from a design perspective.

The cool thing with the audio switching board on the Enova DGX 100 series is that we provide eight additional stereo inputs and eight additional stereo outputs that do not consume any of those very expensive video spots for audio sources or audio outputs only. They’re just like all the other video ins and outs, though—you can freely switch them. The user can have all the flexibility of consuming a video spot without actually doing that. We’re giving the customer that capability as well as the digital signal processing, and it’s all included with the audio switching board solution. That’s a big deal.

[SKD] Tying back to the design, let’s say you have a system where you have audio from a source device that’s going into a soundboard in a room like a mixed-use space. Systems like that also have microphones that you need in the mix, and you might want that combined signal over a PA or even combined with video for your video distribution outputs around the building. How would the system handle that?

[PH] In a system like that, you can output the audio using one of the audio-only outputs on the audio switching board. You could even peel off audio from multiple inputs and send them to the audio outputs. You would take that audio and go into a soundboard, mix it with microphones and other audio sources, and then have the post mix go back into the input side of the Enova DGX into the audio switching board. Now you can switch that “mixed” signal freely to any output. You could have homogeneous sound going to the speakers and the displays, as you would to a mixed audio system or an independent audio system that is fed from the same mixed output. This is great for overflow rooms. It is incredibly versatile, and it gives you the access points before and after the switch, with the ability to truly switch it and then embed mixed or processed audio in any output you want.

A tremendous thanks to Paul Hand for taking time to speak with us about this topic. It is very fascinating the way the AV industry has changed over recent years.

Do you have experience with digital matrix audio and video switching? Share your thoughts in the comments.

TECH TALKS: #Convergence and the Role of Audio in the Modern Video Matrix Switcher

S. Kyle Davis