What is "spatial audio"?What is the...

09
09

What is "spatial audio"?What is the difference between Apple and Sony?Easy -to -understand explanation | Mogura VR

Recently, "I have heard the word spatial audio (spatial audio). Apple, Amazon, Sony, etc. have been active, and in the music field in 2021, the most technical part of the music field.It became a notable element.

So what is this technology?Of course, it is related to VR and AR, and it can be said that it is a continuous technology in the first place.I would like to explain the outline again.

How do people listen to the sound with "ears"

First, let's enter from the basics.

Humans are listening to the sound with two ears.To put it simply, it's a stereo with two left and right channels, but the reality is not that simple.

The sound in the real space resonates in various places, and the sounds are mixed and reaches the ears.At that time, the sound is echoed and concentrated in the shape of the ear (so -called outer ear shape), which vibrates the eardrum, transmitted to the nerves as an electrical signal, and is treated with the brain and becomes a "sound".

The reason we feel the three -dimensional effect and space from the sound is that when the brain recognizes the sounds that come into the ears, the space is "reconstructed in the brain" from the information that takes into account the response.It is a great ability, but if you go further, you will reach the position recognition technology of bats and dolphins, the "echo -location".Well, let's say another story.

In short, there is no doubt that the sounds that are originally around and the sounds coming out of the speakers are "different".

Therefore, the recording engineer has devised the so -called "stereo audio" world that has tried to create as good experiences as possible with left and right stereo.

Initially, speakers were the center, and the sound of the two channels was sounded from the speakers located in a distant place, and the sound field was reproduced in a pseudo manner.

What caused the change in the "headphone audio", the appearance of equipment represented by Walkman.By attaching headphones near your ears, you can create an experience like a sound, and the music has a different experience from real concerts and performances.

In the headphones, the position of the left and right sound field changes by sounding the sound from near the ear.The sound field feeling that was originally born in the speaker + surrounding space disappears.This is generally called "head -in -head" and is a phenomenon in which a sound is heard in the head and a three -dimensional effect is not felt.Although it is essentially unnatural, "the feeling of being wrapped in sound" is superior, so we have tolerated it.

.......... The first is the situation as an "audio".First, keep this in mind and enter the next phase.

First, start from "multi -channel"

The act of making a sound field in the left and right stereo has an essential limit.The sound of the speakers comes out only from two places, but in reality, sounds are generated from various places.

In movie theaters, etc., to create a more powerful space, multiple speakers were placed and sounds were made to increase the density of expression.The relative position for arranging the speakers is set to some extent, and the sound from each speaker was treated as a "channel".If it is a stereo, it is "2 channels", and 4 channels for 4 on the front, back, left and right.Prepare 4 on the front, back, left and right, one in the center, and a sub dedicated bass..One channel.

I guess some familiar words have come out.

(The image is 5.1 channel home sheer speaker)

Originally, it was a technology for movie theaters, but since the 1980s, there has been a technology to provide each channel in order to reproduce it in the home theater."Surround" is commonly referred to as this kind of technology.

In the so -called surround technology, it is necessary to arrange the number of speakers according to the number of channels in an appropriate array.However, it is often difficult at home, so software processing has created a technology that reproduces the way to hear with a smaller number of speakers and headphones.That is commonly called "pseudo -surround" and "virtual surround".

As this kind of technology has become widespread, it has become possible to create an environment that makes it easy to hear sounds, from movies to music live and games.

Increasing the number of channels and raising the reality is suitable for films and concerts that sit in the same place and watch works.

On the other hand, if your direction changes depending on the movement of the video or the thing in the video moves interactively, there is a better approach than simply increasing the channel.In the case of movies, etc., it is necessary to consider the introduction of new technologies if the experience can be heard from the top.

That's where the idea of "object -based audio" comes out.

This is easier to understand if you think that the method of making the video in CG into a sound is turned into a sound.

In CG, objects and lights are placed in a virtual space, taking into account the reflections of the results of light from multiple light sources, and "shooting" the image seen from the camera (yourself) in the space.Create a video by calculating with an image.

Object -based audio is simply a converted CG light source into a "sound source".A real sound is reproduced by placing a sound source in a virtual space and calculating how the sound coming out of it is transmitted to the ears.However, the load is not so large because the sound does not perform as much as the image is as complicated as a video.

Originally, this technology goes well with the game.The fact that the sound changes according to the movement of the player and the position of the sound emitted by enemies can lead to realism in playing games.Currently, it is commonly used in many games, but "PlayStation 5" is focusing.Equipped with a high -performance processor "Tempest 3D Audio" for processing object -based sounds, it is used as a differentiation factor.

https: // www.YouTube.com/watch? v = gsg17wzbo1y

[Dialogue] Deeping the game console The immersive VR is involved in the reason why we should do PS5 | Mogura VR

Moguravr

Good compatibility with the game means that it goes well with VR/AR.When moving in a virtual space, the sound information is more valuable than the video, and of course VR/AR is an important technology.

In VR, the position where the sound emits according to your own movement creates a sense of immersion.The same is true for AR, but especially in the case of AR, not only the images but also the sound can be used as "part of the reality that extends" by making a sound in the direction you want to watch and the direction you want.

"Object -based audio" that determines how to hear from the position of the sound

Object -based audio has a method of generating sounds on its own, such as games, and a method of regenerating the audio data that was recorded in advance.It is mainly used in music and movies.

The fact that the sound source is placed in multiple space is the same as a game, but it cannot be moved by a controller with a controller.However, since you can hear a three -dimensional sound centered on yourself, it is possible to improve the three -dimensional effect in conjunction with the movement of the neck.

For example, in the case of music, the sound of the drums from the drum at the back of the front, the sound of the guitar recorded from the guitarist on the right hand, and the vocal sound from the central vocal position.Make it close to the feeling of playing in front of you.It may seem like a past surround, but the object -based is better to reproduce the movement while singing by the vocalist, or to process processing that resembles the response of the concert hall and the difference in seats.It's much easier to do.

The data format for this is Dolby Atmos, which is specified by Dolby, and "MPEG-H 3D Audio", which is standardized in MPEG participating in ISO.Dolby Atmos is used in movies, UHD BDs, various video distribution, and music distribution of Apple and Amazon.MPEG-H 3D Audio is adopted by Sony and deployed under the brand name of "360 Reality Audio (360RA)", Amazon, DEEZER, NUGS..It is used in several music services, such as NET.

"Binaural" that records the sound you hear to your ears as it is

There are several ways to record three -dimensional sounds, completely separate from these.

For example, a method of setting up a large number of microphones to record the sounds in each direction.The sound heard in that place is recorded in each direction of the microphone, and the three -dimensional effect is expressed by mixing it during playback.

ZOOM's voice recorder "H3-VR" is a representative example of equipment used in these records, but if you look at the photos, it will be easy to imagine how to use it.ZOOM voice recorder "H3-VR"

Another method is to record stereo, imitating the characteristics of "what sound can be heard with your ears".It is called "binaural recording" that adds a microphone to both ears of "dummy head" that imitates a person's head and recording the response with the external ear.In this case, listening with headphones is often enjoyed by reproducing the environment at the time of recording.

(Dummy head microphone "Neumann / KU100" for binaural recording)

The secret is in "HRTF"

I understand how to "leave" and "generate" sounds.So how do you reach your ears?

What is used in movie theaters is a method of placing a large number of speakers in a form that matches the standard, as mentioned.Originally, this is the most ideal, and the same way of thinking is used in the home theater, but the hurdles of equipment and equipment are still high.As shown above, there is a method of simulating sound reflections well, and achieving a three -dimensional effect with a small number of speakers.

And to make it even closer, you need to use "headphones".Currently, music, movies, games, etc. are spreading because we have been using headphones to enjoy object -based audio.

The point is that if you listen to a spatial audio with headphones, you will have a three -dimensional effect, and at the same time, the "head -in -head", which was the fundamental challenge of headphones, will be eliminated to some extent.Not only has a three -dimensional effect, but also has a more natural feeling.This is why spatial audio is attracting attention in music services.

So what kind of mechanism is used to give a three -dimensional effect with headphones?That is the "head transmission function (HRTF)".

When we listen to the sound, the eardrum receives the sound that resonates in the shape of the ear and the sound vibrating with the head born of the sound."HRTF" quantifies the characteristics of how it changes when the sound is heard through the head and ears.By changing the frequency characteristics of the sound using HRTF, the sound transmitted from the headphone is easier to feel as a 3D audio.

However, HRTF is a data with many individual differences.It seems that it changes greatly in the shape of the ear, but if HRTF does not suit you, the way you hear it will change and you will not feel a three -dimensional effect.

How do companies use "spatial audio" technology?

Here, Sony and Apple adopt a good -to -contact method.

Sony has introduced a mechanism to optimize by measuring individual HRTFs after setting the "Headphones Connect" provided, "HEADPHONES CONNECT".It is a technology that shoots ears with a smartphone app, and calculates personal HRTF from there.In this case, tuning according to the acoustic characteristics of each headphone is essential.

Therefore, Sony incorporates a function that optimizes 360RA into the "smartphone app for the company" and further matches the company headphones to realize HRTF optimization.In other words, "you can listen to any headphones, but you can enjoy more optimized sounds with Sony".

Sony has introduced a technology that photographs his ears and optimizes HRTF from there.

(The photo shows the "Sony Wireless Noise Canceling Earphone WF-1000XM3" corresponding to the above technology)

(The same technology is also used in the wireless neck band speaker "SRS-NS7" released on October 29).

This technology can be used by headphone makers who signed a contract with the company.At present, in addition to Sony, audio -technica and Radius have been licensed, and we have already released compatible products from audio technica.

Apple, on the other hand, also uses the "motion sensor" in the headphone while setting HRTF that suits many people.The three -dimensional effect is emphasized by changing the direction and position of the sound field according to the movement of the head detected by the motion sensor.

What exactly is different?

In both cases, while using any headphones, "spatial audio experience" can be done, but if you aim for the best experience, it is desirable to use "corresponding headphones" for each service.

Sony's "Headphones Connect" corresponds to almost all of the company's headphones, and it is relatively easy to respond to other companies' headphones, but it is quite troublesome to take a picture of the ears and optimize HRTF.It is.

Even with the same Sony, PlayStation 5 adopts a form that can be used to make the HRTF design "decisive" to make it easier to handle and experience without taking ears.The point is that it is easy for anyone to connect the headphones to the controller and can be used easily, but in the future it is also considering introducing HRFT tuning that matches your ears, just like audio.

Apple, on the other hand, does not have any troublesome tasks like ear measurement, but the best spatial audio experience is Apple headphones with motion sensors, specifically, "third generation AirPods" and "AirPods Pro."AIRPODS MAX" is limited.

(The photo is "Third Generation AirPods". Only "AirPods Pro" and "AirPods Max" can enjoy the most quality spatial audio)

At present, I think that Apple is more advantageous, including the amount of content and the simplicity of handling.However, the situation will change in this area as the corresponding services and types of equipment change.Although he has not reached a concrete agreement, Apple is discussing with Sony about the compatibility of "360 Reality Audio compatible songs", and there are various changes over the years.There is also a possibility.

Regarding the use of games, EMBODY provides "Immerse Spatial Audio" for Windows PC.This is also a technology that optimizes HRTF and provides spatial audio according to individuals, but at the popular Square Enix MMO RPG "Final Fantasy XIV: Akatsuki Finale" released on December 7 this year.Official response has been made, and attention has been focused.

Either way, the ingenuity of such headphones has made it easier to enjoy a wide range of sound experiences that can feel the spacing spread, from music to games.

It goes well with VR, but hardware use is still limited.At present, it is at the stage of using it like a special effect in apps such as games.If support on the platform side is enhanced, it will be easier to use, so it is expected that each company will be well incorporated into the product in the future.

In that sense, there is an expectation that the "next -generation PlayStation VR", which is being developed for PS5, may be used well because PS5 is a platform for spatial audio.

Not only games, but also ideas such as making use of user interface to make it easier to operate, so I would like to look forward to the ingenuity of each company.

Writing: Muneyoshi Nishida