Today during Apple’s WWDC 2023 keynote, the company announced that iOS 16 will allow users of modern iPhones to scan the shape of their ear to create more accurate spatial audio. Likely implemented as an HRTF, creating custom HRTFs for consumers was once impractical due to the need for sophisticated equipment, but advances in computer vision are making the technology much more accessible.
When it comes to digital spatial audio, there’s a limit to how accurate the sense of ‘position’ or ‘3D’ the audio can be without taking into account the unique shape of the user’s head and ears.
Because everybody has a uniquely shaped head, and especially ears, elements of incoming sound from the real world bounce off your head and into your ears in different and very subtle ways. For instance, when a sound is behind you, the precise geometry of the folds in your ear reflect sound from that angle in a unique way. And when you hear sound coming to your ear in that particular way, you’re attuned to understand that the source of the sound is behind you.
To create a highly accurate sense of digital spatial audio, you need a model which accounts for these factors, such that the audio is mixed with the correct cues that are created by the unique shape of your head and ears.
Audiologists have described this phenomena mathematically in a model known as a Head-related Transfer Function (also known as an HRTF). Using an HRTF, digital audio can be modified to replicate the spatial audio cues that are unique to an individual’s ear.
So while the math is well studied and the technology to apply an HRTF in real-time is readily available today, there’s still one big problem: every person needs their own custom HRTF. This involves accurately measuring each ear of each person, which isn’t easy without specialized equipment.
But now Apple says it will make use of advanced sensors in its latest iPhones to allow anyone to scan their head and ears, and create a custom spatial audio profile from that data.
Apple isn’t the first company to offer custom HRTFs based on a computer-vision model of the ear, but having it built into iOS will certainly make the technology much more widespread than it ever has been.
During the WWDC 2023 keynote, Apple announced the feature as part of the forthcoming iOS 16 update which is due out later this year. It will work on iPhones with the TrueDepth camera system, which includes the iPhone 10 and beyond.
But just having an accurate model of the ear isn’t enough. Apple will need to have developed an automated process to simulate the way that real sound would interact with the unique geometry of the ear. The company hasn’t specifically said this will be based on an HRTF implementation, but it seems highly likely as it’s a known quantity in the spatial audio field.
Ultimately this should result in more accurate digital spatial audio on iPhones (and very likely future Apple XR headsets). That means that a sound 10 feet from your left ear will sound more like it should at that distance, making it easier to differentiate between a sound 2 feet from your left ear, for instance.
This will pair well with the existing spatial audio capabilities of Apple products; especially when used with AirPods which can track the movement of your head for a head-tracked spatial audio experience. Apple’s iOS and MacOS both support spatial audio out of the box, which is capable of taking standard audio and making it sound as if it’s coming from speakers in your room (instead of in your head), and accurately playing back sound that’s specially authored for spatial audio, such as Dolby Atmos tracks on Apple Music.
And there’s another potential upside to this feature to. If Apple makes it possible for users to download their own custom HRTF profile, they may be able to take it and use it on other devices (like on a VR headset for instance).
,