Developer: | Microsoft |
Type: | Lossy audio codec |
Released: | 2020 |
Open: | No |
Satin is a lossy speech codec developed by Microsoft. Satin was designed to supersede the earlier Silk codec in their applications, and implements a neural network and novel signal processing to improve performance over its predecessor.[1]
Satin is designed to deliver good sound quality despite limited bandwidth or high packet loss, such as over unreliable WiFi or cellular networks.[2] Satin can produce output bitrates of 6 to 36 kbps, and operates on super-wideband audio (a 32 kHz sampling rate). Sound is encoded by processing a sparse representation of the input, then decoded with the help of a neural network that infers the high frequencies from the low ones. Because neural networks are computationally complex, optimization and vectorization of the network were required to achieve acceptable performance.[3] To improve resilience to packet loss, each packet is encoded independently and the codec has its own packet loss concealment system.
Silk was developed by Skype and can compress wideband speech in 14 kbps. Satin is considered to be Silk's successor, and was initially announced and implemented for Microsoft Teams in 2020. As of February 2021, it was used for all two-way calls in both Teams and Skype. According to Microsoft, a future release will add support for music in full-band stereo at bitrates of at least 17 kbps.
Microsoft claims that Satin's quality is significantly better than Silk, achieving mean opinion scores up to 1.7 points higher in low-bitrate A/B testing. Microsoft also notes that Satin's bitrate savings allows for sending more redundant data to increase resistance to packet loss.[4]
As of February 2021, Skype and Microsoft Teams implemented Satin for all two-person calls, and an expansion to larger Teams meetings was planned.