Bojan Belic

May 12, 20208 min read

Android Low-Latency Audio Post-Processing with Superpowered

Creating a responsive audio DSP application in Android can be a huge pain due to the fact that there is no easy way to manipulate the audio framework in the Java layer.

After struggling with different approaches such as Exoplayer (1/2), MediaCodec — MediaExtractor — MediaSync, OpenSL ES, etc. I’ve settled with the Superpowered SDK.

NOTE: ‘Just show me the code’

For anyone who just wants to dive right into the code, take a look at the following template project: Android Superpowered DSP Template

Otherwise, stick through the rest of the article which will give some background information on the template project and might answer some questions that arise.

Using Android’s MediaCodec

When my plans to create a post-processing library started to unfold, I had to take some constraints and design choices into account. One of the main features that I wanted was ‘responsiveness’. Technically, this translates to (fairly) low latency.

So the first thing I had to find out is how big the latency is using the default available Media framework. After messing around with it for a while, I got a working version by implementing the MediaCodec, MediaExtractor & MediaSync. To get the post-processing done here, I created a new class (EffectAudioTrack)on top of AudioTrack that overrides the write() method and passed that one on to MediaSync:

This did work, but the problem which I was experiencing was on the latency side. Due to the fact that the MediaCodec was using buffers of 32768 bytes in size, the latency I was experiencing with my default files (WAV 44.1/48 kHz) was around 360 ms depending on the samplerate.

32768 bytes = 16384 short samples for stereo 16384 samples /44100 samples/sec = 372 ms

This wasn’t responsive enough for me, so I had to switch tracks and decided to look into Google’s ExoPlayer.

Using Google’s ExoPlayer

In all fairness, the ExoPlayer is fantastic. It is very powerful, has great codec & streaming support and is very easy to use. The downside on the other hand, is that it is pretty huge and not made for the type of application (DSP) I wanted to use it for.

To keep the story short I did get it working by butchering the ExoPlayer a bit and making the buffersize configurable by modifying the BUFFER_SEGMENT_SIZE within the Renderer (implements the RendererBuilder interface).

But in the end I decided to step away from this track as well considering the ExoPlayer was just plain overkill for my application and I didn’t like the impact (VCS-wise) it had on the project nor architecture.

Using OpenSL ES

This wasn’t my first encounter with OpenSL ES considering I had used (and also struggled a bit with) it before when I wanted to make a real-time post-processing app that added some delay to the microphone input and routed it to the output (Similar to the audio-echo example in the Google NDK samples, but with added processing).

In this case I first recreated the MediaCodec chain in asynchronous mode, added the MediaExtractor to get access to the audiodata and in the onOutputBufferAvailable() callback method, I wrote the ByteBuffer to my OpenSL ES JNI implementation for output.

But whichever tricks I tried to get the desired effect, I either ended up speeding up the playback by far too much, I damaged the integrity of the audio data or ended up with too much latency as in the initial implementation.

Along came Superpowered

Just when desperation started kicking in after I exhausted most of my ideas and thought I was going to have to resort to either deep diving into OpenSL ES to get it working or implementing my ExoPlayer butcher work, I reencountered Superpowered.

I still remember reading a bit about it in the early days of Android Lollipop when I needed a better HLS solution for Android 4.4, but didn’t pay too much attention to it considering it was on a license base for HLS support.

Fast-forward several years and I couldn’t be happier about rediscovering the SDK. Superpowered simply makes your life a lot easier and packs a lot of useful extra features. Also, considering the native code you write with Superpowered is pretty much cross-platform, you could maybe also deploy an iOS app by using something like the Multi-OS Engine?

Below I’ll discuss the template/turn-key project I made, which should allow you to plug your own library or algorithm for post-processing into.

Template Project features

To make sure that the template overhead is not too large, I will only implement the following features & requirements:

ARM-V7 & ARM64-V8 support
Audio file compatibility: WAV/MP3— 44.1kHz (Superpowered will take care of this for you!)
PermissionManager for READ_EXTERNAL_STORAGE
UriHelper to de-obfuscate URI to actual filepath
Superpowered SDK integration using CMake
Integration of Vibrato-effect for demo purposes — This is just ‘a random effect’ I stumbled onto while searching through GitHub.

These features should normally result in a very concrete example with just the right amount of information to get you jumpstarted.

The Java Code

I’ve tried to keep the code here as simple & clear as possible, so it should be pretty much self-explanatory. All the code related to the SuperPowered SDK has been centralised inside of the SuperPoweredPlayer class. The rest is mainly there for enabling the UI functionality and permissions.

Should you have any questions, feel free to PM me, comment or create an issue in the Github repo.

The Native Code

There are 4 key parts in this section:

The JNI & rendering code: This code is the interface between the Java SuperPoweredPlayer class and the native SuperPoweredRenderer class.
The Superpowered SDK: ‘The Belle of the Ball’, this is what it’s all about. It basically contains the free to download SDK, minus a few binaries (ARM ABI’s only!) to reduce VCS bloat.
Vibrato-effect: I encountered this effect while searching for a very lightweight C/C++ based DSP effect on GitHub. I have the feeling it’s not in perfect working condition yet (due to some audible artifacts after the processing), but it does the job for demo purposes.
CMakeLists.txt: Once Android Studio added CMake support, it made life a lot easier for people who regularly include native code in their project. The experimental Gradle plugin wasn’t bad, but due to the scarce documentation and the modified API, it was more often a pain than not.

JNI & Rendering Code

This section will discuss the first 3 points considering they’re very closely tied to each other.

The JNI functions are extremely basic and are nothing more than a wrapper for some of the SuperPoweredRenderer methods. The rest of the code here is more interesting. The general inspiration came from stripping down & modifying the CrossExample project in the Superpowered SDK to fit the requirements of the template.

- Superpowered Components

There are 2 main components which are important for our application. These are the SuperpoweredAndroidAudioIO and the SuperpoweredAdvancedAudioPlayer class.

The first one will mainly take care of any audio I/O you require but we will only use the output sink to modify the outputbuffer through a callback. If you would like to make use of the microphone input for example, than you’d have to initialise it through this class as well.

The latter is the actual audio player itself. As can be seen from the documentation inside of the header file, it’s an extremely powerful component which can by itself already cover a ton of use cases that you might want to incorporate in your application. There’s no use in reinventing the wheel, especially not if it’s high-grade & free!

- Initialisation

As soon as the SuperpoweredRenderer constructor gets called, the core of the application is already being set into motion:

We start off by defining our stereo buffer size using the specs defined inside of SuperpoweredAdvancedAudioPlayer’s header file. This is the buffer we’ll sink the processed data into. Once this has been done, we’ll create a new instance of the SuperpoweredAdvancedAudioPlayer.

We’ll create a pretty standard player, so no cached points for looping, no negative seconds for added silence, just none of that fancy stuff. We keep it very basic. As we can see in the Gist above, there are 4 arguments we pass:

&audioPlayer: We pass a reference to the audioPlayer object itself so we can call it's methods & references should we require it from the callback.

playerEventCallbackA: This callback can be used as an eventhandler for the various events available in the SuperpoweredAdvancedAudioPlayer.

samplerate: Pretty straightforward

0: No point caching necessary

Once the player is instantiated, we call the ::open() method to pass it a reference to the file we want to playback, along with some properties such as the offset & total length (both in bytes).

Now that our player is ready to start processing the file, we need to initialise our SuperpoweredAndroidAudioIO. For our application, we will call it with the following arguments:

samplerate: Pretty straightforward

buffersize: The buffersize / number of samples we defined in the Java layer false: We pass false here considering we don't want to make use of any input peripherals

true: We pass true here considering we are going to make use of the audio output

audioProcessing: This is going to be our implementation of the audioProcessingCallback. We are simply going to make sure that whenever the callback occurs, it calls our SuperpoweredRenderer::process() method where we can apply our effect processing, and write the resulting audiodata into the audioIO buffer reference we get from the callback. The content inside of the audioIO buffer is what will be streamed to the device output

this: We pass our SuperpoweredRenderer as a reference to be able to call the ::process() method from the callback

-1: No input stream

SL_ANDROID_STREAM_MEDIA: Defines the stream type for Android's audiosystem. This is basically the same as android.media.AudioManager.STREAM_MUSIC

0: Define how many samples to keep in the internal FIFO buffer when using both input as output. Not relevant for our application.

Once this is done, we just initialise the Vibrato-effect using the instructions provided in the project’s README.md.

- Processing

Now comes the most interesting part and this is where the magic happens. Take a look at the ::process() method below:

There are 3 blocks of operations happening here:

1. Retrieving data

We call the audioPlayer’s process method which will fill our stereoBuffer with the audio data from the file reference we passed in its ::open() method. All arguments are pretty straightforward and we don’t require any of the syncing arguments which can also be passed. If it managed to put something into stereoBuffer, it will return true and otherwise false. We’ll store this value into our silence variable. Keep in mind that the samples available in stereoBuffer are stereo interleaved.

2. Effect processing

Once we have access to the audio data, we can finally apply some DSP/effect processing on it. Here we call the Vibrato-effect’s processOneSample() method and store the output back into our stereoBuffer. First we check whether stereoBuffer actually contains any valid audio data, and if so we apply the effect. Afterward it’s time write our result to the output.

3. Write audio to output buffer

Now we can write our stereoBuffer into the output buffer that will be passed on to Android’s audio output. Considering Android’s audio framework is based on 16-bit short type, we need to convert our data as well. Luckily for us, Superpowered has a very convenient method called SuperpoweredFloatToShortInt(), which will actually take care of this for us and even copy the converted values into the output buffer.

Once this is done, our native code is ready and we can focus on our CMake build file.

CMake

Considering our application is rather limited, the CMake file will be quite limited as well. Most of the calls are already commented on inside of the CMake file, but some extra info could be useful:

Before you create the CMake file, don’t forget to first modify your Gradle file to support CMake. I do realise that using file(GLOB …) is considered a bad practice, but considering the amount of source files inside of this project are limited and well-defined, it can’t really hurt.

In the CMake file we will order the build of a new shared library called superpoweredeffect. This is also the name we call in our MainActivity when calling loadLibrary (don’t forget to do this!). Make sure that when you link the libraries, you include both OpenSLES (which Superpowered is based on) and one of the static libraries inside of the SDK which aligns with your current target ABI. This can be implemented easily using the ${ANDROID_ABI} variable.

You’re now ready to venture into the world of Android DSP by yourself! If you liked this article, please feel free to share it so others can enjoy it as well. And should you still have any questions, feel free to contact me through whichever medium you prefer.