Friday, November 25, 2011

What went wrong this time? (Adventures with STK)

Sorry, no pretty pictures in this post... Also, this post is moderately technical in nature, so browse away now if you are not interested in audio-programming jargon. You'll be happier for it. If you know Qt and STK well, you might find this post slightly amusing. If you are new to Qt and STK, you might not know what I'm talking about at all. This post is personal, I guess.

I've made the first steps for physics, interaction and graphics, and since I'm interested in making desktop games, it is only natural to do audio at some point...

I thought it would be easy to integrate Qt's audio functionality when using the qmake tool-chain... add qtmultimedia to the config in your .pro file, and include a QAudioOutput class somewhere and trigger sounds from that... Right?

Well, it kinda is... but not really... QAudioOutput is a reasonably low-level interface class and can read from a custom QIODevice derived class (pull mode, good for streaming... limited real-time application) or you are provided with a QIODevice which you can write to once a notify interval has elapsed (push mode... more real-time).  A QIODevice "talks" in raw bytes... so either way you still need to take care of formatting your audio data correctly to match the stream parameters (code for this can be seen in the Generator::generateData method in the Audio Output Qt example ). Also, no mixing/routing functionality is provided... you're expected to implement that or use some third party library like STK. So, although the QAudioOutput class provides a solid cross-platform starting-point, it still requires you to implement a lot of the low-level, broiler-plate, byte-array, endian-conscious code. Also, this part of the Qt framework is under active development so it might drastically change  in the main Qt distribution as projects become merged into it. It is inevitable that technology will pass you by... but it is all part of the learning process.

For a while I've wanted to use Perry Cook & Garry Scavone's STK (https://ccrma.stanford.edu/software/stk/) to provide the basis for a soft-synth audio framework for my game engine. As a preference I decided on RtAudio and STK over PortAudio and CSound, though I know much is interchangeable and CSound has a great deal more features. I was drawn in bySTK's well balanced compromise between latency (#1 consideration for games), footprint and readability.

The aim was to have an audio engine class which gets instantiated along with the script environment, which would also provide script interfaces into a range of STK classes and to dynamically connect them in a modular fashion... Since STK can be compiled without RtAudio dependence I though it best to stick with the QtMultimedia module since I was relying heavily on Qt already.

One can conceivably tie STK's non-real-time build in with a push mode implementation involving QtAudioOutput, but, after wasting much time, in the end experimentation with RtAudio  made me decide to use it instead of QtAudioOutput. The biggest factor for me was that RtAudio already deals with floating point samples and transparently does the conversion to raw audio for a wide range of audio formats. Also, it has a relaxed open source license and is cross-platform (Linux/Mac/Win). My DSP courses dealt mostly with audio signals represented as numbers in the range -1.0 to 1.0, so RtAudio is a more intuitive fit in this regard also.

On the plus side, choosing RtAudio over QtMultimedia also provided the additional ASIO support in windows ( although ASIO support is compiled and tested, I'm  defaulting to dSound on Win32 platforms for now) and RtMidi cross platform MIDI IO support ( which has also been compiled in but is waiting untapped). The STK source files have been statically incorporated into my project not because I planed it, but because I was initially only using some elements of STK... Eventually all of STK source got included. I might eventually move back to just linking to the STK library. Right now I don't care much because the code footprint is small and all the STK classes are pre-compilable, so they don't affect consequent compile times much (exception is lesson 1).

Lessons learned:

  1. Complacency can make you forget the classics: Changed a define in your project file? CLEAN! REBUILD! (And don't forget the cleaning part. Too embarrassed to say how much time I wasted dealing with puzzling bugs after forgetting to do this).
  2. RtAudio's closeStream() method should never be called from a destructor of a class your RtAudio instance is part of... safer to have a pointer to a RtAudio instance which you manually delete after step 3.
  3. You need the "bool done" tick-data element you see in all the STK code examples... no escaping it if using the tick() callback functionality. In the main tick() callback check for data.done and return -1 if it is true; The audio engine class destructor should set the "data.done" value to true and should sleep for a couple of microseconds to allow any pending callbacks to return a -1... only then is it safe to delete the RtAudio instance without a hang (on my system at least).
  4. Building STK with Qt(Qt Creator/qmake) MINGW: If like me, you added all the stk/include and stk/src files to your Qt project, you'll need to add some defines and library dependencies to your .pro file to get everything linking and compiling right. To get the compiler setting and project defines for STK, run STK's configure script from within a msys shell to generate a makefile. Reading this makefile will tell you what defines and libraries you need to add to your Qt .pro file (you can also look at the VC project files that come with STK but it is more verbose and the linker syntax is different). 
So, what do I have?

Not much really, just getting everything compiled together and first steps of simple integration for now. Tones are generating; instruments are playing; filters are filtering; effects are effecting even... but only C++ tests hacked into the audio engine's initAudio() method for the interim. The initial tests show promise. No audio script interfaces or script definable modular flow graphs have been implemented yet... that is next.

I still have a lot to do with the sound-programming before my next update... sorry, I know I can ramble sometimes. This post went on for longer than intended.

Useful links:

No comments:

Post a Comment