audio_system(9) - NetBSD Manual Pages

Command:
Section:
Arch:
Collection:
>>>
AUDIO_SYSTEM(9)        NetBSD Kernel Developer's Manual        AUDIO_SYSTEM(9)


NAME

     audio_system -- the NetBSD in-kernel audio mixer specification


INTRODUCTION

     This document aims to describe all aspects of the in-kernel audio mixer
     included with NetBSD 8 and onwards, describing its current behavior as of
     2018.


VIRTUAL CHANNEL (VCHAN)

     This is the most fundamental element to the mixer.  The vchan has all of
     the properties of the traditional single open NetBSD audio channel.  It
     consists of playback and record rings along with audio_info structures.

     Upon opening of /dev/audio or /dev/sound, a new vchan and mixerctl struc-
     ture is created.  In the case of /dev/sound, audio_info structures are
     inherited from the last open of /dev/audio or /dev/sound.

     All vchans are up or down sampled into the mix ring (intermediate) format
     before being sent to hardware.

     It is described in the following diagram:

             VCHAN1---------\
                             \   VCHAN0
             VCHAN2-------------MIX RING ---- HARDWARE
                     ...     /
             VCHANn---------/

     In the case of sysctl(8) usemixer=0 (see below), there is only one vchan
     whose play and record rings are the hardware play/record rings.

     User accessible vchans are numbered starting at one (1).  Vchan 0 is used
     internally by the mixer for the mix ring and its ring buffers are not
     user accessible.

     The only limit to the number of open vchans is the speed of the computer
     and the number of free file descriptors.


BLOCK - SIZE / LATENCY

     A block of audio data is the basic unit for audio data.  Audio applica-
     tions will not commence playback until three (3) blocks have been written
     - this is the source of latency in the mixer along with the size of the
     audio data block.

     For normal uses of audio read/write there will be three blocks of audio
     data before playback commences one in the vchan, one in the mix ring and
     one in the hardware ring.

     The size of the audio data block is dependent on the audio format config-
     ured by the application the latency sysctl(8) and the underlying audio
     hardware.

     Some audio hardware devices only support a static block size, as such the
     overall latency of the mixer for these devices cannot be changed.  Other
     devices such as those supported by hdaudio(4) allow the hardware block
     size to be changed, allowing the latency of the mixer to change from 4
     milliseconds (ms) to 128 ms with the mixer intermediate format being 16
     bit, stereo, 48 kHz.

     With regard to mmapped audio, blocks are played back immediately so the
     latency presented to applications is one third of the latency sysctl(8)
     value.

     Latency can be calculated by the following formula:

             Latency (ms) =   blocksize(bytes) * num blocks * 1000
                             --------------------------------------
                             freq(Hz) * bytes per sample * channels

     Latency in the mixer and latency presented to audio applications is con-
     sistent, it will be the same regardless of the audio format requested by
     the audio application.

     The default latency configured at boot time is 150ms and is subject to
     the above constraints.


ADDED IOCTLS

     Two new ioctls have been added to accommodate mixing of multiple vchans:

     AUDIO_SETCHAN:
             Allows setting the target vchan to operate on for subsequent
             ioctl(2) calls.

     AUDIO_GETCHAN:
             Returns the current vchan number.

     These ioctls were necessary as some audio applications like to open an
     audio(4) device and an audioctl(4) device so to check on buffer usage and
     samples played etc.

     As opening an audioctl(4) device would represent vchan 0 (the mix ring),
     these ioctls allow setting the target vchan and audio_info structure to
     that of an existing vchan.


MIXERCTL INTERFACE / SOFTWARE VOLUME

     Mixerctl structures are allocated when a new vchan is created.  The mixer
     control structure allows for setting the software volume for playback -
     vchan.dacN or recording - vchan.adcN.  These are 8 bit values and the
     this value is applied during mixing into the mix ring.

     The software volume is applied to all channels (1, 2, 4 etc.) in the
     vchan and at present (2018-05-04) there are no balance controls for user
     accessible vchans.

     The first vchan corresponds to the vchan.dac1/adc1 mixer controls.

     All vchan mixer controls only have effect upon its own volume and writing
     to outputs.master (or equivalent) control is required to change the vol-
     ume of the hardware.

     Mixer controls are only present whilst the chan is in use and numbering
     starts at one (1).  Mixer control numbers i.e.  dac/adc1 correspond to
     their vchan number.


AUDIOCTL / AUDIO_INFO INTERFACE

     Audioctl allows access to the audio_info structure of a given device.
     Due to the audio mixer a -p flag was added to allow access to a given
     vchan's audio_info structure.  The values for -p are numbered starting at
     zero (0).

     Not specifying -p is the same as specifying -p 0 and will result in work-
     ing with vchan 0 (the mix ring).  This will display the audio parameters
     of the mix ring and allow setting the hardware gain and balance.

     This is for compatibility with existing applications and shell scripts
     that are unaware of the -p switch.

     The parameters for playback and recording only effect the particular
     vchan being operated on (gain, sample rate, channels, encoding etc),
     except -p 0 (the mix ring).


ADDED SYSCTLS

     With the introduction of the audio mixer the following sysctl(7)s have
     been added:

     hw.driverN.frequency:

     hw.driverN.precision:

     hw.driverN.channels:
             Intermediate mixing format.  (see below)

     hw.driverN.latency:
             Expressed in milliseconds.  (see above)

     hw.driverN.multiuser:
             Off/On (0/1), defaults to off.  This sysctl(7) determines if mul-
             tiple users are allowed to access the sound hardware.  The root
             user is always allowed access (i.e., for wsbell).  The first user
             to open the audio device has full control of the audio device if
             this sysctl is set to off.  There currently is an outstanding PR
             about affecting a privileged process - PR/52627.

             Ideally if root intervenes with the audio device, it should do so
             unaffected.

             If this control is set to on, then all users' audio data are
             mixed and all users have access to the audio hardware.

     hw.driverN.usemixer:
             Off/On (0/1), defaults to on.  This sysctl(7) enables or disables
             the audio mixer.  When set to off, the audio device can support
             only one vchan.  This vchan's play and record ring buffers are
             the hardware ring buffers.

             This option was added to aid older/slower systems where the extra
             overhead of the audio mixer might pose a problem.


INTERMEDIATE / MIXING FORMAT

     The initial concept was to handle incoming audio data similarly to that
     of a superheterodyne radio receiver:

                RF -> IF -> AF

     So the corresponding mixing concept is:

                vchan -> mixing format -> hardware

     The sysctl(7)s described above determine the format for mixing.  All
     vchans are up or down sampled to this format before mixing takes place.

     On most systems this defaults to 16 bit stereo 48kHz.  The sysctl(7)s
     governing the mixing format may only be changed when there are no vchans
     in use.

     On faster systems the precision (8, 16, 32 bits) may be changed along
     with the sample rate and number of channels (mono, stereo, 4 etc.).

     On older/slower systems utilizing audio mixing, it may be required to
     lower the quality of this format to ease the amount of data processing
     whilst mixing.

     All possible audio formats (mulaw, alaw, slinear, ulinear, 8, 16, and 32
     bit precision) are converted for use by the audio mixer.


MEMORY MAPPED PLAYBACK

     It is possible to use mmap for audio playback, achieving reduced latency.
     However the audio applications selected format must match the mix-
     ing/intermediate format (see above).

     It is possible to obtain the audio_info for vchan0 which contains the
     intermediate/mixing format to ease applications configuring for mmapped
     audio.

     At present most applications don't use the mix ring's audio_info struc-
     ture to obtain the requiredplay back parameters and some user interven-
     tion is required to set the audio format for the application.


HARDWARE DRIVER REQUIREMENTS

     Audio mixing requires signed linear support in the host's endianness.
     Driver authors should support slinear_le and slinear_be formats.

     If the audio hardware is intended to be used with the mixer disabled,
     mulaw 1ch 8000 hz needs to be supported also.

     This is easily achievable with the auconv framework/filters.  All new
     drivers should consider the use of auconv where possible.


SEE ALSO

     audioctl(1), mixerctl(1), audio(4), audio(9)


AUTHORS

     Nathanial Sloss


SPECIAL THANKS

     Great appreciation goes to Onno van der Linden, isaki@, maya@, jmcneill@,
     pgoyette@, mrg@, riastradh@ and christos@ -- without their input, this
     code would not be what it is currently.

NetBSD 8.0                       May 28, 2018                       NetBSD 8.0