Audio, video, and graphics processing is at the core of many CE products. The AVG requirements for CE devices are different than those for PCs/Servers, notably with respect to footprint, input devices, interlacing, streaming, etc.. Multiple graphics planes and video planes may be combined using, e.g., alpha blending and animation.


No single default/standard interfaces exist for AVG. Having a well defined, well supported interface for AVG devices will reduce fragmentation of solutions and encourage the CE community to develop solutions that apply to conforming interfaces, so that they can be deployed across a wider range of systems.


Acronyms and terms




Advanced Linux Sound Architecture -- functional level audio API, now standard in 2.6 Linux kernels, replacing OSS.


Application Programmers Interface


Association of Radio Industries and Businesses. Most relevant to AVG is the proposed graphics architecture proposed for High Definition TV Broadcast (the 5-plane model).


Advanced Television Systems Committee. American standard body for digital television broadcasting.

Back-end Scaler

A Scaler which manipulates the graphics planes and data, but does not allow the host processor access to the (blended) end result, mainly for efficiency reasons.

CCIR 601

In 1982 CCIR 601 established a digital video standard, which uses the Y, Cr, Cb color space (often incorrectly referred to as YUV). Unlike YUV, Cr,Cb range [-0.5, -0.5]. A full conversion matrix is included below (*)


Consumer Electronics: a class of devices used in the home or on the move. Includes DVD, DVR, PVR, PDA, TV, set-top box, cellular phones, etc.


Digital Video Broadcast: European standards body for digital television broadcasting.


Digital Versatile Disc: high capacity multimedia data storage medium.


Digital Video Recorder: a consumer electronic device.


Abstraction of video-out hardware with a low level (ioctl) API. Standard in >2.4 Linux kernel (see the /usr/src/linux/Documentation/fb kernel tree directory for more information).

Front-end Scaler

A Scaler which manipulates the graphics planes and data and allows the host processor access the (in-between and) end results.


High Definition Television: provides a higher quality television broadcast, with progressive and interlaced ( 720p to 1080i ) video and support for 16:9 aspect (movie) ratio.


Joint Photographic Experts Group: (lossy) still image compression standard.


Multimedia Home Platform: an API used together with MPEG-2 transmissions.


Multipurpose Internet Mail Extension: a standard for identifying the type of data contained in a file. MIME is an Internet protocol that allows sending binary files across the Internet as attachments to e-mail messages. This includes graphics, photos, sound, video files, and formatted text documents.


MPEG-1 Audio Layer 3: a popular audio compression standard.


Moving Picture Experts Group: a compression standard for digital audio & video with varying levels of complexity and achievable compression ratios.


National Television Systems Committee: American standard for analog television broadcasting.


Phase Alternating Line: American standard for analog television broadcasting.


Portable Network Graphics: (lossless) still image compression standard.


Personal Video Recorder: a consumer electronic device.


Colorspace representation commonly used in computer graphics. It uses three orthogonal components -- Red, Green and Blue -- to represent colors in to human visible spectrum, e.g. by combining red and green as additive colors it can fool the eye into seeing "yellow" light. An optional A at the end denotes the presence of per-pixel alpha. See also CCIR 601.


Graphics hardware accelerator which may scale and reformat (e.g. convert from YCC to RGB) graphics data and merge multiple independent graphics planes for final display.


Video for Linux: low level (ioctl) video input and overlay API, standard in 2.4. Originally designed for control of analog video capture and tuner cards, as well as parallel port and USB video cameras. Incorporated in many other higher level APIs such as DirectFB.


Video for Linux, second version, made to be more flexible and extensible. Added specifications for digital tuner control and capture.


Colorspace representation commonly used in analog and digital video broadcasts, and video compression technologies such as MPEG. It uses three orthogonal components, one for luminance (Y) and two for the color-difference signals (Cr,Cb). Since the eye is less sensitive to color than luminance, the color difference signals often get a smaller bandwidth allocated (or lower pixel resolution in the digital domain). An optional A at the end denotes the presence of per-pixel alpha. See also CCIR 601.


Colorspace representation commonly used in North American TV broadcast and is similar to YUV (see definition of YUV). The relation with YUV is: I = 0.74 V - 0.27 U and Q = 0.48 V + 0.41 U


Colorspace representation commonly used in European TV broadcast. It is similar to YCbCr and often meant to be the same (incorrectly) with U referring to Cb and V referring to Cr. With Y (luminance) defined as Y=0.299 R + 0.587 G + 0.114 B, by definition, U=B-Y, thus U represents colors from blue (U>0) to yellow (U<0). Likewise V=R-Y, thus V represents colors from magenta (V>0) to Cyan (blue green) (V<0).

(*) RGB to YCbCr conversion matrix:


0.299 0.587 0.114




0.500 -0.419 -0.081



-0.169 -0.331 0.500


Compliance classifiers

Terminology conventions are adopted here as they are defined in IETF RFC 2119, "Key words for use in RFCs to Indicate Requirement Levels" (by S. Bradner, March 1997). A compliance classifier from the following set may be used:


[O] Three target platforms are used or under consideration:

For the first two, the SystemSizeSpec_R2 page has a full description under "Definition - Platform".

Audio Specification

[O] No additional Audio specifications have been defined. ALSA, defined in kernel 2.6, may be used. Further evaluation is required before it can be considered for recommendation (see work in progress). Future extensions relate to AV streaming and synchronization.

Video-in/Capture Specification

[O] No additional Video input (capture) specifications have been defined. V4L2, as defined in kernel 2.6, may be used.

[O] Proprietary solutions may also be used for video capture and digital tuners if V4L2 does not suffice.

[O] DirectFB may be used as a higher level API.

Note: Video output can be seen as an (interlaced) sub-set of graphics. See graphics specification below for more details.

Video-out/Graphics Specification

[S] The standard Framebuffer is recommended for use in embedded CE devices.

[O] DirectFB may also be used in combination with the framebuffer.

Extensions to both are under consideration (see work in progress).

Graphics formats

[O] The framebuffer supports CLUT, RGB and RGBA packet data formats, but not YCbCr[A]. Hardware capable of accelerating the display YCbCr[A] packed data may develop their own extensions to the framebuffer for now.

[O] Also, the DirectFB framework which supports these formats may be used.

Multi-plane support

[O] Graphics hardware capable of multiple planes may be implemented with a single or multiple device drivers with one device per plane e.g. /dev/fb0, /dev/fb1,.../dev/fb5 for a 5 plane capable device. Front-end based scalers are recommended to use the DirectFB framework.

[O] Back-end scalers may add ioctl's to their framebuffer drivers.

Work in progress

Both DirectFB and the Framebuffer can be extended with YCbCr formats and multi-plane blending features commonly found in embedded CE devices. However, it is likely that only one of them will be supported in the future.

Framebuffer specification

YCbCr Format

Resolution Support

The recommended formats are:

If any of these formats are used, the CCIR 601 standard must be used. It defines how the data is interleaved and the relative positions of the Cb/Cr samples in relation to the Y samples.

Memory representation

YCbCr may be stored in e.g a framebuffer in various ways:

Following CCIR601, only the packed formats are recommended, with the possible exception of a separate alpha plane in some cases (see ARIB [O6] proposal).

Font rendering

Basic 2D acceleration

Video format control

Multi-plane support

DirectFB specification

DirectFB overview [G2] provides a list of currently supported features, summarized below.

Important Terminology


Memory region physically reserved for rendering pixels. Surfaces are used for regular rendering of pixels, sprites and so on.


Sub-region of surface. No physical memory allocated.

Primary Surface

Visible screen in full screen mode.


Each layer is different video memory. They are alpha-blended and displayed.


Each layer may have multiple window. Windowstack is a stack of windows. Each window has surface. Their locations and orders may be changed.

YCbCr Format

Resolution Support

Supported formats are:

Memory representation

Font rendering

(*) For example, 'Times New Roman Regular' and 'Times New Roman Italic' correspond to two different faces.

Basic 2D acceleration

Video format control

Multi-plane support

GFX Card Driver

void dfb_gfxcard_OPERATION() 
        bool hw = false;
        /* check if acceleration is available, and then acquire  */
        if (hardware_accel_available(OPERATION) && hardware_accel_acquire(OPERATION)) {
                hw = card->funcs.OPERATION();
        /* if hardware acceleration is not available */
        if (!hw) {

DirectFb benchmarks

You can refer 'DirectFB' benchmark on various environment from Benchmark section of EvaluateDirectFbTaskPage


G - Graphics/Video out:

V – Video in:

A – Audio in/out:

U – Users of AVG:

O – Other:

Note (1) - KD26 refers to the Linux 2.6.X kernel tree, which has a "Documentation" sub-directory.

Remaining Issues

See Work in progress.

