Archive:Hardware Accelerated Video Decoding Development

This article will cover potential methods of and the development around Hardware Accelerated Video Decoding.

"Hardware Accelerated Video Decoding" is the when a video-playback software-application offload portions of the video decoding process to the GPU (Graphic) hardware, it does this by executing specific code algorithms on the GPU. In theory this process should also reduce bus bandwidth requirements.

FFmpeg should probably be the reference and test platform for all hardware accelerated video decoding development. The reason for this is that XBMC used FFmpeg as the base for its DVDPlayer-playback-core (video player) with FFmpeg doing the demuxing and decoding. Also, since both FFmpeg and XBMC Media Center are cross-platform code we could possible also get help from non-XBMC developers as well.

Hardware Accelerated Video Decoding under Linux
Developers wanted! For more information on XBMC development please see the Development Notes article in this WIKI.

Video decoding processes which could be accelerated
XvMC for Linux could possibly be extended in the future to support the same processes:
 * Motion compensation (mo comp)
 * Inverse Discrete Cosine Transform (iDCT)
 * Inverse Telecine 3:2 and 2:2 pull-down correction
 * Inverse modified discrete cosine transform (iMDCT)
 * In-loop deblocking filter
 * Intra-frame prediction
 * Inverse quantization (IQ)
 * Variable-Length Decoding (VLD), more commonly known as slice level acceleration
 * Spatial-Temporal De-Interlacing, (plus automatic interlace/progressive source detection)
 * Bitstream processing (CAVLC/CABAC)
 * CABAC entropy decoding is probably not possible to offload on GPU via pixel shader.
 * NVIDIA and ATI/AMD GPUs use dedicated hardware blocks for entropy decoding.

XvMC
X-Video Motion Compensation (XvMC), is an extension of the X video extension (Xv) for the X Window System. The XvMC API allows a simple way to add hardware accelerated video decoding to video-playback software-applications. XvMC is thus probably the first step to take towards Hardware Accelerated Video Decoding in XBMC's Linux port project, as an existing XvMC library such as libxvmc (from the openChrome project) could be implemented in XBMC's DVDPlayer video player core, (and that fork of libxvmc could later serve as a base which can be extended to support additional codecs and additional hardware/decoding methods).

NVIDIA
Even though NVIDIA closed source binary device driver for Linux contain libXvMCNVIDIA.so which currently only support XvMC hardware acceleration of motion compensation (mo comp), and inverse discrete cosine transform (iDCT) for MPEG-2, the closed source binary device driver for Microsoft Windows from NVIDIA features many more additional video decoding processes which can be passed on to modern GPU (such as NVIDIA's GeForce 6-series, from 6150 and on) which on Microsoft Windows can accelerate many more video decoding processes via Microsoft's DxVA (DirectX Video Acceleration) API, (the Microsoft Windows equivalent to XvMC).

NVIDIA PureVideo Technology
NVIDIA's GeForce 6-series (from GeForce 6150 and on) features a video acceleration engine called "PureVideo", NVIDIA's GeForce 8-series (with the exception of GeForce 8800) features an updated version of this PureVideo technology which NVIDIA calls "PureVideo HD". Both PureVideo and PureVideo HD features a true discrete programmable processing core inside the NVIDIA GPU, dedicated for video decoding. The NVIDIA PureVideo Technology is a combination of a hardware video processor and video decode software, meaning it only offloads parts of the video decoding to the GPU (but since those parts that it does offload are the 'heavy' and processor intensive parts it results in a huge difference on CPU usage when using PureVideo vs. not using PureVideo).

NVIDIA closed source binary device driver for Microsoft Windows has since ForceWare version 85 featured PureVideo Technology support for accelerated hardware video decoding of MPEG-2, MPEG-4 AVC (H.264), VC-1, and WMV9, (plus it also Spatial-Temporal De-Interlacing), via DXVA (which is Microsoft's equivalent of the XvMC API).

CUDA (Compute Unified Device Architecture)
CUDA (Compute Unified Device Architecture) is an GPGPU technology and API that NVIDIA introduced in the GeForce 8-series (G8X based) GPUs. CUDA allows a programmer to use the C programming-language to code algorithms for execution on the GPU, using the GPU as 32-bit (single precision) floating point vector processors. A video decoding process could be a such code algorithms that is executed on the GPU via CUDA. Since CUDA is only supported by NVIDIA GeForce 8-series (G8X based) GPUs and latter it does not really make it a viable alternative for XBMC, (at least for at this time).

Intel (GMA)
Intel's open source graphics device drivers for Linux supports motion compensation (mo comp), and inverse discrete cosine transform (iDCT), and de-interlacing but only for MPEG-2, (even though Intel GMA X3000 / G965 and later also support VLD + iDCT + MC hardware acceleration of VC-1 as well). Nice thing though about Intel is that they feature fully open source device drivers and these device driver do support "XvMCSurfaces" (note however that XvMC is disabled by default in these drivers). However one bad think with Intel GMA (Graphics Media Accelerator) is that only GMA X3000 / G965 and later support Shader Model 3.0 (Vertex Shader Model and Pixel Shader Model), and only GMA X3500 / G35 and later supports OpenGL 2.0 (which XBMC for Linux requires to run smootly).
 * http://www.intel.com/cd/ids/developer/asmo-na/eng/popular/334680.htm

Future Intel Technology - Video Acceleration API (VAAPI)
A new video acceleration API is currently being developed, in an effort lead by Intel. This new API supports more complete offload (like VLD) as well as iDCT and MC, and can support acceleration of MPEG-4, H.264, VC-1, as well as MPEG-2. (Extending XvMC was considered, but due to its original design for MPEG-2 MoComp only, it made more sense to design an interface from scratch that can fully expose the video decode capabilities in today's GPUs). The website for this effort is: http://www.freedesktop.org/wiki/Software/vaapi

The first public version of Ubuntu Mobile and Embedded (UME) Edition will possible feature this new API:
 * http://wiki.ubuntu.com/mobile-hw-decode
 * http://blueprints.launchpad.net/ubuntu/+spec/mobile-hw-decode
 * http://wiki.ubuntu.com/mobile-hw-decode-va-api
 * http://wiki.ubuntu.com/MobileAndEmbedded/Graphics
 * http://wiki.ubuntu.com/MobileAndEmbedded/MediaPlayer
 * http://softwarecommunity.intel.com/articles/eng/1490.htm

Existing Intel Technology - Intel Clear Video Technology
Intel® Clear Video Technology is a combination of video processing hardware and software technologies for a wide range of digital displays. This technology is available on all Intel® G965 Express (Intel GMA 3000) Chipset-based hardware platforms and later (again, note though that you really need Intel GMA X3500 / G35 or later to run XBMC for Linux because of the OpenGL 2.0 requirement). According to Intel, Clear Video Technology enables; Enhanced high-definition video playback, Sharper images, Precise color control, and Advanced display capability.

Intel Clear Video Technology Features and benefits;
 * MPEG-2 decode iDCT + motion compensation. Up to 2 stream support (1 HD and 1 SD)
 * De-interlacing Advanced pixel adaptive (SD/HD-1080i)
 * Color control ProcAmp: brightness, hue, saturation, contrast.
 * Video scaling 4x4 scaling
 * Digital Display Support (through SDVO) Digital Video Interface (DVI), High-Definition Multimedia Interface (HDMI)
 * Display support RGB (QXGA), HDMI, UDI, DVI, HDTV (1080i/p, 720p), Composite, Component, S-Video (via Intel Serial Digital Video Out), TV-out, CRT
 * Aspect ratio 16:9, 4:3, letterbox
 * Maximum resolution support 2048 x 1536 at 75 Hz, RGB (QXGA)

ATI/AMD
ATI/AMD's current display drivers for Linux do not support XvMC. Despite the fact that all Radeon graphic chips hardware has support for MPEG-2 acceleration, (eg. motion compensation, and iDCT decoding), ATI has never provided access to these capabilities in Linux.

ATI/AMD Avivo Technology
ATI/AMD Avivo Technology uses pixel shaders to assist in decoding the video. So far only ATI/AMD closed source binary device driver for Microsoft Windows support this Avivo Technology. However, in theory using technology pixel shaders to assist in decoding the video should be supported on any GPU (by any manufacturer) that support Shader Model 3.0 (ie. Pixel Shader 3.0 and Vertex Shader 3.0).

AMD COBRA video library
The recently announced COBRA video library accelerates video transcode by AMD. No more information has been found as of yet.

ATI/AMD's Close-to-the-Metal (CTM) Technology
ATI/AMD's Close-to-the-Metal (CTM) is a type of GPGPU technology. CTM is supported by all R5x series of ATI Radeon graphics chips and later, and with CTM ATI introduced the concept of using the GPU as 32-bit (single precision) floating point vector processors. CTM works by presents a virtual machine abstraction for GPUs, CTM presents a thin interface to this hardware by hiding graphics-specific features of the device. ATI/AMD made CTM open source via the Close-to-the-Metal (CTM) open source project site on SourceForge.net on 2th of Febrary 2007.

Alternative methods of hardware accelerated video decoding
Video decoding processes could possible also be accelerated under Linux/UNIX (and Microsoft Windows) by using other methods than the previously mentioned PureVideo Technology from NVIDIA (alternative methods could also be used in combination with PureVideo to run video decoding processes that PureVideo do not support). Programming shaders (Pixel Shader or Vertex Shader), with one shader for each video decoding process that one would wish to accelerate is one such method. GPGPU (General-Purpose Computing on Graphics Processing Units) is another possible method. All these alternative methods requires Shader Model 3.0 support by the GPU, (which is one of the reasons why we made Shader Model 3.0 a minimum end-user requirement for XBMC on Linux, Mac, and Windows).

GLSL (OpenGL Shading Language)
OpenGL Shading Language (GLSL, a.k.a. GLslang) is a high-level shader programming-language (based on the C programming-language) which offers such possibilities, (GLSL was originally introduced as an extension to OpenGL 1.5 but the OpenGL ARB only formally included GLSL into the OpenGL 2.0 core). http://en.wikipedia.org/wiki/OpenGL_Shading_Language

Cg (C for Graphics)
Cg (or "C for Graphics") is a another high-level shading programming-language, created by NVIDIA for programming vertex and pixel shaders it is compatible with other GPU hardware manufactures as well. Like GLSL, Cg is also based on the C programming language, and although they share the same syntax, some features of C were modified and new data types were added to make Cg more suitable for programming graphics processing units. The Cg programming-language seems to have survived the introduction of the newer shading languages very well, mainly due to its established momentum in the digital content creation area, although the language is seldom used in final products.

GPGPU (General-Purpose Computing on Graphics Processing Units)
General-Purpose Computing on Graphics Processing Units (GPGPU, also referred to as GPGP and to a lesser extent GP²) is a recent trend in computer science that uses the Graphics Processing Unit to perform the computations rather than the CPU. The addition of programmable stages and higher precision arithmetic to the GPU rendering pipeline have allowed software developers to use the GPU for non graphics related applications. Because of the extremely parallel nature of the graphics pipeline the GPU is especially useful for programs that can be cast as stream processing and real-time computing problems. Simplest way to enable GPGPU support is by using a library such as Lib Sh (GPGPU library for C++), Brahma, BrookGPU, Brook, or Brook+ (the latter family, BrookGPU, Brook, and Brook+, are probably the better two of these four for XBMC video decoding purposes, though looking at them all might be a good experince in its own).

Other GPGPU developer resources

 * GPGPU.org
 * BrookGPU
 * Brook
 * Lib Sh (GPGPU library for C++)
 * Brahma
 * Jorik
 * Shallows library - a cross platform C++ layer on top of OpenGL 2.0 and GLSlang
 * GPGPU Programming Resources - This project maintains various libraries, utility classes, and programming examples.

Possible development tools and resources
Tools and resources that could possible help in the development. Note that GLSL shaders will need to be created and tested in a development tool prior to the injection in the video-playback software-application that will use them, (to do so, GLSL developer tools exists, see "Development Tools" below).

Development Tools

 * Lumina - GLSL development tool (IDE). It is platform independent and the interface uses the Qt (toolkit).
 * NVIDIA ShaderPerf1.8 and ShaderPerf 2.0 Alpha - handy utility that reports detailed shader performance metrics for a wide range of inputs. It is available both as a command line utility and with a user interface in FX Composer. Please note that ShaderPerf 2.0 Alpha only supports DirectX shaders written in HLSL or assembly, (so either use version 1.8 or use HLSL2GLSL which can be used to convert a HLSL shader into a GLSL shader).
 * FX Composer - provides an IDE interface to create, compile and debug GLSL (as well as DirectX) shaders.
 * RenderMonkey - provides an IDE interface to create, compile and debug GLSL (as well as DirectX) shaders.
 * Blender - This popular opensource 3D modeling and animation package can now use GLSL materials, thus allowing any shader developer to use it as a development tool.
 * OpenSceneGraph - open source multiplatform graphics and shader IDE (also see GLSL Shading with OSG — 1.20MB zipped PDF)
 * HLSL2GLSL - library and tool that converts HLSL (High Level Shader Language) shaders to GLSL (OpenGL Shading Language)
 * DirectX OpenGL Wrapper - emulates API calls thru OpenGL commands and other platform specific commands in order to enable DirectX 8 applications to run on other platform than Windows.

Open Source Device Drivers

 * intellinuxgraphics.org open source Linux Graphics Device Drivers from Intel (with XvMC for MPEG-2 acceleration support)
 * openChrome Project - open source device drivers for VIA (has updated XvMC with MPEG-2/MPEG-4 acceleration support)
 * Nouveau - open source device driver for NVIDIA-based graphic controllers (does not yet feature any XvMC support)

Source Code and Libraries

 * BrookGPU - GPGPU library in ANSI C for general purpose computations on GPU (OpenGL and DirectX compatible)
 * Lib Sh - GPGPU library in metaprogramming language and C++ for general purpose computations on GPU
 * Discrete Wavelet Transform (DWT) of JPEG 2000 (JasPer) on GPU written in Cg shader
 * OpenCV (Open Computer Vision Library) - a collection of algorithms and sample code for various computer vision problems. The library is compatible with Intel Image Processing Library (IPL) and utilizes Intel Integrated Performance Primitives for better performance. Features a bi-linear interpolation and color space conversion functions in IPL, (I also read that motion estimation with block matching and Hough transform is on the roadmap so you might want to check out their CVS).
 * Anti-Grain Geometry - A High Quality Rendering Engine (High Fidelity 2D Graphics Renderer) for C++ (GPL licensced)
 * SDL_buffer - a SDL extension library that is useful when you have to resize an image multiple times.
 * SDL_Resize - basic image resizing library, high quality output suited for prerendering images.
 * SDL_Config - Library designed for reading and writing configuration (.ini) files in an easy, cross-platform way.
 * SDL_bgrab - SDL convertion of libbgrab (a framegrabber lib from the same author).
 * NVIDIA Shader Library (color space conversions, blurring, interpolation, anti-aliasing, etc.)

Online Documentation and Tutorials

 * GLSL (OpenGL Shader Language) Tutorial @ Lighthouse 3D
 * OpenGL specification and OpenGL Shading Language reference documents (3DLabs)

Books (hard-copy)

 * GPU Gems 2 (published by NVIDIA) for and by developers
 * there is also GPU Gems 1 but it does not cover GPCPU

XvMC

 * Wikipedia.org article on XvMC (X-Video_Motion_Compensation)
 * MythTV WIKI article on XvMC under Linux
 * bit-tech.net article on NVIDIA PureVideo Technology

GLSL

 * Wikipedia.org article on GLSL (OpenGL Shading Language)
 * shadertech.com - Shader development news, forums, tools, code, and links.