| 67th | Top computing and IT abbreviations |
H.264/MPEG-4 AVC (Advanced Video Coding) is a standard for video compression. The final drafting work on the first version of the standard was completed in May 2003.
H.264/AVC is the latest block-oriented motion-compensation-based codec standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG), and it was the product of a partnership effort known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 AVC standard (formally, ISO/IEC 14496-10 - MPEG-4 Part 10, Advanced Video Coding) are jointly maintained so that they have identical technical content. H.264 is used in such applications as Blu-ray Disc, videos from YouTube and the iTunes Store, DVB broadcast, direct-broadcast satellite television service, cable television services, and real-time videoconferencing.
Contents |
The intent of the H.264/AVC project was to create a standard capable of providing good video quality at substantially lower bit rates than previous standards (e.g. half or less the bit rate of MPEG-2, H.263, or MPEG-4 Part 2), without increasing the complexity of design so much that it would be impractical or excessively expensive to implement. An additional goal was to provide enough flexibility to allow the standard to be applied to a wide variety of applications on a wide variety of networks and systems, including low and high bit rates, low and high resolution video, broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems.
The H.264 standard is a "family of standards", the members of which are the profiles described below. A specific decoder decodes at least one, but not necessarily all profiles. The decoder specification describes which of the profiles can be decoded.
The standardization of the first version of H.264/AVC was completed in May 2003. The JVT then developed extensions to the original standard that are known as the Fidelity Range Extensions (FRExt). These extensions enable higher quality video coding by supporting increased sample bit depth precision and higher-resolution color information, including sampling structures known as YUV 4:2:2 and YUV 4:4:4. Several other features are also included in the Fidelity Range Extensions project, such as adaptive switching between 4×4 and 8×8 integer transforms, encoder-specified perceptual-based quantization weighting matrices, efficient inter-picture lossless coding, and support of additional color spaces. The design work on the Fidelity Range Extensions was completed in July 2004, and the drafting work on them was completed in September 2004.
Further recent extensions of the standard have included adding five new profiles intended primarily for professional applications, adding extended-gamut color space support, defining additional aspect ratio indicators, defining two additional types of "supplemental enhancement information" (post-filter hint and tone mapping), and deprecating one of the prior FRExt profiles that industry feedback indicated should have been designed differently.
Scalable Video Coding as specified in Annex G of H.264/AVC allows the construction of bitstreams that contain sub-bitstreams that conform to H.264/AVC. For temporal bitstream scalability, i.e., the presence of a sub-bitstream with a smaller temporal sampling rate than the bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. In this case, high-level syntax and inter prediction reference pictures in the bitstream are constructed accordingly. For spatial and quality bitstream scalability, i.e. the presence of a sub-bitstream with lower spatial resolution or quality than the bitstream, NAL (Network Abstraction Layer) removed from the bitstream when deriving the sub-bitstream. In this case, inter-layer prediction, i.e., the prediction of the higher spatial resolution or quality signal by data of the lower spatial resolution or quality signal, is typically used for efficient coding. The Scalable Video Coding extension was completed in November 2007.
The H.264 name follows the ITU-T naming convention, where the standard is a member of the H.26x line of VCEG video coding standards; the MPEG-4 AVC name relates to the naming convention in ISO/IEC MPEG, where the standard is part 10 of ISO/IEC 14496, which is the suite of standards known as MPEG-4. The standard was developed jointly in a partnership of VCEG and MPEG, after earlier development work in the ITU-T as a VCEG project called H.26L. It is thus common to refer to the standard with names such as H.264/AVC, AVC/H.264, H.264/MPEG-4 AVC, or MPEG-4/H.264 AVC, to emphasize the common heritage. The name H.26L, referring to its ITU-T history, is less common, but still used. Occasionally, it is also referred to as "the JVT codec", in reference to the Joint Video Team (JVT) organization that developed it. (Such partnership and multiple naming is not uncommon. For example, the video codec standard known as MPEG-2 also arose from the partnership between MPEG and the ITU-T, where MPEG-2 video is known to the ITU-T community as H.262.[1])
In early 1998 the Video Coding Experts Group (VCEG - ITU-T SG16 Q.6) issued a call for proposals on a project called H.26L, with the target to double the coding efficiency (which means halving the bit rate necessary for a given level of fidelity) in comparison to any other existing video coding standards for a broad variety of applications. VCEG was chaired by Gary Sullivan (Microsoft [formerly PictureTel], USA). The first draft design for that new standard was adopted in August 1999. In 2000, Thomas Wiegand (Heinrich Hertz Institute, Germany) became VCEG co-chair. In December 2001, VCEG and the Moving Picture Experts Group (MPEG - ISO/IEC JTC 1/SC 29/WG 11) formed a Joint Video Team (JVT), with the charter to finalize the video coding standard. Formal approval of the specification came in March 2003. The JVT was (is) chaired by Gary Sullivan, Thomas Wiegand, and Ajay Luthra (Motorola, USA). In June 2004, the Fidelity range extensions (FRExt) project was finalized. From January 2005 to November 2007, the JVT was working on an extension of H.264/AVC towards scalability by an Annex (G) called Scalable Video Coding (SVC). The JVT management team was extended by Jens-Reiner Ohm (Aachen University, Germany). Since July 2006, the JVT works on Multiview Video Coding (MVC), an extension of H.264/AVC towards free viewpoint television and 3D television.
The H.264 video format has a very broad application range that covers all forms of digital compressed video from low bit-rate Internet streaming applications to HDTV broadcast and Digital Cinema applications with nearly lossless coding. With the use of H.264, bit rate savings of 50% [2] or more are reported. Digital Satellite TV quality, for example, was reported to be achievable at 1.5 Mbit/s, compared to the current operation point of MPEG 2 video at around 3.5 Mbit/s.[3] In order to ensure compatibility and problem-free adoption of H.264/AVC, many standards bodies have amended or added to their video-related standards so that users of these standards can employ H.264/AVC.
Both the Blu-ray Disc format and the now-discontinued HD DVD format include the H.264/AVC High Profile as one of 3 mandatory video compression formats. Sony has also chosen this format for their Memory Stick Video format.[4]
The Digital Video Broadcast project (DVB) approved the use of H.264/AVC for broadcast television in late 2004.
The Advanced Television Systems Committee (ATSC) standards body in the United States approved the use of H.264/AVC for broadcast television in July 2008, although the standard is not yet used for fixed ATSC broadcasts within the United States.[5] [6] It has since been approved for use with the more recent ATSC-M/H (Mobile/Handheld) standard, using the AVC and SVC portions of H.264.[7]
AVCHD is a high-definition recording format designed by Sony and Panasonic that uses H.264 (conforming to H.264 while adding additional application-specific features and constraints).
AVC-Intra is an intraframe compression only format, developed by Panasonic.
The CCTV (Close Circuit TV) or Video Surveillance market has included the technology in many products. Prior to this technology, the compression formats used within the industry's DVRs Digital Video Recorders were generally low quality in compression capability. With the application of the H.264 compression technology to the video surveillance industry, the quality of the video recordings became substantially improved. Starting in 2008, some in the surveillance industry promoted the H.264 technology as synonymous with "high quality" video.
In countries where patents on software algorithms are upheld, the vendors of products which make use of H.264/AVC are expected to pay patent licensing royalties for the patented technology[8] that their products use. This applies to the Baseline Profile as well.[9] A private organization known as MPEG LA, which is not affiliated in any way with the MPEG standardization organization, administers the licenses for patents applying to this standard, as well as the patent pools for MPEG-2 Part 1 Systems, MPEG-2 Part 2 Video, MPEG-4 Part 2 Video, and other technologies. The last US MPEG LA patents for H.264 will not expire until 2028[10].
On February 2, 2010 MPEG LA announced that H.264-encoded Internet Video that is free to end users would continue to be exempt from royalty fees until at least December 31, 2015. [11] However, other fees remain in place. The license terms are updated in 5-year blocks. [12]
In 2005, Qualcomm, which was the assignee of US Patents 5,452,104,[13] and 5,576,767[14] sued Broadcom in US District Court, alleging that Broadcom infringed the two patents by making products that were compliant with the H.264 video compression standard.[15] In 2007, the District Court found that the patents were unenforceable because Qualcomm had failed to disclose them to the JVT prior to the release of the H.264 standard in May 2003.[15] In December 2008, the US Court of Appeals for the Federal Circuit affirmed the District Court's order that the patents be unenforceable but remanded to the District Court with instructions to limit the scope of unenforceability to H.264 compliant products.[15]
Discussions are often held regarding the legality of free software implementations of formats like H.264, especially concerning the legal use of GNU LGPL and GPL implementations of H.264 and other patented formats. Consensus in discussions is that the allowable use depends on the laws of local jurisdictions. If operating or shipping a product in a country or group of countries where none of the patents covering H.264 apply, then using, for example, an LGPL implementation of the format is not a problem: There is no conflict between the software license and the (non-existent) patent license.
Conversely, shipping a product in the U.S. which includes (though not necessarily implements) a GPL H.264 decoder/encoder requires that the copyright terms of the GPL license be upheld, otherwise conveying the codec would be in violation of the software license of the implementation. In simple terms, LGPL and GPL licenses version 3.0 and above require that any rights held in conjunction with distributing the code also apply to anyone receiving the code,[16] and no further restrictions are put on distribution or use.[17] A product which incorporates GPLed code must not rely upon a discriminatory patent license that would prohibit the user from exercising rights granted to them by the GPL.[18] Thus, the right to distribute patent-encumbered code under those licenses as part of the product is revoked per the terms of the GPL and LGPL.[18] It should be realized that the party who would enforce any such breach of copyright would be the people who hold copyright: its writers, whereby any suit on a breach of that clause would have to argue that there exist valid, applicable patents that apply to the capabilities GPL licenced code,[18] a stance copyright holders[nb 1] have not taken.[19]
H.264/AVC/MPEG-4 Part 10 contains a number of new features that allow it to compress video much more effectively than older standards and to provide more flexibility for application to a wide variety of network environments. In particular, some such key features include:
These techniques, along with several others, help H.264 to perform significantly better than any prior standard under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less, especially on high bit rate and high resolution situations.[22]
Like other ISO/IEC MPEG video standards, H.264/AVC has a reference software implementation that can be freely downloaded.[23] Its main purpose is to give examples of H.264/AVC features, rather than being a useful application per se. Some reference hardware design work is also under way in the Moving Picture Experts Group. The above mentioned are complete features of H.264/AVC covering all profiles of H.264. A profile for a codec is a set of features of that codec identified to meet a certain set of specifications of intended applications. This means that many of the features listed are not supported in some profiles. Various profiles of H.264/AVC are discussed in next section.
The standard defines various sets of capabilities, which are referred to as profiles, targeting specific classes of applications.
Profiles for non-scalable 2D video applications include the following:
For camcorders, editing, and professional applications, the standard contains four additional all-Intra profiles, which are defined as simple subsets of other corresponding profiles. These are mostly for professional (e.g., camera and editing system) applications:
As a result of the Scalable Video Coding (SVC) extension, the standard contains three additional scalable profiles, which are defined as a combination of a H.264/AVC profile for the base layer (identified by the second word in the scalable profile name) and tools that achieve the scalable extension:
As a result of the Multiview Video Coding (MVC) extension, the standard contains two multiview profiles:
| Feature | CBP | BP | XP | MP | HiP | Hi10P | Hi422P | Hi444PP |
|---|---|---|---|---|---|---|---|---|
| B slices | No | No | Yes | Yes | Yes | Yes | Yes | Yes |
| SI and SP slices | No | No | Yes | No | No | No | No | No |
| Flexible macroblock ordering (FMO) | No | Yes | Yes | No | No | No | No | No |
| Arbitrary slice ordering (ASO) | No | Yes | Yes | No | No | No | No | No |
| Redundant slices (RS) | No | Yes | Yes | No | No | No | No | No |
| Data partitioning | No | No | Yes | No | No | No | No | No |
| Interlaced coding (PicAFF, MBAFF) | No | No | Yes | Yes | Yes | Yes | Yes | Yes |
| CABAC entropy coding | No | No | No | Yes | Yes | Yes | Yes | Yes |
| 8×8 vs. 4×4 transform adaptivity | No | No | No | No | Yes | Yes | Yes | Yes |
| Quantization scaling matrices | No | No | No | No | Yes | Yes | Yes | Yes |
| Separate Cb and Cr QP control | No | No | No | No | Yes | Yes | Yes | Yes |
| Monochrome (4:0:0) | No | No | No | No | Yes | Yes | Yes | Yes |
| Chroma formats | 4:2:0 | 4:2:0 | 4:2:0 | 4:2:0 | 4:2:0 | 4:2:0 | 4:2:0/4:2:2 | 4:2:0/4:2:2/4:4:4 |
| Sample depths (bits) | 8 | 8 | 8 | 8 | 8 | 8 to 10 | 8 to 10 | 8 to 14 |
| Separate color plane coding | No | No | No | No | No | No | No | Yes |
| Predictive lossless coding | No | No | No | No | No | No | No | Yes |
| Level | Max macroblocks | Max video bit rate (VCL) | Examples for high resolution @ frame rate (max stored frames) |
||||
|---|---|---|---|---|---|---|---|
| per second | per frame | BP, XP, MP (kbit/s) |
HiP (kbit/s) |
Hi10P (kbit/s) |
Hi422P, Hi444PP (kbit/s) |
||
| 1 | 1,485 | 99 | 64 | 80 | 192 | 256 | 128×96@30.9 (8) 176×144@15.0 (4) |
| 1b | 1,485 | 99 | 128 | 160 | 384 | 512 | 128×96@30.9 (8) 176×144@15.0 (4) |
| 1.1 | 3,000 | 396 | 192 | 240 | 576 | 768 | 176×144@30.3 (9) 320×240@10.0 (3) 352×288@7.5 (2) |
| 1.2 | 6,000 | 396 | 384 | 480 | 1,152 | 1,536 | 320×240@20.0 (7) 352×288@15.2 (6) |
| 1.3 | 11,880 | 396 | 768 | 960 | 2,304 | 3,072 | 320×240@36.0 (7) 352×288@30.0 (6) |
| 2 | 11,880 | 396 | 2,000 | 2,500 | 6,000 | 8,000 | 320×240@36.0 (7) 352×288@30.0 (6) |
| 2.1 | 19800 | 792 | 4,000 | 5,000 | 12,000 | 16,000 | 352×480@30.0 (7) 352×576@25.0 (6) |
| 2.2 | 20,250 | 1,620 | 4,000 | 5,000 | 12,000 | 16,000 | 352×480@30.7(10) 352×576@25.6 (7) 720×480@15.0 (6) 720×576@12.5 (5) |
| 3 | 40,500 | 1,620 | 10,000 | 12,500 | 30,000 | 40,000 | 352×480@61.4 (12) 352×576@51.1 (10) 720×480@30.0 (6) 720×576@25.0 (5) |
| 3.1 | 108,000 | 3,600 | 14,000 | 17,500 | 42,000 | 56,000 | 720×480@80.0 (13) 720×576@66.7 (11) 1280×720@30.0 (5) |
| 3.2 | 216,000 | 5,120 | 20,000 | 25,000 | 60,000 | 80,000 | 1,280×720@60.0 (5) 1,280×1,024@42.2 (4) |
| 4 | 245,760 | 8,192 | 20,000 | 25,000 | 60,000 | 80,000 | 1,280×720@68.3 (9) 1,920×1,080@30.1 (4) 2,048×1,024@30.0 (4) |
| 4.1 | 245,760 | 8,192 | 50,000 | 62,500 | 150,000 | 200,000 | 1,280×720@68.3 (9) 1,920×1,080@30.1 (4) 2,048×1,024@30.0 (4) |
| 4.2 | 522,240 | 8,704 | 50,000 | 62,500 | 150,000 | 200,000 | 1,920×1,080@64.0 (4) 2,048×1,080@60.0 (4) |
| 5 | 589,824 | 22,080 | 135,000 | 168,750 | 405,000 | 540,000 | 1,920×1,080@72.3 (13) 2,048×1,024@72.0 (13) 2,048×1,080@67.8 (12) 2,560×1,920@30.7 (5) 3,680×1,536@26.7 (5) |
| 5.1 | 983,040 | 36,864 | 240,000 | 300,000 | 720,000 | 960,000 | 1,920×1,080@120.5 (16) 4,096×2,048@30.0 (5) 4,096×2,304@26.7 (5) |
Previously-decoded pictures are used by H.264/AVC encoders to provide predictions of the values of samples in other pictures. This allows the encoder to make efficient decisions on the best way to encode a given picture. Such pictures are stored in a virtual decoded picture buffer (DPB). The maximum capacity of the DPB in units of frames (or pairs of fields), as shown in parentheses in the right column of the table above, can be computed as follows:
|
|
where MaxDpbMbs is a constant value provided in the table below as a function of level number, and PicWidthInMbs and FrameHeightInMbs are the picture width and frame height for the coded video data, expressed in units of macroblocks. (This formula is specified in sections A.3.1.h and A.3.2.f of the 2009 edition of the standard.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For example, for an HDTV picture that is 1920 samples wide (PicWidthInMbs = 120) and 1080 samples high (FrameHeightInMbs = 68), a Level 4 decoder has a maximum DPB storage capacity of Floor(32768/(120*68)) = 4 frames (or 8 fields). Thus, the value 4 is shown in parentheses in the table above in the right column of the row for Level 4 with the frame size 1920×1080.
It is important to note that the current picture being decoded is not included in the computation of DPB fullness (unless the encoder has indicated for it to be stored for use as a reference for decoding other pictures or for delayed output timing). Thus, a decoder needs to actually have sufficient memory to handle (at least) one frame more than the maximum capacity of the DPB as calculated above.
Versions of the H.264/AVC standard include the following completed revisions, corrigenda, and amendments (dates are final approval dates in ITU-T, while final "International Standard" approval dates in ISO/IEC are somewhat different and slightly later in most cases). Each version represents changes relative to the next lower version that is integrated into the text. Bold faced versions are published (or planned to be published).
[citation needed]
| Feature | QT | Nero | LEAD | x264 | DivX | Dicas | Elecard | TSE | VSofts | ProCoder | Avivo | Elemental | IPP |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B slices | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
| SI and SP slices | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Multiple reference frames | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
| Flexible Macroblock Ordering (FMO) | No | No | No | No | No | No | No | No | Yes | No | No | No | No |
| Arbitrary slice ordering (ASO) | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Redundant slices (RS) | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Data partitioning | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Interlaced coding (PicAFF, MBAFF) | No | MBAFF | MBAFF | Yes | Yes | Yes | Yes | No | MBAFF | Yes | MBAFF | Yes | No |
| CABAC entropy coding | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
| 8×8 vs. 4×4 transform adaptivity | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
| Quantization scaling matrices | No | No | No | Yes | Yes | No | No | No | Yes | No | No | No | No |
| Separate Cb and Cr QP control | No | No | No | Yes | Yes | Yes | Yes | No | No | No | No | No | No |
| Monochrome (4:0:0) | No | No | No | No | No | No | Yes | No | No | No | No | No | No |
| Chroma formats (4:2:x) | 0 | 0 | 0 | 0 | 0, 2 | 0 | 0, 2 | 0, 2 | 0, 2, 4 | 0 | 0 | 0 | 0 |
| Largest sample depth (bit) | 8 | 8 | 8 | 8 | 10 | 8 | 8 | 8 | 10 | 8 | 8 | 8 | 12 |
| Separate color plane coding | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Predictive lossless coding | No | No | No | Yes | No | No | Yes | No | No | No | No | No | No |
| Film grain modeling | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Profile | QT | Nero | LEAD | x264 | DivX | Dicas | Elecard | TSE | VSofts | ProCoder | Avivo | Elemental | IPP |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Constrained baseline | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Baseline | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Extended | No | No | No | No | No | No | No | No | No | No | No | No | No |
| Main | No | Yes/No | Yes/No | Yes | Yes/No | Yes | Yes | No | Yes/No | Yes | No | Yes | No |
| High | No | No | No | No | No | No | No | No | No | No | No | No | No |
Because H.264 encoders required significant computing power, software encoders run on general CPU are typically slow, especially dealing with HD contents. To offload the CPU and/or to do realtime encoding, hardware encoders may be employed.
A hardware H.264 encoder can be an ASIC or an FPGA. An FPGA is a general programmable chip. To use an FPGA as a hardware encoder, an H.264 encoder IP is required. As technology evolves, a full HD (main profile, level 4.1, 1080p, 30fps) H.264 encoder can run on a single chip of low cost FPGA in 2009.
ASIC encoders with H.264 encoder function are available from many different semiconductor companies, but the H.264 encoder IP used in the ASIC are mostly licensed from a few IP vendors. Some H.264 IP vendors' IP are for FPGA or ASIC only, and some are for both FPGA and ASIC.[25]
|
|||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
|