Nén Video thông tin liên lạc P4

Thông tin tài liệu

4 Error Resilience in Compressed Video Communications 4.1 Introduction Compressed video streams are intended for transmission over communication networks. With the advance of multimedia systems technology and wireless mobile communications, there has been a growing need for the support of multimedia services such as mobile teleconferencing, telemedicine, mobile TV, distance learn- ing, etc., using mobile multimedia technologies. These services require the real time transmission of video data over fixed and mobile networks of varying bandwidth and error rate characteristics. Since the coded video data is highly sensitive to information loss and channel bit errors, the decoded video quality is bound to suffer dramatically at high channel bit error ratios (BER). This quality degradation is exacerbated when no error control mechanism is employed to protect coded video data against the hostility of error-prone environments. A single bit error that hits a coded video stream could lead to disastrous quality deterioration for extended periods of time. Moreover, the temporal and spatial predictions used in most of the video coding standards today render the coded video stream rather more vulnerable to channel errors. This vulnerability is represented by the rapid propagation of errors in both time and space and the quick degradation of the reconstructed video quality. To mitigate the effects of channel errors on the decoded video quality, error-handling schemes must be efficiently applied at both the video encoder and decoder. Since real-time video transmissions are sensitive to time delays, the issue of re-transmitting the erroneous video data is totally ruled out. Therefore, other forms of error control strategy must be employed to mitigate the effects of errors inflicted on coded video streams during transmission. Some of these error control schemes employ data recovery techniques that enable decoders to conceal the effects of errors by predicting the lost or corrupted video data from the previously reconstructed error-free information. These techniques are decoder-based and incur no changes on the transport technologies employed. Moreover, they do not Compressed Video Communications Abdul Sadka Copyright © 2002 John Wiley & Sons Ltd ISBNs:0-470-84312-8(Hardback);0-470-84671-2(Electronic) place any redundancy on the compressed video streams and are thus referred to as zero-redundancy error concealment techniques (Wang and Zhu, 1998). Other error control schemes operate at the encoder and apply a variety of techniques to enhance the robustness of compressed video data to channel errors. These are known as error resilience techniques, and they are widely used in video communications today (Redmill et al., 1998; Talluri, 1998; Soares and Pereira, 1998; Weng et al., 1998). The last type of error control mechanism operates at the transport level and tries to optimise the packet structure of coded video frames in terms of their error performance as well as channel throughput. These techniques are the most complexas they depend on the networking platforms over which coded streams are intended to travel and the associated network and transport protocols (Guille- mot et al., 1999; Parthsarathy, Modestino and Vastola, 1997). In this chapter, we cover a variety of the error concealment and resilience techniques used in video communications today, and the transport-based error control schemes will be examined in the next chapter. 4.2 Effects of Bit Errors on Perceptual Video Quality The error performance of most video coding standards is degraded mainly due to two major factors, namely the motion prediction and the bit rate variability discussed in Section 3.2. In the motion prediction process of ITU-T H.263, for instance, motion vectors (MV) are sent in differential coordinates in both pixel and half-pixel accuracies. In other words, each MV is sent as the difference between the estimated MV components and those of the median of three candidate MV predictors belonging to MBs situated to the top, left and top-right of the current MB. If an error corrupts a particular MB, the decoder would be unable to correctly reconstruct a forthcoming MB whose MV depends on that of the affected MB as a candidate predictor. Similarly, the failure to reconstruct the current MB because of errors prevents the decoder from correctly recovering forthcoming MBs that depend on the current MB in the motion prediction process. The accumulative damage due to these temporal and spatial dependencies might be caused by a single bit error, regardless of the correctness of subsequent information. Similarly, the variable bit rate nature of coded video streams is another predica- ment for error robustness in compressed video communications. If a variable- length video parameter is corrupted by errors, the decoder will fail to figure out the original length of this parameter, thereby losing its synchronisation. The effects of a bit error on the decoded video quality can be categorised into three different classes, as follows. A single bit error on one video parameter does not have any influence on segments of video data other than the damaged parameter itself. In other words, 122 ERROR RESILIENCE IN COMPRESSED VIDEO COMMUNICATIONS 30.0 25.0 Actual motion vector Motion vector prediction residual 20.0 15.0 0.00 0.02 0.04 0.06 0.08 0.10 Error Percentage Y-PSNR (dB) Figure 4.1 PSNR values at different error rates with and without motion vector prediction the error is limited in this case to a single MB that does not take part in any further prediction process. One example of this category is encountered when an error hits a fixed-length INTRADC coefficient of a certain MB which is not used in the coder motion prediction process. Since the affected MB is not used in any subsequent prediction, the damage will be localised and confined only to the affected MB. Moreover, the decoder will not lose synchronisation, since it has skipped the correct number of bits when reading the erroneous parameter before moving to the next parameter in the bit stream. This kind of error is the least destructive of the three to the quality of service. The second type of error is more problematic because it inflicts an accumulative damage in both time and space due to prediction. When the prediction residual of motion vectors is sent, bit errors in motion code words propagate until the end of the frame. Moreover, the error propagates to subsequent INTER coded frames due to the temporal dependency induced by the motion compensation process. This effect can be mitigated if the actual MVs are encoded instead of the prediction residual. As illustrated in Figure 4.1 for the 30 frames of the Foreman sequence encoded at 30 kbit/s, the quality of the decoded picture can be improved for error rates higher than 10\ when the actual MV values are transmitted. At lower error rates, the quality drops slightly, since the compression efficiency is decreased when no MV prediction is used. The damage to the picture quality depends on the number of successive frames that are INTER coded following the bit error position. Thus, PSNR values tend to decrease with time due to error accumula- 4.2 EFFECTS OF BIT ERRORS ON PERCEPTUAL VIDEO QUALITY 123 tion. This category of error is obviously more detrimental to the quality of decoded video than the first one; however, it does not cause any state of de-synchronisation, since the decoder flushes the correct number of bits when reading the erroneous motion code words. The worst effect of bit errors occurs when the synchronisation is lost and the decoder is no longer able to figure out to which part of a frame the received information belongs. This category of error is caused by the bit rate variability characteristic. When the decoder detects an error in a variable length code word (VLC), it skips all the forthcoming bits, regardless of their correctness, in the search for the first error-free synch word to recover the state of synchronisation. There- fore, the corruption of a single bit is transformed into a burst of channel errors. The occurrence of a bit error in this case is manifested in two different scenarios. The first scenario arises when the corrupted VLC word results in a new bit pattern that is a valid word in the Huffman table corresponding to that specific parameter. In this case, the error cannot be detected. However, the resulting VLC word might be of a different length, causing the decoder to skip the wrong number of bits before moving forward to the next piece of information in the bit stream, thereby creating a loss of synchronisation. This situation remains until an invalid code word is detected, implying the occurrence of an error and causing the decoder to stop its operation and search for the next error-free synch word. The second scenario appears when the corrupted VLC word (possibly in conjunction with subsequent bits) results in a bit pattern that is not deemed legitimate by the Huffman decoder. In other words, the decoder fails to detect any valid VLC word for a particular video parameter within a segment of the bit stream that corresponds to the maximum length of the corrupted code word. In this case, the decoder signals the occurrence of an error, skips all the forthcoming bits and resumes decoding at the next intact synch word. Figure 4.2 illustrates these two scenarios. Figure 4.3 demonstrates the importance of synchronisation of an H.263 decoder to the reconstructed video quality. The H.263 decoder is modified in a way that ensures resynchronisation just after the position of error. Therefore, the decoder is able to detect an error in a video parameter and look for the next error-free synch word. In other words, only video parameters such as MVs and DCT coefficients are corrupted without the decoder losing its synchronisation (Figure 4.2(b)). Adminis- trative information such as COD, MCBPC, CBPY, synch word, etc., affect the synchronisation of the decoder although they might be fixed-length coded. If one of these control parameters is corrupted by errors, there is no means for the video decoder to detect it until it falls on an invalid Huffman code word later in the bit stream. This loss of synchronisation leads to a dramatic drop of perceptual quality. It is evident that, with maintained synchronisation, the average PSNR values are significantly higher for error rates above 10\ , again for the Foreman sequence encoded with H.263 at 30 kbit/s. Consequently, the synchronisation information is very sensitive to errors and hence very crucial for the correct decoding of a compressed video stream. Therefore, a block-based video decoder must be made 124 ERROR RESILIENCE IN COMPRESSED VIDEO COMMUNICATIONS Figure 4.2 Bit errors leading to loss of synchronisation in the video decoder robust enough to detect the channel errors and resynchronise at the correct bit pattern very quickly and with minimal quality loss. 4.3Error Concealment Techniques (Zero-redundancy) Error concealment or post-processing error control consists of a mechanism by which only the decoder fulfills the task of error control (Wang and Zhu, 1998). The encoder does not add any redundant bits onto the application layer coded stream for error protection purposes. On the other hand, no transmission or transport level mechanism is adopted in these techniques to reduce the severity of artefacts resulting from transmission errors. Error concealment techniques are purely 4.3 ERROR CONCEALMENT TECHNIQUES (ZERO-REDUNDANCY) 125 Motion and DCT 35.0 33.0 31.0 29.0 27.0 25.0 23.0 21.0 19.0 17.0 15.0 0.02 0.04 0.06 0.08 0.100.00 All parameters Error Percentage Y-PSNR (dB) Figure 4.3 PSNR values at different error rates with and without loss of synchronisation decoder-based, whereby the video decoder attempts to benefit from previously received error-free video information for the approximate recovery of lost or erroneous data without relying on additional information from the encoder. Some error concealment techniques are combined with other error control schemes to provide an interactive error handling mechanism in a video communication system (Wada, 1989). In this technique, the encoder relies on some kind of feedback channel signalling from the decoder that includes information about the corrupted MBs. In addition to post-processing error concealment, the encoder contributes to the error control mechanism by avoiding the use of damaged MBs in any further prediction process. However, in this section, we limit the discussion of error concealment to these techniques that are restrictively decoder-based and hence redundancy-free. In these error concealment algorithms, several techniques such as spatial and temporal interpolation, filtering and smoothing of available video data could be employed to estimate and sometimes predict missing video information such as coded shape data (Shirani, Erol and Kossentini, 2000), motion vectors, transform coefficients and administrative bits (Chhu and Leou, 1998; Lam and Reibman, 1995). For an error concealment technique to be activated, an error detection mechanism is required to indicate to the decoder the occurrence of errors. In the previous section, it was shown that the error detection is signalled by the loss of synchronisation due to error-corrupted VLC parameters. In addition to loss of synchronisation, the video decoder claims an error when the number of AC 126 ERROR RESILIENCE IN COMPRESSED VIDEO COMMUNICATIONS coefficients of any 8 ; 8 block of pixels is found to have exceeded 63 or when the decoded MV component or quantisation parameter is outside the acceptable range ([1,31] for the latter). However, transmission errors could also be detected using transport level headers such as checksum, parity bits, CRC (Cyclic Redun- dancy Checks) codes, for bit errors or sequence numbers, temporal references, etc., for packet erasures. These codes are attached normally to packets, as defined by the transport protocol, and their values are indicators as to whether transmission errors have occurred. Error concealment techniques take advantage of the human eyes tolerance to distortion in the high-frequency components more than the low-frequency components of a video frame. Some techniques rely on multi-layer video coding to send low-frequency DC coefficients and motion vectors in the base layer and high- frequency AC coefficients in the enhancement layer (Kieu and Ngan, 1994). When the high-frequency components of the more error-prone enhancement layer are corrupted, the concealment technique recovers their values by using the DCT coefficients of the corresponding motion-compensated MBs in the previous frame. All of these techniques, however, make use of the spatial and/or temporal correla- tions between damaged MBs and their neighbouring MBs in the same and/or previous frame to achieve concealment (Lam and Reibman, 1995). Some of these techniques apply to INTRA coded MBs to recover the INTRADC coefficients of error-affected MBs, whereas other techniques apply only to INTER coded MBs to recover the corresponding motion data. Techniques have been proposed for the error concealment of the damaged shape data of MPEG-4 video coded sequences (Shirani, Erol and Kossentini, 2000). Error concealment methods attempt to reduce the visual artefacts in segments of a video stream that lie between two error-free synch words. If a synch word is inserted once every GOB, then a damaged MB leads to the corruption of a whole slice of video (assuming that a synch word is inserted at the beginning of each GOB). In this case, error concealment must be applied to reduce the effects of errors on the whole slice rather than on the affected MB only. In some transport schemes, the order of transmission of coded MBs is changed by means of interleaving. Despite the processing delay incurred by this technique and controlled by the interleaving depth, the use of interleaving allows the errors to disperse within the spatial area of a video frame, hence causing damage only to spatially disjointed blocks and reducing the likeli- hood of damaging a whole row of MBs. It is obvious that the choice of interleaving depth is a trade-off between the associated delay and the spreading factor of error-affected MBs or else the efficacy of the concealment technique. 4.3.1 Recovery of lost MVs and MB coding modes If the coding mode of the damaged MB is known to be INTER, then the simplest concealment method is to replace the erroneous MB by the spatially coinciding 4.3 ERROR CONCEALMENT TECHNIQUES (ZERO-REDUNDANCY) 127 MB in the previous frame. This technique, despite its simplicity, might sometimes prove inefficient, as it leads to some inaccurate concealment results (annoying visual artefacts) especially in the presence of large motion in the video scene. Alternatively, if the motion data has been received free of errors, then the affected MB could be replaced by the motion-compensated MB, i.e. the MB pointed at by the actual motion vector of the lost MB. The latter technique could yet lead to fine concealment results when error-free motion data is available. However, in many circumstances, the motion vector of the error-damaged MB is also corrupted by transmission errors, and therefore the recovery of the erroneous MV is necessary for the reconstruction of the damaged INTER coded MB. This situation gets even worse when the coded/uncoded flag (COD) and/or the modes of coded MBs are also corrupted. If the motion data of a particular MB is corrupted, the most straightforward and simplest technique to restore its MV is to force a zero vector. Therefore, this is equivalent to assuming that the spatially corresponding MB in the previous frame was the best match MB in the motion estimation process at the encoder. If the transform coefficients of the damaged MB have also been corrupted by errors, then error concealment is similar to replacing the erroneous MB by the spatially coinciding MB in the previous frame as indicated above. This method gives good concealment results in relatively small motion video sequences. Another method is to replace the lost MV by the MV of the spatially corresponding MB in the previous frame. A third method suggests using the average of MVs from the spatially adjacent MBs. However, if an MB is damaged by errors, adjacent MBs to the right (H.261) and below (H.263 and MPEG-4) are also affected due to motion prediction which uses three candidate predictors, as described in Chapter 2. Therefore, the MVs of only the left and top neighbouring MBs are used in the error concealment process. In some cases, instead of using the average, the median of MVs of spatially adjacent MBs is used to predict the lost or error-damaged MV. It has been found through experimentation that the last method yields satisfactory results and produces the best reconstruction results of all the available MV recovery methods (Narula and Lim, 1993). Optimal concealment techniques com- bine these four methods and choose the method that essentially leads to the smallest boundary matching error (sum of boundary variations between recovered MB and neighbouring ones). A more sophisticated technique for recovering a lost MV consists of predicting its value from MVs of spatially adjacent MBs in the previous frame. The MV that best moves its corresponding MB in the direction of the damaged MB (MB with lost MV) is used as the value of the lost MV. This method is based on the assumption that if a portion of the picture in the previous frame is moving into the direction of the damaged MB then it is likely that it will continue to move in the same direction into the next frame. This method obviously fails when errors occur on the edge blocks or the boundaries of an object. Figure 4.4 shows the subjective quality obtained by three different MV recovery techniques. On the other hand, if the coding mode is damaged, the affected MB is 128 ERROR RESILIENCE IN COMPRESSED VIDEO COMMUNICATIONS Figure 4.4 One-hundredth frame of Foreman coded with H.263 and subject to random errors with BER : 0.01 per cent: (a) no concealment, (b) zero-MV technique, (c) MV of spatially corresponding MB in previous frame, (d) MV of MB in previous frame that best moves in the direction of the lost MV treated as an INTRA coded block. The MB is then recovered using information from spatially adjacent undamaged MBs only. The reason for that is to avoid any error in predicting a coding mode in such cases as a scene change, for instance. 4.3.2 Recovery of lost coefficients Lost coefficients in a damaged block can be interpolated from spatially corresponding coefficients in adjacent blocks. One method is to interpolate each lost coefficient from its corresponding coefficients in its four neighbour blocks. When only some coefficients in a block are damaged, coefficients in the same block could be used for the interpolation of the lost coefficient value. However, if all coefficients of a block are lost then this frequency-domain interpolation is equivalent to interpolating each pixel in the block from the corresponding pixels in four adjacent blocks rather than the nearest available pixels. Since the pixels used for interpolation are eight pixels away from the lost pixel value in four separate directions the correlation between these pixels and the missing pixel is likely to be small, and therefore the interpolation may not be accurate. To improve the prediction accuracy, the missing pixel values could be interpolated from the four one-pixel wide boundaries of the damaged MB. The pixels in all of the four one-pixel wide boundaries could be used, or alternatively only those pixels in the two nearest boundaries, as shown in Figure 4.5. The spatial interpolation of lost coefficients is more suitable for INTRA coded blocks. For INTER coded blocks, the interpola- 4.3 ERROR CONCEALMENT TECHNIQUES (ZERO-REDUNDANCY) 129 Figure 4.5 Error concealment of lost coefficients by spatial interpolation: (a) using pixels from four one-pixel wide boundaries, (b) using pixels from the nearest two one-pixel wide boundaries tion does not yield accurate results, since the high-frequency DCT coefficients of prediction errors in adjacent blocks are not highly correlated. Consequently, in INTER coded blocks only the zero-frequency DC coefficient and the lowest five non-zero frequency AC coefficients are estimated from the top and bottom neighbouring blocks, while the rest of the AC coefficients are all set to zero. 4.4 Data Partitioning To limit the effect of synchronisation loss on the decoded video quality, synch words are inserted in the video bit streams at regular fixed intervals. Unlike the core ITU-T H.263 standard which places synch words at the beginning of a frame or GOB, MPEG-4 streams are divided into a number of packets starting with a synch word and containing a regular number of bits. Figure 4.6 shows the difference between the packet structures of H.263 and MPEG-4. Similarly to block-based video coders, the effects of errors on object-oriented compressed video streams depend on the type of the corrupted video parameter and the sensitivity of this parameter to errors. However, object-based video coded streams contain shape data, hence their increased vulnerability to errors. Since video data parameters have different sensitivities to errors, as established in Section 3.7, improvements in the error robustness of MPEG-4 could be achieved by separating the video data to two parts (Talluri, 1998). The shape and motion data of each video packet (VOP) is placed in the first partition, while the less sensitive texture data (AC TCOEFF) is placed in the second partition. The two partitions are separated by a resynchronisation code which is called a motion marker in INTER coded VOPs or a DC marker in INTRA coded VOPs. This 130 ERROR RESILIENCE IN COMPRESSED VIDEO COMMUNICATIONS [...]... beginning of each video packet When the HEC flag is set, the header information is repeated in the video packet If the header information at the beginning of the video packet matches the header information at the beginning of the video frame, the decoder assumes that header information has been correctly received However, if the header data in the video frame is corrupted then the enclosed video data can... error protection (UEP) Since the video parameters of block-based and object-based video compression algorithms present different sensitivities to errors and different contributions to overall decoded quality, unequal error protection could be used for robust yet bandwidth-efficient video transmissions (Horn et al., 1999) As the name implies, UEP consists of protecting video data in unequal proportions... CORRECTION (FEC) IN VIDEO COMMUNICATIONS 135 Due to the variable length of video parameters in a compressed bit stream, the error-protected VLC word results in another variable length code Consequently, if the channel decoder is unable to handle the error(s) affecting a particular VLC word, the video decoder loses synchronisation, since it finds no way to identify the original size of the corrupted video parameter... synch words into video packets: (a) H.263, (b) MPEG-4 synchronisation code is different from the code at the beginning of a video packet The first partition is preceded by a synch code that indicates the start of a new VOP This MPEG-4 video data structure is illustrated in Figure 4.7 The datapartitioning scheme enables the video decoder to restore the error-free motion and shape data of a video packet when... the video decoder has to skip all forthcoming bits in the stream until it resynchronises on finding the next error-free synch word This results in a huge waste of bandwidth, resulting from discarding all the error-protected parameters in the skipped video segment, thereby reducing the efficiency of the employed FEC scheme Because of their sensitivity to errors, motion vectors produced by block-based video. .. standardisation process 4.5.2 Cyclic redundancy check (CRC) FEC data could be inserted into a video stream for a variety of reasons One reason is to enhance the robustness of video data to channel errors, as demonstrated above Another reason is to aid the synchronisation at the decoder by inserting synch words at the beginning of each video packet or fixed-length segment Despite the quality improvement, the insertion... To reduce the accumulative effect of errors in a video sequence, the probability of error in a MV component should be minimised The error resilience of a video bit stream could thus be minimised by duplicating the MV data at different locations in the stream Consequently, the probability of receiving an erroneous MV data bit can be reduced To enable the video decoder to locate the duplicate motion data... contribution to overall video quality In this case, the motion data and the transform coefficients of a block-based compressed video stream receive the same level of protection This process makes the protection of highly sensitive data, such as motion vectors, less efficient, while leading to unnecessary waste of bandwidth by overprotecting less important data To solve this problem, the video data parameters... performance and the error-free video quality is to make the coding rate of an FEC scheme adaptable to varying network conditions One way of achieving this compromise is to use the rate-compatible puncture codes (RCPC) that are covered in the following subsection FEC techniques normally apply equal error protection (EEP) onto various video parameters In other words, the video parameters are protected... on the error robustness of an object-oriented video coder needs to be determined The Stefan sequence is used here to analyse the error sensitivity of data in the first and second partitions of an MPEG-4 video packet Stefan is a CIF (352 ; 288) 30 frames/s fast-moving sequence that features a tennis player in the middle of a rally with two objects in the video scene, the player (foreground) and the background . reconstructed video quality. To mitigate the effects of channel errors on the decoded video quality, error-handling schemes must be efficiently applied at both the video. and decoder. Since real-time video transmissions are sensitive to time delays, the issue of re-transmitting the erroneous video data is totally ruled out.

Ngày đăng: 24/10/2013, 15:15

Xem thêm: Nén Video thông tin liên lạc P4, Nén Video thông tin liên lạc P4

Nén Video thông tin liên lạc P4

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan