Video Encryption Technology and Application 1616687347, 9781616687342


286 42 7MB

English Pages [110] Year 2014

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Title page
Contents
Preface
Introduction
Overview of Video Encryption
2.1. Definition and Performance Requirement
2.2. Overview of Cryptology
2.3. Overview of Video Coding
2.4. Digital Video Encryption Algorithm
Video Encrypton Algorithms with Format Compliance
3.1. Definition and Significance of Format Compliance
3.2. Principles for Format Compliance
3.3. Video Encryption with Format Compliance
Visual Security Assessments for Visual Media
4.1. Definition
4.2. Visual Security Assessment Methods
4.3. Introduce of Three Kinds of Objective Visual Security Assessments
Content-Basede Multi-Level Encrypton and Multi-Level Authorization
5.1. Definition and Significance
5.2. Content-Based Multi-Level Encryption and Authorization Method
5.3. Multi-Level Encryption and Authorization Model Based on Remote Sensing Images
Conclusion
References
Index
Recommend Papers

Video Encryption Technology and Application
 1616687347, 9781616687342

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

2010. Nova Science Publishers, Inc. All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under U.S. or applicable copyright law. Copyright

EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI AN: 359039 ; Xu, Zhengquan, Sun, Jing.; Video Encryption Technology and Application Account: s3468732

Media and Communications Technologies, Policies and Challenges

VIDEO ENCRYPTION TECHNOLOGY AND APPLICATION No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

MEDIA AND COMMUNICATIONS TECHNOLOGIES, POLICIES AND CHALLENGES Intelligibility Research and Communication Issues in Emergency Situations Samuel A. Fletcher (Editor) 2010. 978-1-61668-634-5 Videoconferencing: Technology, Impact and Applications Adam C. Rayler (Editor) 2010. 978-1-61668-285-9 Mediating Health: The Powerful Role of the Media Deborah Begoray, Mimi Cimon and Joan Wharf Higgins (Authors) 2010. 978-1-61668-324-5 Video Encryption Technology and Application Zhengquan Xu and Jing Sun (Authors) 2010. 978-1-61668-331-3 Information and Communication Technologies Policies and Practices Almas Heshmati and Sun Peng (Editors) 2010. 978-1-60876-671-0 Noise-Induced Hearing Loss in Youth Caused by Leisure Noise Hannah Keppler, B. Vinck and I. Dhooge (Authors) 2010. 978-1-61668-200-2 Reality Television – Merging the Global and the Local Amir Hetsroni (Editor) 2010. 978-1-61668-315-3

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Videoconferencing: Technology, Impact and Applications Adam C. Rayler (Editor) 2010. 978-1-61668-407-5 Noise-Induced Hearing Loss in Youth Caused by Leisure Noise Hannah Keppler, B. Vinck and I. Dhooge (Authors) 2010. 978-1-61668-460-0 Spectrum Issues for the New Communications Age Caroline D. Underwood (Editor) 2010. 978-1-61668-544-7 Mediating Health: The Powerful Role of the Media Deborah Begoray, Mimi Cimon and Joan Wharf Higgins (Authors) 2010. 978-1-61668-614-7 Spectrum Issues for the New Communications Age Caroline D. Underwood (Editor) 2010. 978-1-61668-725-0 Intelligibility Research and Communication Issues in Emergency Situations Samuel A. Fletcher (Editor) 2010. 978-1-61668-732-8 Video Encryption Technology and Application Zhengquan Xu and Jing Sun (Authors) 2010. 978-1-61668-734-2

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Media and Communications Technologies, Policies and Challenges

VIDEO ENCRYPTION TECHNOLOGY AND APPLICATION

ZHENGQUAN XU AND

JING SUN

Nova Science Publishers, Inc. New York

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Copyright © 2010 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. For permission to use material from this book please contact us: Telephone 631-231-7269; Fax 631-231-8175 Web Site: http://www.novapublishers.com NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers‟ use of, or reliance upon, this material. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA

Available upon Request ISBN: 978-1-61668-734-2 (eBook)

Published by Nova Science Publishers, Inc.  New York

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

CONTENTS Preface

ix

Chapter 1

Introduction

1

Chapter 2

Overview of Video Encryption

3

Chapter 3

Video Encrypton Algorithms with Format Compliance

13

Visual Security Assessments for Visual Media

55

Content-Basede Multi-Level Encrypton and Multi-Level Authorization

73

Conclusion

89

Chapter 4 Chapter 5 Chapter 6 References

91

Index

95

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

PREFACE Video encryption technology is a combination of cryptography and video technology. Video encryption has become a specialized research branch in data encryption filed because of its particular requirements and the special properties of video data. This chapter mainly introduces several aspects of video encryption related research methods and technology solutions. A thorough description of video encryption techniques included its performance requirements, principles for designing a secure video encryption algorithm, the primary encryption algorithms and analysis, the latest research achievement, as well as its performance evaluations, novel applications are given. The open problems and potential research area of video encryption is discussed finally.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Chapter 1

INTRODUCTION With the rapid development of information technology, the applications of digital video have been developed rapidly and become more and more widespread. They have penetrated into people's work, daily life and even all aspects of economic and political life. Video information encryption and content protection have become reality and necessary need [1, 2]. Compared to the general data, video data have the property of huge amount and complex structure and often need to be processed with in real-time or online [3]. An ideal video encryption technology should not only ensure security and computational efficiency (to ensue real time), but also keep format compliance and compression rate invariable because the encryption is generally carried out in the compressed domain [4, 5]. Therefore, the video information encryption has become a specialized research branch in data encryption filed. This chapter mainly introduces several aspects of video encryption related research methods and technology solutions. Firstly, video encryption (itself), including its requirement, subject and open problem, development and trend is reviewed. Secondly, the frame, scheme as well as their advantages and disadvantages of various developed video encryption algorithms are introduced. The algorithm introduction is mainly focused on dominant selective encryption algorithms and scrambling encryption algorithms. It also includes the analysis and discussion of the difficulties and solutions of video encryption based on variable-length codeword and the generation of pseudorandom sequence. Thirdly, the visual security assessment methods for cipherimages according to the unintelligible of cipher-images as well as the significance of the assessment methods in video encryption research are presented. Content-based multi-level encryption and multi-level authorization

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

2

Zhengquan Xu, Jing Sun and Jin Liu

methods of video images and their models in a remote sensing application are introduced. Finally, concludes the current research of the video encryption, predicts the prospects of its development, and points out the potential of further research.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Chapter 2

OVERVIEW OF VIDEO ENCRYPTION 2.1. DEFINITION AND PERFORMANCE REQUIREMENT With the rapid development of Computer Network Technology and maturity standards of MPEG/H.26x, video communications-based applications can be seen everywhere, has become an indispensable part of our daily life. Video data transmit insecure channels, extremely vulnerable to attack. By the urgent need of copyright and privacy protection, sensitive video data needs to be protected before transmission or distribution. Video data encryption is the basic means to ensure security. Video encryption means to adopt classical encryption algorithms or novel encryption algorithms to protect video content. Generally, a video encryption system is composed of several components, as shown in Figure 1. Here, the original video content is transformed into the encrypted video content with the encryption algorithm under the control of the encryption key. Similarly, the encrypted video content is decrypted into the original video content with the decryption algorithm under the control of the decryption key. Additionally, some attacks may be done to break the system and obtain the original video content. Most of the research work focuses on efficient encryption and decryption algorithms that are secure against attacks. Research on applicable video data encryption technology should be based on full analysis and the use of the characteristics of video source. The main features of video streaming source are: the huge amount of data, high redundancy, high real-time and some operation functions of the compressed video data, such as the data‟s location index, coding rate control. These

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

4

Zhengquan Xu, Jing Sun and Jin Liu

characteristics of video data decide that video data encryption should meet the following requirements [4, 5].

Security Security is the basic requirement of video content encryption. For video encryption, it is generally believed that when the cost for breaking is larger than the one paid for the video content, the encryption algorithm may be as secure. Video data can be as ordinary binary data, so classical cryptographic algorithm can be used in video encryption. Video content with a large amount of data, deciphering the data are hard to avoid a large number of decoding operations, this will greatly increase the difficulty to decipher. Therefore, on the basis of ensuring security, some special and fast encryption algorithms can also be used. Additionally, different from text or binary encryption, video data requires both cryptographic security and visual security. The former refers to security against cryptographic attacks [2, 6], and the latter means that the encrypted video content is unintelligible to human perception. Visual security based on cipher-images will be discussed in detail in Section 4.

Compression Ratio The data size can keep unchanged before and after encryption, called the compression ratio invariance. By using an encryption algorithm with the property, it does not change storage space in the storage process as well as speed in the transmission process. Therefore, the ideal video encryption algorithm should have the compression ratio invariance.

Real-Time As the requirements of real-time video data transmission and access, encryption algorithms or decryption algorithms can not brought too much delay. Therefore, video encryption algorithms should be efficient to meet the requirements of real-time interaction.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Overview of Video Encryption

5

Format Compliance Format compliance is the format information among video data keeping unchanged before and after encrypted that can guarantee video ciphertext can still be decoded successfully through standard decoder without correct decryption. This has many advantages, format compliance make time location possible to support the video data, may also support many functions, such as increase、delete、cut and paste operation.

Video Information Direct-Operability Video information direct-operability is the encrypted video can still be done specific operations as the same as before encryption, without having to decrypt firstly to complete a specific operation, and then encrypt. Such specific operations are: video codec, multiple compressions, scalable coding, as well as some interactive functions such as fast forward and fast backward. Their common feature is that can directly do operation to the compressed video information with standard formats.

2.2. OVERVIEW OF CRYPTOLOGY Cryptology includes cryptography and cryptanalysis [2, 7]. The former is the science and technology to enable the confidentiality of information, the latter is the science and technology to decipher ciphertext. Classical cryptographic system is usually based on confidential encryption algorithm to keep information confidentiality, lack of confidentiality and limited options, can not be standardized and security control. Modern cryptographic system is based on the key algorithm, its security is relying on the safety key, and the encryption algorithm can be opened and standardized, easy to control through the security key. Modern Cryptography system mainly includes symmetric cipher and asymmetric cipher. Symmetric cipher can be divided into stream cipher and block cipher. In symmetric cipher, the encryption key and decryption key can be derived from each other, so key management is critical. Well-known symmetric ciphers include DES [8], AES [9], IDEA [10] etc. In asymmetric cipher, the encryption key can be opened, and the decryption key needs to be kept confidential. It is difficult to calculate

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

6

Zhengquan Xu, Jing Sun and Jin Liu

the decryption key form the encryption key. Well-known asymmetric ciphers include RSA [11], ECC [12], EIGamal [13], etc. Cryptanalysis focuses on the methods to analyze or break cryptographic means. It provides some common or special methods to analyze the security of a cryptosystem. Generally, a cryptosystem should survive all the cryptanalysis methods before it can be used in practice. Classical encryption algorithms provide research foundation for video encryption algorithms. Video data can be as general binary data to be encrypted by classical encryption algorithms, but different form text data, video data is enormous and often processed in real-time. Therefore, classical encryption algorithms are not appropriate for video data. For example, if video data are treated as ordinary data and encrypted with the classical encryption algorithms directly, the huge amount of calculation can not satisfy real-time and make other damages to the video format, which can not keep the other characteristics of video information. In order to maximize the reduction of the amount of video data to encrypt, people think of using block cipher, stream cipher to encrypt only partial information. Such method is the mainstream research direction of video encryption. A balance can be reached between the amount of information and confidentiality. However, in the video compression code stream, there are a large number of variable-length code words on behalf of almost all the video content. If directly encrypt these code words, the vast majority of cases, these codes will be encrypted into illegal code word that does not exist in protocol code table, it is bound to result in incompatible formats. In order to achieve format compliance of video stream, it is thought that we firstly select stream data with the same type, and then randomly permute them by scrambling table generated by the random sequence. Such method is also one of the hot in video encryption. One of the problems is usually relatively small amount of similar information, and limited scrambling space, it was relatively easy to be attacked. In addition, if you want to ensure their safety, there is need for regular replacement of scrambling table, so encryption and decryption key client synchronization problem may arise. Video encryption requires both cryptographic security and visual security. This request is different from ordinary data. Therefore, we need to research on special video encryption algorithms for the characteristics of video data, based on study of classical encryption algorithms.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Overview of Video Encryption

7

2.3. OVERVIEW OF VIDEO CODING The original video data contain different types of redundant information including space redundancy, time redundancy and visual redundancy. The basic idea of video coding is a series of video technology to reduce redundant information, in the applications of certain requirements and conditions, as soon as possible use the least amount of video data to get the best quality of video image. Decryption key

Encryption key Original Video content

Encryption algorithm

video content cipher

Decryption algorithm

Decrypted Video content

Attacks

Figure 1. Architecture of video image encryption and decryption.

Almost all international standards for video coding use the hybrid coding system of block-based predictive coding and transform coding. The basic principle of video encoding is as shown in Figure 2. In it, image preprocessing which includes image format conversion, image segmentation, image denoising. This process will first transform the format of video image into brightness and chroma separation form, sample chroma information appropriately to remove human visual redundancy of chroma, and then divide a complete image into a large number of small image blocks for easy encoding. Predictive coding can be inter-prediction in a number of image frames as well as intra-prediction in an image, mainly for the elimination of time redundancy and space redundancy. Transform coding is used to transform from space domain to frequency domain, and usually used in conjunction with the quantification in order to reduce visual redundancy. Quantification is the main source of image distortion. Entropy coding is changing the coefficient after quantified into a string of bit-stream output using the way of the variable

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

8

Zhengquan Xu, Jing Sun and Jin Liu

length coding (VLC). This process can greatly reduce the amount of information representation of video images. In video communications and storage applications, to resolve the problem of compatibility is to establish a unified standard. Industry standardization is a prerequisite for success and power for popularity of business applications. In recent years, the Telecommunication Standardization Organization of International Telecommunication Union (ITU-T) and the International Standardization Organization (ISO) of the Moving Picture Experts Group (MPEG) have developed a number of video coding standards. Former standards are commonly referred to as suggestions, the title H.26x (including H.261, H.262, H.263 and H.264), latter standards are known as MPEG-x (including MPEG-1, MPEG-2 and MPEG-4, etc.). The suggestions video standards of ITU-T are targeted at real-time communication, such as video telephony and video conference, and MPEG video standards are mainly for video storage and video broadcasting applications. Original video sequence

Preprocessing

Precedictive coding

Transform Coding

Quantizat -ion

Entropy coding

Postcompression coding codestream

Figure 2. Basic principle of video coding.

The latest video standard H.264, or called MPEG-4 Part 10, is jointly being made by ITU-T Video Coding Experts Group (VCEG) and ISO Moving Picture Experts Group (MPEG), named as Joint Video Team (JVT). It is a multi-bit rate oriented natural video coding standard, which can be applied to high bit-rate of SDTV, HDTV and digital storage systems, can also be used for low bit-rate real-time communication systems. Owing to its better performance than others, is called the next-generation video compression standard. Under the same quality, the bit rate can be reduced about half; or under the same bit rate, the signal to noise ratio improved. Since its advent, it has been given wide attention. Video coding technology is also one of the foundations to study video encryption technology. Due to the special property of video data, we need to look at the video coding standard to deep analyze the characteristics for video stream format, such as syntactic and semantic. So that we can classify the video information, to know what information can be encrypted directly, and

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Overview of Video Encryption

9

what information may arise format compatibility problem if encrypted directly. Next-generation video compression standard H.264, in comparison with the previous series of the relative standards, in the encoded bit stream each semantic unit forward-backward correlation is more closely, it is very difficult to be separated from key information to encrypt, random selection information to encrypt or simply scrambling is inevitable to destruct their video formats. These problems should be studied for our current mainstream video encryption algorithm.

2.4. DIGITAL VIDEO ENCRYPTION ALGORITHM At present, scholars at home and abroad start a positive research on video data encryption. Video data encryption, in accordance with the encryption methods, can be divided into the use of AES, RSA et al. classical methods and the adoption of chaos and quantum theory cryptography et al. non-classical methods; in accordance with content of encryption, it can be divided into direct encryption and selective encryption for some important information encryption with combination of compression coding. Here is the following analysis of the characteristics of various types of encryption methods.

2.4.1. Direct Encryption Direct encryption is to consider video data as general binary data, using IDEA, AES, RSA or other classical algorithms by-pixel-by-bit encryption [14, 15]. The original video data can be encrypted directly, but also can be encrypted after compression. This method makes use of the advantage of classical cryptographic algorithm with high security, but to video data, it has disadvantages of large calculation and bad real-time. At the same time, the changes in the probability statistical characteristics of the video data make an influence to the coding compression ratio (the compression methods currently use statistical redundancy in space domain, time domain and strong correlation of video data). As the emergence of wireless video communication technology, embedded devices as the communication terminals have strong constraints on the calculated power consumption and bandwidth that makes direct encryption impractical.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

10

Zhengquan Xu, Jing Sun and Jin Liu

Entropy coding

Transform coding/predictive coding



Network packaging

Quantization RLC





VLC







Figure 3. Selective encryption points during compression coding process.

2.4.2. Selective Encryption The current compression schemes are hybrid coding, that is, firstly predictive coding, and then transform coding to predict difference values, finally statistics coding to the transform coefficient. Figure 3 gives the coding phase for encryption. Selective encryption method takes full account of the characteristics of video data itself and requirements of compression standards. Selective encryption is not to directly encrypt the pixel values, but to the key information generated in the process of above-mentioned coding. At present, the selective objects of encryption include: encryption of I-frame and I block of the frame; encryption of DCT coefficients; encryption of motion vectors; encryption of RLC coefficient, and so on [16 -21].

I-Frame and I Block of the Frame Encryption In MPEG series compression standards the video frames are divided into I-frame, P-frame and B-frame three types. I-frame is known as key-frame, Pframe is obtained by predictive coding on the basis of I-frame. B-frame is obtained by interpolation on the basis of P-frame and I-frame. Maples T B and Spanos G A, et al. chooses DES, RSA and other classical cryptographic algorithms for encryption of I-frame [16]. In theory, if do not know I-frame, only P-frame and B-frame are useless, so only encryption of I-frame can achieve the purpose of encrypting the entire video stream, and at the same time can significantly reduce the computational complexity. But Agi I in Reference [17] points out that for video frames have very strong correlation and nonencrypted I-block may exist in P-frame and B-frame, so only encryption of Iframe is insecure and the computational complexity remains high.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Overview of Video Encryption

11

Head Information Encryption Qiao L et al. points out to encrypt head information of video streams in MPEG series standards (including the GOP head, the Slice head and the macroblock head) using DES, RSA et al. algorithms [18]. Compared to the Iframe encryption, head information encryption can significantly reduce the computational load, but have two drawbacks: First, the head with a lot of standard information, an attacker can easily guess; Second, the head with a lot of control information, during transmission is used for synchronization, error monitoring by the middle node, if it is encrypted, it is very difficult to deal with the middle node. DCT Coefficient Encryption This method is the encryption to frequency domain coefficients. Tang L proposed scrambling DCT coefficients to achieve the purpose of encryption, but do not give the scrambling key generation method [19]. The advantage of this method is that the syntax structure of the final generated bit stream does not change, and the algorithm complexity is low. The disadvantages is changing the order of the value size of the DCT coefficients by the “Z” shape scan, and making subsequent entropy coding compression rate greatly decrease. Another important defect is that it just changes the energy distribution in 8 × 8 block that does not mean it can change of energy distribution of the whole image, so the whole image may be intelligibility to human visual. Through the analysis and experiment Qiao L, et al. presents that using random scrambling to replace the “Z” shape scan, not only greatly reduces the compression ratio, but also leads that the cryptographic system can not resist known-plaintext attack [20]. Therefore, using such method alone can not achieve better encryption security effects. Shi C G and Bhargava B, et al. points out an encryption method of DCT coefficient symbols, which splices the DCT coefficient symbols (“0” represents a positive number and “1” represents a negative number) into bit stream or data paragraph, and then uses the randomly generated key stream by bit XOR operation to encrypt symbols, at last, backfills the encrypted symbols to the original data [21]. Thereafter they propose a method by encrypting DCT coefficient symbols and motion compensation vector symbols [22].

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

12

Zhengquan Xu, Jing Sun and Jin Liu

2.4.3. Chaos Encryption Chaotic system is a nonlinear dynamical system with the prosperities of highly sensitive to initial conditions, non-periodic trajectory without convergence and inherent randomness, which is in line with modern cryptographic theory. Chaos has very similar characteristics with modern cryptography. In recent years, it has been widely used in cryptography. Fridrich F proposed using two-dimensional Baker map to achieve pixellocation transform, and then using the extended three-dimensional map to change the pixel values to achieve the purpose of encryption [23]. On this basis, Mao Y B and Chen G R, et al. proposed a fast image encryption scheme [24]. The scheme applies three-dimensional Baker map, the encryption speed is 2 to 3 times of the original speed and security has been improved somewhat. Thereafter, Chen G R, et al. presents an image encryption scheme based on three-dimensional Cat mapping [25], namely use three-dimensional Cat map to scramble the location of pixels, and use other chaotic maps to confuse relationships between the original image and the encrypted image. Many scholars are doing research on this area, such as Li S J et al. proposed a chaosbased real-time digital video image encryption scheme [26]. The above schemes are all the image or video data direct encryption, with high complexity and without the use of the advantages of selective encryption.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Chapter 3

VIDEO ENCRYPTON ALGORITHMS WITH FORMAT COMPLIANCE 3.1. DEFINITION AND SIGNIFICANCE OF FORMAT COMPLIANCE Keep video encryption algorithm with format compliance is an important demand. The meaning of format compliance is the syntactic-semantic information of video ciphertext is still compatible with standard format, and its core performance is video ciphertext can still be successful decoded by standard decoder. The importance of format compliance is in the following: To maintain format compliance can guarantee channel information such as network synchronization and fault tolerance in the original video will not be changed, to ensure network adaptability and fault tolerance of ciphertext stream [27]. To maintain video syntactic and semantic information compatible with standard format, a variety of video stream such as start code word, format field information, and other signs remain unchanged, that is beneficial to the security of visual information [28]. To maintain format compliance is in favor of commercial application of video content effective and flexible control [29]. To maintain format compliance is also beneficial to maintain the direct operability for video information, that is, those operations can done directly to the compressed video information [27].

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

14

Zhengquan Xu, Jing Sun and Jin Liu To maintain format compliance is in order to ensure that video streams can be successfully decoded by ordinary decoder.

3.2. PRINCIPLES FOR FORMAT COMPLIANCE In this section, based on relevant theory of modern algebra on the collection and mapping [30], the video encryption algorithm that can maintain format compatible is described as single-shot from a set of video information in video standard format to it own; the transformation that maintain the same format of video code word will be described as the coincidence format code word mapping form set to itself; and then use the definition of composite (or synthetic) in collection and mapping to decompose the structure of maintain format compatible video encryption algorithm into series of video codeword legitimate transform compound; Finally, the analysis of different video code elements‟ legitimate transform is carried out in order to complete the integrity of the video encryption algorithm system with format compliance.

3.2.1. Related Definition Definition 3.1: The collection of all possible n-bit binary data is Vn. That is, Vn = {x | bitslen (x) = n}. Clearly, Vn contains 2n binary numbers (0, 1, 2 ... 2n-1), that is, | Vn | = 2n. Definition 3.2: All in line with the video standard format S for video information collection is Mn. That is, Mn = {x | stream_decodes (x) = true, x∈ Mn}. In it, stream_decodes (x) = true denotes that x in line with the standard format will be a smooth decoding, otherwise stream_decodes (x) = false. Definition 3.3: Standard format for a collection of video information Mn to Mn itself mapping (also known as conversion) is a transformation with format compliance. The mapping ƒ: Mn → Mn is the transform to maintain format compatible. Definition 3.4: Standard format for a collection of video information itself Mn to Mn injective ƒ (also known as conversion) is a reversible transformation with format compliance, also known as a video encryption algorithm with format compliance. That is, for any x1, x2 ∈ Mk, and x1 ≠ x2 have ƒ (x1) ≠ ƒ (x2), ƒ is reversible, recorded as the inverse ƒ-1. Definition 3.5: The single-shot ƒk (also known as conversion) form a collection of standard format video information Mn to Mn itself, which

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

15

depends on key k, is called key-based video encryption algorithm with format compliance. Definition 3.6: Video information m (m ∈ Mn), directly or indirectly, contains element information ve of video context, and it called as the video code element. Definition 3.7: Video code element ve correspondence specific legal codeword known as the video code word vc. The collection of legal codewords for video code element ve is recorded Cve. That is, Mn = (x | code_decodes (x) = true, x∈ Mn). In which, code_decodes (x) = true means that x is in line with the standard format and will be smoothly decoded, otherwise code_decodes (x) = false. Definition 3.8: Certain types of video code element ve‟s legal codeword set Cve to Cve itself mapping (also known as conversion), and the mapping does not affect the decoding process of codeword context, called the independent legal transformation of video code element ve and recorded as eE. That is, for any legal codeword of ve, eE (ve) = ve‟, has ve‟ ∈ Cve, and ve‟ context does not affect other codeword context decoding process, so eE is the independent legal transformation for ve. Definition 3.9: In the video stream m (m ∈ Mn), the video code element which always has fixed bit length which called fixed-length video code element, referred to as fixed-length code element (fve). The fixed-length code element in the code stream corresponding to a specific legal code known as fixed-length code element‟s word, referred to as fixed-length code word. Definition 3.10: In the video stream m (m ∈ Mn), the video code element which has variable bit-length video symbol called variable-length video code element, referred to as variable-length code element (vve). The variable-length code element in the code stream corresponding to a specific legal code known as variable-length code element‟s word, referred to as variable-length code word. Definition 3.11: In the video stream m (m ∈ Mn), the code element whose successful decoding depends on the specific context which called contextbased video code element, or context code element (cve). The context code element in the code stream corresponding to a specific legal code is known as context code element‟s word or code word context. Note: The decoding of context code element usually need to rely on determine of the code element‟s bit-length, the selection of its variable-length code table and so on.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

16

Zhengquan Xu, Jing Sun and Jin Liu

3.2.2. Principles for Video Encryption with Format Compliance Theorem 3.1: In video stream m (m ∈ Mn), if the composite transform ƒ of video code element vei‟s independent legal transformation eEi is reversible, that is called video encryption algorithm with format compliance. Set m = (ve1, ve2, ... ven), the independent legal transformation of vei is eEi, that is, eEi (vei) = vei‟, eEi (vj) = vej (j ≠ i), ƒ is reversible, that is, arbitrary m, ƒ (m) = m‟, there is ƒ-1 (m‟) = m. By definition 3.4, the composite ƒ of eEi is the injective ƒ of the video information set in standard format Mn to Mn itself, that is, a video encryption algorithm to keep format compatible. Theorem 3.1 is the basic principles of video encryption algorithm with format compliance, which is to find corresponding independent legal transformation method for different video code elements, and then composite these video elements‟ independent legal transformation combinations (composite) to form the encryption algorithm with format compliance for the entire video stream.

3.2.3. Pinciples for Video Code Element Encryption Next we will discuss how video code element to achieve independent legal transformation, according to the definition of 3.8, if a video code word is mapped into another legal code word, and the mapping does not affect the context of other code word decoding process, it is an independent legal transformation. For general n-bit binary data, plaintext space and ciphertext space are equivalent to Vn (definition 3.1), and their mapping relationship is shown in Figure 4. Three-dimensional mapping is used to reflect the n-bit binary elements‟ specific changes in the whole range of the mapping. To express plaintext space and ciphertext space equivalent to the case of Vn can be used directly in the application of cryptography algorithm. Theorem 3.2: If a fixed length of n-bit binary elements‟ plaintext space is Vn, and then encrypt them by use of cryptographic algorithm which does not change the bit-length, mapping results of the encryption also must be in plaintext space. Existing cryptographic algorithm usually does not change the length of plaintext, and therefore, according to Theorem 3.2, if the plaintext space Ce of a certain element e does not mean Vn, then consider it is mapped to the

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

17

plaintext space Vn of general n-bit binary data (based mapping for f), the n-bit binary data can directly use the existing cryoptogramphic algorithm to do encryption mapping (its encryption maps is Ek, decryption maps is Dk), and then use inverse transformation f-1 back to original plaintext space Ce. Therefore, the encryption algorithm of the element e will be as the composite of f-1, Ek, Dk, f: eE (e) eD(e )

f 1 ( Ek ( f (e))) e f 1 ( Dk ( f (e ))) e

This is the basic principles of encryption under the circumstances that the plaintext space of video code element is not equivalent to Vn. Ciphertext Space

Ciphertext Space 2n -1

文本 文本

0

2n -1 Plaintext Space

Time

Plaintext Space

Figure 4. Mapping relationships of plaintext space and ciphertext space for general binary data encryption.

For most video code elements having a special meaning, their plaintext space (Cve) is not equal to Vn. For example, for 3-bit video code element, the standard set forth that there is only four legal codewords of the video element (assumed to be 001,011,101,110). For independent legal transformation method, ciphertext space and plaintext space should be the same as shown in Figure 5, the gray part on behalf of the scope of mapping points of plaintext space and ciphertext space. If we directly encrypt the 3-bit video code element, it will likely to obtain illegal code words, so as not to fulfill the condition of the independent legal transformation in definition 3.8, thus can not remain compatible format.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

18

Zhengquan Xu, Jing Sun and Jin Liu

Ciphertext Space

111

000 000 001 011

101 110 111 Plaintext Space

Figure 5. Mapping relationships between plaintext space and ciphertext space (not equal to Vn) of 3-bit video code element. Ciphertext Space

Ciphertext Space

111

111

000

000 000 001 011

101 110 111 Plaintext Space

000

111

Plaintext Space

Figure 6. Mapping relationships between plaintext space, ciphertext space and Vn..

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

19

Ciphertext Space 文本 文本 文本 文本 文本 文本 文本

Ti m e

Plaintext Space

Figure 7. Mapping relationships between plaintext space and ciphertext space of fixedlength code element.

Then the way of how to encrypt video code element may be that transferring plaintext which is unequal to Vn and ciphertext Cve into space Vn, then use cryptographic algorithm directly on Vn, and then mapped back to the original Cve. As shown in Figure 6. According to the existing video coding standards and video content related to the characteristics of syntax elements, the video code elements are divided into three categories: fixed-length code element, variable-length code element and code element context. We will be on the following three types of video symbol to discuss their specific legal transformation methods.

3.2.3.1. Fixed-Length Code Element Encryption Fixed-length code element, which is the syntactic elements with fixed bitlength encoded video content information. Video streams are usually encoded using fixed-length video code element are: H.263/MPEG-4 quantitative coefficient, intra-block DC coefficient; H.264/AVC 4 × 4 luminance block word prediction model , I_PCM mode macroblock pixel values. There is a

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

20

Zhengquan Xu, Jing Sun and Jin Liu

special kind of fixed-length code element is the very important sign which is separate from code stream: in H.263/MPEG-4 standard the sign bits of motion vector transform coefficients, in H.264 the sign bits of the tail coefficient of the residual block. Note that motion vector signs and transform coefficients signs for H.264 in the code stream are not independent. In the encryption of fixed-length code element, plaintext space and ciphertext space mapping can be expressed in Figure 7. Time axis is to represent the fixed-length code elements‟ specific changes in the whole range of the mapping of plaintext space and ciphertext space. Assumptions by mapping expressed Ek and Dk for the binary stream cipher. The independent legal transformation of the fixed-length code element fve can be described as follows: FEk ( fve)

Ek ( fve)

FDk ( fve )

Dk ( fve )

fve fve

In which, FEk ( ) is encryption algorithm, FDk ( ) is the decryption algorithm on fixed-length code element fve, and fve‟ is the ciphertext of fve.

Ciphertext Space

Ciphertext Space 文本 文本 文本 文本 文本 文本 文本

文本 文本 文本 文本 文本 文本 文本

Plaintext Space

Ti m e

Ti m e

Plaintext Space

Figure 8. Mapping relationships between plaintext space and ciphertext space of variable-length code word and the indirect encryption principle.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

21

3.2.3.2. Variable-Length Code Element Encryption Variable-length code element is the video symbol whose code-bit length is not fixed and changing all the time during coding. Variable-length coding is the main way of video information entropy coding to take. The important video elements which use variable-length coding are: in H.263/AVC by the use of Huffman coding motion vectors, transform coefficients; H.264/AVC in Context-based Adaptive Variable Length Coding (CAVLC) mode uses the Exp-Golomb coding motion vectors, quantization parameters, 16 × 16 luminance block forecasting model word, and chroma block forecasting model word. The collection of the legal codeword of variable-length code element does not mean Vn, plaintext space and ciphertext space is not a one-to-one mapping, could not encrypt directly as general binary data, and so consider using replacement algorithms of equal length codewords. The basic idea is that the variable-length codeword vve is mapped to fixed-length codeword i of the collection Vn, the fixed-length code word i can be encrypted i‟, then i‟ is mapped into variable-length code word vve‟, which is the realization of variable-length code word indirect encryption. The scheme mapping mentioned above is shown in Figure 8. Suppose that the mapping form the legal space of variable-length code to Vn is M, inverse mapping is M-1, so M (vve) = i, M-1 (i) = vve. Mapping Dk and Ek is cryptographic algorithm for binary stream. The independent legal transformation of variable-length code element vve can be described as follows:

VEk (vve)

M 1 ( Ek ( M (vve)))

VDk (vve )

M 1 ( Dk ( M (vve ))) vve

vve

In which, VEk ( ) is encryption algorithm, VDk ( ) is the decryption algorithm on variable-length code element vve, and vve‟ is the ciphertext of vve.

3.2.3.3. Code Element Context Encryption Code element context is characterized by the bit-length or other important format information of current codeword from the context other codewords to decide. In H.264/AVC coding using CAVLC block-factor amplitude of the residual is a context code element.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

22

Zhengquan Xu, Jing Sun and Jin Liu

The set of the legal codewords of the code element context, not only does not mean Vn, but also affects the context other codewords decoding process and therefore it can not be directly used for the general binary data cryptographic algorithm. Its encryption is based on the above ideas of the encryption of variable-length code word, but also need to limit the ciphertext space in the conditions that does not affect decoding process of the context other codewords. Because such codewords are context-based adaptive, then the limitation of ciphertext space is dynamic, and the corresponding mapping space from the variable-length code word to fixed-length code word is also dynamic, as showed in Figure 9. Suppose that the mapping from the context codeword to the fixed-length codeword is M, then M is dynamic with the specific circumstances of the current context, the associated context-based information for the r, thus the mapping M is the function variable with parameter r and its inverse mapping is M-1. So the mapping from context codeword cve to fixed-length code word i is M (cve, r) = i, and the inverse mapping is M -1 (i, r) = cve. Mapping Dk and Ek is cryptographic algorithm for binary stream. Then the independent legal transformation of the context codeword cve can be described as follows:

CEk (cve) M 1 ( Ek ( M (cve, r )), r ) cve CDK (cve ) M 1 ( Dk ( M (cve , r )), r ) cve In which, CEk ( ) is encryption algorithm, CDk ( ) is the decryption algorithm on context code element cve, and cve‟ is the ciphertext of cve.

3.2.4. Summary of Principles and Means In Figure 10, the encryption algorithm with format compliance can be decomposed into these different types of video code elements to deal with, and then use the existing cryptographic algorithm to encrypt. For variable-length code element and code element context, respectively, in accordance with the above definition of the legal transformation map them into fixed-length codewords, and then encrypt them. After encrypted, inversely map the encrypted codewords into the legal code elements, and stream them into video ciphertext.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

23

3.3. VIDEO ENCRYPTION WITH FORMAT COMPLIANCE 3.3.1. VLC Encryption with Random Sequence In this section, introduce a variable-length codewords (VLC) encryption with random sequence [31]. Video variable-length codewords were encrypted by randomly changed variable-length coding table while each codeword was coded. The codewords were grouped into the table according to their length. While coding the video event stream, a random number corresponding to every event was generated to decide which codeword in its group the event should be coded to. The randomness of the number sequence was ensured by generated from binary system random sequence. The difficult problem of generating forbidden codewords was solved while enciphering video in the algorithm. It can completely keep video coding format compliance and compression ratio. It has the same anti-attack strength has direct stream cipher method while its speed is over ten times faster than the former to meet realtime requirement. The experiments show that the cipher pictures are fully disguised. The method can be applied to MPEG, H. 26x, and JPEG signal, and can be made as a separate modular.

Ciphertext Space

Ciphertext Space

文本 文本 文本 文本 文本 文本 文本

文本 文本 文本 文本 文本

Plaintext Space

Plaintext Space

Ti m e

Text

Figure 9. Mapping relationships between plaintext space and ciphertext space of code element context and the principle of indirect encryption.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

24

Zhengquan Xu, Jing Sun and Jin Liu

Video plaintext Analysis of codestream

M(vve) vi

vve

cve

M(cve,r)

ci

Encryption algorithm Ek

fve

fve’ M-1(vi’) vve’

vi’

Video Codestream ciphertext synthesization

M-1(ci’,r) ci’

cve’

Figure 10. Structure principle of video encryption algorithm with format compliance.

3.3.1.1. Principle of Randomly Change Codewords Table Algorithms (RCTA) The change of VLC code table exactly is the change of the corresponding relation between video information events (event, difference DC and so on) and their bit representation (codeword), but RCTA needn‟t made the VLC keys code table which comparatively fixed and different with standards. RCTA directly changes the corresponding relation between video events and codeword in random base on the standard code table. By doing this, on the one hand, it is not unlawful for events and code words similar with the standard code table; on the other hand, it is saver for using random code table for every codeword. The concrete methods can be divided into two classes: a

b

Group the standard VLC code table‟s events and their corresponding codewords, which want to be change, according to wither it is the last non-zero AC parameter of block and the length of code word; order in every group; and make the reference code table for encryption. This reference code table can be open but it must be similar between the both sides of transceiver. The corresponding relation between events and code words inside the group is determined by random numbers in encoding and decoding. Using this method, the same events will get legal code word which different with random in coding according to the standard code table but similar in length.

3.3.1.2. Making Reference Code Table by Grouping Code Word Variable length code table. Group every standard code table‟s code words (and their corresponding events). Codes, whose last bit of event is similar (all are 0s or 1s) and the length is the same, are grouped into

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

25

one group and are ordered within the group. The number of code words is the same with the length of mode; by doing this, every standard VLC code table can get a corresponding code words group and the reference code table between the improved length of mode and the order within group. Table 1 shows the rules of making reference code table by taking the example of inside luminance and color difference TCOEF of the variable length code table [32 Table 568]. It lists the first three groups in the table where „s‟ is the sign bit („1‟ means positive and „0‟ means negative). Grouping accord to the last=0 or last=1 to make sure that the tag of tail of block is out of mistakes. The corresponding event (last=0; run=0; level=±1) of the first group has 2 code word with 3 bits (100 and 101) and so the length of mode is 2. The order inside group is: 101(negative) is 1, 100(positive) is 2, and other code words‟ order is made in the same way. Others, for example: the variable length code table of interframe luminance and color difference TCOEF, the variable length code table of MVD, and so on, make the corresponding reference table following this way [32]. As for difference DC coding (showed in Table 2), if do not change the variable length code table which represents size [32, Table 5-49, Table 5-51] and group the variable length code which represents the values of difference DC [32, Table 5-50 additional code] according to the size, in which positive part and negative part are to be in one group for the same length of bits. The lengths of first group‟s additional code are not the same, but the lengths of merging completed code are the same (011,110,111). Grouping in this way will increase the changing scope of the variable range of high probability code and improve the security compared to the first line‟s single grouping. Fix length code table. If DC use the way of fix length coding (FLC), there is only one group. The escape code of little probability event can not change (in the experimental condition of this section) which nearly do not effect the confidentiality, or we can group the fix length code according to the same length.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

26

Zhengquan Xu, Jing Sun and Jin Liu Table 1. Example of VLC Reference Code Table Group Number 1

Variable Code 10s

Length of Code 3

Last

Run

Level

Modulus

0

0

1

2

Sort Inside Group 1,2

2

110s

4

0

0

2

2

1,2

3

1111s

5

0

0

3

4

1,2

0

1

1

1110s

3,4

Table 2. Difference DC Reference Code Table Group Number

Type of Code Length

1

011 11

Additional code

Length of Code/bit

0

3

10

Modulus

Sort Inside Group 1

3

2

0 1

2

Value of Difference DC

-1 1

00-01

4

10-11

-3 ~ -2 2~3

3 4

1,2 3,4

3.3.1.3. Encryption and Decryption Algorithms When encoding, the format and the information of parameters do not change (do not encrypt) and still according to the standard coding. Look up the reference table to find the corresponding mode length M j and the order within group R j for every event which need be encrypted. Product random number R j which uniform distributes in the range of [0, M j

1] , add after modulo M j and get the order

within group of ciphertext

Cj

( Pj

R j ) mod M j

(1)

Then output the same group codeword as the ciphertext corresponding to C j for the vent.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

27

When decoding, receiver can be get the same R j with sender just for both communication sides have the same key. It can be known form Equation 1 that:

(C j R j ) mod M j

(( Pj R j ) mod M j R j ) mod M j (( Pj R j ) mod M j R j mod M j ) mod M j ( Pj R j R j ) mod M j Pj mod M j

Because of 0

Pj

(C j

Pj

R j , Pj mod M j

Pj , then

R j ) mod M j

(2)

Look up the reference table (the same with sender and can be open) to find the mode length M j of cipher codeword and the order within group C j . Count out Pj according to Equation 2, and the corresponding event of Pj ‟s order in group is the decoding output of the original plaintext event .

3.3.1.4. A Random Sequence Generation Algorithm of Variable Mode The key sequence of stream cipher could not be used in RCTA directly and it needs to be transferred to a random number sequence whose module value (random value range) changes with the modulus‟ variety. The following will talk about an algorithm which generates random number sequence of variable module according to fixed module random sequence. Suppose that the b scale (usually b 2 ) sequence V v v0v1...vk ... , If random variables

vk (k

0,1,...) are equal-probability distribution and mutually independent in

the range of [0, b 1] , and then V is random sequence. The binary random sequence is used to make sure the security in practice. If it is inconvenience to use the real random sequence, it can use high performance pseudo-random sequence. The steps to generate variable module random sequence are as follow:

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

28

Zhengquan Xu, Jing Sun and Jin Liu a

To get the corresponding module value M j according to the events and code words

Do {

V continues to generate t j bits which are signed as v j 0 , v j1 ,...v j (t j which satisfies condition: t j 1 logb M j

wj

v j (t j 1) v j (t j

2)

...v j1v j 0

v j (t j 1)b

bj 1

1)

and

tj

v j (t j

t

2)

bj

2

... v j 0 ;

} While ( w j

M j 1)

Make sure w j to be probability parameter valuing of r j which is unify distribution in the range of [0, M j

1] and meet the following equation:

b

rj

w j , and to get the R j in Equation (1) and Equation (2)

c

Repeat step a and step b to get the needed sequence R j for the following events and code words.

3.3.1.5. Experiments and Analysis In this experimental scheme, escape code dose not encode and just encode DC, AC in I block and motion vector of predictive frame. Figure 11(a) is I frame whose video data is encoded, encrypted, decrypted and decoded with the same key stream between sender and receiver (that is the plaintext through correct decryption). Table 3. Time Delay of Every Frame Encryption Generation of Key 1.752

Retrieval of Output 0.003

Total Time Consuming 1.763

Stream Cipher

Average

Analysis of Input 0.009

Shortest

0.000

0.116

0.000

0.117

6.970

Longest

0.212

8.224

0.082

8.518

51.980

23.030

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

29

Figure 11. Results of experiment.

a.

Visual security: Figure 11(b-f) are the images which decrypted according to standard table. During encryption, Figure 11(b) shows the image whose corresponding table of AC randomly changes and table of DC is standard, and its details are blur but its figure is visible; Figure 11(c) shows the image whose table of DC randomly changes and table of AC is standard, and some details and numbers can be recognized; Figure 11(d) shows the I frame‟s ciphertext whose the zero-value difference DC grouped alone; Figure 11(e) shows the I frame‟s ciphertext whose 0-value and ±1-value difference DC change in the same group; Predictive frame‟s data of encryption is very little, but it is on the base of I frame and so its results are better. The 10th frame (P frame) after I frame decoding image is showed in Figure 11(f). b. Compatibility of code stream: through directly decoding ciphertext file_mpg_ecr to get the normal full video file (It is set that YUV format, 300frames in this experiment, and the results is showed in Figure 11(b-f). The standard decoder does not alarm during decoding explains that the ciphertext obeys the coding standard format or it has very good compatibility for the decoder.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

30

Zhengquan Xu, Jing Sun and Jin Liu c.

Computational efficiency: the video coding stream (file_mpg) which has 1,360,725 bytes needs 54,001 bytes of key stream. The video stream is 25.20 times more than key stream, and this shows that it will save a mass of time for generating key stream. Time consuming for each frame using RCTA and stream cipher is shown in Table 3. The spending time of stream cipher is more than RCTA and usually it is 13 times, the time consuming of shortest frame is 59 times and the longest frame is 6 times. d. Variable of the coding stream‟s size: the lengths of plaintext file_mpg_dcr are 1360725 bytes and the length of ciphertext file_mpg_ecr is 1360727 bytes. The mentioned above shows that encryption doesn‟t increase the code stream and doesn‟t decrease the compression ratio.

3.3.2. Video Encryption Algorithm Based on Spatial Shuffling Video data security is very important for multimedia commerce such as video-on-demand and real-time video multicast. Nowadays, many video encryption methods have been put forward. This section presents a new kind of video encryption approach that based on the syntax character of compressed digital video bitstreams [33]. Using this approach, we can shuffle the critical data blocks in each video frame with different shuffling tables generated by a cipher key. These critical data blocks have an important impact on image reconstruct, so the result images can't be recognized without valid cipher key. The simulation result shows that this approach provides overall high security, high speed, size preservation and format compliant, so it can meet the demands of security, real-time and low cost at the same time.

3.3.2.1. Pseudo-Random Sequence and Encryption Shuffling Table 3.3.2.1.1. Shuffling Table Generation Based on Pseudo-Random Sequence Video bitstreams is dynamically encrypted in this section by using dynamic shuffling tables. Dynamic shuffling tables are generated by pseudorandom sequence. Its performance directly defines the security and effect of shuffling encryption. Dynamic shuffling tables, which are sufficiently secure and suitable for the practical demand, should amount to three items as follows: 1) each shuffling table is different; 2) two shuffling tables which are next to

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

31

each other should be different as possible as it could; 3) shuffling rule to each shuffling tables is not previously determined; 4) For each shuffling table, the shuffling rule presents distribution without the regulation, and have the quality of the similar white noise. For the sake of the verification of the random of pseudo-random sequence, and whether it can meet with the demand of the generation of shuffling table or not, we can do the following experiment: Taking each eight bits in order from the output of the pseudo-random sequence, we can get lots of numbers between 0 and 255. The experiment result shows that the byte value distribution nears to the balance. The byte value from 0 to 255 appears on the same probability. Therefore, we can generate perfect random shuffling tables by using pseudo-random sequence. The pseudo-random sequence that has good character of randomness can generate by the m-sequence or the chaos sequence. The procedure of the generation of dynamic shuffling tables is as follows: a

b c d

e f

Analyze the structure of the video bitstreams, obtain the total number M of critical data block, then compute out k , which must meet the clause of 2k 1 M 2k ; Generate k bits using the output of pseudo-random sequence, then treat the k bits as a new random number; Repeat the above step M times and get others random number; Map the M random numbers to the range [1, M ] . In case two or more random number are mapped to the same value, we will re-assign, in a pre-determined order, un-used values to those conflicting values; Form a shuffling table using these M random number; Repeat the above steps, then we can get different shuffling tables which can be used in each video frames.

3.3.2.1.2. Shuffling Table Application The bitstreams are shuffled based on shuffling tables using the following process: a

b

Analyze the bitstreams to be encrypted, then get the critical data blocks to be shuffled, denote these critical data blocks with serial numbers from 1 to M ; Generate a shuffling table with M random number;

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

32

Zhengquan Xu, Jing Sun and Jin Liu c

Re-arrange the critical data blocks in an order specified in the shuffling table. The critical data blocks of i is mapped to a position determined by the i th random number of the shuffling table.

3.3.2.2. Video Encryption Approach and Critical Blocks of Bitstreams 3.3.2.2.1. Video Encryption Approach The basic framework of video encryption approach presented by this paper is shown as Figure 12. From the angle of the realization, its basic process is: With the compressed video bitstreams input, the information of each critical data blocks is obtained by video bitstreams structure analyzing, then the critical data blocks of bitstreams is shuffled in spatial according to the dynamic changed shuffling tables, at last, return the shuffled critical data blocks to the original bitstreams. Encryption

Cipher Seed

PSG

PSG

STG

STG

#4

#3 #5

CDBR

VBSA

#3

ASD

#2

CDBR

VBSA #1

Cipher Seed

SE

#2

Decryption

#6

Internet PSG: Pseudo-random Sequence Generator STG: Shuffling Table Generator VBSA: Video Bitstreams Structure Analysis SE: Shuffling Encryption ASD: Anti-shuffling Decryption CDBR: Critical Data Block Return

#1: video bitstreams input #2: critical data block #3: uncritical data #4: encrypted bitstreams output #5: encrypted bitstreams input #6: decrypted bitstreams output

Figure 12. Video encryption frameworks.

The only different between encryption and decryption is that encryption uses shuffling tables to shuffle encryption (SE), but decryption uses antishuffling tables to anti-shuffle decryption (ASD).

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

33

3.3.2.2.2. Critical Data Blocks of Bitstreams The basic idea of this approach is to spatially shuffle critical data blocks of the compressed bitstreams. The shuffling of critical data blocks selected from the compressed bitstreams has no influence on compression format, and is secure from attacks. Then cryptograph bitstreams that maintained format compliance is obtained by reverting critical data blocks to the position where is appointed at shuffling table. In order to keep format compliance, we must analyse structure of bitstreams firstly to know what would be influenced if we shuffle some information. The process about codec is as follows: variable length decode of DCT coefficients, then inverse quantization, inverse DCT, get the present MacroBlock according to the MV information of MacroBlock, and at last get the reference frame by ME (Motion Estimate). Therefore, the key information of compressed video bitstreams is mainly DCT (Discrete Cosine Transform) coefficients and MV (Motion Vector) codewords. They contain nearly all content and play a key role in video bitstreams codec and reconstruct. According to above analysis, we can get: DCT coefficients and MV is essential to video frame reconstruct. The critical data blocks in this study are MB (MacroBlock), block, MV value and so on. The simulation result shows that shuffling only one type of critical data blocks can't provide enough security because the video picture still can distinguish some information. Choosing the right combination of shuffling data blocks and forming into security level can attain good encryption performance. Table 4. Security Level and Combination Levels Low Medium

High

Different Combination of Shuffling Units One type of critical data blocks only(such as MV or DC) DC of I-frame and MV of PB-frame DC of I-frame and MB of PB-frame MB of I-frame and MV of PB-frame Block(blocks of all) Macroblock(MB of all)

3.3.2.3. Security Level Design and Analysis 3.3.2.3.1. Security Level Design According to the characteristics of the video data, the encrypt security meet with applicability, we must consider three metrics: speed, security, and bit rate overhead [34]. The speed and the security are incompatible. In order to

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

34

Zhengquan Xu, Jing Sun and Jin Liu

obtain comparative highly security level we must shuffle more critical data blocks, however the speed is decreasing. As pointed out in [35], for many realworld applications such as pay-per-view, although the content data rate is very high, the monetary value to the bits is low; therefore, “very expensive attacks are not interesting to adversaries” and “hence, light-weight encryption algorithms which can provide sufficient security level and have an acceptable computation cost are attractive to MPEG video applications.” So we should choose the right security level according to different application demand. The different combination of shuffling data blocks lead to different security level. According to our experimentation, the security level and corresponding combination are as showed in Table 4.

3.3.2.3.2. Resultant Images of Different Security Level Low Security Level: Only shuffling one type of critical data blocks, so the security is not high, but the speed is quick. It can be applied to such situation that security level is not high but real-time is high. For example, one of the major goals of content access control for entertainment purposes is to enable authorized users to view the video, and to disallow unauthorized users to view the video with satisfactory quality. One feasible method for access control is through encryption of low security level as pointed out in this section. Medium Security Level: The resultant image encrypted by medium security level, as shown in Figure 13(c-d), is satisfactory to many applications. Only very a few information could be distinguished in the resultant image. But the information become illegible quickly, as a result of the subsequent frame refreshed. In a word, the medium security level can meet with the general video encryption demand. High Security Level: High security level can provide top security. The resultant image is as shown in Figure 13(e-f). There is not nearly any information that could be distinguished. This level can be applied to such situation as that of top-secret communications, e.g., protection of military information. 3.3.2.3.3. Performance of Different Security Level In summary, we have described new methods of performing selective encryption and of shuffling compressed bitstreams while preserving format compliance. Now, we will test their performance and present some simulation results. In order to test the encryption speed, we have counted the time cost of three modules. The three modules include bitstreams analysis, generate

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

35

shuffling tables and critical data blocks return. The sum of the three modules is the total time cost of spatial shuffling encryption. Table 5 shows the simulation results. The unit is millisecond. We can present some conclusions from the Table 5 as following: 1) Different security level has different time cost, and high security level maybe have lower time cost than other security levels. 2) The three modules have different weightiness of effect on the total time cost. The generation of shuffling tables has the biggest percent on the time cost. 3) The time cost of bitstreams analysis is very low.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 13. Result images of different security level. In it, (a) is original image, (b) is result image of low security level, (c) and (d) are result images of medium security level, (e) and (f) are result images of high security level.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

36

Zhengquan Xu, Jing Sun and Jin Liu Table 5. Time Cost of Three Modules (ms) Levels High Mid 1 Mid 2 Mid 3 Mid 4 Low 1 Low 2 Low 3

Bitstreams Analysis 0.76 0.54 1.04 0.83 0.69 0.49 0.69 1.04

Shuffling Tables Generation 1.59 1.72 6.97 4.48 7.24 6.97 1.74 3.30

Data Blocks Return 0.62 0.41 0.63 0.21 0.55 0.69 0.27 0.56

Total Time 2.97 2.67 8.64 5.52 8.49 8.15 2.70 4.89

To the above conclusions, we can do the following interpretation: 1) The total time cost of shuffling encryption have relation with the number of critical data blocks which need to be shuffled, and did not directly relate to the security level. The more the number of critical data blocks are, the bigger the size of each shuffling table is. At this time the total time cost is increasing because there are more critical data blocks to be shuffled. 2) The generation process of the shuffling tables includes generation and application of Pseudorandom Sequence. This process is simulated by software in this study, so the time cost of the generation of shuffling tables is high. This process can be implemented by hardware when it is in realization for application. The total time cost will decrease rapidly if we try to reduce the time cost of the generation of shuffling tables. 3) In this section, we only analyze the syntax of the bitstreams when the compressed bitstreams is encrypted with spatial shuffling. At the process of bitstreams analysis, we don‟t carry on the process of image reconstruction that includes the process of IQ, IDCT, ME and MC. So, the time cost of bitstreams analysis is very low.

3.3.2.4. Performance Analysis Trade off between complexity and security: For multimedia content encryption, especially for the application in real-time video communication, low processing overhead becomes an extremely important requirement. Due to the restriction of real-time, it becomes important to select the most critical shuffling unit to shuffle. Syntax compliance makes it easier to locate different type critical data blocks in the process of bitstreams analysis. In addition, we present diverse combinations of complexity and security in this study. Those combinations are divided into three security levels so that we can select them conveniently according to different application demands.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

37

Network Adaptation and Friendliness: Because of the no guarantee nature of some transmission channels, to guarantee quality of service (QoS) for multimedia applications, on-line dynamic bandwidth adaptation usually needs to be performed. Therefore, it is desirable to make the encryption transparent to network adaptation processes, such as trans-coding or rate controlling. However, we adopt the spatial shuffling encryption approach that maintains syntax compliance. We don‟t changes the characters of network adaptation and rate controlling. Therefore, the network adaptation and friendliness of this approach in this section are perfect.

Video plaintext

Analysis of coding stream

4×4 PredMode

4×4 PredMode’

Trailing_ones

Trailing_ones’

IPCM_MB_byte

IPCM_MB_byte’

Other_PredMode

Last_bit

quant

Encrytiom algorithm encod Ei

Last_bit’

Other_PredMode’

suffix

Suffix’

Quant’

mvd

suffix

Suffix’

Mvd’

Residual_level

Suffix(variable)

Suffix’(variable)

combinatin of coding stream

Video ciphertext

Residual_level’

Figure 14. Principle of H.264/AVC self-adaptive selective encryption algorithm.

Security Analysis: Due to the good randomness of pseudo-random sequence, this encryption approach presented in this study equals to changing the cipher every time. According to the theory of cryptology, it is the most secure encryption approach. Therefore, this approach presented in this study is prior to others selective encryption methods using standard encryption algorithm. Dynamic changing shuffling tables are generated from pseudo-random sequence. Using different seeds, we can get different pseudo-random sequences. Therefore, the best encryption and decryption keys are these seeds. Concerning the generation of the pseudo-random sequence, there have been a lot of research results, so we don‟t make the discussion in this section. Due to the special characters of real-time video communication, many aspects should be considered. The basic idea of this paper is to spatially shuffle critical data blocks of the compressed bitstreams in a way such that the resultant bitstreams complies with the compression format. The ability to

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

38

Zhengquan Xu, Jing Sun and Jin Liu

maintain format compliance presented by this paper provides one satisfactory solution to all of the requirements discussed above.

3.3.3. H.264/AVC Video Encryption Algorithm After key information in H.264 compressed stream, such as quantization step, predictive mode, motion vectors and residual coefficients were analyzed, corresponding format-compliance selective encryption schemes are proposed respectively. A format-compliance residual block spatial permutation scheme is also proposed. Residual blocks were categorized into different groups according to their nC values and numbers of total none-zero coefficients. Random shuffling table was generated in each residual block group respectively and residual blocks were permutated. The algorithm was composed of two parts: selective encryption and residual block spatial permutation. The experimental results show that H. 264 compressed video stream could be encrypted in real-time while the format is kept compliant with the same stream size.

3.3.3.1. Self-Adaptive Selective Encryption Algorithm for H.264/AVC The principle of H.264/AVC self-adaptive selective encryption algorithm is showed in Figure 14 [36]. For the fixed code word, the three bits predictive mode word of 4×4 luminance block, the tailing of residual block coefficient and the pixel value of I-PCM macroblock (intra-frame coding mode) are encrypted by cryptographic algorithms directly; the intra-frame predictive mode word of 16×16 luminance block and chroma block, the quantization step, the difference value of motion vector with partly bits encryption. The encryption bit length of residual coefficient amplitude‟s level-suffix part is self-decided by the context. 3.3.3.1.1. Encryption of Quantization Step The relevant code words of quantization step in H.264 are single coded by the method se(v) of Exp-Golomb. According to H.264, the representation of the actual information k-bit suffix which is the output code word by the method se(v) of Exp-Golomb, is legal for any combination of 1 and 0 [37]. So it can selectively encrypt the suffix of the Exp-Golomb code word relevant to quantization coefficient. But it should be ensure that these encrypted code words must be in the standard range after decoding to ensure the compatibility of video semantic format. According to Reference [37], the final coefficients

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

39

used for decoding are tailored into the legal range by getting residual of performing modulo 52. So those standard decoders, which are good at fault tolerance, can encrypt the suffix of the code word directly.

3.3.3.1.2. Encryption of Predictive Code Word In H.264, there are four kinds of predictive mode of intra-frame predictive coding for 16×16 luminance block, nine kinds for 4×4 luminance block and four kinds for chroma component [37]. a

b

c

For 4×4 luminance block, the relevant nine kinds of predictive modes are singled in H.264 coding stream by one bit flag word :previntra4×4-pred-mod–flag (flag) and three bits fixed code word: remintra 4×4-pred-mode (mode) [37]. It means that if flag=1, the predictive mode of current block is the smaller one of left-neighboring block and the two above blocks and current block doesn‟t have code word of mode; and if flag=0, the predictive mode of current block is given by the following mode. So, if flag=0, this algorithm will directly encrypt the three bits predictive word: rem-intra4×4-predmode. For 16×16 luminance block, its predictive word adopts variablelength coding as mb-type code word together with luminance CBP and chroma CBP by the way ue(v) of Exp-Golomb[37]. In order to keep the compatibility of video format, it couldn‟t change the value of CBP arbitrarily (the value of CBP point out the coding way of each sub-block of luminance and chroma blocks). Table 6 shows that, in H.264, the combinations of luminance and chroma CBP are the same each four rows. In order to maintain the compatibility of format and keep the size of code stream, this algorithm will group each two rows into one group and directly encrypt the last bit of every group‟s variable length code word to achieve the goal of encrypting the predictive mode word. For the predictive mode word of chroma block: intra-chro-ma-predmode, it is coded singly by the way ue(v) of Ecp2Golomb as showed in Table 7. In Order to maintain the compatibility of format, it can‟t encrypt the whole word. For example, it has wrong results if decode the encryption mode 3 when its suffix 00 is changed into 01. So, this algorithm will encrypt the predictive mode as 1 or 2 and selectively encrypt the code word‟s suffix (the last bit) under these two modes.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

40

Zhengquan Xu, Jing Sun and Jin Liu

3.3.3.1.3. Encryption of Motion Vector Difference Motion vector difference mvd is coded singly by the way se(v)of Exp2Golomb.The representation of actual information k-bit suffix is valid for any composition of k bits. If the mvd after decoded exceed the legal range, it will be tailored into the standard range according to the standard decoder protocol. So, this algorithm will encrypt the suffix suffix of this code word directly. 3.3.3.1.4. Encryption of Residual Coefficient Relevant Information It consists of tailing coefficient code word trailing-ones, prefix of nonzero coefficient amplitude level-prefix and suffix of nonzero coefficient amplitude level-suffix. trailing-ones can be encrypted directly, but it just contains little information and takes a great proportion in code stream. So this algorithm suggests not encrypting trailing-ones in normal case to decrease computation complexity. The codec of the context-based self-adaptive nonzero coefficient level is complex [38]. The codec of level-prefix is according to the variable length code table of protocol, the direct encryption must destruct the compatibility of format. LevelSuffixSize, suffixLength and levelCode are the parameters used in coding. Level-suffix is unsigned integer whose length is levelSuffixSize. Usually the value of levelSuffixSize is equal to the value of suffixLength, but there are two exceptions: the prefix is 14, suffixlength is 0 and levelSuffixSize is 4; prefix is 15 and levelSuffixSize is 12. Variable suffixLength is context-based self-update. After being initialed, using the new rules of Reference [38], it can cause a larger change of level by making just a little change of level-suffix. And so we can encrypt level-suffix to encrypt residual coefficient.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Table 6. Some Semantic of mb_type and Encoding Variable Code Eord in I Plate of H.264 Mb_type

type

1 2 3 4 5 6 7 8 …

I- 16 ×16 - 0 - 0 - 0 I- 16 ×16 - 1 - 0 - 0 I- 16 ×16 - 2 - 0 - 0 I- 16 ×16 - 3 - 0 - 0 I- 16 ×16 - 0 - 0 - 0 I- 16 ×16 - 1 - 0 - 0 I- 16 ×16 - 2 - 0 - 0 I- 16 ×16 - 3 - 0 - 0 …

Intra-frame 16×16 predictive mode 0 1 2 3 0 1 2 3 …

Chroma block CBP

Luminance block CBP

Variable code word

Group number

Encryption bit

0 0 0 0 1 1 1 1 …

0 0 0 0 0 0 0 0 …

010 011 00100 00101 00110 00111 0001000 0001001 …

1

Last bit Last bit Last bit Last bit Last bit Last bit Last bit Last bit …

2 3 4 …

Table 7. Semantic of Chroma Block Intra-frame Predictive Word and Encoding Variable Code Word Chroma block intra-frame predictive mode 0 1 2 3

Predictive mode DC Horizontal Vertical Plane

Variable code word 1 010 011 00100

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Encryption bit none Last bit none

Zhengquan Xu, Jing Sun and Jin Liu

42

In order to make encryption to maintain the compatibility of semantic format, we put forward self-adaptive encryption as follow: a b c d

When prefix is 14 and suffixLength is 0, directly encrypt the 4 bits level-suffix. When prefix is 15, directly encrypt the 12 bits level-suffixLength. When suffixLength is not less than 6, don‟t update suffixLength any more and directly encrypt level-suffix. Suppose that thread is to express self-updated threshold 3 = (suffixLength 1) of suffixLength and abs-diff is to express the absolute values of the difference between level and thread. When absdiff is bigger than (1 = k ) , we can encrypt the last k bits of suffixLength. This can ensure encryption doesn‟t affect the relationship between level and threshold, and then doesn‟t affect the judgment of suffixLength‟s self-update, so maintain the compatibility of semantic format.

3.3.3.1.5. Encryption of I-PCM Mode Macroblock In some special situations, H.264 can use a kind of intra-frame coding mode which called I-PCM coding mode. Under this mode, coder will transmit the pixel values of whole macro block (first is the 16×16 luminance block, and then is two 8×8 chroma blocks). So this way of coding mode can be encrypted directly. In order to reduce computational complexity, we choose the highest bit or some high bits to encrypt. 3.3.3.2. Residual Block Scrambling Algorithm for H.264/AVC For secure stream media applications, block spatial scrambling can be used as a practice video encryption approach. The scrambling approaches commonly used for encryption of H.263/MPEG-4 stream can not be used for H.264/AVC‟s directly because the neighboring MacroBlocks, residual blocks and codewords of H.264/AVC stream data are context-sensitive. A novel format- compliant residual block spatial scrambling algorithm for H.264/AVC stream encryption is proposed in this section [39]. In the algorithm, the residual blocks are categorized into different groups and each residual block group uses different random shuffling table, which is generated by the chaotic cryptosystem respectively. The experimental results show that H.264/AVC video stream could be encrypted in real-time while the format is kept compliant.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

43

3.3.3.2.1. H.264/AVC Residual Values Coding In H.264/AVC, the total number of non-zero transform coefficient levels and the number of trailing one transform coefficient levels are derived by parsing for a given codeword. VLC selection of the codeword is dependent upon a variable nC. The current residual values block nC is derived by nA and nB. Except chroma DC coefficient, other residual values block‟s nC is derived by the left 4×4 non-zero transform coefficient block and the up 4×4 non-zero transform coefficient block of the current MB corresponding value nA and nB (See Figure 15). If scrambling the residual values block directly, two issues would arise: Firstly, an incorrect variable nC would be derived by the changed nA or nB, that would cause the incorrect value of the number of non-zero transform coefficient and the number of trailing one transform coefficient as lookup the wrong encoding table. This leads to the format incompliant; Secondly, the number of non-zero transform coefficient derived form the permuted residual block would conflict with the number corresponding macroblock_code_mode, luma or chroma coded_block_pattern. This also leads to the format incompliant. In a word, the scrambling approaches which commonly used for encryption of H.263/MPEG-4 stream can not be used for H.264/AVC‟s directly because that would lead to format incompatibility because the neighboring MBs, residual blocks and codewords of H.264/AVC stream data are context-sensitive.

nB

nA

nC

Figure 15. Current residual values block nC is derived by nA and nB.

3.3.3.2.2. The Basic Principle of Algorithm Based on the analysis of the H.264/AVC video codec characteristics, it is noted that the adjacent block decoder is affected by the value nC of residual

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

44

Zhengquan Xu, Jing Sun and Jin Liu

block and the number of non-zero coefficient significantly. An adaptive encryption algorithm is proposed. In the algorithm, the residual blocks are categorized into different groups in accordance with the total_coeff of its nonzero coefficient and nC value. Each group uses different the scrambling table and finally the all groups are scrambled together. The basic idea of this algorithm is as below: For each coding slice, the residual blocks with the same total_coeff of non-zero coefficient and nC value are divided into the same group. According to the difference of this total_coeff and nC value classifies into different scrambling groups. The numbers of scrambling groups is corresponding to the total_coeff of combinations and nC value. Finally, the scrambling groups will be permuted. Total_coeff of non-zero coefficient range of 4 × 4 residual block is [0,16], there being 17 kinds of values. nC value range is [-1,16], there being 18 kinds of values. Therefore, if all residual blocks with different attributes need to be scrambled, there is the requirement of setting the 17 × 18 scrambling groups. In different applications, to go along with confidentiality requirements, scrambling groups may be appropriately reduced so as to reduce the complexity of the procedure. For example, nC value, in most cases, is 1,0,1,… 8 and total_coeff in most cases is within the scope of [0,9]. It can be only take [-1,8] for nC within 10 values and [0,9] for total_coeff within 10 kinds of combinations. The classification of scrambling groups can guarantee that the scrambling must be only in the group, thus further ensuring the encoded bitstreams format compliant. The scrambling table of each group is generated respectively by a random sequence. For example, it would be obtained through a shuffling algorithm from a pseudo-random sequence generated by the chaotic cryptosystem. Original sequence residual blocks scrambling matrix scrambling residual blocks

1

2

3

4

5

6

7

8

...

9

10

...

A1

B1

A2

C1

D1

B2

D2

B3

...

A3

C2

...

3

8

1

10

7

6

5

2

...

9

4

...

A2

B3

A1

C2

D2

B2

D1

B3

...

A3

C1

...

Figure 16. An example of H.264/AVC residual blocks.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

45

3.3.3.2.3. Algorithm Implementation Because of residual blocks are categorized into different groups, and each group is scrambled independently through different scrambling table, the processing group classifications and block scramble in a slice would be carried out repeatedly. That results in greatly increasing of the complexity of the calculation. In order to reduce the complexity of the calculation, each residual block is numbered altogether according to the order of its initial address in a slice. Then generate a scrambling table from the random sequence generated by the chaotic cryptosystem for each scrambling group. The size of the table is related to the number of residual block contained. The scrambling tables of each group will be integrated into a general scrambling table. Afterwards, all residual blocks can be scrambled in a slice through the general scrambling table. It not only ensures that all residual blocks permutated are within the scrambling group they are belonged to, that is necessary for the algorithm to keep format compliant, but also ensures all residual blocks are permutated with just one scrambling processing, that offers the high efficient of the algorithm. For example, in Figure 16, a section code stream in a slice is divided into A, B, C, D four Scrambling groups. There are three residual coefficient blocks {A1, A2, A3} in group A and generate scrambling table {A2, A1, A3}. There are three residual coefficient blocks {B1, B2, B3} in group B and generate scrambling table {B3, B2, B1}. There are two residual coefficient blocks {C1, C2} in group C and generate scrambling table {C2, C1}. There are two residual coefficient blocks {D1, D2} in group D and generate scrambling table {D2, D1}. Each block is labeled with a number according to its order in the stream and an initial numbered list is generated. The initial numbered list and its corresponding block list is shown the first row and the second row in Figure 16. Then permute order number which to the block member of the group according to the scrambling table of each group respectively, to attain the general scrambling table which is shown in the third row in Figure 16. Finally, permute the blocks in stream through the general scrambling table as the output of ciphertext which is shown in the fourth row. The method of scrambling residual blocks can perturb the video image in vision and keep it format compliant, it can be used independently and also can be used a complementary role integrated with other selective encryption algorithm to improve the confidentiality.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

46

Zhengquan Xu, Jing Sun and Jin Liu

3.3.3.3. Combination of Two Encryption Algorithms: AVCEA AVCEA consists of self-adaptive selective encryption and residual block scrambling. For self-adaptive selective encryption, it can choose one kind or some kinds of key information to be encrypted which has discussed in Section 3.3.3.1. For residual block scrambling, it can be used alone and also can be used as supplement for selective encryption to enhance the security. When use the combination of selective encryption and residual block scrambling, it can choose those kinds of key information, which are used in selective encryption and don‟t belong to residual block scrambling, to be as the key seed of generating random sequence (which is used to generate residual block scrambling table). The advantage of this combination is it can make the self-synchronization of key come true. The main information inside residual block is the relevant code word of residual coefficients, so it can be used as key seed, in the same time, the quantization coefficient of block encryption, predictive mode word and the relevant code word of motion vector difference. Plaintext of Video compression coding stream

Key information extraction

Recode the position, length and other information of residual block (plate as unit)

Key seed

Block encryption

Generate random sequence

write back the ciphertext of key information

Group residual block according to nC and total_coeff

Generate the corresponding scrambling table according to the number of each group’s residual bolcks

Intermediate video ciphertext stream

Implement scrambling operation according to scrambling table, position and length of residual blocks

Final video ciphertext

Figure 17. Scheme of AVCEA.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

47

The implementation scheme of the combination of selective encryption and residual block scrambling are showed in Figure 17. Firstly, analyze the video coding stream to extract residual block‟s position and length, and group residual block according to their value of nC and total-coeff, in the same time, extract the key information as key seed and use this key seed to generate random sequence. Secondly, according to the number of residual block, generate scrambling table of corresponding length for those groups who have no less than two residual blocks. Thirdly, encrypt the key seeds (the key information) by block encryption algorithm. Write the ciphertext back to original coding stream and get the intermediary video ciphertext based on selective encryption. At last, scramble the intermediary video ciphertext according to the parameters of residual block‟s length, position in coding stream and each group‟s scrambling table, and get the final video ciphertext.

3.3.3.4. Performance Analysis 3.3.3.4.1. Visual Security Use three experimental schemes for foreman and mobile as follow: selective encryption; residual block scrambling; combination of residual block scrambling and selective encryption. In which, selective encryption scheme includes: just encrypt the coefficients of quantization; just encrypt the predictive mode word (luminance and chroma); just encrypt the difference of motion vector; just encrypt the residual coefficients (level-suffix and trailingonce); jointly encrypt four kinds of key information mentioned above. The experiment results of two sequences shown in Figure 18. From Figure 18 it can be seen that only adopting residual block Scrambling the image becomes blurred, color confusion, but as the increase of I frame intervals, the hided effect of images have been enhanced and we can also distinguish images outline. The encryption of scrambling residual block combined with selective encryption in part IV and the encryption of joint four kinds of key information in part III can be very good visual effect [5, 36] of secrecy, whether I frame or P frame is difficult to distinguish any content information. 3.3.3.4.2. Semantic Format Compatibility Our encryption analysis carried out on the foundation of maintaining semantic format compatibility. This method doesn‟t encrypt format information in principle, so it wouldn‟t distrust the syntactic and semantic of H.264 and be compatible with standard H.264 totally. During the experiments,

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

48

Zhengquan Xu, Jing Sun and Jin Liu

for the cipher-stream which encrypted from many H.264 sequence such as foreman, mobile, mother and daughter and so on, the decoder can decode them smoothly to get cipher-images whose content can not be identified.

a. Original 1st frame(I) Original 37th frame(P) Original 1st frame(I) Original 37th frame(P) (I) Foreman& mobile original frames

b. Just encrypt the coefficients of quantization

c. Just encrypt the predictive mode word (luminance and chroma)

d. Just encrypt the difference of motion vector

e. Just encrypt the residual coefficients (level-suffix and trailing-once)

f. Jointly encrypt four kinds of key information mentioned above (II) Selective encryption scheme

(III) Scrambling residual blocks

Figure 18. Cipher-images of foreman sequence and mobile sequence.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

49

3.3.3.4.3. Security Analysis For the part of selective encryption algorithm, the security of this method is base on block cipher. For example, if encrypt key information using the international accepted IDEA, the exhaustive attack from known plaintext will get the key through 2128 (1038) times of encryption operation [7]. For the part of residual block scrambling, the security is base on stream cipher. We can use some widely accepted stream cipher such as A5, RC4 and SEAL, to ensure the security of this method [7]. In addition, the attack analysis of video need human‟s judgment which is slower several orders of magnitude than machines, so the time of attack using ciphertext-only attack will increase by magnitudes. So the security of our scheme will be further enhanced. 3.3.3.4.4. Computational Complexity Make statistics and analysis of H.264 stream after compression coding foreman and other sequences. If encrypt all kinds of key information except the suffix of residual coefficient, the average ratio of encryption data is about 7.57%. For the part of residual block scrambling, computational complexity is due to the sequence encryption algorithm of generating scrambling table. The residual blocks in each encoding plate are limit, and each encoding plate will get few random bits from random sequence to generate scrambling table, so the computational complexity of scrambling residual block algorithm is very low. Table 8. Statistics of the Ratio of Encryption Time Consuming and Video Encoding Time Consuming (%) Sequence

Foreman.qcif Mother and daughter.qcif Mobile.qcif Foreman.cif Mother and daughter.cif Mobile.cif

Time Consuming Ratio (%) Selective Residual Block Encryption Scrambling 4.31 0.56 2.92 0.59 6.31 0.67 5.11 0.60 3.59 0.53 7.12 0.73

Encryption 4.87 3.51 6.98 5.72 4.13 7.85

The statistic of encryption spending time comparing to H.264 encoding time is showed in Table 8 (using the algorithm of encrypting quantization coefficient, predictive mode word and motion vector, combination of scrambling residual block). The real average delay of one plate is less than 3 ms, so it doesn‟t affect the real time encoding and transmission of H.264.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

50

Zhengquan Xu, Jing Sun and Jin Liu

3.3.3.4.5. Affection to the Size of Coding Stream The encryption of this method is independent of video encoder and so doesn‟t affect any characteristic of encoding-and-decoding, the quality of video picture and compression ratio. In principle, the block cipher used in selective encryption algorithm couldn‟t increase plaintext‟s length. The residual block scrambling algorithm just randomly changes the position of block in the coding stream. In addition, this method doesn‟t need to transmit extra additional information in coding stream, so it couldn‟t increase the size of coding stream.

3.4.4. Video Encryption Based on Chaos Chaos-based digital video and image encryption scheme, there are a number of studies at home and abroad, the paper described above are also a number of video encryption scheme used in chaotic sequence. Chaos cipher used for video encryption has become a hot topic. Chaotic systems can produce chaotic sequence with excellent characteristics that can be used for video encryption. But practically, an applied chaotic system is implemented by a digital system with finite computing precision, which will inevitably lead to the dynamical property degradation of chaos [40, 41], thereby to reduce the anti-attack of chaotic sequence. For video data with enormous amount and high security, chaotic sequence for encrypting request a longer cycle, higher complexity, and better randomness, so the degradation problem is particularly prominent. In response to this key issue, the coupling structure of the dual generator of chaotic systems is proposed [42], to effectively improve the degradation, resulting in a longer cycle, better properties of chaotic sequence to fit in a large quantity of video data, high-security secrecy demand.

3.4.4.1. Chaotic Sequences Generator Based on Coupled Logistic Maps In this section, the major part of the design is a newly proposed chaosbased pseudo-random keystream generator (PRKG) based on a couple of chaotic systems. The structure of the PRKG system is presented in Figure 19. The two logistic chaotic systems use the same principle with different initial values. In our design, the first logistic chaotic system generates the random numbers to update the parameters of the second, while some conditions are satisfied. The generator system is proposed in the following.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance

51

(1) Quantification Method Generating a pseudorandom binary sequence from the orbit of the logistic map xn

(3)

xn (1 xn )

1

For χn (0,1) and μ (3.569945,4], μ and χn are the system control parameter and initial condition. Depending on the value of μ, the dynamics of the system can change dramatically. The choice of μ in the equation above guarantees the system is in chaotic state and output chaotic sequences {χn} have perfect randomness [43, 44]. A simple way for turning a real number χn to a discrete bits symbol Xn is presented by Equation (4). Change the decimal part of the real into the binary sequences, and then extract some bits form it, so Xn is {bn1,bn2,bn3,…bnL}. We also can turn the binary representation Xn to its corresponding real representation χn by the reverse operation of Equation (4). xn

0.bn1bn 2 ...bnL -1

2 bn1

-2

2 bn 2

-L

... 2 bnL

(4)

(2) Keystream Generator The new generator system adopting two logistic maps is proposed for the generation of pseudorandom binary sequences. This algorithm consists of two logistic maps: xn yn

(1) 1

xn (1 xn )

(2) 1

yn (1 yn ), n

(5) 0, 1, 2...

for χn (0,1) and μ (3.569945,4], evolve successive states from the first logistic map by χn+1=μ(1) χn(1-χn), and obtain the real number χn+1 , turn the real number χn+1 to its binary representation Xn+1 by Equation (4), suppose that L=45, thus Xn+1 is {b1,b2,b3,…b45}. By defining three variables whose binary representation is Xl=b1…b15, Xm=b16…b30, Xh=b31…b45, respectively, the following equations are obtained.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

52

Zhengquan Xu, Jing Sun and Jin Liu Xn

Xl

1

Xm

(6)

Xh

Suppose that Xn+1′ is {p1…p15} after XOR operation, we turn the Xn+1′ to its real representation χn+1′ by Equation (7). χn+1′ is extended till tens to meet the condition: χi+1′′∈ (3.569945,4], so that the value can make the sequences into chaos. Then, judge whether the condition: 3.569945< χi+1′′ ≤4 and c≥100 is valid or not, in it, variable c is the iterative times since last update of μ(2). If valid, χi+1′′ is used to update the parameter μ(2) of the second logistic map system. In the iterative process, χn+1′ is always to update the value of previous iteration χn. -1

xn 1

2 p1

xn 1

xn 1

(2)

xi 1

-2

2 p2

... 2

-15

(7)

p15

(8)

10

(3.569945

xi 1

4  c 100)

(9)

The second logistic map does the same operations except for Equation (8) and Equation (9). Evolve successive states from the second logistic map by yn+1=μ (2) yn(1-yn), and obtain the real number yn+1 , turn the real number yn+1 to its binary representation Yn+1 by Equation (4), then we can get the value Yn+1′ by XOR operation expressed in Equation (6), Yn+1′ is the output binary sequences zi, meanwhile, turn the Yn+1′ to its real representation yn+1′, and yn+1′ is to update the value of previous iteration yn. Briefly, the algorithm can be expressed as follows: 1. 2. 3.

1

yi

1

xi

1

xi (1 xi )

(2)

Xi Yi

4.

(1)

xi

1 1

Xn 1

Xi

1,

Xl Yl

yi (1 yi ), i yi

1

Xm Ym

Yi

0, 1, 2... 1

Xh Yh

xn 1 , Yn 1

yn 1

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Video Encrypton Algorithms with Format Compliance 5.

xi 1

xi 1

(2)

xi 1

6.

xi

xi 1 , yi

7.

zi

Yn

53

10

(3.569945

xi 1

4  c 100)

yi 1

1

3.4.4.2. Video Encryption Scheme and Experiment Here use the H.264/AVC adaptive selective video encryption algorithm with format compliance described in the previous section, select the H.264 bitstream type of all key information combinations, use stream cipher algorithm to encrypt the information. Figure 20 shows the principle of encryption. Video decryption is an opposite process, the key information after decrypt will be wrote back to code in accordance with its recorded location and the length of the information stream. The decryption process and encryption process are contrary, the key stream generator algorithm and encryption algorithm are the same with decryption, decryption program shown in Figure 21.

Initial value (μ(1),χ0) The first logistic chaotic system

Perturbation The replacement value of control parameter

Initial value (2) ( μ ,y0) The second logistic chaotic system

Perturbation

The output binary sequences zi

Figure 19. The keystream generator system.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

54

Zhengquan Xu, Jing Sun and Jin Liu

n+1 frame data …E8 4D 02 0F…

Encryption Algorithm

n frame data …3B 05 9C 57…

Transmission channel Synchro n-2 frame ciphertext -nous …6F FF 21 6B… control

n-1 frame ciphertext …CE 9A EB 3D…

…58 f2 1d 80… …3a 49 d7 21… Key stream …2a 3e 18 65… …f2 7e 48 90… …e4 67 00 1d… …01 c7 5b d3…

Key stream Generator

Initial Key

Figure 20. Video encryption scheme.

Decryption Algorithm

Transmission channel Synchro n-1 frame ciphertext n-2 frame ciphertext -nous …CE 9A EB 3D… …6F FF 21 6B… control

n-3 frame ciphertext n-4 frame ciphertext …1F 6E 10 04… …65 7C F1 03…

…58 f2 1d 80… …3a 49 d7 21… Key stream …2a 3e 18 65… …f2 7e 48 90… …e4 67 00 1d… …01 c7 5b d3…

Key stream generator

Initial Key

Figure 21. Video decryption scheme.

Here use the 1st frame, 145th frame and 286th frame of the original image to study their effect of encryption. On used the adaptive selective encryption algorithm performance evaluation, in the light of the previous section, does not go into details.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Chapter 4

VISUAL SECURITY ASSESSMENTS FOR VISUAL MEDIA Video data security is very important for multimedia applications, such as video surveillance and real time videoconference. Although many video encryption methods have been reported, no systematic visual security assessment means have been proposed. Now, security analysis of video encryption put emphasis only on the encryption algorithm itself. However, the intelligibility of the encrypted video needs also to be investigated. Typically, the objective assessment is required to assess how much the video information in cipher-videos is distorted, shuffled, and unrecognized. This section, we focus on research of visual security evaluation method of the cipher-images.

(a)Original

(b)Encrypted

Figure 22. Video security effects.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

56

Zhengquan Xu, Jing Sun and Jin Liu

(a)

(b)

(c)

(d)

(e)

(f)

Figure 23. Original images and cipher images. (a) is the original image, (b) is the reconstruction image, (c) is the cipher-image, (d-f) are cipher-images of the image produced by different encryption algorithms.

4.1. DEFINITION Compressed bitstreams of images and videos become to be cipherbitstreams when they are encrypted by selective encryption algorithms [45] that can maintain bitstream format compatibility. If cipher-bitstreams are directly inputted to standard decoder and are decoded without decryption, the images we get are called cipher-images. Compared with the original images (in Figure 23(a)), the pixel value and neighborhood distribution of the cipher-images (in Figure 23(c)) all have been changed. However, for the same original image, different encryption algorithms produce different cipher-images (in Figure 23(d~f)), which have different changing degrees of pixel value and neighborhood distribution, and then make the unrecognizable degree much different. The unintelligible degree of cipher-images to human visual is named Visual Security.

4.2. VISUAL SECURITY ASSESSMENT METHODS Visual security assessment is a necessary part of the performance analysis on the image and video encryption algorithms. Performance analysis based on cryptanalysis can prove the complexity for the attacker deciphering the

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

57

encryption algorithm in theory, but can not provide the visual security degree of the cipher-images. Video encryption requires both cryptographic security and visual security. Current security assessments methods of image and video encryption were deep studied and divided into three kinds: assessment based on cryptographic analysis, assessment based on subjective evaluation, and assessment based on video quality assessment.

4.2.1. Assessment Based on Cryptographic Analysis Assessment based on cryptographic analysis quantitatively analyzes the possibility of deciphering the cipher visual media through the use of cryptanalysis theory. Reference [46] gives the possibilities of ciphertext-only attack, known-plaintext attack and chosen-plaintext attack during the security analysis of encryption algorithms. It also quantitatively gives the compute complexity of ciphertext-only attack through the use of exhaustive method. Reference [47] presents two quantitative cryptanalytic findings on the performance of ciphers against plaintext attacks based on a general model of permutation-only multimedia ciphers. In different perspectives, other References [48, 49] also use the cryptanalysis to analyze the possibility and complexity for the attacker to decipher the encryption algorithms successfully. These types of assessments, which process security analysis of encryption algorithms by means of cryptanalysis, are extensively adopted by the most of performance analysis of visual media encryption algorithms.

4.2.2. Assessment Based on Subjective Evaluation Assessment based on subjective evaluation process security analysis to the encryption algorithm subjectively through judging the unrecognizable degree of the cipher-images which decoded from the cipher-bitstreams directly. After introducing the encryption algorithm, Reference [50] directly presents six cipher-images of three video test sequences to prove that encryption algorithm can distort the visual information of images, and that cipher-images are unrecognizable to meet the need of visual security. Similarly, Reference [51] presents more graphics of cipher-images, and analyzed the unrecognizable degree of the cipher-images subjectively, then compared the visual security of cipher-images getting from different encryption methods. Subjective assessment for the visual security of the cipher-images can be influenced by

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

58

Zhengquan Xu, Jing Sun and Jin Liu

the measuring environment and subjective sensation. What‟s more, because of low speed and high cost, it is not very feasible for security analysis only based on the subjective assessment in practical application. Cryptanalysis is needed for security assessment. The subjective assessments are combined with the cryptanalysis, which have already been extensively applied in visual media security evaluation, especially in security evaluation of image encryption algorithm.

4.2.3. Assessment Based on Video Quality Assessment Assessment based on video quality assessment theory is a kind of objective security evaluation method, but for which there are few studies or applications so far. There are many video quality assessment methods currently, among which the method based on peak signal noise ratio (PSNR) is widely applied for easy implement and low computational complexity. At present, a few papers of visual media encryption analyze the cipher-images‟ unrecognizable degree with PSNR value when evaluating the encryption algorithm. Reference [52] presents cipher-images for subjective assessment, and analyze the security degree of the cipher-images according to the cipherimages‟ PSNR value at the same time. The reference points out that the lower the PSNR value is, the more different between the cipher images and the original images there will be, and the lower the intelligibility degree of the cipher-image is, so the better the security level is. Reference [53] gives the cipher-images and the change curve of PSNR value, and analyzes the recognizable degree of the cipher-images getting from different kinds of encryption algorithms.

4.2.4. Visual Security Assessment vs Video Quality Assessment Security level evaluation of cipher-images is different from video quality assessment of video codec because they have different research objects and goals. Video quality assessment is a method to measure the distorted degree of loss compression in video codec. It only reflects the accumulation value of error between original image and reconstruction image, which is adopted to assess images that have little difference after compressed and reconstructed. The aim of visual security assessment is to assess how much video information in cipher-videos is distorted, shuffled, and unrecognized. That is to say, visual

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

59

security assessment has an emphasis on the evaluation of the unidentifiable degree of cipher-images. Cipher-images have many changes not only in pixel value but also in spatial distribution. Therefore, visual security assessment is different from video quality assessment, and the traditional video quality assessment algorithms are not appropriate to evaluate the security level of cipher-videos, and thus it needs to present new objective assessment methods.

4.3. INTRODUCE OF THREE KINDS OF OBJECTIVE VISUAL SECURITY ASSESSMENTS In this section, three visual security assessment methods are presented for video encryption [54, 55]. They are structure distortion, image entropy, and spatial correlation. Experiment results are presented and analyzed. It shows that our scheme can provide objective assessment compliant with subjective assessment, and is also suitable for the security assessment of other selective video encryption algorithms.

4.3.1. Method Based on Structure Distortion Wang et al. [56] presented a new video quality assessment method based on Structure Similarity (SSIM) in 2002. Because of using the structure information of images, SSIM have a good performance on static images. In this section, structure distortion is introduced from the method of SSIM. Visual Security of cipher-videos is assessed by structure distortion. Let X, Y denote original image and cipher image respectively, then the SSIM of the two images is SSIM ( 2 (

l ( x, y )c( x, y )s ( x, y ) x

y 2

2 x

y

c1)( 2 c1 )(

c2 )

xy

2 x

2 y

(10)

c2 )

Here, c1and c2 are two small constants to avoid instability of zero denominator, l(x,y) is brightness comparison function, c(x,y) is contrast comparison function, s(x,y) is structure comparison function, ux and uy are the mean of

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

60

Zhengquan Xu, Jing Sun and Jin Liu

image x and image y respectively,

x

and

y

are the variance,

xy

is the

correlation coefficient.

x

1 N

x

N

xi i 1

x

2

N

1

( xi

N 1i

x)

1

N

1 ( xi N 1i1

xy

x

)( yi

y

)

Using Equation 10, we get following conclusions. If the compared two images are the same, SSIM equals to 1. The more difference the two images have, the smaller the SSIM is. SSIM has a lower limit of 0. Therefore, the visual security assessment of video encryption based on structure distortion is SD 1 SSIM 1

(2 (

x

y

2 x

2 y

c1 )(2 c1 )(

xy 2

x

(11)

c2 ) 2 y

c2 )

The value of SD (Structure Distortion) indicates the degree of visual security. The bigger the value of SD is, the more disorderly the pixel of cipher-images is. Images with disorder pixel have more visual security, and can not been recognized easily.

4.3.2. Method Besed on Local Image Entropy Entropy indicates the uncertainty characteristic of information source. It is the average amount of information in all the signals. Let X be a information source with random signal xi whose probability is p(xi), i=1,2,3…..r, p(xi)>0, , then the entropy of X is r

H(X )

P( xi ) log p( xi )

(12)

i 1

For digital videos and images, random signal xi of information source X is the gray level of image pixel, i=0, 1, 2…..255. Different image gray level has

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

61

different probability. Gray level i has a probability of p(xi), then the image entropy E is 255

E

P( xi ) log P( xi )

(13)

i 0

Based on the theory of image entropy, we introduce partition entropy to describe the detail information of images. Firstly, each image is partitioned into several small blocks. For each block, we compute its entropy, which is called partition entropy. Secondly, partition entropy of each image is added. Its result is called total image entropy. The calculation procedures are as follows. Let Eij be the partition entropy of block [i, j]. Here, Eij is calculated using Equation (13). M, N denote the number of rows and columns of each image. The total image entropy of each image is M

N

Eall

Eij

(14)

i 1 j 1

For visual security assessment of video encryption, the accumulative of partition entropy have a good performance to describe the unidentifiable degree of cipher-images. As Figure 23 shows, original image (a) has such similar pixel area as hat, background wall and clothes, which makes the partition entropy smaller. However, because cipher-image (c) or (d) is encrypted, and the spatial distributing of pixel is disordered, the relevant partition entropy increase, and the total image entropy of each image increase sharply. The increase of total entropy of cipher-images indicates the change of visual security. The more disorderly the pixel of cipher-image is, the bigger the entropy is. Images with bigger entropy have more visual security, and can not been recognized easily. The precision of visual security assessment based on image entropy is related to the size of partition blocks. Small partition blocks can get good precision of image entropy, and result in good description of visual security in theory. However, the computational complexity of partition entropy will increase, when partition blocks are too small. For visual security assessment of video encryption, we propose that the size of partition block is 8x8, because most video coding algorithms adopt DCT transform with size of 8x8, and such size blocks represent the pixel boundary of images during video codec. The

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

62

Zhengquan Xu, Jing Sun and Jin Liu

simulation result shows that partition block with 8x8 can get good performance.

4.3.3. Method Based on Spatial Correlation Definition A: Let (i, j ) and (i

,j

) denote two pixel in one

image with distance as ( , ) , and their pixel values are gi , j and gi

,j

.

Let the positive constant m denote the difference of the pixel value (Only consider the brightness of pixel value.). Let

g (i, j , , )

1 gi , j

gi

,j

m

0 gi , j

gi

,j

m

then the two pixels are Similar if g (i, j, , )

(15)

1 ; otherwise the two pixel are

not Similar. Definition B: We calculate whether the points of the (2d

1)2 number

on [ d , d ] are similar to the center point (i, j ) , and we accumulate and normalize the results, then we obtain g (i, j, , ) /(2d 1) 2

f (i, j ) ,

(16)

[ d, d]

We call f (i, j ) the Correlation Coefficient of the pixel point (i, j ) on the rectangle with radius d . Definition C: For an image with width M and high N , Let the positive constant m denote the difference of the pixel value, then we count the Correlation Coefficients of each pixel respectively, and accumulate and normalize the results, we get

countm

f (i, j ) /( M * N )

(17)

i, j [M ,N ]

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

63

We call countm m-level Spatial Correlation Coefficient of this image. According to the definition of spatial correlation, the value of spatial correlation coefficient relate to not only spatial distribution states of image pixel but also the value of d and m . Different value of d and m can get different precision. As to rectangular radius d , big value can get good precision of spatial correlation, and result in good description of visual security in theory. However, the computational complexity of spatial correlation will increase, when rectangular radius d is too bigger. For visual security assessment of video encryption, we propose that the rectangular radius d have value of 3 or 4, because many video coding algorithms adopt DCT transform with size of 8x8, and such size rectangular radius d represent the pixel boundary of images during video codec. In the section, we assign rectangular radius d to 3. Different images have different spatial distribution of pixel, and the quantity of information is also different. Such difference also exists in the same image. For example, in the original image (a) of Figure 23, the hat region pixel value changes slightly, the background wall changes more, but the person face changes the most. Therefore, it can assign pixel value difference m to different value to satisfy different precision requirement. Through analysis of some video test sequence, three ranks of pixel value difference m are designed. 1

2

3

High precision: High precision value of m represents the similarity of the image area in which the change of pixel value is smooth. For example, in the original image (a) of Figure 23, the hat region pixel value changes slightly, so we can set m a smaller value to get a higher precision. Selecting the pixel point (171, 45), and setting rectangular radius d as 3, can get (2d+1)2 pixel points for which where their pixel values are shown in Figure 24(a). Analyzing the pixel values, it can be seen that the value of pixel point (171, 45) is 236 and the value of most pixel points around (171, 45) are 235 or 236 with a difference span which does not exceed 2. Due to the discussion above, assigning pixel value difference m to 2 will assure us a higher precision. Medium precision: Medium precision value of m represents the similarity of most image regions. For example, in the original image

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

64

Zhengquan Xu, Jing Sun and Jin Liu

4

5

6

(a) of Figure 23, the pixel value of the eyes (shown in Figure 24(d)), the nose and the background wall are closely similar to neighbor pixel on a changing trend, so we can set a medium value to get a medium precision. Selecting the pixel point (151, 146), and setting rectangular radius d as 3, can get (2d+1)2 pixel points for which their pixel values are shown in Figure 24(b). Analyzing the pixel values, we can see that the value of pixel point (151,146) is 64. The values of the pixel points around (151, 146) distribute in a wider span. The pixels in left eye of foreman distribute in the range of [61, 71], so the pixel value difference m can be set to 5. Due to the discussion above, assigning pixel value difference m to 5 will assure us a medium precision. Low precision: Low precision value of m represents the similarity of the image area in which the change of pixel value is large. As the areas at the top of the right collapsible shown in Figure 24(e), the pixel points distribute in a strip. The changes of pixel value within that area are in a wider span, and there are big differences with the pixel points outside these areas. Selecting the pixel point (141, 247), and setting rectangular radius d as 3, can get (2d+1)2 pixel points for which their pixel values are shown in Figure 24(c). Analyzing the pixel values, it can be seen that the value of pixel point (141,247) is 151. The changes of pixel value in that area are in a wider span, but the pixel points distribute at the edge of the collar distribute in one strip. The pixel values distribute between [140,160], so the pixel value difference m can be set to 10. Due to the discussion above, assigning pixel value difference m to 10 will assure us a low precision. It is necessary to set pixel value difference m to a bigger value in order to describe neighborhood similarity of complex images.

Through the above analysis, we designed three ranks of pixel value difference m that respectively are 2, 5, and 10. For different images, we should select different ranks according to different precision demand. High precision pixel value difference m (m=2) is suitable for simple images, which have less quantity of information. Low precision pixel value difference m (m=10) is suitable for complex images, which have more quantity of information. But for many pictures, some regions are smooth and their structure is simple, other regions have much more texture information. We can adopt weighted spatial correlation to describe such images, and adjust weighted factor to meet different demand of visual security assessment for different kinds of images.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

65

Definition D: For an image, suppose the three level spatial correlations are countm=2, countm=5, countm=10 respectively, and the weighted factor is a+b+c=1. We define

countm

a * countm

2

b * countm

5

c * countm

10

(18)

then we call count Weighted Spatial Correlation Coefficient of this image. For an image or a video frame with width M and height N, its Neighborhood Similarity calculation process is as follows: 1

2 3

Choose appropriate rectangular radius d, and pixel value difference m based on above analysis. According to Definition B, calculate f (i, j) the Neighborhood Similarity of each pixel point (i, j) on the rectangle with radius d by Equation (16). According to Definition C, accumulate f (i, j) of each pixel (i, j) on the rectangle with radius d, and then get Neighborhood Similarity Degree of the image or the video frame by Equation (17). 236 236 236 236 236 236 235 236 236 236 237 236 236 235 236 235 235 236 235 235 236 236 235 236 236 235 236 237 235 235 236 236 235 237 235 236 236 236 237 236 236 235 235 236 236 234 235 235 235

(a) High precision (m=2) 103 105 106 106 101 96

99

94

87

77

72

85 68 66

75 70

64 68

63 66

63 64

61 63

63 65 63 65

77

73

74

71

70

71 74

87

82

79

79

84

84 85

98

96

97

98

98

99 98

(b) Medium precision (m=5)

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

66

Zhengquan Xu, Jing Sun and Jin Liu 159 152 128 89 153 157 146 119

80 78

76 76

130 158 155 142 116 85 108 146 160 151 140 110

77 87

89

120 156 156 144 136 110

86

96

129 154 154 146 135

85

105 134 156 160 155

83 (c) Low precision (m=10)

79 86

(c) Low precision (m=10)

(d) Left eye of foreman

(e) Upside of right collapsible Figure 24. Neighborhood characteristic of image.

For different cipher-images by using different encryption algorithms, we can obtain their objective assessment results on visual security by comparing their Neighborhood Similarity Degree. The larger the Neighborhood Similarity Degree of the image is, the smaller the distorted degree. And the higher the recognizable degree is, the lower the visual security of the corresponding encryption algorithm. For video sequences, we calculate the Neighborhood Similarity Degree of each frame, and get the curses of Neighborhood Similarity Degree by using different encryption algorithms. Because of different encryption algorithms to be used, different curves of the neighborhood similarity can be obtained, and the smaller the Neighborhood Similarity Degree of the image is, the higher the visual security of the used encryption algorithm. Through observation and comparison of changes in the curves, we can determine the visual security degree of the encryption algorithms.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

67

4.3.4. Results and Anlysis The cipher-videos of visual security assessment in this section get from the algorithm in [33]. Reference [33] introduces a selective video encryption method based on spatial shuffling, which can keep format compliant. This selective video encryption method presents an algorithm to get some key blocks from the compressed bitstreams. These key blocks include intra frame macroblock (IMB), prediction frame macroblock (PMB), motion vector (MV), and intra frame direct coefficient (IDC). To shuffle these key blocks, we can get secure cipher-videos as Figure 25. We use foreman sequence as video test sequence, and xvid open source as video codec algorithm. The first row of Figure 25 are the original video images, others are IDC shuffling, MV shuffling, IMB shuffling, and PMB shuffling respectively. Each row presents three such frame images of one shuffling method as no. 48, no. 110, and no. 150. Figure 26 is the security assessment graph of cipher-videos based on Structure Distortion (SD), which x-axis is the frame number of video sequence, and y-axis is the value of structure distortion. The greater the value of SD is, the higher the security is, the fewer we can recognize from the cipher-videos. Figure 27 is the security assessment graph of cipher-videos based on image entropy, which x-axis is the frame number of video sequence, and y-axis is the value of image entropy. The greater the value of image entropy is, the higher the security is, the fewer we can recognize from the cipher-videos. Figure 28 is the security assessment graph of cipher-videos based on spatial correlation, which x-axis is the frame number of video sequence, and y-axis is the value of Spatial Correlation (SC). The smaller the value of SC is, the higher the security is, the fewer we can recognize from the cipher-videos. According to the Figure 25 and Figure 26-28, we can get the following conclusion. a). the three assessment methods, structure distortion, image entropy, and spatial correlation, have a good performance for visual security assessment. As shown in Figure 25, IDC shuffling can be recognized some information easily, and has a low security level, but MB shuffling cannot be recognized any information, and has a high security level, which conforms to the objective assessment of Figure 26-28. b). the three kinds of assessment methods have a good distinction for visual security assessment. As shown in Figure 25, the visual security level of MB shuffling is better than PMB shuffling, and PMB shuffling is better than MV shuffling. The visual security level of PMB shuffling is close to MB shuffling. These minute difference are

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

68

Zhengquan Xu, Jing Sun and Jin Liu

shown in Figure 26-28, and conforms to the subjective assessment. In addition, the visual security level of MV shuffling is better than IDC shuffling, but worse than MB shuffling. These obvious differences are shown in Figure 2628. c). the three assessment methods, structure distortion, image entropy, and spatial correlation, almost have a consistent result. Except for the result of IMB shuffling, all objective assessment results meet to the subjective assessment. The security level of all encryption methods in order is MB>PMB>MV>IDC. The assessment of IMB shuffling based on structure distortion meets to subjective assessment better. 48th frame

110th frame

150th frame

OR G

IDC

MV

IMB

PM B

MB

Figure 25. Result images of different encryption methods. Here, we only present three typical frames of each encryption methods.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

69

Figure 26. Visual security assessment based on structural distortion.

Figure 27. Visual security assessment based on image entropy.

Figure 28 is the security assessment graph of cipher-videos based on weighted spatial correlation. The weighted spatial correlation gets from definition D, and the weighted factor is as follows. a = 0.8, b = 0.15, c = 0.05. Because the foreman video test sequence has a simple change of background, the weighted factor a is assigned a bigger value. It is probable for high precision pixel value difference to approach to subjective assessment. We also make a visual security assessment for cipher-videos of mobile video test sequence. Mobile video test sequence has complex structure and diverse background, so the high precision weighted factor should be assigned a smaller value. The weighted factor is as follows. a = 0.5, b = 0.3, c = 0.2. The

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

70

Zhengquan Xu, Jing Sun and Jin Liu

security assessment graph is shown as Figure 29. Compared with ciphervideos of mobile sequence, this graph presents good assessment for the above encryption methods, which conforms to subjective assessment.

Figure 28. Visual security assessment based on spatial correlation. Here, the weighted factor is as follows, a=0.8, b=0.15, c=0.05.

In the research of video encryption, whether the video encryption algorithm meets the need of application relies on the visual security of ciphervideos. It is unpractical to assess the visual security completely depending on the subjective assessment. The three visual security assessment methods of video encryption referred in this study can be applied as effective and efficient objective algorithm. The significance of putting forward the three visual security assessments methods at a same time lies in the following aspects: 1) if a unified conclusion is made by means of the three assessments that an encryption algorithm is optimal in security, then this video encryption algorithm will be the best encryption algorithm in security; 2) if an encryption algorithm satisfy the subjective criterion of visual security assessment, then all the encryption algorithms proved by the three assessments to be better than the said encryption algorithm assessment of visual security. According to the graph of results, among the encryption algorithms in this study, all encryption algorithms based on MB shuffling, PMB shuffling, and MV shuffling can meet the demand of visual security. While the results of IDC shuffling and IMB shuffling encryption is not satisfactory, so it can‟t be applied as an independent encryption algorithm in combination with other key blocks.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Visual Security Assessments for Visual Media

71

The visual security assessment presented in this study, which use the cipher-video image as research object, is independent of video encryption algorithm. So it can be applied in the visual security assessment of other video encryption algorithm, and provide some references for the performance analysis.

Figure 29. Visual security assessment based on spatial correlation. Here, the ciphervideos are mobile video test sequences, and the weighted factor is as follows, a=0.5, b=0.3, c=0.2.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Chapter 5

CONTENT-BASEDE MULTI-LEVEL ENCRYPTON AND MULTI-LEVEL AUTHORIZATION 5.1. DEFINITION AND SIGNIFICANCE 5.1.1. Concept of Content-Based Visual media is different from the ordinary binary data code stream, which contains standard syntax structure information and plenty of semantic information. The syntax structure information specifies all parameters and status of compressed code-stream and the semantic information describes the image or video objects in color, texture, contour, movement, etc. Semantic information expresses the majority of images and video content. It plays a decisive role for the people in image analyses and recognition. In this section, we define the semantic information in visual media as “content”. In content-based encryption method, first fully analyze visual media encoding compressed code stream, in accordance with the importance of semantic information on which to classify, and then, extract images or video which play a decisive role in the image reconstruction, finally, encrypt these key data in order to achieve the goal of content-based encryption protection.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

74

Zhengquan Xu, Jing Sun and Jin Liu

5.1.2. Concept of Multi-Level Authorization Currently, there are lots of image and video encryption algorithms, but the concept of multi-level authorization has not been reported. We put forward a new concept about multi-level authorization, which fulfills the function of “encrypt once, multi-level authorization”. Specifically, it is depicted as following: 1

During the encryption in visual media, firstly analyze the compressed code stream, in accordance with the importance of image or video content to classify information, and then use different levels of encryption keys to accomplish the goal of “one-encryption”;

2

During the decryption in visual media, in accordance with the target user's security level, decide the extent of decryption and achieve "multi-level authorization." High-level users could decrypt all the ciphertext so that they can fully identify all the images and video information. For the common users, only decrypt part ciphertext so that it can only identify partial image and video information. As for the non-authorized users, as there is no decryption key, they can only see fully encrypted ciphertext images and video, and do not recognize any image and video information.

The method of multi-level authorization can guarantee the control of multi-level security. "Encryption once, multi-level authorization" of the visual media encryption, have the features of high efficiency, easy to data manage. It not only provides high-level and high-security encryption, but also fulfills ordinary application requirements of unpaid preview, copyright protection and so on.

5.1.3. Advantage of Content-Based Multi-Level Authorization In this section, we put forward the multi-level authorization based on content, whose advantage is described as following: 1

Content-based Encryption could maintain the visual media information of the syntax structure, and only encrypt the “content” in

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Content-Basede Multi-Level Encrypton and Multi-Level Authorization 75 order to enhance the security. The syntax structure information of compressed code stream is fixed field which is open in protocol or standard, if we encrypted these fields, it will lead to a known-plaintext attack, reducing the encryption security of visual media. 2

Content-based encryption can maintain cipher-stream‟s adaptability and fault tolerance in the network. The reason is we only encrypt the content of the original compressed stream, but channel information, such as network synchronization and fault tolerance, has not been changed during the encryption process.

3

Content-based encryption could ensure that the application of cipher stream would not be restricted. In storage-oriented or on-demandoriented applications, as the syntax structure information of cipher code stream has not been changed, visual media do not need to be decrypted firstly to fulfill the function such as playback, fast forward, fast afterward and other interactive features.

4

The cipher bit-steam of multi-level encryption has only one codestream, and the ciphertext is easier to be managed and distributed. It will be very favorable for those visual media distributors whose capability of information disposal is limited. Visual media distributors only need to issue a coincidence ciphertext with the highest security to each user, and then distribute the keys in accordance with different levels of security keys to different users. According to different levels of the secret key, different levels users can decrypt and obtain the different plaintext, and the extent of identification will be different from different levels users.

5

Multi-level authorization is conducive to the flexible and effective control in visual media business applications. In video-on-demand, pay-TV and other commercial applications, in order to attract customers to buy, it often needs to let non-authorized or unauthorized users play the code-stream smoothly, but can not be satisfied with the effect of viewing for the encryption protection of the images or video.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

76

Zhengquan Xu, Jing Sun and Jin Liu

5.2. CONTENT-BASED MULTI-LEVEL ENCRYPTION AND AUTHORIZATION METHOD 5.2.1. Theoretical Model of Content-Based Multi-Level Authorization In Figure 30, the theoretical model of content-based multi-level authorization is given. The visual media encryption and decryption of the model are independent of the encoding and decoding. After the original image or video being imported, the first phase is encoding compression, and then compressed data stream is sent to the encryption module, completing the information extraction and multi-level encryption, the final phase is obtaining encrypted ciphertext stream through the code synthesis. The ciphertext stream can be directly sent to the network, or locally stored. Decryption is the opposite operation of encryption. Firstly the cipher bit stream of key information must be analyzed and extracted from the ciphertext, and then be multi-level decrypted. After that, code stream of content after decryption with non-content code stream are integrated by code stream synthesis module. Finally, the synthetical decrypted code stream will be transformed to normal image by visual media decoder. We can watch the visual effect of the encrypted visual media after multilevel encryption and code stream synthesis by sending the ciphertext to standard decoder and decoding them without decryption before the ciphertext is sent to network or locally stored. The obtained images decoded are cipherimages of visual media. Under this circumstance, as the encryption idea is based on content, the syntax format of cipher-stream of visual media maintain unchanged, so the standard decoder can decode cipher-stream successfully. The cipher-images of visual media after decoded can be carried through visual security evaluation, which analyze the damage of encryption algorithm to the visual information of visual media. It provides an evidence for evaluating encryption effect and comparing encryption algorithm. The previous section has introduced the method of evaluation on visual security. The basic modules associated with the coding in theoretical model include: 1

Coding: including pre-processing, predictive coding, transform coding and entropy coding, and compressed code stream output after

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Content-Basede Multi-Level Encrypton and Multi-Level Authorization 77

2

encoding. The module actually is a complete compression coding process. The pre-processing process completes component transformation of each frame (RGB color space to YUV color space), or displacement of amplitude (the sampling value of pixels in JPEG), or reversible color transformation (the standard lossless compression in JPEG2000), and each frame is divided into macroblock or sub-block (overlapping block in JPEG2000) to facilitate follow-up predictive coding and transform coding. Predictive coding removes the correlation of neighboring pixels in each frame. The obtained the value of residual difference after predicted is the difference between the predictive value and actual value. Residual difference only needs few bits to represent, so that to reduce the computation complexity and to increase data compression ratio. Transform coding and entropy coding is to complete the compression coding of visual media. Transform coding converts the coefficient of spatial domain to frequency domain in order to focus much energy. By run-length encoding and Huffman entropy coding, frequency domain coefficients are organized to binary bit stream. The bit stream is the final compression code stream. Code stream synthesis: Complete the synthesis of the key information, non-key information and the structural information, making the compressed code stream after encryption or decryption still be in line with the syntax format and coding standard. Code stream synthesis is logically an independent process, but different data encryption algorithm corresponds to the realization of different ways. Since the majority of the key information which needs to be encrypted is extracted in the form of bits, if adopting the stream cipher encryption algorithm, the extraction, encryption and synthesis of bit stream can be completed at the same time. However, if using block cipher encryption algorithm, in the key information extraction process, should be firstly save the bit length and location of all key information should be. And then encrypt them by block cipher algorithm until the bit number of key information up to a certain block length. Finally, in accordance with the extraction order of key information backfill the encrypted bits to the extraction position and finish the stream synthesis.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

78

Zhengquan Xu, Jing Sun and Jin Liu 3

Decoding: Decoding is the inverse process of encoding, and its basic operations includes the entropy decoding, inverse transform, the compensation of predicting and post-processing etc. The operations of various parts of the process are similar to encoding. The output of decoding is the reconstructed image. Syntax information

Original images

Encode (compression)

Key information extraction

Non key information

Key information A Key information B Key information C

MUX Decode (decompression)

Multi-levels encryption

Encryption

Net transport or storage

Decryption Multi-levels decryption

Decode (decompression)

Reconstructed images

MUX

Cipher key information A Cipher key information B Cipher key information C

Non key information

Cipherimages

Visual security assessment

Cipher Key information extraction

Syntax information

Figure 30. Model of content-based multi-level authorization.

In addition to the above process associated with the coding, the theoretical model also includes three important components: 1

Content-based extraction of key information: The module analyzes compressed data stream to identify the content and to confirm the importance of various information, so that it could provide the input for the follow-up multi-level encryption. At present, the visual media generally needs compression coding before the transmission or storage, which transform the time-domain, spatial-domain or object-domain signal to code stream of compression domain. The input of content-based key information extraction process is the output of the compression encoding stream, and the key information extraction is based on the compression domain. The key information analysis and extraction process is independent of

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Content-Basede Multi-Level Encrypton and Multi-Level Authorization 79

2

3

encoding, so it can ensure that the theoretical model is universal, and adopting the new technologies such as object-oriented coding also can be based on the theoretical model to achieve multi-level authorization. Multi-level authorization protection of visual media: After the content-based key information analyzed and extracted, the multi-level data encryption keys can be used to implement multi-level encryption to the different importance levels information, and to achieve multilevel authorization of visual media. as a result the different key information have different effect to visual media quality. Multi-level data encryption keys are generated by public key algorithms based on the idea of multi-level security. Objective evaluation of visual security: Practical visual media encryption algorithm not only requires the performance of anti-attack and anti-analysis, but also needs to ensure the non-identifiability of cipher images of visual media, called visual security. The better visual security of cipher images of visual media, the higher security of corresponding encryption algorithm for visual media information.

The above three important components are the core idea in this section, and are the technology route realizing content-based multi-level authorization. The third component has already been introduced in previous section in detail. The remaining two components will be specified as following.

5.2.2. Extraction of Key Information Based on Content Key information extraction based on content is the first step of multi-level encryption. After analysis and extraction of key information in compressed code stream, encrypt them by high security encryption algorithm to achieve the encryption protection based on content. This part first analyzes the selection point of encryption, which is the location of extraction and encryption of key information in the visual media coding process. Then introduces the analysis methods of the key information and gives examples of how to determine the key content information in visual media.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

80

Zhengquan Xu, Jing Sun and Jin Liu Prediction /Transform

Original image

1

Quantization

Entropy coding

2

3

Compression codestream

4

Figure 31. Selection of encryption points.

5.2.2.1. Selection of Encryption Point In accordance with the relationship between encryption process and coding process, visual media encryption could be categorized as following: 1) encryption of original image pixels, 2) encryption of transformation coefficients after prediction and the transform coding, 3) encryption in entropy coding process, 4) encryption of compression code stream after entropy coding. The above-mentioned classification of the relative position coding is shown in Figure 31. In the section about the visual media encryption, we encrypt the compressed encoding data, and encryption point locates after the entropy coding. In network-oriented applications, compressed code stream complete encryption before data network packaging and sending; in the storage-oriented applications, compressed stream completes the encryption before sending to the storage. 5.2.2.2. Analysis of Key Information The analysis and extraction process of code stream after compression coding are very different from the decoding and playing process of compression code stream. As Figure 32 shows, the compressed code stream decoding is required variable-length decoding, inverse quantization, inverse transform and so on. But the encryption of compressed code stream requires only variable-length decoding. To selectively encrypt the information after variable-length decoding can complete the encryption process. As the encryption process without the larger amount of computation of inverse quantization and inverse transform, the complexity of selective encryption on compression stream is far less than the encoding and decoding, and the key information analysis, extraction process has high efficiency. At the same time, the extraction content used for encryption does not contain syntax structure information, so the backfill after encrypted does not change the compressed stream format.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Content-Basede Multi-Level Encrypton and Multi-Level Authorization 81 Key information of video compression code stream is mainly DCT coefficients and the motion vectors. They represent almost all the information and play a key main role in reconstruction of the video image. Video decoding is based on units of macroblock to do the operation of variable-length decoding (VLD) for DCT coefficients, then inverse quantization (IQ) and inverse transform (IDCT). Finally it will get the current macroblock information in accordance with the motion vector of this marcoblock and the pervious frame by motion compensation. Based on the above analysis on the characteristics of syntax structure of video data, it can be drawn that DCT coefficients and motion vectors are very important to image reconstruction, and they contain almost all the video data information.

Decoding

Compression codestream

VLC decoding

Inverse Quantiaza -tion

Prediction /Transform

Decoded image

Encryption

Compression codestream

VLC decoding

Partial encryption

Codestream backfill

Compression ciphertext

Figure 32. Difference between decoding and encryption.

(a) DC values are set to 128 (median)

(b) AC values are set to zero

(c) Original image

Figure 33. Significance analysis of DC coefficient and AC coefficient.

5.2.2.2.1. DC Coefficient and AC Coefficient DC coefficient and AC coefficient can affect the image quality. In Figure 33(a), all of DC value of the image is set to 128 (median), and AC value remains unchanged. From the figure, we can clearly see the contour of the

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

82

Zhengquan Xu, Jing Sun and Jin Liu

image. In Figure 33(b), all of AC value is set to zero and DC value remains unchanged. From the figures, the average brightness and chroma of each macroblock can be seen. Therefore, both DC value and AC value have impact on image quality. AC coefficient decides the contour of image, and the DC coefficient decides average brightness and chroma of each macroblock. Encrypting the DC coefficient or the AC coefficient or both could achieve the effect of security. Hence the DC coefficient and AC coefficients can be used as the key information to encrypt.

5.2.2.2.2. Motion Vector (MV) Motion vector used for motion compensation contains the motion information of image, and is very important for video reconstruction. If the motion vector information can not be decoded correctly, motion information of video images can not be restored; the image quality will drop rapidly. In Figure 34 (d-f), motion vector of foreman sequence is set zero and decoded, and Figure 34 (a-c) are corresponding original images. As can be seen from Figure 34, because the movement information does not be restored, the video reconstruction is greatly affected. Therefore, the motion vector information can be used as key information to encrypt, and which can get good encryption results.

(a) Original image(1st frame)

(b) Original image(155th frame)

(c)Original image(180th frame)

(d) MV are set to zero(1st frame)

(e) MV are set to zero(155th frame)

(f) MV are set to zero (180th frame)

Figure 34. Significance analysis of MV.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Content-Basede Multi-Level Encrypton and Multi-Level Authorization 83

5.2.2.2.3. Other Key Information In addition to the above the DCT coefficients and motion vectors, in the different visual media coding standard there are also some unique content information. For example, in the H.264 standard such as prediction mode word, quantization information may be considered as key information; in JPEG2000 standard, the wavelet packet of some particular layer (such as resolution, component, waveband, sub-block) can also be used as key information. The information has key effect on the reconstruction of images, and can be further studied in the process of key information analysis. For analysis and extraction of key information based on content, we need to analyze the various format features of different visual media in detail, extract key information which plays a key role in video reconstruction from compressed stream. Therefore, the analysis of a variety of characteristics of the visual media stream is important. In general, the key information which can be encrypted includes DCT coefficients, motion vectors, etc. However, different coding standards need to consider the compatibility of encrypted code stream. For example, in JPEG, we can encrypt all DCT coefficient completely, and after encryption the syntax structure of cipher-stream remains unchanged; in MPEG4, the DC coefficient of DCT coefficient can be encrypted completely, and we can only choose some bit of AC coefficient to encrypt; in H.264, each DCT coefficient can not encrypted, and we can only choose some residual bits to encrypt.

5.2.3. Frameworks of Multi-Level Authorization Protection of Visual Media The Figure 35 shows the basic framework of multi-level encryption authorization. Considering the block diagram layout to describe simply, the figure gives only three types of the extracted key information which need to high, middle and low three levels encryption respectively. As is shown, the three types of key information obtained by the analysis and extraction are encrypted by from low to high three levels keys K1, K2, K3. And then the cipher-stream of key information will be backfilled to the compression stream through the code stream synthesis module. The following Figure 36 gives the basic framework of multi-level decryption authorization. As is shown, firstly, analyze and extract cipherstream of key information. And then decrypt them by using the three different

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

84

Zhengquan Xu, Jing Sun and Jin Liu

levels keys. Finally return the plain steam of key information to the extraction position. Multi- level encryption Cipher A

Encryption

Key information B

K1

Cipher B

Encryption

Key information C

K2

K3

SK

SK

K1 =Esk (K0)

SK

K2 =Esk (K1)

Multilevel key generate

Initial Key K0 Public keys pair generator

Cipher C

Encryption

Codestream synthesization

Analysis and extraction of key informaion

Key information A

K3 =Esk (K2)

PK K2

PK

Low- level users

PK

Middle- level users

K3

PK

High- level users

Multi-level authorization

K1

Figure 35. Basic framework of multi-level encryption module.

Middle-level users

Low-level users K1

PK

K1 = Dpk(K2)

K2 PK

K2 = Dpk(K3)

High-level users

PK

K3 PK

K3

K3 Decryption

Cipher C

Codestream synthesization

K2

Key information B

Cipher B Decryption K1 Cipher A Decryption

Analysis and extraction of key information

Key information C

Key information A

PK

Multi-level Decryption

Figure 36. Basic framework of multi-level decryption module.

Three levels data encryption key K1, K2 and K3 are generated by public key encryption algorithm with the initial key K0 through the one-way function, and are applied to encryption and decryption of three levels key information. The specific process is as follows: In the encryption module, first of all, the

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Content-Basede Multi-Level Encrypton and Multi-Level Authorization 85 public key pair generator generates a pair of keys that is private key SK and the public key PK. Then, the private key SK of the pair of keys is considered as the key of encryption function ESK. We make full use of encryption key SK encrypting the initial key K0 through encryption ESK several times. It respectively generates from low to high multi-level data encryption keys K1, K2 and K3. And the three levels data encryption keys are used in the three levels key information encryption. Finally, the public key PK, and three levels data encryption keys are assigned to three levels users, realizing the multilevel encryption authorization. In the decryption module, all users of different levels, according to the distribution of the decryption key PK, decrypt the respective data encryption key (K1, K2 or K3) using the decryption function Dpk to generate no higher than their own level of key information data decryption key (K1 or K2), then use the decryption key to decrypt the key information data to finish the decryption of key information data. As a result of public key cryptosystem one-way characteristics, the encryption function Esk can encrypt K0 to generate in order three levels keys from low to high: K1, K2 and K3, while the decryption function Dpk can begin with one of K3, K2 or K1 to interactively generate keys in order from high to low. For example, k 2 Dpk (k 3), k1 Dpk (k 2) . Therefore, low level users without private key SK will not be able to obtain high-level data decryption key, thus can not decrypt ciphertext of the high-level key information, but can only calculate and obtain lower level data decryption key by the public key PK to decrypt lower level key information.

5.3. MULTI-LEVEL ENCRYPTION AND AUTHORIZATION MODEL BASED ON REMOTE SENSING IMAGES With the development of remote sensing technology and network, a wide range of space information civilian use make remote sensing images security more and more attention. On the basis of the characteristics of large amount of remote sensing data but real-time transmission or access, a scheme of authorizing the use of remote sensing images based on multi-rank security through Internet distribution is introduced in this section [57]. The same remote sensing images data after encryption are distributed to different rank users, but different authorization users acquire different important degree information after decryption through their own different decryption keys. Experiment and theoretical analysis prove that the scheme has high

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

86

Zhengquan Xu, Jing Sun and Jin Liu

confidentiality and high computing efficiency, and can meet the requirements of practical application

5.3.1. Data Characteristics of Remote Sensing Image and Extraction of Confidential Region For huge volume of remote sensing image data, if encrypting all the information of them, it will spend a lot of time and processing power, and influence the application of remote sensing image. Usually, the need for confidentiality of region is called confidential region. In the vast majority of remote sensing images, confidential regions account for only a small percentage of the entire remote sensing image, suppose that a remote image including regions (R1, R2, R3, ... Rn), where the collection of confidential regions is as the set s , non-confidential regions set is s . s {R1 , R2 , R3} s {Ri 3 i

n, i

N } {R4, R5, ...Rn }

Ordinary users only can access non-confidential regions s , and the confidential regions can not be accessed by ordinary users, so there is no need for the whole image data encryption to protect. We could use select encryption algorithms based on the content, only to encrypt the important regions of the image data, such as confidential regions set s . Because of important regions R1, R2, R3 of s distributed in different locations of the remote sensing image, selection rapid and accurate description for them help to improve the efficiency of data encryption, at the same time, reduce the amount of data required to encrypt. Remote Sensing Image confidential region‟s description and extraction is carried out on remote sensing image segmentation which is know as remote sensing image processing. The image is separated into characteristic regions to extract the interested target. However, the complexity characteristics of remote sensing image make its division has no entire reliable model to guide, which to some extent hindered the applications of segmentation technology in the field of remote sensing images processing. Although there have been a lot of remote sensing image segmentation algorithms, and researchers use various methods of automated segmentation of

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Content-Basede Multi-Level Encrypton and Multi-Level Authorization 87 remote sensing images for a positive attempt, but not yet an algorithm under different conditions can all obtain the satisfied segmentation results. In the experiments, we adopt the smallest external rectangular method to select and extract confidential region from remote sensing images. No matter what the shape or surface feature of the confidential region is, the method can be used to select and extract it.

5.3.2. Framework of Multi-Level Encryption and Authorization Model of Remote Sensing Images In multi-level security encryption authorization system, the confidential level of each region is designated in the processing of selection and description. According to the important degree of the selective region, corresponding level keys are taken on for encryption. Multi-level authorization of remote sensing images is to make different users obtain different important degree information, by using their difference authority degree‟s decryption keys. Each user based on his authority can only access the corresponding degree‟s remote sensing images information. For example, the military and defense sector should be able to visit all of the confidential regions of remote sensing images; government agencies and research institutes should be able to visit the regions non-involving national security. Ordinary people should not access any confidential region. Consult the Figure 35 and Figure 36 in previous section to obtain the specific framework of multi-level encryption authorization model.

Figure 37. Experiment results of remote sensing images: (a) original image, (b) extracted region, (c) encrypted region.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

88

Zhengquan Xu, Jing Sun and Jin Liu

Figure 38. Experiment of multi-level security.

5.3.3. Experiment and Analysis We adopt the remote sensing images of Google headquarter as the test of remote sensing images multi-level security authentication scheme. As is shown in Figure 37, it comes from the GeoEye Company and can provide the image data of 50cm accuracy. Figure 37(a) is the original image and Figure 37(b) is a region extracted from the original image, which need to be protected. Figure 37(c) is the encrypted image adopting relevant key. After obtaining the same cipher images of remote sensing images, different level authority users make use of different level keys to decrypt the same encrypted the cipher images to acquire different levels of identification of image information. The highest authority users can obtain other users keys through their own keys and public key, and then obtain fully-decrypted image, as Figure 38(a). Other authority users respectively use their own keys to get the lower than their own level users keys through the same method, and obtain the different security levels of plaintext of the remote sensing image, as Figure 38(b-d). But the unauthorized user only can see non-decrypted image Figure 38(e).

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Chapter 6

CONCLUSION Image technology and video technology and applications of cryptography, after fast development of decades, have their own theory on a certain basis and mature technology. The encryption technology of images and video of these two types of visual media is a new research direction combined with applied cryptography. In recent decades, the related research has been given due attention and has become a hot spot. Although a number of ideal image and video encryption methods have been proposed, for image encryption and video encryption as a new research direction, there are still a lack of complete theoretical system and application of mature technologies, their research and applications need to be further in-depth to solve the following issues: Applicability study of video encryption algorithm used in the new video coding standard. With the rapid development of video technology, the new coding standard will continue to appear and replace the old coding standard. Therefore, for the future of the new video coding standard, video encryption algorithms are necessary to further study. Further study of visual security objective evaluation algorithm of video ciphertext. Objective evaluation of visual security is a new research topic, the corresponding research work has just begun, and there is the need for further study in order to come up with a better algorithm for objective evaluation. Application research of video encryption algorithm applied in different application environments. Video encryption technology with the applications under different circumstances of the continued

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

90

Zhengquan Xu, Jing Sun and Jin Liu expansion, need to continuously study and research. Such as in wireless network environment, the issues as the limited computing capability of terminal devices, data easily wiretapping and counterfeiting need to be resolved. Video encryption in wireless environment is an important question. Combination between multi-level authorization of visual media, digital signature algorithm and watermarking technology. Multi-level encryption of visual media can only realize privacy protection of visual information content of visual media, and visual media data in the network transmission may be subject to a variety of attacks, including eavesdropping or stealing, transmission error, intentional destruction, and so on. Therefore the video encryption technology need to solve such issues as the integrity and anti-deniability of the video media data to meet the practical requirements. The mature digital signature technology and watermarking technology can be used to solve these issues.

Video encryption is a research direction with the development of cryptography and video media technology. Although the current video media encryption algorithms have many issues to be resolved, in the next several years, their research will be concerned by more and more people. We have reason to believe, more and more research results of the video media encryption will appear, and there is more and more application demand.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

Vanstone, SA; Menezes, AJ; Oorschot, PC. Handbook of Applied Cryptography; CRC Press: Boca Raton, FL, 1996. Mollin, RA. An Introduction to Cryptography; CRC Press: Boca Raton, FL, 2006. Furht, B. Handbook of Internet and Multimedia Systems and Applications; CRC Press: Boca Raton, FL, 1999. Lian, SG; Sun, JS; Wang, ZQ. Information and Control. 2004, vol. 33, 560-566. Xu, ZQ; Yang, ZY; Li, W. Geomatics and Information Science of Wuhan University. 2005, vol. 30, 570-574. Friedman, William, F. Military Cryptanalysis: Transposition and Fractionating Systems; Aegean Park Press: Walnut Creek, CA, 1993. Schneier, B. Applied Cryptography: Protocols, Algorithms, and Source Code in C; 2nd edition; China Machine Press: Beijing, 2000. FIPS 46-2. Data Encryption Standard. November, 1993. FIPS 197. Advanced Encryption Standard (AES). November, 2001. Lai, X; Massey, JL. LNCS. 1991, vol. 473, 389-404. RSA Labs Public Key Cryptography Standards (PKCS). November, 1993. IEEE P1363. ANSI X9.62. Elliptic Curve Cryptography (ECC). 1999. ElGamal, T. IEEE T INFORM THEORY. 1985, vol. 31, 469-472. Bhargava, B; Shi, CG; Wang, SY. MPEG video encryption algorithms, Kluwer Academic Publishers: Netherlands, 2002. Qiao L; Nahrstedt, K. In Proceeding of the First International Conference on Imaging Science, Systems and Technology (CISST‟97). Las Vegas, NV, July, 1997, 21-29.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

92

Zhengquan Xu, Jing Sun and Jin Liu

[16] Maples, TB; Spanos, GA. In Proceedings of 4th International Conference on Computer Communications and Networks. Las Vegas, Nevada, September, 1995. [17] Agi, I; Gong, L. In Proceedings of the Internet Society Symposium on Network and Distributed System Security. San Diego, February, 1996. [18] Qiao, L; Nahrstedt, K. International Journal on Computers & Graphics. 1996, vol. 22, 219-230. [19] Tang, L. In Proceedings of the Fourth ACM International Multimedia Conference(ACM Multimedia 96'). Boston, November, 1996. [20] Qiao, L; Nahrstedt, K; Tam, MC. In Proceedings of IEEE International Symposium on Consumer Electronics. Singapore, December, 1997. [21] Shi, CG; Bhargava, B. In Proceedings of the 6th ACM International Multimedia Conference. Bristol, September 1998. [22] Shi, CG; Wang, SY; Bhargava, B. In Proceedings of the International Conference of Parallel and Distributed Processing Techniques and Applications (PDPTA99'). LasVegas, June, 1999. [23] Fridrich, F. INT J BIFURCAT CHAOS. 1998, vol. 8, 1259-1284. [24] Mao, YB; Chen, GR; Lian, SG. INT J BIFURCAT CHAOS. 2004, vol. 14, 3613-3624. [25] Chen, GR; Mao, YB; Charles, K. CHAOS SOLITON FRACT. 2004, vol. 21, 749-761. [26] Li, SJ; Zheng, X; Mou, X. et al. In Proceedings of SPIE on Electronic Imaging. San Jose, January, 2002. [27] Wen, J; Severa, M. etc. IEEE T CIRC SYST VID. 2002, vol. 12, 545-557. [28] Qiao, L; Nahrstedt, K. International Journal on Computers & Graphics. 1998, vol. 22, 437-448. [29] Lookabaugh, T; Sicker, DC; Keaton, DM; Guo, WY; Vedula, I. In Proceedings of the SPIE on Multimedia Systems and applications VI Conference. Qrlando, FL, 2003, 10-21. [30] Durbin, John, R. Modern Algebra: An Introduction; 4th edition; John Wiley & Sons: New York, 1999. [31] Yang, ZY; Chen, L; Li, W; Xu, ZQ. J. Huazhong Univ. of Sci. & Tech (Nature Science Edition). 2007, vol. 35, 8-11. [32] Zhong, YZ; Wang, Q; He, YW. Object-Based Coding of Multimedia Data Compression of International Standards MPEG-4 and Its Verification Models, Science Press: Beijing, 2000. [33] Yao, Y; Xu, ZQ; Li, W. In Proceedings of the 8th International Conference on Signal Processing. Guilin, China, November 2006, 25462550.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

References

93

[34] Qiao, LT; Nahrstedt, K. International Journal on Computers & Graphics. 1998, vol.22, 437-448. [35] Shi, C; Bhargava, B. In Proceedings of the 6th International Conference on Multimedia Technology (ACM Multimedia’98). 1998, 81-88. [36] Li, W; Xu, ZQ; Yang, ZY. Liu, SB. J. Huazhong Univ. of Sci. & Tech (Nature Science Edition). 2007, vol. 35, 13-17. [37] Wen, JT; Severa, M; Zeng, W. J. IEEE T CIRC SYST VID. 2002, vol. 12, 545-557. [38] Bi, HJ. A New Generation of Video Compression Coding StandardH.264/AVC; Posts & Telecom Press: Beijing, 2005. [39] Liu, SB; Xu, ZQ; Li, W; Liu, J. LNCS, 2008, vol. 5226, 1114-1121. [40] Masuda, N; Aihara, K. INT J BIFURCAT CHAOS. 2002, vol. 12, 20872103. [41] Chen, G; Mao, Y; Chui, CK. CHAOS SOLITON FRACT. 2004, vol. 21, 749-761. [42] Liu, SB; Sun, J; Xu, ZQ; Liu, JS. CHIN PHYS. 2009. [43] Baptista, MS. PHYS LETT A. 1998, vol. 240, 50-54. [44] Geisel, T; Fairen, V. PHYS LETT A.1984, vol. 105, 263-266. [45] Wen, JT; Severa, M; Zeng, WJ; Luttrell, M. IEEE T CIRC SYST VID.2002, vol. 12, 545-557. [46] Lian, SG; Liu, ZX; Ren, Z; Wang, ZQ. LNCS. 2005, vol. 3768, 281-290. [47] Li, SJ; Li, CQ; Chen, GR. etc. SIGNAL PROCESS IMAGE. 2008, vol. 23, 212-223. [48] Li, Y; Liang, LW; Su, ZP; Jiang, JG. In Proceedings of 7th Seventh International Conference on Information and Communications Security(ICICS 2005). 2005, 1121-1124. [49] Zou, YZ; Huang, TG; Gao, W; Huo, LS. IEEE T CONSUM ELECTR. 2006, vol.52, 1289-1297. [50] Ahn, J; Shim, HJ; Jeon, B; Choi, I. LNCS. vol. 3333, 386-393. [51] Kwon, SG; Choi, WI; Jeon, B. LNCS. vol. 3656, 207-214. [52] Lian, SG; Liu, ZX; Ren, Z; Wang, H. IEEE T CONSUM ELECTR. 2006, vol. 52, 621-629. [53] Stutz, T; Uhl, A. In Proceedings of the Multimedia & Security Workshop 2007(ACM MM&Sec‟07). 2007. [54] Yao, Y; Xu, ZQ. Sun, J. Informatica. 2009, vol 33, 69-76. [55] Yao, Y; Xu, ZQ; Li, W. In Proceedings of the International Workshop on Multimedia Security in Communication(Music’08). 2008, vol. 8. [56] Wang, Z; Lu, L; Bovik, AC. In Proceedings of the 2002 International Conference on Image Processing. 2002, 65-68.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

94

Zhengquan Xu, Jing Sun and Jin Liu

[57] Liu J; Xu, ZQ. Sun, J. In Proceedings of the 2009 International Conference on Computer Sciences and Convergence Information Technology(ICCIT 2009). 2009, in publish.

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

INDEX A accuracy, 107 achievement, xi ACM, 112, 113 adaptability, 17, 93 adaptation, 46 algorithm, xi, 1, 3, 4, 5, 6, 10, 11, 13, 17, 18, 19, 20, 21, 22, 25, 26, 27, 28, 29, 30, 34, 46, 47, 49, 50, 53, 54, 55, 56, 58, 61, 62, 64, 65, 66, 67, 69, 71, 72, 73, 83, 88, 89, 95, 96, 98, 103, 105, 109, 110 amplitude, 27, 48, 50, 95 ASD, 40 assessment, 2, 69, 71, 72, 73, 74, 75, 77, 79, 81, 83, 84, 87, 88, 89 attacker, 13, 71, 72 attacks, 3, 4, 41, 42, 72, 110 authentication, 107 authority, 106, 107

B back, 14, 22, 25, 58, 66 bandwidth, 11, 46 Beijing, 111, 112, 113

blocks, 9, 37, 39, 40, 41, 42, 43, 45, 47, 48, 49, 52, 53, 54, 55, 56, 58, 61, 76, 77, 84, 89 Boston, 112

C channels, 3, 45 chaos, 11, 15, 39, 62, 63, 64 CHAOS, 112, 113 China, 111, 112 cipher, 2, 4, 6, 7, 25, 29, 34, 37, 46, 59, 61, 62, 66, 69, 70, 71, 72, 73, 74, 76, 77, 83, 84, 87, 88, 89, 93, 94, 95, 96, 98, 102, 107 ciphers, 6, 72 civilian, 104 classes, 31 classical, 3, 4, 7, 11, 12 classification, 55, 99 codes, 7 coding, 4, 6, 8, 9, 10, 11, 12, 13, 25, 26, 27, 29, 31, 32, 33, 37, 46, 48, 49, 50, 52, 54, 57, 61, 62, 77, 79, 95, 96, 97, 98, 99, 101, 102, 109 commerce, 37 communication, 9, 10, 11, 33, 45, 47 communication systems, 10

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

96

Index

compatibility, 9, 10, 37, 48, 49, 50, 52, 58, 71, 102 compensation, 14, 96, 100, 101 complexity, 13, 15, 45, 50, 52, 54, 55, 61, 63, 71, 72, 73, 77, 79, 96, 99, 105 compliance, 1, 5, 7, 17, 18, 19, 20, 21, 28, 29, 30, 41, 43, 45, 46, 47, 65 components, 3, 97, 98 composition, 49 computation, 42, 50, 96, 99 computing, 62, 104, 110 concrete, 31 confidentiality, 6, 7, 32, 54, 56, 104, 105 conflict, 53 confusion, 58 Congress, vii constraints, 11 consumption, 11 context-sensitive, 53 control, 3, 4, 6, 13, 18, 42, 63, 92, 94 convergence, 14 conversion, 8, 19 correlation, 10, 11, 13, 74, 75, 78, 79, 81, 84, 87, 88, 89, 96 correlation coefficient, 75, 78 correlations, 81 counterfeiting, 110 coupling, 63 CRC, 111 cryptanalysis, 6, 71, 72, 73 cryptographic, 4, 6, 7, 11, 12, 14, 21, 22, 25, 27, 28, 48, 71 cryptography, xi, 6, 11, 14, 21, 109, 110 CTA, 31 customers, 94

definition, 18, 21, 23, 28, 78, 87 degradation, 62 destruction, 110 Discrete Cosine Transform, 41 disorder, 76 displacement, 95 distribution, 3, 13, 34, 35, 38, 71, 74, 78, 79, 104 division, 105 dynamical system, 14

E ears, 38 eavesdropping, 110 election, 10 encoding, 8, 31, 33, 51, 53, 61, 62, 91, 95, 96, 97, 99 encryption, xi, 1, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 25, 26, 27, 28, 29, 30, 31, 36, 37, 38, 40, 41, 42, 43, 45, 46, 47, 49, 50, 52, 53, 54, 56, 57, 58, 61, 62, 65, 66, 67, 69, 70, 71, 72, 73, 74, 75, 77, 79, 83, 85, 86, 88, 89, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 109, 110, 111 energy, 13, 96 entertainment, 42 entropy, 13, 26, 74, 76, 77, 84, 87, 95, 96, 99 environment, 72, 110 experimental condition, 32 extraction, 95, 96, 97, 98, 99, 102, 105 extraction process, 96, 97, 99 eye, 80 eyes, 80

D decoding, 4, 19, 20, 21, 27, 31, 33, 34, 37, 48, 95, 96, 99, 100 Decoding, 96 decryption, 3, 5, 6, 7, 8, 22, 26, 27, 28, 35, 40, 47, 66, 67, 71, 92, 95, 96, 102, 103, 104, 106 defense, 106

F fault tolerance, 17, 48, 93 February, 112 forecasting, 27

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Index

G generation, 2, 10, 13, 38, 39, 43, 45, 47, 64 goals, 42, 73 government, 106 graph, 84, 87, 89 grouping, 32 groups, 31, 47, 53, 54, 55, 56, 58

H height, 81 high-level, 93, 104 human, 4, 8, 13, 61, 71 hybrid, 8, 12

I IDEA, 6, 11, 61 identification, 94, 107 images, 2, 4, 9, 36, 38, 44, 58, 59, 60, 69, 70, 71, 72, 73, 74, 75, 76, 77, 79, 80, 81, 83, 84, 86, 91, 92, 94, 95, 98, 101, 102, 104, 105, 106, 107, 109 implementation, 57 incompatibility, 53 Information and Communication Technologies, ii information technology, 1, 114 injury, vii instability, 75 integrity, 18, 110 interaction, 5 international standards, 8 Internet, 104, 111, 112 ISO, 9, 10 iteration, 64, 65

97

L lice, 54 limitation, 28 location, 4, 5, 14, 66, 96, 98

M machines, 61 magnetic, vii mainstream, 7, 10 management, 6 mapping, 15, 18, 19, 21, 22, 23, 25, 27, 28 media, 52, 72, 73, 91, 92, 93, 94, 95, 96, 97, 98, 99, 101, 102, 109, 110 median, 100 military, 43, 106 models, 2 modules, 43, 95 modulus, 34 motion, 12, 14, 25, 26, 35, 47, 48, 57, 58, 62, 84, 99, 101, 102 movement, 91, 101 multimedia, 37, 45, 46, 69, 72

N national security, 106 natural, 10 Netherlands, 111 network, 17, 46, 93, 95, 99, 104, 110 Nevada, 112 New York, vi, viii, 112 noise, 10, 38, 73 normal, 37, 50, 95 nose, 80

O J judge, 64 judgment, 52, 61

one-to-one mapping, 27 online, 1 on-line, 46 orbit, 63

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

98

Index

P packaging, 99 parameter, 28, 31, 35, 63, 64 partition, 76, 77 PCM, 25, 48, 52 perception, 4 periodic, 14 play, 41, 92, 94, 99 power, 9, 11, 105 prediction, 9, 25, 84, 99, 101 preprocessing, 8 privacy, 3, 110 probability, 11, 32, 34, 35, 38, 76 probability distribution, 34 program, 66 protection, 1, 3, 43, 92, 93, 94, 97, 98, 110 protocol, 7, 49, 50, 93 pseudo, 2, 34, 38, 39, 46, 55, 63 public, 98, 103, 104, 107

Q QoS, 46 quality of service, 46 quantization, 27, 41, 47, 48, 57, 58, 62, 99, 101 quantum, 11 quantum theory, 11

R radius, 78, 79, 80, 81 random, 2, 7, 10, 14, 29, 31, 33, 34, 38, 39, 45, 46, 53, 55, 57, 58, 61, 63, 76 random numbers, 31, 39, 63 randomness, 14, 29, 38, 46, 63 range, 21, 25, 32, 33, 34, 35, 39, 48, 49, 54, 80, 104 real time, 1, 62, 69 reality, 1 recognition, 91 reconstruction, 45, 70, 74, 92, 99, 101, 102 redundancy, 4, 8, 9, 11

reference frame, 41 regular, 7 regulation, 38 relationship, 21, 52, 99 relationships, 15, 22, 23, 24, 25, 26, 30 remote sensing, 2, 104, 105, 106, 107 resolution, 102

S safety, 6, 7 sample, 8 sampling, 95 scalable, 6 secret, 43, 94 security, 1, 2, 3, 4, 6, 7, 11, 14, 15, 18, 32, 34, 36, 37, 38, 41, 42, 43, 44, 45, 56, 61, 63, 69, 70, 71, 72, 73, 74, 75, 77, 79, 81, 83, 84, 87, 88, 89, 92, 93, 94, 95, 98, 101, 104, 106, 107, 109 seed, 57 seeds, 46, 58 segmentation, 8, 105 self, 47 semantic, 10, 17, 48, 52, 58, 91 semantic information, 17, 91 sensation, 72 sensing, 2, 104, 105, 106, 107 separation, 8 series, 8, 10, 12, 13, 18 services, vii shape, 13, 106 sign, 25, 31 signals, 76 signs, 17, 25 similarity, 79, 80, 83 simulation, 38, 41, 43, 77 Singapore, 112 software, 45 spatial, 40, 43, 45, 46, 47, 52, 74, 77, 78, 79, 81, 84, 87, 88, 89, 96, 97 speed, 5, 15, 29, 38, 42, 43, 72 standardization, 9 standards, 3, 8, 9, 10, 12, 13, 25, 31, 102, 111, 112

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use

Index statistics, 12, 61 storage, 5, 9, 10, 93, 97, 99 streams, 13, 18, 25 strength, 29 subjective, 71, 72, 73, 74, 85, 88 Sun, ii, iii, v, 111, 113, 114 surveillance, 69 symbols, 14 synchronization, 7, 13, 17, 57, 93 syntactic, 10, 17, 25, 59 syntax, 13, 25, 37, 45, 46, 91, 93, 95, 96, 99, 100, 102 synthesis, 95, 96, 102

T telephony, 9 terminals, 11 three-dimensional, 14 threshold, 52 title, 9 tolerance, 17, 48, 93 trajectory, 14 trans, 46 transformation, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 95, 99 transformation coefficients, 99 transmission, 3, 5, 13, 45, 62, 97, 104, 110

99

transparent, 46 two-dimensional, 14

U uncertainty, 76 uniform, 33

V values, 12, 14, 25, 32, 39, 47, 52, 53, 54, 63, 77, 79, 80 variables, 34, 64 variance, 75 vector, 14, 25, 35, 48, 49, 57, 58, 62, 84, 100, 101 video surveillance, 69 visible, 36 vision, 56

W wavelet, 102 wireless, 11, 110 wiretapping, 110

EBSCOhost - printed on 2/26/2021 12:56 PM via ISTANBUL TICARET NIVERSITESI. All use subject to https://www.ebsco.com/terms-of-use