130 77 4MB
English Pages 406 [403] Year 2006
Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos New York University, NY, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
3903
Kefei Chen Robert Deng Xuejia Lai Jianying Zhou (Eds.)
Information Security Practice and Experience Second International Conference, ISPEC 2006 Hangzhou, China, April 11-14, 2006 Proceedings
13
Volume Editors Kefei Chen Shanghai Jiaotong University 1954 Hua Shan Road, Shanghai 200030, P.R. China E-mail: [email protected] Robert Deng Singapore Management University 469 Bukit Timah Road, 259756, Singapore E-mail: [email protected] Xuejia Lai Shanghai Jiaotong University 1954 Hua Shan Road, Shanghai 200030, P.R. China E-mail: [email protected] Jianying Zhou Institute for Infocomm Research 21 Heng Mui Keng Terrace, 119613, Singapore E-mail: [email protected]
Library of Congress Control Number: 2006922001 CR Subject Classification (1998): E.3, C.2.0, D.4.6, H.2.0, K.4.4, K.6.5 LNCS Sublibrary: SL 4 – Security and Cryptology ISSN ISBN-10 ISBN-13
0302-9743 3-540-33052-6 Springer Berlin Heidelberg New York 978-3-540-33052-3 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2006 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 11689522 06/3142 543210
Preface
This volume contains the Research Track proceedings of the Second Information Security Practice and Experience Conference 2006 (ISPEC 2006), which took place in Hangzhou, China, April 11–14, 2006. The inaugural ISPEC 2005 was held exactly one year earlier in Singapore. As applications of information security technologies become pervasive, issues pertaining to their deployment and operations are becoming increasingly important. ISPEC is an annual conference that brings together researchers and practitioners to provide a confluence of new information security technologies, their applications and their integration with IT systems in various vertical sectors. ISPEC 2006 received 307 submissions. This is probably the highest number of paper submissions in any information security-related technical conferences. Due to this exceptionally large number of submissions and the high quality of the submitted papers, not all the papers that contained innovative ideas could be accepted. Each paper was sent to at least three Program Committee members for comments. Based on the reviewers’ comments and discussion by the Program Committee, of the 307 submissions, 35 were selected for inclusion in these proceedings as research track papers and another 21 papers were selected as industrial track papers and are published in the Journal of Shanghai Jiaotong University (Science). As always, the success of an international conference was made possible through the contributions from many individuals and organizations. We would like to thank all the authors who submitted papers. We are indebted to our Program Committee members and the external reviewers for the great job they did. We sincerely thank our General Chair Xuejia Lai for his support and encouragement, Ying Qiu for managing the website for paper submission, review and notification, Feng Bao for helping out in several “emergency” situations and Jianying Zhou for his excellent work as Publication and Publicity Chair. Our special thanks are due to the members of the Local Organizing Committee at Shanghai Jiaotong University, in particular to Yanfei Zheng, Meiju Chen and Zhihua Su for their great efforts to make the conference run smoothly. Last but not least, we are grateful to Shanghai Jiaotong University, the Institute for Infocomm Research and Singapore Management University for sponsoring the conference.
January 2006
Kefei Chen Robert H. Deng
ISPEC 2006 Second Information Security Practice and Experience Conference Hangzhou, China April 11-14, 2006
Organized by Shanghai Jiaotong University, China Sponsored by Shanghai Jiaotong University, China and Singapore Management University, Singapore and Institute for Infocomm Research, Singapore
General Chair Xuejia Lai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanghai Jiaotong University, China Program Chairs Kefei Chen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanghai Jiaotong University, China Robert H. Deng . . . . . . . . . . . . . . . . Singapore Management University, Singapore Publication Chair Jianying Zhou . . . . . . . . . . . . . . . . . . . . Institute for Infocomm Research, Singapore
Program Committee Tuomas Aura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Microsoft Research, UK Feng Bao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I2R, Singapore Chin-Chen Chang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CCU, Taiwan Lily Chen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NIST, USA Liqun Chen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HP Bristol Labs, UK Xiaotie Deng . . . . . . . . . . . . . . . . . . . . . . .City U. of Hong Kong, Hong Kong, China Jintai Ding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U. of Cincinnati, USA Xuhua Ding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SMU, Singapore
VIII
Organization
Dengguo Feng . . . . . . . . . . . . . . . . . . . . . . . . . . . Chinese Academy of Sciences, China Dieter Gollmann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TU Hamburg, Germany Guang Gong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U. of Waterloo, Canada Dawu Gu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanghai Jiaotong U., China Yongfei Han . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Onets, China Yupu Hu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Xidian U., China Jiwu Huang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sun Yat-Sen U., China Sushil Jajodia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GMU, USA Kwangjo Kim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ICU, Korea Chi-Sung Laih . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NCKU, Taiwan Dong Hoon Lee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Korea U., Korea Ninghui Li . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Purdue U., USA Tieyan Li . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I2R, Singapore Yingjiu Li . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SMU, Singapore Shengli Liu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanghai Jiaotong U., China Javier Lopez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U. of Malaga, Spain Jianfeng Ma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xidian U., China Wenbo Mao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .HP Lab, China David Naccache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gemplus, France Masahiro Mambo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U. of Tsukuba, Japan Chris Mitchell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U. of London, UK SangJae Moon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kyungpook National U., Korea Hweehwa Pang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SMU, Singapore Reihaneh Safavi-Naini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . UOW, Australia Kouichi Sakurai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kyushu U., Japan Joerg Schwenk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ruhr U. Bochum, Germany Dawn Song . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CMU, USA Vijay Varadharajan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Macquarie U., Australia Serge Vaudenay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EPFL, Switzerland Guilin Wang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I2R, Singapore Victor Wei . . . . . . . . . . . . . . . . . The Chinese U. of Hong Kong, Hong Kong, China Wenling Wu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chinese Academy of Sciences, China Yongdong Wu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I2R, Singapore Bo Yang . . . . . . . . . . . . . . . . . . . . . . . . . . South China Agricultural University, China Yiqun Lisa Yin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Independent Consultant, USA Moti Yung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Columbia U., USA Huanguo Zhang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wuhan U, China Muxiang Zhang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Verizon Communications, USA Yunlei Zhao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fudan U., China Dong Zheng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanghai Jiaotong U., China Yuliang Zheng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . UNCC, USA Jianying Zhou . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I2R, Singapore Huafei Zhu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I2R, Singapore Yuefei Zhu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Information Engineering U., China
Organization
IX
External Reviewers Amit Lakhani, Andre Adelsbach, Angela Piper, Anindo Mukherjee, Anyi Liu, Baodian Wei, Becky Jie Liu, Benoit Chevallier-Mames, Bo Yang, Bo Zhu, Cao Aixia, Chen Hua, Chunhua Pan, Dang Nguyen Duc, David M’Raihi, Dayin Wang, Debin Gao, Divyan M. Konidala, Dongvu Tonien, Duc Liem Vo, Eimear Gallery, Fangguo Zhang, Fei Yan, Gerardo Fernandez, Gildas Avoine, Honggang Hu, HUANG Qiong, Hui Li, Hyunrok Lee, Jae-Gwi Choi, James Newsome, Jason Crampton, Jean Monnerat, Jeffrey Horton, John Bethencourt, John Mao, Joonsang Baek, Jose A. Montenegro, Jose A. Oneiva, Joseph Pamula, Katrin Hoeper, Katrin Hoeper, Ken Giulian, Kenji Imamoto, Kishan Chand Gupta, Lei Hu, Lei Zhang, Leonid Reyzin, Liankuan Zhang, Libin Wang, Lifeng Guo, Lijun Liao, Lingyu Wang, Lionel Victor, Lizhen Yang, Lujo Bauer, Luke McAven, Mark Manulis, Martin Vuagnoux, Matthieu Finiasz, Mi Wen, Michael David, Michael Psarros, Min Gyung Kang, Nam Yul Yu, P. George, Patrick George, Qiang Li, Qingguang Ji, Rodrigo Roman, Ron Steinfeld, Ryuzou Nishi, Sankardas Roy, Satoshi Hada, Scott Contini, Sebastian Gajek, Shiping Chen, Shirley H.C. Cheung, Shuhong Wang, Siamak Fayyaz, Sujing Zhou, Tu Feng, Tzong-Chen Wu, Ulrich Greveler, Ulrich Greveler, Wang Chih-Hung, Wei Han, Weizhong Qiang, Wen-Chung Kuo, Wen-Guey Tzeng, Wenming Lu, Xi Chen, Xiangxue Li, xiaodong lin, Xiaofeng Chen, Xiaoming Sun, Xuan Hong, Yang Tommy Guoming, Yanjiang Yang, Yassir Nawaz, Yongbin Zhou, Yoshiaki Hori, Yoshifumi Ueshige, ZHANG Lei, Zhenfeng Zhang, Zhiguo Wan, ZHONG Xiang, ZHOU Juxiang, ZHU Xusong.
Table of Contents
Cryptoanalysis DPA-Resistant Finite Field Multipliers and Secure AES Design Yoo-Jin Baek, Mi-Jung Noh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication Min Feng, Bin B. Zhu, Cunlai Zhao, Shipeng Li . . . . . . . . . . . . . . . . . . .
13
Diophantine Approximation Attack on a Fast Public Key Cryptosystem Baocang Wang, Yupu Hu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Further Security Analysis of XTR Dong-Guk Han, Tsuyoshi Takagi, Jongin Lim . . . . . . . . . . . . . . . . . . . . .
33
Network Security I A Counting-Based Method for Massive Spam Mail Classification Hao Luo, Binxing Fang, Xiaochun Yun . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
Model and Estimation of Worm Propagation Under Network Partition Ping Wang, Binxing Fang, Xiaochun Yun . . . . . . . . . . . . . . . . . . . . . . . . .
57
Tackling Worm Detection Speed and False Alarm in Virus Throttling Jangbok Kim, Jaehong Shim, Gihyun Jung, Kyunghee Choi . . . . . . . . .
67
Network Security II Using Data Field to Analyze Network Intrusions Feng Xie, Shuo Bai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
Adversarial Organization Modeling for Network Attack/Defense Ji Wu, Chaoqun Ye, Shiyao Jin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
A Novel Dynamic Immunization Strategy for Computer Network Epidemics Zhifei Tao, Hai Jin, Zongfen Han, En Cheng . . . . . . . . . . . . . . . . . . . . . . 100 Preventing Web-Spoofing with Automatic Detecting Security Indicator Fang Qi, Feng Bao, Tieyan Li, Weijia Jia, Yongdong Wu . . . . . . . . . . . 112
XII
Table of Contents
Security Protocol Security Protocol Analysis with Improved Authentication Tests Xiehua Li, Shutang Yang, Jianhua Li, Hongwen Zhu . . . . . . . . . . . . . . . 123 A Protocol of Member-Join in a Secret Sharing Scheme Xiao Li, Mingxing He . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 More on Shared-Scalar-Product Protocols Huafei Zhu, Feng Bao, Tieyan Li, Ying Qiu . . . . . . . . . . . . . . . . . . . . . . . 142
Communication Security Efficient Public Key Broadcast Encryption Using Identifier of Receivers Jung Wook Lee, Yong Ho Hwang, Pil Joong Lee . . . . . . . . . . . . . . . . . . . 153 A Practical Clumped-Tree Multicast Encryption Scheme Ling Dong, Kefei Chen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Trojan Horse Attack Strategy on Quantum Private Communication Jinye Peng, Guangqiang He, Jin Xiong, Guihua Zeng . . . . . . . . . . . . . . 177
Signature and Key Agreement Linkable Democratic Group Signatures Mark Manulis, Ahmad-Reza Sadeghi, J¨ org Schwenk . . . . . . . . . . . . . . . . 187 Identity-Based Key Agreement with Unilateral Identity Privacy Using Pairings Zhaohui Cheng, Liqun Chen, Richard Comley, Qiang Tang . . . . . . . . . . 202 Short (Identity-Based) Strong Designated Verifier Signature Schemes Xinyi Huang, Willy Susilo, Yi Mu, Futai Zhang . . . . . . . . . . . . . . . . . . . 214 Identity Based Key Insulated Signature Yuan Zhou, Zhenfu Cao, Zhenchuan Chai . . . . . . . . . . . . . . . . . . . . . . . . . 226
Application I Design and Implementation of an Extended Reference Monitor for Trusted Operating Systems Hyung Chan Kim, Wook Shin, R.S. Ramakrishna, Kouichi Sakurai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Table of Contents
XIII
A Design and Implementation of Profile Based Web Application Securing Proxy Youngtae Yun, Sangseo Park, Yosik Kim, Jaecheol Ryou . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 An Efficient and Practical Fingerprint-Based Remote User Authentication Scheme with Smart Cards Muhammad Khurram Khan, Jiashu Zhang . . . . . . . . . . . . . . . . . . . . . . . . 260
Application II Domain-Based Mobile Agent Fault-Tolerance Scheme for Home Network Environments Gu Su Kim, Young Ik Eom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Using π-Calculus to Formalize Domain Administration of RBAC Yahui Lu, Li Zhang, Yinbo Liu, Jiaguang Sun . . . . . . . . . . . . . . . . . . . . . 278 An Efficient Way to Build Secure Disk Fangyong Hou, Hongjun He, Zhiying Wang, Kui Dai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Practical Forensic Analysis in Advanced Access Content System Hongxia Jin, Jeffery Lotspiech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
Cryptographic Techniques Security Analysis of a Server-Aided RSA Key Generation Protocol Tianjie Cao, Xianping Mao, Dongdai Lin . . . . . . . . . . . . . . . . . . . . . . . . . 314 Integrating Grid with Cryptographic Computing Zhonghua Jiang, Dongdai Lin, Lin Xu, Lei Lin . . . . . . . . . . . . . . . . . . . . 321 Three-Round Secret Handshakes Based on ElGamal and DSA Lan Zhou, Willy Susilo, Yi Mu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
System Security Securing C Programs by Dynamic Type Checking Haibin Shen, Jimin Wang, Lingdi Ping, Kang Sun . . . . . . . . . . . . . . . . . 343 A Chaos-Based Robust Software Watermarking Fenlin Liu, Bin Lu, Xiangyang Luo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
XIV
Table of Contents
Privately Retrieve Data from Large Databases Qianhong Wu, Yi Mu, Willy Susilo, Fangguo Zhang . . . . . . . . . . . . . . . . 367 An Empirical Study of Quality and Cost Based Security Engineering Seok Yun Lee, Tai-Myung Chung, Myeonggil Choi . . . . . . . . . . . . . . . . . . 379 Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
DPA-Resistant Finite Field Multipliers and Secure AES Design Yoo-Jin Baek and Mi-Jung Noh System SW Lab., Samsung Electronics Co., Yongin, 449-711, Korea {yoojin.baek, mjnoh}@samsung.com
Abstract. The masking method is known to be one of the most powerful algorithmic countermeasures against the first-order differential power attack. This article proposes several new efficient masking algorithms applicable to finite field multipliers. Note that the finite field multiplier (more precisely, the finite field inversion) plays a crucial role in the confusion layer of many block ciphers including AES. The new algorithms are applied to implement AES DPA-securely in hardware and the detailed implementation results are presented.
1
Introduction
The differential power attack (DPA) [5] uses the power consumption data to derive some secret information in a cryptographic device. The primary reason why DPA works is that the power consumption signals from a device are strongly related to the internal state of the device, therefore it may leak crucial information about the keying-parameters involved. In particular, this article adopts the power leakage model that the power consumption information leaks the Hamming weight of the data being processed and considers the first-order DPA which investigates the statistical properties of the power consumption information at each sample time [7]. Various countermeasures against the first-order DPA have been proposed so far and the masking method [6] is known to be one of its most powerful algorithmic countermeasures. The main idea of the masking method is that before a certain cryptographic operation involving a secret key is performed, the input data is masked using a randomly chosen value so that its Hamming weight looks random to the outside world, which clearly prevents an adversary from performing the first-order DPA. This article proposes new masking algorithms applicable to finite field multipliers. More precisely, efficient algorithms which solve the following problem are presented: Masking Problem for the Multiplication in a Finite Field. For a given finite field GF(q) and (x , r), (y , s) ∈ GF(q)×GF(q) with x = x +r and y = y +s, compute (xy + t, t) for a random value t ∈ GF(q) with the constraint that the Hamming weight distributions of intermediate results must be independent of x and y. Note that in most applications the field GF(q) is a binary field, i.e. q = 2n for some n. K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 1–12, 2006. © Springer-Verlag Berlin Heidelberg 2006
2
Y.-J. Baek and M.-J. Noh
The related problem for a finite field inversion could also be considered. However, since the inversion can be realized using multiplications (and some linear functions), it is sufficient to consider the multiplication case. Previous Works. Much work has been done on solving the above problem (and its related problems) [2, 3, 4, 9, 10, 13, 14]. For example, the work in [2, 4, 14] mainly concerned with solving it in software level. Hence, they use the byteoriented operations or table-lookup operations in their solutions, so they are not adequate to the hardware implementation (in view of critical path and size). On the other hand, our methods are highly hardware-oriented and mainly use bit-wise operations to reduce the hardware size and the critical path. The methods presented in [3, 13] use the idea of masking at the gate level. More precisely, to apply the masking method to AES, they first decompose the inversion over GF(28 ) into the operations over GF(2). And, noting that the only non-linear operation in GF(2) is the multiplication, i.e. the AND gate, they tried to devise an efficient masking algorithm for the AND gate. In doing so, they used the following simple identities (x ∧ y) ⊕ rz = (((rz ⊕ (rx ∧ ry )) ⊕ (rx ∧ y )) ⊕ (ry ∧ x )) ⊕ (x ∧ y ) to get ((x ∧ y) ⊕ rz , rz ) [13] and (x ∧ y) ⊕ rz = (¬y ∧ ((¬ry ∧ rx ) ∨ (ry ∧ x ))) ∨ (y ∧ ((ry ∧ rx ) ∨ (¬ry ∧ x ))) to get ((x ∧ y) ⊕ rx , rx ) [3], given (x , rx ) and (y , ry ) with x = x ⊕ rx and y = y ⊕ ry , where ∧, ∨, ⊕ and ¬ stand for AND, OR, XOR and NOT gates, respectively. Consequently, the methods can be implemented using 4 AND and 4 XOR gates [13] or 4 OR, 4 XOR and 3 NOT gates (or 6 AND, 3 OR and 2 NOT gates) [3] to mask the AND gate. Also, the method in citeTri required an additional random bit rz in its implementation. On the other hand, our method, restricted to GF(2), requires 4 NAND and 4 XOR gates for securing the AND gate without any additional random bit. Since inserting random values into a hardware circuit requires an additional register and may cause some troublesome implementation issues, our method is expected to be more suitable for real hardware implementations than that in [13]. Also, our method is expected to have the smaller critical path than that of [13]. The solution in [3] is highly comparable to our method in the aspect of hardware size, while the exact comparison is highly depending on the used technology. The final one [9, 10] has similarity with ours in that it used the binary extension field arithmetics over GF(22 ) or GF(24 ) to mask the inversion over GF(28 ) and it didn’t require any additional random value. But, as noted in [9], its critical-path and hardware size are less efficient than that of [13]. Hence, our method is expected to have the smaller size and the lower critical path than the method in [9, 10]. The proposed algorithms are applicable to DPA-resistantly implement any cryptographic algorithms which use finite field multipliers as internal (non-linear) operations. The typical example is AES (Advanced Encryption Standard) [1], to
DPA-Resistant Finite Field Multipliers and Secure AES Design
3
which the new algorithm is applied to result in DPA-secure AES designs. The detailed implementation results are summarized in this paper. This paper is organized as follows. Section 2 introduces the finite field arithmetics and the overall architecture of AES. Brief explanations about DPA, the masking method and the related masking problem are given in the same section. The new masking algorithm for finite field multiplications can be found in Section 3. Section 4 presents several ways to apply the proposed algorithm to securely (and efficiently) implement the S-box of AES and the whole AES in hardware. The detailed implementation results are summarized in the same section.
2 2.1
Preliminaries Finite Field Arithmetics
Many block ciphers including AES make an extensive use of arithmetics over a finite field GF(2n ). This section introduces a brief description of GF(2n ). Let f (x) be an irreducible polynomial of degree n over GF(2). Then one can construct the finite field GF(2n ) GF(2)[x]/(f (x)) as follows: as a set, GF(2n ) consists of 2n elements, each of which can be uniquely written as a polynomial of degree less than n over GF(2). The addition operation in GF(2n ) is the usual polynomial addition over GF(2) (i.e. the component-wise XORing) and the multiplication is the polynomial multiplication over GF(2), followed by a modular reduction by f (x). For any non-zero g(x) ∈ GF (2n ), one can find h(x), k(x) ∈ GF(2)[x] such that g(x)h(x) + f (x)k(x) = 1, using the extended Euclidean algorithm. Then h(x) (mod f (x)) is the multiplicative inverse of g(x) in GF(2n ). 2.2
DPA and Masking Method
The differential power attack (DPA), which was first introduced by Kocher et al. in [5], tries to find some secret information from power consumption data obtained during some cryptographic operations. The reasons why DPA works are that 1) physical devices consume a different amount of power when operating on logical 1’s compared to operating on logical 0’s and 2) in applications such as smart cards, power supply for devices from the outside world makes the power consumption easy to measure. In particular, this paper adopts the power leakage model that the power consumption leaks the Hamming weight information of the data being processed. And, this paper considers the first-order DPA which investigates the statistical properties of power signals at each sample time [7]. In the sequel, DPA stands for the first-order DPA. DPA is performed as follows: first, power consumption signals for randomly chosen plaintexts are gathered and then the set of corresponding plaintexts is partitioned into two subsets, according to some (guessed) key-dependent bits.
4
Y.-J. Baek and M.-J. Noh
Finally, the average power signal of each subset is analyzed to get some useful information. DPA works correctly since, for the rightly guessed subkey, power consumption is statistically related to the data computed with the subkey and, for wrongly guessed subkeys, the relation between those data is likely to be random. The masking method [6] is known to be one of the most powerful algorithmic countermeasures against DPA. The main idea of the masking method is that before a certain cryptographic operation involving a secret key is performed, the input data is masked using a random value, so that the Hamming weight distribution of the data being processed looks random to the outside world. More precisely, to apply the masking method, we first scramble (or mask) the input plaintext using a random value (the masking phase), then perform the corresponding cryptographic operation with the masked data and finally descramble (or unmask) the result to get the desired ciphertext (the unmasking phase). In doing so, the following masking technique (used in the masking phase) is usually used: Definition 1. For a k-bit binary string x, a (Boolean) mask of x is any tuple (x , r) such that x = x ⊕r, where ⊕ denotes the bit-wise eXclusive-OR operation. For a cryptographic use, r must be random. To apply the masking method, we usually encounter the following generic problem called the masking problem: for a given function f : {0, 1}k → {0, 1}n , construct an efficiently computable (probabilistic) function F : {0, 1}2k → {0, 1}2n with the following properties: 1) Given any mask (x , r) of x ∈ {0, 1}k , F outputs a random mask of f (x), i.e. F (x , r) = (f (x) ⊕ s, s) for some random s ∈ {0, 1}n . 2) The Hamming weight distributions of intermediate results in the computation must be independent of the input x. Hence, for example, if f is a linear function (with respect to ⊕), we can put F (x , r) = (f (x ), f (r)), which gives us a trivial solution for the masking problem of linear functions. Note that for a linear function f and x = x ⊕ r f (x ) ⊕ f (r) = f (x ⊕ r) ⊕ f (r) = f (x). As another example, if g is an affine function, i.e. there exists an n × k matrix A and a vector b ∈ {0, 1}n such that g(x) = A · x ⊕ b, we can set G(x , r) = (g(x ), A · r), which solves the masking problem for g. But applying the masking method to non-linear functions does not seem to be an easy task. In particular, many cryptographic primitives use the multiplication (more precisely, the inversion) in a finite field as an internal non-linear function. Therefore, this paper mainly focuses on solving the following problem: Masking Problem for the Multiplication in a Finite Field. For a given finite field GF(q) and (x , r), (y , s) ∈ GF(q) × GF(q) with x = x + r and y = y + s, construct (xy + t, t) for some randomly chosen t ∈ GF(q) such that
DPA-Resistant Finite Field Multipliers and Secure AES Design
5
the Hamming weight distributions of intermediate results must be independent of x and y. Note that in most applications the field GF(q) is a binary field, i.e. q = 2n . But we do not confine our discussion to the binary field case for more general applications. 2.3
Overview of AES
AES is a symmetric block cipher of SPN (Substitution Permutation Network) type. It uses keys of 128-, 192- or 256-bit length to encrypt data blocks of 128-bit length and repeatedly uses the following primitive functions [1]: - ShiftRows: Each row in a 4 × 4 byte-array (called a state) of data is shifted 0,1,2 or 3 bytes to the left, according to the row index. - SubBytes: Replaces each byte in a state with its substitute in an S-Box. - AddRoundKey: Simply XOR-es an input with a sub-key. - MixColumns: Each column in a state is considered as polynomial over GF(28 ) and is multiplied modulo x4 + 1 with a fixed polynomial c(x) = {0x03}x3 + {0x01}x2 + {0x01}x + {0x02}. ShiftRows and MixColumns are byte-wise linear transformations and AddRoundKey is a bit-wise affine function. Thus, as noted in Section 2.2, the masking method can easily be applied to these functions. On the other hand, SubBytes is a parallel application of nonlinear S-Boxes which are composed of the multiplicative inverse function in GF(28 ) GF(2)[x]/(x8 + x4 + x3 + x + 1) and an affine transformation. Hence, if it is efficiently applicable to the multiplicative inverse function in GF (28 ), the masking masking can also be applied to the whole AES algorithm, which is the main point of this paper.
3
New Masking Algorithm
This section proposes a new masking algorithm for the finite field multiplier. The exact form of the new algorithm and its justification can be given as follows: New Masking Algorithm for the Finite Field Multiplier Input: x (= x + r), r, y (= y + s), s ∈ GF(q) Output: (xy + t, t) for a random value t ∈ GF(q) 1. a = x y , b = ry , c = x s, d = rs; 2. a = a + y , b = b + y ; 3. a = a + c, b = b + d;. 4. Return (a, b). The diagram of the new algorithm is given in Figure 1. Now, using the following propositions, we can justify the above algorithm.
6
Y.-J. Baek and M.-J. Noh
x‘
y‘ r s
Fig. 1. Diagram of the Proposed Algorithm
Proposition 1. The output (a, b) of the above algorithm is a random mask of xy. Proof. Since a + b = (y + x y + x s) + (y + ry + rs) = x y + x s + ry + rs = (x + r)(y + s) = xy, (a, b) is a Boolean mask of xy and since Pr(b = γ | x = α, y = β) =
1 q
for any α, β, γ ∈ GF(q) where the probability is taken over all the random choices of r and s, it is also a random mask of xy. Lemma 1. The probabilistic distributions of intermediate values of the above algorithm are independent of x and y. Proof. In the algorithm, there are 6 intermediate values to be considered, say, x y , x s, ry , rs, y + x y and y + ry . Now, for any α and β in GF(q), Pr(x y = γ | x = α, y = β) = Pr(x s = γ | x = α, y = β) = Pr(ry = γ | x = α, y = β) = Pr(rs = γ|x = α, y = β) = Pr(y + x y = γ | x = α, y = β) = Pr(y + ry = γ | x = α, y = β) 2q − 1 if γ = 0 q2 = q−1 if γ = 0, q2 where the probability is taken over the random choices of r and s. The assertion now follows since the above probability is independent of α and β.
DPA-Resistant Finite Field Multipliers and Secure AES Design
7
Proposition 2. The Hamming weight distributions of intermediate values of the above algorithm is independent of x and y. Proof. For a bit sting x, let |x| denote the Hamming weight of x and put f (x , r, y , s) to stand for one of the intermediate values of the above algorithm. Then, for any α and β from GF(q) and any positive integer n, Pr(|f (x , r, y , s)| = n | x = α, y = β) Pr(f (x , r, y , s) = γ | x = α, y = β), = |γ|=n
where the summation is taken over all γ ∈ GF(q) with |γ| = n and the probability is taken over all the random choices of r and s. Now, the assertion follows from Lemma 1. Remark 1. In some special cases, the above algorithm can be optimized in terms of the hardware size. For example, in GF(2), the algorithm has of the following simplified form (x , r, y , s) → ((y ⊕ (x ∧ y )) ⊕ (x ∧ s), (y ⊕ (r ∧ y )) ⊕ (r ∧ s)), where ∧ stands for the bit-wise AND gate, hence it uses 4 XOR gates and 4 AND gates. But, in the CMOS cell library, a NAND gate is usually smaller than an AND gate in size. So the above algorithm can clearly be optimized as: (x , r, y , s) → ((y ⊕ (x ∧ y )) ⊕ (x ∧ s), (y ⊕ (r ∧ y )) ⊕ (r ∧ s)), where a means the bit-wise negation of a. Note that the new version uses 4 XOR gates and 4 NAND gates. Remark 2. The proposed algorithm has some variants with the same hardware size, the justifications of which can also be obtained in similar ways. 1. (x , r, y , s) → ((s + x s) + x y , (s + rs) + ry ) 2. (x , r, y , s) → ((x + x y ) + ry , (x + x s) + rs) 3. (x , r, y , s) → ((r + ry ) + x y , (r + rs) + x s) Remark 3. Recently, a new kind of power attacks, so called a glitch attack, was proposed in [8]. Since a glitch naturally occurs in CMOS logic circuits and the proposed masking algorithm is based on the CMOS logic circuits, the proposed masking algorithms may be vulnerable to the glitch attack. Hence, in the next step, we will investigate countermeasures against the glitch attack which are based on the present masking algorithm.
4
Efficient and Secure S-Box Designs of AES
In designing secure AESs, this paper will take the following strategy: first, an 128-bit random value is chosen and XOR-ed with a plaintext. Then, (modified)
8
Y.-J. Baek and M.-J. Noh
AES is applied to the scrambled data and finally, the resulted data is descrambled to give the desired ciphertext. Note that the descrambling operation occurs only once at the final stage, which means that only one 128-bit random data is needed for each message encryption. Taking the procedure, three kinds of operations are met with, that is, linear (ShiftRows, MixColumns and a matrix-multiplication in SubBytes), affine (AddRoundKey and an adding-constant operation in SubBytes) and non-linear (S-box in SubBytes). And, as noted in Section 2.2, there are simple ways to apply the masking method to linear or affine functions. For example, for the XOR operation which takes two bit strings x and y as inputs and outputs x ⊕ y, the function F defined as F (x , r, y , s) = (x ⊕ y , r ⊕ s) gives us its complete solution for the masking problem. Hence, the way to apply the masking method to the S-box which is actually the multiplicative inversion over GF(28 ) is only considered in the sequel. In applications such as smart cards, the hardware size is a very important factor that directly affects the cost and the power consumption. In general, the bottleneck affecting the size of AES implementations is the way to implement the S-box. One approach for designing the S-box circuit is to construct a multiplicative inversion circuit and an affine transformation circuit independently, and then to combine these two circuits in serial. In this direction, various methods for constructing a compact inversion circuit over GF(28 ) have been introduced and the composite field inversion method [12] is known to be one of the most size-effective S-box implementations. The following subsections introduce several composite field inversion methods and show how to apply the proposed masking algorithm to them to result in secure and efficient S-box designs of AES (and consequently the whole AES designs). It is emphasized that our implementations are optimized for power and area, rather than performance. Out implementations were simulated and synthesized with Cadence NC-Verilog and Synopsys Design-Compiler, respectively. Also, we used the Samsung smartcard library (smart130), which is a 0.18 µm CMOS technology. The resulting VLSI circuits achieve data rates up to 4Mbps, keeping power consumption of 1mA and maximum throughput of 11.9Mbps at 16Mhz clock frequency. The detailed implementation results can be found in Table 1 for an S-box and in Table 2 for the whole AES. Table 1. S-Box Implementation Results
Using GF((24 )2 )- and GF(2)-arithmetic Using GF(((22 )2 )2 )- and GF(22 )-arithmetic Using GF(((22 )2 )2 )- and GF(2)-arithmetic
Without Masking With Masking Gate Count Critical Path Gate Count Critical Path 393 24.9 ns 1023 27.0 ns
343
1319
34.1 ns
954
31.1 ns
26.7 ns
DPA-Resistant Finite Field Multipliers and Secure AES Design
9
Table 2. Implementation Results of AES with Masking Gate Count Critical Path Using GF((24 )2 )- and GF(2)-arithmetic Using GF(((22 )2 )2 )- and GF(22 )-arithmetic Using GF(((22 )2 )2 )- and GF(2)-arithmetic
25.7K
57.1 ns
26.0K
59.1 ns
25.6K
57.0 ns
Performance (128-bit key)
4 Mbps @ 5MHz 8 Mbps @ 10MHz 11.9 Mbps @ 15MHz
In the sequel, the following field structures are used: GF(22 ) GF(2)[x]/(x2 + x + 1), GF((22 )2 ) GF(22 )[x]/(x2 + x + φ), φ = (10)2 ∈ GF(22 ), GF(((22 )2 )2 ) GF((22 )2 )[x]/(x2 + x + λ), λ = (1100)2 ∈ GF((22 )2 ), GF(24 ) GF(2)[x]/(x4 + x + 1), GF((24 )2 ) GF(24 )[x]/(x2 + x + ω), ω = (1001)2 ∈ GF(24 ), GF(28 ) GF(2)[x]/(x8 + x4 + x3 + x + 1). Note that GF(28 ) GF(((22 )2 )2 ) GF((24 )2 ) and the isomorphisms σ1 : GF(28 ) → GF((24 )2 ) and σ2 : GF(28 ) → GF(((22 )2 )2 ) can be explicitly given by 10111011 11000010 0 1 0 1 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 1 1 1 0 0 1 0 1 1 0 0 0 1 1 , σ2 = 0 1 1 0 0 0 1 1 . σ1 = 0 0 0 0 1 1 1 0 0 1 1 1 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 1 0 1 1 00000101 00000101 4.1
Using the Arithmetics of GF((24 )2 ) and GF(2)
Viewing GF(28 ) as GF((24 )2 ) via the isomorphism σ1 , g ∈ GF(28 ) can be considered as g = ax + b for some a, b ∈ GF(24 ). And, g −1 = (ax + b)−1 can be computed as g −1 =
a2 ω
1 (ax + a + b), + b(a + b)
where all the operations like a2 , a2 ω, b(a + b) and (a2 ω + b(a + b))−1 take place in GF(24 ). So the inverse operation in GF(28 ) can be performed using only three multiplications, one squaring, one constant multiplication by ω, three field additions (XORs), and one inversion over GF(24 ) (and two group isomorphism computations σ1 , σ1−1 ). Note that the squaring and the constant multiplication
10
Y.-J. Baek and M.-J. Noh
by ω in GF(24 ) are linear functions so there is no problem in applying the masking method to these functions. Hence, the only problems lie in applying the masking method to the multiplication and the inversion in GF(24 ). And, to solve them, we decompose the multiplication and the inversion in GF(24 ) into bit-wise operations. More precisely, for a(x) = a0 + a1 x + a2 x2 + a3 x3 and b(x) = b0 + b1 x + b2 x2 + b3 x3 in GF(24 ), c(x) = a(x)b(x) = c0 + c1 x + c2 x2 + c3 x3 and d(x) = a(x)−1 = d0 + d1 x + d2 x2 + d3 x3 can be obtained as follows: c0 = a0 b0 ⊕ a3 b1 ⊕ a2 b2 ⊕ a1 b3 c1 = a1 b0 ⊕ (a0 ⊕ a3 )b1 ⊕ (a1 ⊕ a2 )b2 ⊕ (a2 ⊕ a3 )b3 c2 = a2 b0 ⊕ a1 b1 ⊕ (a0 ⊕ a3 )b2 ⊕ (a1 ⊕ a2 )b3 c3 = a3 b0 ⊕ a2 b1 ⊕ a1 b2 ⊕ (a0 ⊕ a3 )b3 d0 = a1 ⊕ a2 ⊕ a3 ⊕ a1 a2 a3 ⊕ a0 ⊕ a0 a2 ⊕ a1 a2 ⊕ a0 a1 a2 d1 = a0 a1 ⊕ a0 a2 ⊕ a1 a2 ⊕ a3 ⊕ a1 a3 ⊕ a0 a1 a3 d2 = a0 a1 ⊕ a2 ⊕ a0 a2 ⊕ a3 ⊕ a0 a3 ⊕ a0 a2 a3 d3 = a1 ⊕ a2 ⊕ a3 ⊕ a1 a2 a3 ⊕ a0 a3 ⊕ a1 a3 ⊕ a2 a3 . Note that the XOR operation is a linear function. Consequently, if the masking method can be applied to the multiplication in GF(2), i.e. the AND operation, it can also be applied to the whole inversion over GF(28 ). Therefore, the proposed algorithm with q = 2 plays its role in this place. 4.2
Using the Arithmetics of GF(((22 )2 )2 ) and GF(22 )
If GF(28 ) is viewed as GF(((22 )2 )2 ) via the isomorphism σ2 , g ∈ GF(28 ) can be considered as g = ax + b for some a, b ∈ GF((22 )2 ) and g −1 = (ax + b)−1 can be computed as g −1 =
1 (ax + a + b), a2 λ + b(a + b)
where in this case all the computations like a2 , a2 λ, b(a+b) and (a2 λ + b(a + b))−1 take place in GF((22 )2 ). So the inversion in GF(28 ) can be computed using only three multiplications, one squaring, one constant multiplication by λ, three field additions (XOR), and one inversion over GF((22 )2 ) (and two group isomorphism computation σ2 , σ2−1 ). Note that the squaring and the constant multiplication by λ in GF((22 )2 ) are linear functions so the masking method can easily be applied to these functions. Hence, it is sufficient to apply the masking method to multiplications and inversions over GF((22 )2 ), which can be realized using the arithmetics of GF(22 ) as follows: for a = a1 x + a2 , b = b1 x + b2 ∈ GF((22 )2 ) with a1 , a2 , b1 , b2 ∈ GF(22 ) a−1 =
1 (a1 x + a1 + a2 ) a21 φ + a2 (a1 + a2 )
(1)
and ab = ((a1 + a2 )(b1 + b2 ) + a2 b2 )x + (a1 b1 φ + a2 b2 ),
(2)
DPA-Resistant Finite Field Multipliers and Secure AES Design
11
where all the operations like a21 , a21 φ, a2 (a1 +a2 ) and (a21 φ + a2 (a1 + a2 ))−1 occur in GF(22 ). Note that the squaring and the constant multiplication by φ in GF(22 ) are linear functions. Also, the inversion in GF(22 ) is a linear function because a−1 = a2 for all a ∈ GF(22 ). Hence, the masking method can easily be applied to these functions. Consequently, if it is applicable to the multiplication in GF(22 ), the masking method can also be applied to the whole inversion in GF(28 ) and so the proposed algorithm with q = 22 plays its role in this place. 4.3
Using the Arithmetics of GF(((22 )2 )2 ) and GF(2)
As noted above, the only non-linear operations in Equations (1) and (2) are the multiplication over GF(22 ). While the masking method over this operation can be realized using the arithmetics in GF(22 ) as in Section 4.2, it is also possible to do so using the (bit-wise) arithmetics of GF(2). First, the GF(22 )-multiplication using the operations in GF(2) is implemented as follows: for a = a1 x + a2 , b = b1 x + b2 ∈ GF(22 ) with a1 , a2 , b1 , b2 ∈ GF(2) ab = (a1 b1 ⊕ a1 b2 ⊕ a2 b1 )x ⊕ (a1 b1 ⊕ a2 b2 ). Since the XOR operation is a linear function, it is sufficient to apply the masking method to the multiplication in GF(2), i.e. the AND gate and so the proposed algorithm with q = 2 plays its role in this place.
5
Conclusion
This paper considered the first-order DPA and the masking method. New efficient masking algorithms applicable to finite field multipliers are proposed and are applied to implement AES DPA-securely in hardware. Also, the detailed implementation results are presented. Acknowledgements. We thank the anonymous referees for helpful comments.
References 1. National Institute of Standards and Technology, Federal Information Processing Standards Publication 197, Anouncing the Advanced Encryption Standard(AES), Available at http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf, 2001. 2. M. Akkar and C. Giraud, An Implementation of DES and AES, Secure against Some Attacks, Proceedings of Cryptographic Hardware and Embedded Systems: CHES 2001, LNCS vol. 2162, Springer-Verlag, 2001, pp. 309–318. 3. J. Goli´c and R. Menicocci, Universal Masking on Logic Gate Level, Electronics Letters 40(9), 2004, pp. 526–527. 4. J. Goli´c and C. Tymen, Multiplicative Masking and Power Analysis of AES, Proceedings of Cryptographic Hardware and Embedded Systems: CHES 2002, LNCS vol. 2535, Springer-Verlag, 2002, pp. 198–212. 5. P. Kocher, J. Jaffe and B. Jun, Differential Power Analysis, Proceedings of Crypto ’99, LNCS vol. 1666, Springer-Verlag, 1999, pp. 388–397.
12
Y.-J. Baek and M.-J. Noh
6. T. Messerges, Securing the AES Finalists against Power Analysis Attacks, Proceedings of Fast Software Encryption Workshop 2000, LNCS vol. 1978, Springer-Verlag, 2000, pp. 150–165. 7. T. Messerges, Using Second-Order Power Analysis to Attack DPA Resistant Software, Proceedings of Cryptographic Hardware and Embedded Systems: CHES 2000, LNCS vol. 1965, Springer-Verlag, 2000, pp. 238–251. 8. S. Mangard, T. Popp and B. Gammel, Side-Channel Leakage of Masked CMOS Gates, Topics in Cryptology - CT-RSA 2005, LNCS vol. 3376, Springer-Verlag, 2005, pp. 351–365. 9. E. Oswald, S. Mangard and N. Pramstaller, Secure and Efficient Masking of AES A Mission Impossible?, Cryptology ePrint Archive, Report 2004/134, 2004, Available at http://eprint.iacr.org. 10. E. Oswald, S. Mangard, N. Pramstaller and V. Rijmen, A Side-Channel Analysis Resistant Description of the AES S-box, Proceedings of Fast Software Encryption Workshop 2005, LNCS vol. 3557, Springer-Verlag, 2005, pp. 413–423. 11. Samsung smart-card library (smart130). 12. A. Satoh, S. Morioka, K. Takano and S. Munetoh, A Compact Rijndael Hardware Architecture with S-Box Optimization, Proceedings of Asiacrypt 2001, LNCS vol. 2248, Springer-Verlag, 2001, pp. 239–254. 13. E. Trichina, Combinational Logic Design for AES Subbyte Transformation on Masked Data, Cryptology ePrint Archive, Report 2003/236, 2003, Available at http://eprint.iacr.org. 14. E. Trichina, D. de Seta and L. Germani, Simplified Adaptive Multiplicative Masking for AES and Its Secure Implementation, Proceedings of Cryptographic Hardware and Embedded Systems: CHES 2002, LNCS vol. 2523, Springer-Verlag, 2002, pp. 187–197.
Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication Min Feng1 , Bin B. Zhu2 , Cunlai Zhao1 , and Shipeng Li2 1
School of Mathematical Sciences, Peking Univ., Beijing, 100871, China {fengmin, zhao}@math.pku.edu.cn 2 Microsoft Research Asia, Beijing, 100080, China {binzhu, spli}@microsoft.com
Abstract. Comb method is an efficient method to calculate point multiplication in elliptic curve cryptography, but vulnerable to power-analysis attacks. Various algorithms have been proposed recently to make the comb method secure to power-analysis attacks. In this paper, we present an efficient comb method and its Simple Power Analysis (SPA)-resistant counterpart. We first present a novel comb recoding algorithm which converts an integer to a sequence of signed, MSB-set comb bit-columns. Using this recoding algorithm, the signed MSB-set comb method and a modified, SPA-resistant version are then presented. Measures and precautions to make the proposed SPA-resistant comb method resist all power-analysis attacks are also discussed, along with performance comparison with other comb methods. We conclude that our comb methods are among the most efficient comb methods in terms of number of precomputed points and computational complexity.
1
Introduction
Elliptic curve cryptography (ECC) has gained increasing popularity in public key cryptography due to its shorter key sizes for the same level of security as compared to other public key cryptosystems. A key operation in ECC is point multiplication. Many efficient point multiplication methods have been developed [1]. One of them is the comb method [2]. The main idea is to use a binary matrix with rows and columns to represent a scalar and process the matrix columnwise. Unfortunately, the method is vulnerable to side-channel attacks which were first introduced by Kocher et al. [3,4] and extended to ECC [5]. Side-channel attacks measure observable parameters such as timings or power consumptions during cryptographic operations to deduce the whole or partial secret information of a cryptosystem. Power analysis includes both Simple Power Analysis (SPA) and Differential Power Analysis (DPA) [4]. A particular target of side-channel attacks for ECC is the scalar in point multiplication which computes a product kP where P is point on an elliptic curve E(F ) over a finite field F and k is a secret multiplier which is a positive integer. Higher order and refined DPA attacks are also proposed [6,7,8]. With power analysis, partial information or the exact value of the secret k can be deduced when the original comb method [2] or the scalar multiplication methods described in [1] are used. K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 13–24, 2006. c Springer-Verlag Berlin Heidelberg 2006
14
M. Feng et al.
Many countermeasures have been proposed to protect against side-channel attacks on ECC. Two major strategies have been proposed to protect against SPA attacks. The first strategy is to make the addition and doubling operations indistinguishable. A unified formula for computing both addition and doubling has been proposed in [9] for Jacobi-type and in [10] for Hesse-type elliptic curves. The second strategy is to remove dependency in the intermediate steps of the scalar multiplication on specific value of the secret multiplier k. Coron, et al. [5, 11, 12, 13, 14] proposed schemes using addition chains to always execute point addition and doubling for each bit. M¨ oller, et al. [15, 16] modified window methods by making addition chains with fixed pattern of nonzero digits. Hedabou et al. proposed SPA-resistant comb methods [17, 18, 19]. ChevallierMames et al. [20] proposed a scheme which divides point doubling and point addition into side-channel atomic blocks so that point multiplication appears as a succession of side-channel atomic blocks that are indistinguishable by SPA. An SPA-resistant method is not necessarily resistant to DPA attacks. Many countermeasures have been proposed to convert an SPA-resistant method into a DPA-resistant method. Coron [5] proposed to use random projective coordinates. Joye and Tymen [21] proposed to use a random isomorphism such as a random elliptic curve isomorphism and a random field isomorphism. In this paper, we propose a new comb recoding algorithm to convert each bitcolumn in the comb scalar matrix to a signed, Most Significant Bit (MSB)-set, nonzero bit-column. All nonzero bits in an arbitrary bit-column have the same sign. Using the recoding algorithm, we present a novel comb method which computes point multiplication more efficiently with less precomputed points than the original comb method [2]. The proposed comb method is then modified to be SPA-resistant by exploiting the fact that point addition and point subtraction are virtually of the same computational complexity in ECC and cannot be distinguished by SPA. We also describe measures to convert our SPA-resistant comb method to thwart all known side-channel attacks. Our comb methods are among the most efficient comb methods in terms of number of precomputed points and computational complexity. This paper is organized as follows. In the next section, we introduce preliminaries for ECC and the original comb method. In Section 3, side-channel attacks and prior countermeasures are presented. Our novel comb recoding algorithm and comb point multiplication methods are described in Section 4. Security analysis and performance comparison with other comb methods are also provided in this section. We conclude this paper in Section 5.
2 2.1
Preliminaries Elliptic Curves Equations
An elliptic curve over a field F can be expressed by its Weierstrass form: E : y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6
ai ∈ F.
The set E(F ) of points (x, y) ∈ F 2 satisfying the above equation plus the “point at infinity” O forms an abelian group with the point at infinity O as the zero,
Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication
15
and point addition as the group’s binary operation. Given two points P1 and P2 in E(F ), a third point P3 = P1 + P2 ∈ E(F ) as the addition of P1 and P2 can be calculated with the chord-tangent process [1]. A special point addition that a point adds itself is called doubling. The cost of point doubling is usually different from that of point addition. Point addition and doubling need to compute costly field inversions. By using the Jacobian projective coordinates which represent a point P = (x, y) as P = (X, Y, Z), where x = X/Z 2 and y = Y /Z 3 , and the infinity point O as (θ2 , θ3 , 0), θ ∈ F ∗ , field inversions can be avoided at the expense of more field multiplications. A field multiplication is usually much faster than a field inversion, resulting in faster elliptic curve point addition and doubling. The group E(F ) generated by an elliptic curve over some finite field F meets the public key cryptography requirements that the discrete logarithm problem is very difficult to solve. Therefore ECC has been used in many standards and applications. Elliptic curves used in cryptography are elliptic curves defined over fields F2m or fields Fp where m is a large number and p is a big prime. Over these two types of fields, the Weierstrass form reduces to the short Weierstrass form, and point addition and doubling are also simplified. For details of elliptic curve equations and point operations, interested readers are referred to [1]. n Algorithm 1. Fixed-base Comb Method [2] (d = w ) n−1 i Input: A point P , an integer k = i=0 bi 2 with bi ∈ {0, 1}, and a window width w 2. Output: Q = kP . Precomputation Stage: 1. Compute [bw−1 , · · · , b1 , b0 ]P for all (bw−1 , · · · , b1 , b0 ) ∈ {0, 1}w . 2. Write k = K w−1 || · · · ||K 1 ||K 0 , where each K j is a bit-string of length d. Padding with 0 on the left if necessary. Let Kij denote the ith bit of K j . Define Ki ≡ [Kiw−1 , · · · , Ki1 , Ki0 ]. 3. Q = O. Evaluation Stage: 4. For i = d − 1 to 0 by −1 do: 5. Q = 2Q, 6. Q = Q + Ki P . 7. Return Q.
2.2
Scalar Multiplication
Adding a point P to itself k times is called scalar multiplication or point multiplication, and is denoted as Q = kP , where k is a positive integer. Many efficient methods have been proposed for scalar multiplication. Interested readers are referred to [1] for details. One of the proposed efficient point multiplication methods is the comb method proposed by Lim and Lee [2] in 1994.
16
M. Feng et al.
n−1 n Let k = i=0 bi 2i with bi ∈ {0, 1}. For an integer w 2, set d = w . We (w−1)d (w−2)d d + bw−2 2 + · · · + b1 2 + b 0 , define [bw−1 , bw−2 , · · · , b1 , b0 ] bw−1 2 where (bw−1 , bw−2 , · · · , b1 , b0 ) ∈ {0, 1}w . The comb method uses a binary matrix of w rows and d columns to represent an integer k, and processes the matrix columnwise. This comb method stores 2w − 1 points in the precomputation stage. In storage estimation in this paper, the input point P is always included. Let us estimate the time cost of the comb method. In the precomputation stage, [bw−1 , · · · , b1 , b0 ]P needs to be calculated for (bw−1 , · · · , b1 , b0 ) ∈ {0, 1}w . To achieve this, 2d P, 22d P, · · · , 2(w−1)d P are first calculated, which costs (w − 1)d doubling operations. Then every possible combination of l (l > 1) nonzero bits in [bw−1 , · · · , b1 , b0 ]P is calculated by adding one point from P ’s point multiplication of l − 1 bits combinations and a point from P ’s point multiplication of a single bit. Therefore it costs 2w − w − 1 point additions in the precomputation stage. In conclusion, the total cost in the precomputation stage is {(w − 1)d}D + {(2w − w − 1)}A. To estimate the time cost in the evaluation stage, we assume the most significant column of {Ki } is not zero, i.e., Kd−1 = 0. Then the number of doubling operations in the evaluation stage is (d − 1). If Ki = 0, then the point addition in Step 6 is not needed. If we assume k is uniformly distributed, w the probability that Ki = 0 is 2 2w−1 , and the average number of point addiw tions is 2 2w−1 (d − 1). Therefore the average time cost in the evaluation stage w is approximately {(d − 1)}D + { 2 2w−1 (d − 1)}A. The total time cost of the w comb method is {(w − 1)d + (d − 1)}D + {(2w − w − 1) + 2 2w−1 (d − 1)}A = w {wd − 1}D + {(2w − w − 1) + 2 2w−1 (d − 1)}A.
3 3.1
Side-Channel Attacks and Countermeasures Side-Channel Attacks
Two types of power analysis have been introduced by P. Kocher [3, 4]. One is the Simple Power Analysis (SPA). The other is the Differential Power Analysis (DPA). Simple Power Analysis. SPA analyzes a single trace of power consumption in a crypto-device during scalar multiplication. A branch instruction condition can be identified from the recorded power consumption data. This represents continuity of elliptic curve doubling operation. For the comb method Alg. 1, SPA can detect if Ki is zero or not, which means leak of secret information. Differential Power Analysis. DPA records many power traces of scalar multiplications, and uses correlation among the records and error correction technique [4] to deduce some or all digits of the secret k. DPA is more complex yet powerful than SPA. An SPA-resistant scalar multiplication method is not necessarily resistant to DPA attacks, but many countermeasures can be used
Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication
17
to transform an SPA-resistant method to a DPA-resistant method. A common practice is to make execution, and thus power consumption, different for identical inputs. Randomization is usually employed to achieve this effect. All these randomizing approaches are feasible: randomizing input point in projective coordinates, randomizing exponential parameter representation, randomizing elliptic curve equation, and randomizing field representation. This paper focuses on SPA-resistant scalar multiplication. All these randomizing approaches can be applied to transform our SPA-resistant methods to be resistant to DPA attacks. 3.2
Prior SPA-Resistant Comb Methods
Many countermeasures to SPA attacks have been proposed. A particular approach is to make execution of scalar multiplication independent of any specific value of the multiplier k. All the proposed SPA-resistant comb methods as well as the one to be proposed in this paper are of this approach. Those SPA-resistant comb methods are described next. HPB’s Comb Methods. Hedabou, Pinel and B´en´eteau (HPB) [17, 18] proposed two comb methods recently to protect against SPA. The main idea is to extend Ki in the comb method Alg. 1 to a signed representation (Ki , si ), where each Ki is nonzero. Their first method [17] uses this following procedure to generate such a signed representation (Ki , si ) for an odd integer k represented by Ki , 0 i < d, in the comb method. Let s0 = 1 and construct the rest by setting (Ki , si ) = (Ki−1 , si−1 ) (Ki−1 , si−1 ) = (Ki−1 , −si−1 ) if Ki = 0, and (Ki , si ) = (Ki , si ) otherwise. method [18] translates an odd scalar k into a representation nThe second i b 2 with bi ∈ {1, −1} by exploiting the facts 1 ≡ 1¯1¯1 · · · ¯1, where ¯1 is i=0 i defined as −1, and applies the original comb method to the new representation of the old scalar k. A bit-column Ki generated in this method can be represented by [bw−1 , · · · , b1 , b0 ], where bj ∈ {1, −1}, 0 j < w. HPB’s comb methods apply these signed representations to the original comb method to calculate (k + 1)P for even k and (k + 2)P for odd k. 2P is then calculated. P or 2P is subtracted from the result produced by the original comb method to obtain the desired point kP . HPB’s first method will be referred to as the signed Non-Zero (sNZ) comb method, and the second method as the signed All-Bit-Set (sABS) comb method in this paper. sNZ has the same time and space cost as the original comb method in the precomputation stage, i.e., storage of 2w − 1 points and time cost of {(w − 1)d}D + {(2w − w − 1)}A. Because elliptic curve point substraction has the same complexity as point addition, the second method stores only 2w−1 precomputed points [bw−1 , bw−2 , · · · , b2 , b1 , 1]P since the scalar is odd, where bi ∈ {1, −1}.
18
M. Feng et al.
The number of precomputed points in sABS is about half of that in sNZ. The time cost of sABS in the precomputation stage was estimated as {(w − 1)d}D + {(2w − w)}A for w = 2, 3, 4, 5 in HPB’s paper [18]. This means that sABS needs one more point addition than sNZ in the precomputation stage. The evaluation stage for both HPB’s comb methods costs d−1 point additions and d−1 doublings. The last stage after the original comb method costs one doubling and one subtraction. Therefore the total cost of sNZ is (w−1)d+(d−1)+1 = wd doubling operations and (2w − w − 1) + (d − 1) + 1 = 2w − w + d − 1 adding operations. sABS costs {wd}D + {(2w − w + d)}A. Compared with the original comb method Alg. 1, sABS stores about half of precomputed points as that in the original comb method, but the time cost in the precomputation stage is a little higher due to the fact that all bits in sABS are set to either 1 or −1. FZXL’s Comb Methods. Feng, Zhu, Xu, and Li (FZXL) [19] proposed another comb method referred to as the signed LSB-Set (sLSBS) comb method in this paper and its variations which are more efficient than the original comb method. In sLSBS, every odd scalar k is transformed into a representation of bit-columns {Ki } with the following properties: for each bit-column [bw−1 , · · · , b1 , b0 ], the least significant bit b0 is either 1 or −1, and the rest bits bi is either 0 or has the same sign as b0 , 0 < i < w. In other words, Ki = ±[cw−1 , · · · , c2 , c1 , 1], where ci = 0 or 1 for 0 < i < w. By adding dummy operations, sLSBS is easily modified to be SPA-resistant. Both versions store 2w−1 precomputed points with the time cost in the precomputation stage as {(w − 1)d}D + {(2w−1 − 1)}A. sLSBS requires (d−1)D+(d− 12 )A in the evaluation stage and the total cost is (wd−1)D+(2w−1 + d − 32 )A. The corresponding values for the SPA-resistant version are dD + dA and wdD + (2w−1 + d − 1)A [19], respectively. The value d used in sLSBS or its SPA-resistant counterpart is equal to n+1 w n used by other comb methods, which results in d one larger than instead of w that used in other comb methods when n, the number of bits of k, is divisible by w. FZXL proposed several methods to deal with this issue while maintaining computational efficiency. Details can be found in [19]. In this paper, we compare our proposed comb methods with only sLSBS and its SPA-resistant counterpart.
4 4.1
Signed MSB-Set Comb Method Recoding Algorithm
Like the aforementioned comb methods, our approach is also to represent a scalar k with a set of signed nonzero bit-columns {Ki ≡ [Kiw−1 , · · · , Ki1 , Ki0 ] = 0}. The major difference is that every Ki generated by our novel recoding method is a signed MSB-set integer. More specifically, our recoding scheme generates 1} and Kij ∈ {0, Kiw−1 }, 0 j < w − 1 for each bit-column Ki . Kiw−1 ∈ {1, ¯ As shown later in this paper, a major advantage of our recoding method over the original fixed-base comb recoding method is that the precomputation stage
Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication
19
needs to calculate and store only half of the points. The detail of our recoding algorithm is described next for a window width w 2. The recoding algorithm first partitions a binary representation of a scalar k into w binary strings K j of d bits long for each, 0 j < w, with 0 possibly padded on the left. Then it converts in Steps 3 to 5 each bit of the highest d bits to either 1 or ¯ 1 in by exploiting the fact that 1 ≡ 1¯1¯1 · · · ¯1. In other words, each bit Krw−1 , 0 r < d, in K w−1 is either 1 or ¯1. The rest of the recoding algorithm processes each bit from the least significant bit towards the {(w −1)d−1}th bit. If the current ith bit bi is 1 and has a sign different from that of the most significant bit b(i mod d)+(w−1)d in the same bit-column Ki mod d , the current bit is set to ¯ 1 and the next higher corresponding bit is added by 1 to keep the value of k unchanged. This process generates wd bits {bi } and a δ to represent an n-bit integer k. Due to length limitation, the following theorems are given without proof. Theorem 1. Given a scalar k, Alg. 2 outputs a δ ∈ {0, ±1}, a sequence of bits {b }, and bit-columns {Kr ≡ [Krw−1 , · · · , Kr1 , Kr0 ]} such that k = δ · 2(w−1)d + iwd−1 i w−1 ∈ {1, −1} and Krj ∈ {0, Krw−1 }, where i=0 bi 2 and for each Kr , Kr j 0 j < w − 1 and 0 r < d, and Kr ≡ bjd+r . Theorem 2. δ has a probability of with uniform distribution.
1 2
to be zero when k is an integer in [0, 2n )
n Algorithm 2. Signed MSB-Set Comb Recoding Algorithm (d = w ). Input: An n-bit integer k > 0 and a window width w 2. wd−1 Output: k = δ · 2(w−1)d + i=0 bi 2i ≡ δ · 2(w−1)d + K w−1 || · · · ||K 1 ||K 0 , j where each K is a binary string of d bits long and δ ∈ {0, ±1}. Let Krj denote the rth bit of K j , i.e., Krj ≡ bjd+r . Define bit-column Kr ≡ [Krw−1 , · · · , Kr1 , Kr0 ]. The output satisfies Krw−1 ∈ {1, −1} and Krj ∈ {0, Krw−1 } for 0 j < w − 1 and 0 r < d. 1. Padding with 0 on the left if necessary to form a wd-bit representation wd−1 k = i=0 bi 2i with bi ∈ {0, 1}. (w−1)d−1 i 2. Set b(w−1)d = 1, δ = b(w−1)d − 1 and e = i=0 bi 2 . 3. For i = (w − 1)d + 1 to wd − 1 by 1 do: 4. if bi = 1 then set bi = 1, 5. if bi = 0 then set bi = 1 and bi−1 = ¯1. 6. For i = 0 to (w − 1)d − 1 by 1 do 7. if e is odd and b(i mod d)+(w−1)d = ¯1, then set bi = ¯1 and e = 2e 8. else set bi = e mod 2, and e = 2e 9. δ = δ + e
4.2
Signed MSB-Set Comb Methods
By applying the recoding method in Sect. 4.1, we have the comb method Alg. 3. If the most significant bit of Ki is ¯ 1, we have Ki = −|Ki |. In this case, Step 6 in Alg. 3 actually executes Q = Q − |Ki |P .
20
M. Feng et al.
n Algorithm 3. Signed MSB-Set Comb Method (d = w ). Input: A point P , an integer k > 0, and a window width w 2. Output: Q = kP . Precomputation Stage: 1. Compute [1, bw−2 , · · · , b1 , b0 ]P for all (bw−2 , · · · , b1 , b0 ) ∈ {0, 1}w−1 . (Note that [1, 0, · · · , 0, 0] = 2(w−1)d .) 2. Apply Alg. 2 to k to compute the corresponding bit-columns K0 , K1 , · · · , Kd−1 and δ. 3. Q = O. Evaluation Stage: 4. For i = d − 1 to 0 by −1 do: 5. Q = 2Q, 6. Q = Q + Ki P . 7. Return Q = Q + δ · 2(w−1)d P (i.e., return Q = Q + δ · [1, 0, · · · , 0, 0]P ).
Alg. 3 is not an SPA-resistant comb method. SPA is able to detect if δ is zero or not in Step 7 of Alg. 3. Since Ki = 0 for all i, the operations in the for loop of Alg. 3 are a sequence of alternative point doubling (D) and point addition (A), DADA · · · DADA. By inserting potential dummy operations after the for loop, we can easily convert the above SPA-nonresistant method to an SPA-resistant method, as described in the following algorithm. n Algorithm 4. SPA-Resistant Signed MSB-Set Comb Method (d = w ). Input: A point P , an integer k > 0, and a window width w 2. Output: Q = kP . Precomputation Stage: 1. Compute [1, bw−2 , · · · , b1 , b0 ]P for all (bw−2 , · · · , b1 , b0 ) ∈ {0, 1}w−1 . 2. Apply Alg. 2 to k to compute the corresponding bit-columns K0 , K1 , · · · , Kd−1 and δ. 3. Q0 = O. Evaluation Stage: 4. For i = d − 1 to 0 by −1 do: 5. Q0 = 2Q0 , 6. Q0 = Q0 + Ki P . 7. Set Q1 = Q0 − (−1)b(w−1)d · [1, 0, · · · , 0, 0]P . 8. Return Q|δ| .
In Steps 7–8 of Alg. 4, we have exploited the fact that δ = b(w−1)d − 1 + e(w−1)d from Alg. 2, where b(w−1)d and e(w−1)d are in {0, 1}. This fact implies that b(w−1)d must be 0 if δ is −1, and b(w−1)d must be 1 if δ is 1. 4.3
Security Against Power Analysis
Security of our proposed SPA-resistant point multiplication method Alg. 4 is discussed in this section. We first consider its security against SPA, and then
Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication
21
describe how to convert the method to resist DPA, second-order DPA, and other side channel attacks. Like other SPA-resistant methods [17, 15, 16], Alg. 4 exploits the fact that point subtraction is virtually the same as point addition for power analysis. It performs one point addition (or point subtraction) and one doubling in each iteration of the loop in calculating point multiplication. There is always one point addition (or point subtraction) in Step 7. This means that the same sequence is executed no matter what value a scalar k is. Therefore SPA cannot extract any information about the secret k by examining the power consumption in execution of Alg. 4’s point multiplication. In other words, our SPA-resistant comb method Alg. 4 is really SPA-resistant. An SPA-resistant method is not necessarily resistant to DPA attacks, as shown by other SPA-resistant point multiplication methods [17,15,16]. This is also true for our SPA-resistant method Alg. 4. Typical measures such as randomization projective coordinates or random isomorphic curves can be used to convert Alg. 4 into a DPA-resistant method. The aforementioned randomization measures may not be enough to resist the second-order DPA attack proposed by Okeya and Sakurai [6]. This second order attack exploits the correlation between power consumption and hamming weight of the loaded data to determine which Ki is loaded. To thwart this second-order DPA attack, we can use the same scheme proposed in [17] to protect HPB’s methods – to randomize all precomputed points after getting the point in the table so that there is no fixed hamming weight. Goubin [7] recently proposed a refined DPA attack on many randomization schemes. This attack employs special points with one of coordinates being zero. To deal with Goubin’s DPA attack, a simple approach is to choose elliptic curves E : y 2 = x3 + ax + b defined over Fp (p > 3) with b not being a quadratic residue modulo p, and to reject any point (x, 0) as an input point in applications of our proposed SPAresistant method. If the cardinality #E(Fp ) is a big prime number, points (x, 0) cannot be eligible input points since they are not on elliptic curves. Another more powerful attack, the Zero-value Point Attack proposed in [8] also requires certain prerequisite conditions for the elliptic curve to be used, although the conditions are weaker than Goubin’s attack [7]. Careful selection of the elliptic curve can get rid of these security threats. 4.4
Efficiency
Both of our comb methods Algs. 3–4 require storage of 2w−1 points. In the precomputation stage of our comb methods, 2d P, 22d P, · · · , 2(w−1)d P are first calculated. This costs (w − 1)d point doublings. Then all possible combinations [1, bw−2 , · · · , b1 , b0 ]P with (bw−2 , · · · , b1 , b0 ) ∈ {0, 1}w−1 are calculated in the same way as the precomputation stage for Alg. 1, which costs 2w−1 − 1 point additions. The total cost of our comb methods in the precomputation stage is
22
M. Feng et al.
Table 1. Comparison of space and average time costs for the SPA-nonresistant comb methods Original Comb sLSBS Comb Alg. 3 n n d w n+1 w w Storage 2w − 1 2w−1 2w−1 Pre(w − 1)dD (w − 1)dD (w − 1)dD Stage (2w − w − 1)A (2w−1 − 1)A (2w−1 − 1)A Eva(d − 1)D (d − 1)D (d − 1)D 2w −1 Stage (d − 1)A (d − 12 )A (d − 12 )A 2w Total (wd − 1)D (wd − 1)D (wd − 1)D w Cost (2w − w − 1 + 2 2w−1 (d − 1))A (2w−1 + d − 32 )A (2w−1 + d − 32 )A
therefore {(w − 1)d}D + {2w−1 − 1}A. The time costs of our comb methods in the evaluation stage vary a little due to the post-processing after the for loop. Assume that the scalar k is uniformly distributed, then the average cost in the evaluation stage is (d − 1)D + (d − 12 )A for Alg. 3 and (d − 1)D + dA for Alg. 4. We would like to first compare our comb method Alg. 3 with the other fixedbase comb methods without considering SPA-resistance. Table 1 lists the space and time costs for the original comb method Alg. 1, sLSBS [19], and Alg. 3. Compared with the original comb method Alg. 1, our comb method Alg. 3 stores 2w−1 points, which is about half of 2w − 1, the number of points stored by the original comb method. In addition, Alg. 3 saves 2w−1 − w point additions in the precomputation stage. The evaluation stage has a similar time cost for both methods, as shown in Table 1. If we want to maintain about the same storage space for pre-computed points, our method Alg. 3 can choose the value of w as w = w1 + 1, one larger than the value w = w1 used in the original comb method, resulting in a similar storage space (2w1 v.s. 2w1 − 1) as the original comb method Alg. 1 yet with much faster computation in both precomputation and evaluation stages, thanks to smaller d used in our methods. Compared with n sLSBS, d changes from n+1 w in sLSBS to w in Alg. 3. When n is not divisible by w, both methods have the same value of d, resulting in the same time cost. In the case that n is divisible by w, Alg. 3 uses a d which is one smaller than that used in sLSBS, resulting in smaller time costs in both precomputation and evaluation stages. To deal with the problem, FZXL proposed several different schemes to ensure execution efficiency. Our proposed comb methods handle the issue in a nice and uniform manner. Let us now compare our SPA-resistant comb method Alg. 4 with other SPAresistant comb methods. Table 2 lists the space and time costs for the two HPB methods, sLSBS’s SPA-resistant counterpart, and our SPA-resistant comb method Alg. 4. All the comb methods except sNZ store 2w−1 pre-computed points, while sNZ stores 2w − 1 pre-computed points which is about twice the number of stored points in other comb methods. Our Alg. 4 executes one less point doubling in the evaluation stage than the other three SPA-resistant comb methods, and requires much less point additions in the precomputation stage than both of HPB’s methods. Alg. 4 shows the same advantage over sLSBS’s
Signed MSB-Set Comb Method for Elliptic Curve Point Multiplication
23
Table 2. Comparison of space and average time costs for SPA-resistant comb methods
d Storage PreStage EvaStage Total Cost a
sNZ Comb sABS Comb sLSBSa Comb Alg. 4 n n n w w n+1 w w w w−1 w−1 w−1 2 −1 2 2 2 (w − 1)dD (w − 1)dD (w − 1)dD (w − 1)dD (2w − w − 1)A (2w − w)A (2w−1 − 1)A (2w−1 − 1)A dD dD dD (d − 1)D dA dA dA dA wdD wdD wdD (wd − 1)D (2w − w + d − 1)A (2w − w + d)A (2w−1 + d − 1)A (2w−1 + d − 1)A
SPA-resistant version
SPA-resistant counterpart as Alg. 3 over sLSBS, due to smaller d used in Alg. 4 when n is divisible by w. Table 2 shows that our Alg. 4 is the most efficient SPA-resistant comb method.
5
Conclusion
In this paper, we proposed a novel comb recoding algorithm to convert an integer to a representation with a set of signed MSB-set nonzero comb bit-columns. Using this recoding algorithm, we presented a signed MSB-set comb method and an SPA-resistant comb method to calculate point multiplication for ECC. Security of the proposed SPA-resistant comb method and comparison of the proposed comb methods with other comb methods were also discussed in the paper. Our comb methods are among the most efficient comb methods in terms of the number of precomputed points and computational complexity. Combined with randomization techniques and certain precautions in selecting elliptic curves and parameters, our proposed SPA-resistant comb methods can thwart all sidechannel attacks.
References 1. I. F. Blake, G. Seroussi, and N. P. Smart, Elliptic Curves in Cryptography, Cambridge Univ. Press, 1999. 2. C. Lim and P. Lee, “More Flexible Exponentiation with Precomputation,” Advances in Cryptology – CRYPTO’94, LNCS 839, pp. 95–107, Springer-Verlag, 1994. 3. P. C. Kocher, “Timing Attacks on Implementations of Diffe-Hellman, RSA, DSS and Other Systems,” Advances in Cryptology – CRYPTO’96, LNCS 1109, pp. 104– 113, Springer-Verlag, 1996. 4. P. C. Kocher, J. Jaffe, and B. Jun, “Differential Power Analysis,” Advances in Cryptology – CRYPTO’99, LNCS 1666, pp. 388–397, Springer-Verlag, 1999. 5. J. S. Coron, “Resistance Against Differential Power Analysis for Elliptic Curve Cryptosystems,” Cryptographic Hardware and Embedded Systems – CHES’99, LNCS 1717, pp. 292–302, Springer-Verlag, 1999.
24
M. Feng et al.
6. K. Okeya and K. Sakurai, “A Second-Order DPA Attack Breaks a Window-Method Based Countermeasure against Side Channel Attacks,” Proc. 5th Intl. Conf. on Information Security, LNCS 2433, pp. 389–401, Springer-Verlag, 2002. 7. L. Goubin, “A Refined Power-Analysis Attack on Elliptic Curve Cryptosystems,” Public Key Cryptography – PKC’2003, LNCS 2567, pp 199–211, Springer-Verlag, 2003. 8. T. Akishita and T. Takagi, “Zero-Value Point Attacks on Elliptic Curve Cryptosystem,” Information Security Conference – ISC’2003, LNCS 2851, pp 218–233, Springer-Verlag, 2003. 9. P. Y. Liardet and N. P. Smart, “Preventing SPA/DPA in ECC Systems Using the Jacobi Form,” Cryptographic Hardware and Embedded Systems – CHES’2001, LNCS 2162, pp. 391–401, Springer-Verlag, 2001. 10. M. Joye and J. J. Quisquater, “Hessian Elliptic Curves and Side-Channel Attacks,” Cryptographic Hardware and Embedded Systems – CHES’2001, LNCS 2162, pp. 402–410, Springer-Verlag, 2001. 11. K. Okeya, H. Kurumatani, and K. Sakurai, “Elliptic Curves with the MontgomeryForm and Their Cryptographic Applications,” Public Key Cryptography– PKC’2000, LNCS 1751, pp. 238–257, Springer-Verlag, 2000. 12. W. Fischer, C. Giraud, E. W. Knudsen, and J.-P. Seifert, “Parallel Scalar Multiplication on General Elliptic Curve over Fp Hedged against Non-Differential SideChannel Attacks,” IACR, Cryptography ePrint Archieve 2002/007, http://eprint.iacr.org/2002/007, 2002. 13. T. Izu, and T. Takagi, “A Fast Parallel Elliptic Curve Multiplication Resistant against Side Channel Attacks,” Public Key Cryptography (PKC 2002), LNCS 2274, pp. 280–296, 2002. 14. E. Brier and M. Joye, “Weierstrass Elliptic Curves and Side-Channel Attacks,” Public Key Cryptography (PKC2002), LNCS 2274, pp. 335–345, 2002. 15. B. M¨ oller, “Securing Elliptic Curve Point Multiplication against Side-Channel Attacks, Addendum: Efficiency Improvement,” http://www.informatik.tudarmstadt.de/TI/Mitarbeiter/moeller/ecc-scaisc01.pdf, 2001. 16. K. Okeya and T. Takagi, “A More Flexible Countermeasure against Side Channel Attacks Using Window Method” cryptographic Hardware and Embedded Systems – CHES’2003, LNCS 2779, pp. 397–410, 2003. 17. M. Hedabou, P. Pinel, and L. B´eb´eteau, “A Comb Method to Render ECC Resistant against Side Channel Attacks,” http://eprint.iacr.org/2004/342.pdf, 2004. 18. M. Hedabou, P. Pinel, and L. B´eb´eteau, “Countermeasures for Preventing Comb Method Against SCA Attacks,” Information Security Practise and Experience Conference, ISPEC’2005, LNCS 3439, pp, 85-96, Springer-Verlag, 2005. 19. M. Feng, B. Zhu, M. Xu and S. Li, “Efficient Comb Methods for Elliptic Curve Point Multiplication Resistant to Power Analysis,” http://eprint.iacr.org/2005/222. 20. B. Chevallier-Mames, M. Ciet, and M. Joye, “Low-Cost Solutions for Preventing Simple Side-Channel Analysis: Side-Channel Atomicity,” IEEE Transaction on Computers, vol. 53, no. 6, pp. 760–768, June 2004. 21. M. Joye and C. Tymen, “Protections against Differential Analysis for Elliptic Curve Cryptography – An Algebraic Approach,” Cryptographic Hardware and Embedded Systems – CHES’2001, LNCS 2162, pp. 377–390, Springer-Verlag, 2001.
Diophantine Approximation Attack on a Fast Public Key Cryptosystem Wang Baocang and Hu Yupu Key Laboratory of Computer Networks & Information Security, Ministry of Education, Xidian University, Xi’an, 710071, P.R. China [email protected], [email protected]
Abstract. At ACISP 2000, H. Yoo etc. proposed a public key cryptosystem using matrices over a ring, which was analyzed using lattice basis reduction algorithms by Youssef etc. at ACISP 2001. In this paper, another attack, namely Diophantine approximation attack, is presented. It is shown that the decryption of the cryptosystem can be transformed into solving the simultaneous Diophantine approximation problem, which can be approximated by lattice basis reduction algorithms. So we heuristically explain that the scheme is insecure. Furthermore, our new attack is more general than lattice attack. Keywords: Public-key cryptosystem, Cryptanalysis, Simultaneous Diophantine approximation problem, Lattice basis reduction, Diophantine approximation.
1 Introduction Traditional public key cryptosystems (PKC) such as RSA [1] and ElGamal [2] suffer from a drawback that their speed is relatively low, which motivates cryptographers to design faster PKCs. Several fast PKCs have been proposed in the literature. Among these PKCs were several lattice-based PKC, such as Ajtai-Dwork [3], GGH [4], and NTRU [5], knapsack trapdoors, such as [6], braid-based PKCs [7], PKCs using finite non-Abelian groups [8], and so on. However, using discrete logarithm problem and integer factorization problem to construct fast PKCs has been an open problem in the literature. At ACISP 2000, Yoo etc. proposed a fast public key cryptosystem using matrices over a ring [9]. The security of the proposed scheme was claimed to rest on integer factorization problem in the original paper [9], and it is also the first fast asymmetric cryptographic scheme based on integer factorization problem. It is shown in [9] that if integers with the form of n=pq, where p, q were primes, was factorized efficiently, the scheme would be totally broken immediately. At ACISP 2001, Youssef etc. presented a lattice reduction attack on it [10]. The factorization of the module N allows us to find the trapdoor information. This paper provides another attack, i.e., Diophantine approximation attack. We show that if one can solve simultaneous Diophantine approximation problem efficiently, he would also be able to decrypt any given challenge cipher-text in polynomial time. However, no efficient algorithm is known for the problem. If this problem K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 25 – 32, 2006. © Springer-Verlag Berlin Heidelberg 2006
26
B. Wang and Y. Hu
is viewed as involving lattices, lattice basis reduction algorithms provide an efficient solution to approximate the problem. Different from lattice attack [10], our new attack is not restricted by the structure of N, so it is more general than lattice attack. We briefly describe the public key cryptosystem in section 2. Section 3 reviews the attack described in [10]. Our new analysis is detailed in section 4. Section 5 concludes.
2 Description of the Scheme Throughout this paper, we always use a mod n to denote the least nonnegative remainder of a divided by n, and n to denote the absolute least remainder of a divided by n, for example, 5 mod 7=5, 7=–2. This section gives the description of the cryptographic scheme proposed in [9]. Refer to [9] for more details. Key generation. The proposed encryption scheme involves lattices with dimension of n. The basic steps to choose parameters run as follows. Randomly choose primes p and q with size 512 bits, a matrix D over ring Matn(Z), and integers m, e, dii, i=1, …, n. The parameters satisfy the following conditions: N=pq; m§q0.4, e§q0.3, where m and e are upper bounds of message and error vectors respectively. D is a diagonal matrix with diagonal entries dii such that m ∆T , where ∆T denotes the expected valid time interval for transmission delay, then remote system rejects the login request. *
3. Computes C1* = h ( h( IDi ⊕ x ) ⊕ Tu ) . If C1 is equal to the received C1 , it means user is authentic and remote system accepts the login request, and performs step 4 otherwise, the login request is rejected. 4. For the mutual authentication, remote system computes C 2 = h(h( IDi ⊕ x) ⊕ Ts ) and then sends mutual authentication message {C2 , Ts } to the Ui. 5. Upon receiving the message {C 2 , Ts } , user verifies either Ts is invalid or Tu = Ts , then user Ui terminates this session otherwise performs step 6. 6. Ui computes C 2* = h( Bi ⊕ Ts ) and compares C 2* ? = C 2 . If they are equal, user believes that the remote party is authentic system and the mutual authentication between Ui and remote server is completed, otherwise Ui terminates the operation.
An Efficient and Practical Fingerprint-Based Remote User Authentication Scheme
Password Change Phase Whenever Ui wants to change or update his/her old password
265
pwi to the new
' i,
*
password pw he/she inserts smart card into the terminal and enters his IDi and pwi , and also imprints fingerprint biometric at the sensor. If Ui is verified successfully, smart card performs the following operations without any help of the remote system: 1. Computes Bi = Vi ⊕ h ( pwi i* ⊕ S i ) = h ( IDi ⊕ x ) , 2. Verifies whether Bi equals to the stored
Ai or not. If they are equal, smart card
performs further operations, otherwise terminates the operation. 3. Computes Vi ' = Bi ⊕ h( pwi' ⊕ S i ) '
4. Stores Vi on the smart card and replaces the old value of Vi . Now, new password is successfully updated and this phase is terminated.
5 Security Analysis of the Proposed Scheme In this section, we perform security analysis of the presented scheme. 1. It is very difficult for anyone to derive the server’s secret key x from the hash value of Ai = h( IDi ⊕ x) , because of the security property of one-way hash functions [6]. 2. To withstand replay attacks, neither the replay of an old login message {IDi , C1 , Tu } nor the replay of the remote system’s response {C2 , Ts } will work. It would be failed in steps 2 and 5 of the authentication phase, because of the time interval validation, respectively. 3. From the login message {IDi , C1 , Tu } , it is infeasible to compute
Bi by using
equation C1 = h( Bi ⊕ Tu ) , because it is computed by the secure one-way hash function. 4. Proposed scheme protects from the forgery attack and impersonation attack found in Lee et al.’s scheme. An attacker can attempt to modify login message {IDi , C1 , Tu } into {IDi , C A , TA } . However, this impersonation attempt will be failed in the step 3 of the authentication phase, because an attacker has no way of obtaining the value of Bi* = h( IDi ⊕ x) to compute the valid value of C1 . 5. Server spoofing attack is completely solved by providing the mutual authentication between user and remote system. Remote system sends mutual authentication message {C 2 , Ts } to the user. If an attacker intercepts it and resends the forge message i.e. {C A , T A } to the user, it will be verified in steps 5 and 6 of the authentication phase because the value of C 2 is computed by C 2 = h (h ( IDi ⊕ x ) ⊕ Ts ) . In addition, replay of this message can be exposed because of the time stamp.
266
M.K. Khan and J. Zhang
6. The proposed scheme can prevent from the parallel session attack [14] and reflection attack [28], because remote server and user check whether Tu = Ts , respectively. 7. In the password change phase, user has to verify himself/herself by fingerprintbiometric and it is not possible to impersonate a legal user, because biometric is unique. Furthermore, the value of Bi is also compared with the value of Ai on the smart card. If these two values are not same, user is not allowed to change the password. Moreover, if the smart card is stolen or theft, unauthorized users can not change new password. Hence, our scheme is protected from the denial-of-service attack through stolen smart cards [21].
6 Efficiency of the Proposed Scheme The performance and efficiency of the proposed scheme are summarized in table 1. Our scheme is completely based on one-way collision free hash functions, which are computationally faster than modular exponentiations [6]. In the registration, login, and authentication phases, without mutual authentication, Lee et al.’s scheme requires 8 times exponential and 2 times hash operations. While on the other hand, proposed scheme requires only 9 hash computations with the mutual authentication so, the computational complexity of our scheme, in the authentication life cycle, is less than Lee et al.’s scheme. Furthermore, proposed scheme enables users to choose their passwords freely at the time of registration, while Lee et al.’s scheme does not allow users to choose their passwords, and passwords are computed by remote system, which are long and random, and user can not easily remember them. In our scheme users can change or update their passwords securely without the help of remote system. In contrast, in Lee et al.’s scheme, users have no way to change or update their passwords, which is an inefficient solution and it does not full fill the Table 1. Comparisons of efficiency
Lee et al.’s Scheme [8] Computation in registration phase
Proposed Scheme
2 TExp
2 TH
Computation in login phase
3 TExp, 1 TH
2 TH
Computation in authentication
3 TExp, 1 TH
phase
5 TH (with mutual authentication)
Choose password
Not Allowed
Allowed
Password change phase
Not supported
Supported
Mutual authentication
Not supported
Supported
Public information on smartcard
f(.), p
h(. )
Length of authentication message
C = {IDi , C1 , C 2 , T }
m = {IDi , C1 , Tu }
TExp : the computation time for modular exponentiation. TH : the computation time for a one-way hash function.
An Efficient and Practical Fingerprint-Based Remote User Authentication Scheme
267
requirement of authentication protocols [18] [22]. In addition, compared with Lee et al.’s scheme, only our scheme supports mutual authentication to protect the system from the server spoofing attack [21] [26]. By mutual authentication user also authenticates the remote party and establishes a trust on the authenticity of the server. The memory space on the smart cards is limited and it is cost effective [29], so to keep this issue in mind, our scheme consumes less memory space on the smart card to store user’s public information. Besides, length of the transmitted authentication message to the remote system is also short in our scheme. Hence, it is obvious that the proposed scheme is more efficient and securer than Lee et al.’s scheme in terms of computations, performance, efficiency, and security.
7 Conclusion In this paper, we demonstrated that a recently proposed fingerprint-based remote user authentication scheme, by Lee et al., is vulnerable and susceptible to the attack and has some practical pitfalls. To overcome the discrepancies of their system, we proposed an efficient and practical fingerprint-based remote user authentication scheme using smart cards. The proposed scheme is based on one-way collision free hash functions and does not maintain password tables on the remote server. Furthermore, users can choose their passwords freely and change or update them securely whenever they want. In addition, mutual authentication between user and remote system is also introduced, and users can trust on the authenticity of the remote party. Hence, the proposed scheme is protected from server-spoofing attack found in Lee et al.’s scheme. Moreover, lower computational cost, improved security, more user-friendliness, and increased efficiency of our scheme has proved a big achievement over Lee et al.’s scheme.
Acknowledgements This project is supported by ‘Southwest Jiaotong University Doctors Innovation Funds 2005’.
References 1. Lamport, L.: Password Authentication with Insecure Communication, Communications of the ACM, 24 (11) (1981) 770-772. 2. Hwang, M.S., Li, L.H. A New Remote User Authentication Scheme using Smart Cards, IEEE Transactions on Consumer Electronics 46 (1) (2000) 28-30. 3. El Gamal, T.: A Public-key Cryptosystem and a Signature Scheme Based on Discrete Logarithms, IEEE Transactions on Information Theory 31 (4) (1985) 469-472. 4. Wang, S. J., Chang, J.F.: Smart Card Based Secure Password Authentication Scheme, Computers and security 15(3) (1996) 231-237. 5. Yang, W.H., Shieh, S.P.: Password Authentication Schemes with Smart Cards, Computers and Security 18 (8) (1999) 727-733. 6. Sun H.M., An Efficient Remote User Authentication Scheme Using Smart Cards, IEEE Transactions on Consumer Electronics 46(4) (2000) 958– 961.
268
M.K. Khan and J. Zhang
7. Lee, C. C., Hwang, M. S., Yang W. P.: A Flexible Remote User Authentication Scheme Using Smart Cards, ACM Operating Systems Review 36 (3) (2002) 46-52. 8. Lee, J.K., Ryu, S.R., Yoo, K.Y.: Fingerprint-based Remote User Authentication Scheme Using Smart Cards, IEE Electronics Letters, (12) (2002) 554–555 9. Hsieh, B.T., Yeh, H.T., Sun, H.M., Lin, C.T, Cryptanalysis of a Fingerprint-based Remote User Authentication Scheme Using Smart Cards’. Proc. IEEE 37th Annual 2003 Int. Carnahan Conf. on Security Technology, Taipei, Taiwan, (2003) 349-350. 10. Shen, J. J., Lin, C. W., Hwang M. S., A Modified Remote User Authentication Scheme Using Smart Cards, IEEE Transactions on Consumer Electronics 49 (2) (2003) 414-416. 11. Chang, C. C., Hwang K. F.: Some Forgery Attacks on a Remote User Authentication Scheme Using Smart Cards, Informatics, 14 (3) (2003) 289 -294. 12. Shyi-Tsong, Wu., Bin-Chang C.: A User Friendly Remote Authentication Scheme with Smart cards, Computers & Security 22 (6) (2003) 547-550. 13. Leung, K. C., Cheng L. M., Fong A. S., Chan C. K.: Cryptanalysis of a Modified Remote User Authentication Scheme Using Smart Cards, IEEE Transactions on Consumer Electronics 49 (4) (2003) 1243-1245. 14. Hsu, C.L.: Security of Chien et al.’s Remote User Authentication Scheme Using Smart Cards, Computer Standards and Interfaces 26 (3) (2004) 167– 169. 15. Kumar, M.: New Remote User Authentication Scheme Using Smart Cards, IEEE Transactions on Consumer Electronics 50 (2) 2004) 597-600. 16. Yang, C. C., Wang R. C.: Cryptanalysis of a User Friendly Remote Authentication Scheme with Smart cards, Computers & Security 23 (5) (2004) 425-427. 17. Wu, S.T., Chieu B.C., A Note on a User Friendly Remote User Authentication Scheme with Smart Cards, IEICE Transactions Fundamentals 87-A (8) (2004) 2180-2181. 18. Yoon, E. J., Ryu E. K., Yoo K.Y.: Efficient Remote User Authentication Scheme based on Generalized ElGamal Signature Scheme, IEEE Trans. Consumer Electronics 50 (2) (2004) 568-570. 19. Hsu C.L.: Security of Chien et al.’s Remote User Authentication Scheme Using Smart Cards, Computer Standard and Interfaces, 26(3) (2004) 167-169 20. Lin, C.H., Lai, Y.Y, A Flexible Biometrics Remote User Authentication Scheme, Computer Standard and interfaces 27 (1) (2004) 19–23. 21. Yoon, E. J., Ryu E. K., Yoo K.Y: An Improvement of Hwang-Lee-Tang’s Simple Remote User Authentication Scheme, Computers and Security 24 (2005) 50-56. 22. W.C, Ku, S.T. Chang., M.H. Chiang.: Further Cryptanalysis of Fingerprint-based Remote User Authentication Scheme Using Smartcards, IEE Electronics Letters 41 (5) 2005 23. Lu, R., Cao, Z.: Efficient Remote User Authentication Scheme Using Smart Card”, Computer Networks (article in press), Online April 2005. 24. Jain, A.K., and Uludag, U.: Hiding Biometric Data. IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 25(11) (2003) 1494-1498. 25. Jain, A.K., Hong L, Bolle R.: On-Line Fingerprint Verification. IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 19(4) (1997) 302-314 26. Asokan, N., Debar, H., Steiner M, Waidner M.: Authenticating Public Terminals, Computer Networks 31 (8) (April 1999) 861-870. 27. Anderson, R. J.: Why Cryptosystems Fail, Proc. of First ACM Conference on Computer and Communications Security, USA (Nov. 1993) 215–227. 28. Mitchell, C.: Limitations of Challenge-response Entity Authentication, Electronic Letters, 25(17) (Aug. 1989) 1195-1196 29. Rankl, W., W. Effing (Editors): Smart Card Handbook, Third Edition, John Wiley & Sons UK, (2003).
Domain-Based Mobile Agent Fault-Tolerance Scheme for Home Network Environments Gu Su Kim and Young Ik Eom School of Information and Communication Eng., Sungkyunkwan University, 300 cheoncheon-dong, Jangan-gu, Suwon, Gyeonggi-do 440-746, Korea {gusukim, yieom}@ece.skku.ac.kr
Abstract. Mobile agent technology is very useful in home network environments and ubiquitous environments. For several reasons, a particular mobile agent may fail to operate during its movement between various platforms. For mobile agent recovery, a checkpoint can be used. However, devices in home network environments generally have minimal or no secondary storage. Therefore, a checkpoint cannot be saved on the device within a home network. In this paper, a scheme that safely recovers a mobile agent using a checkpoint saved within the home gateway is presented. When the mobile agent enters the single home network environment, , it registers its recovery policy and saves its checkpoint on the home gateway. At the checkpoint saving instance, a symmetric key generated by the home gateway encrypts the checkpoint, this symmetric key is also encrypted and stored together with the checkpoint. When the mobile agent is abnormally terminated or the device suddenly turns off from battery exhaustion or failure, the home gateway recognizes the exception in the mobile agent or device, recovering the mobile agent with the checkpoint according to the previously registered recovery policy.
1
Introduction
In a broad sense, an agent represents any program that acts on behalf of a (human) user. A mobile agent can be described as a program representing the user in a computer network, with the ability to migrate autonomously from node to node, performing select computation on behalf of the user [1, 2]. Several advantages such as reduction of network traffic, asynchronous and autonomous computation activities, dynamic adaptation capability, and robustness and faulttolerance [3] are gained by deploying mobile agent environments. Therefore, by applying the mobile agent concept to home networks, remote interactions and network traffic is reduced among the home network devices [4, 5]. When the mobile agent moves within devices in home networks, the device may suddenly turn off due to battery exhaustion or failure. At this point, the mobile agent may abnormally terminate. Therefore, fault-tolerance capabilities
This work was supported by National Center of Excellence in Ubiquitous Computing and Networking (CUCN), Korea.
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 269–277, 2006. c Springer-Verlag Berlin Heidelberg 2006
270
G.S. Kim and Y.I. Eom
of the mobile agent need to be supported. In most systems supporting mobile agent fault-tolerance, the checkpoint is saved in a file system and the mobile agent is recovered using the checkpoint. However, devices in home network environments cannot save the checkpoint of the mobile agent in their storage because these devices have minimal or no storage and also have limited batter power. In this paper, a scheme safely recovering a mobile agent from failure due to device battery exhaustion or failure, or an exception of the mobile agent itself in home network environments is proposed. The proposed scheme stores the checkpoint of each device on the file system in the home gateway. For checkpoint confidentiality and security, the scheme encrypts the checkpoint at the checkpoint storing point. In Section 2, related work on the fault-tolerance scheme of the mobile agent is described. Section 3 describes our system architecture and the recovery process of the mobile agent using the checkpoint. Section 4 presents an analysis of the safeness of our mechanism described in Section 3. Finally, Section 5 concludes with a summary.
2
Checkpoint Security of Mobile Agents
A mobile agent may abnormally terminate at any location due to host failure (the platform that supporting the execution environment for the mobile agent) or the mobile agent itself [6]. For providing fault-tolerance capability, a mobile agent replication scheme has been proposed [7]. This scheme replicates a mobile agent and executes multiple instances of the mobile agent concurrently. Even through an instance of the agent abnormally terminates, other agents can complete the given job. However, this scheme increases the load on the system due to the execution of multiple instances and incurs the selection problem in that a voting mechanism needs to be provided to select the correct result among the results produced by many instances of the agent. There is another scheme designed for fault-tolerance, using checkpoints to recover mobile agents from the failure [8], this scheme is adopted in commercial mobile agent systems such as Voyager[9], Tacoma[10], and Concordia[11]. In this scheme, the checkpoint of a mobile agent includes the execution image, byte code, and the internal status of the mobile agent, the system saves the checkpoint when required during execution. When a mobile agent fails to operate, the system restarts the mobile agent from the checkpoint to recover the mobile agent. In this scheme, the mobile agent transmits the termination event to its home platform when it abnormally terminates, enabling the home platform to start the recovery procedure. Fault-tolerance schemes using checkpoints can be classified into agent-based schemes and platform-based schemes. In agent-based fault-tolerance schemes, a platform creates a special monitor agent. For example, Tacoma[10] creates a rear guard agent to monitor the execution of mobile agents in visiting platforms. If the rear guard agent finds an abnormal termination of a mobile agent, the rear guard
Domain-Based Mobile Agent Fault-Tolerance Scheme
271
recovers the mobile agent from its latest checkpoint. If the rear guard terminates, the mobile agent platform simply recreates another rear guard. Using a monitor agent for fault tolerance has the merit that different recovery policy can be applied for each mobile agent, however, it also has the disadvantage that frequent message exchanges occur between the monitor agent and the mobile agent. In platform-based fault-tolerance schemes, each platform manages checkpoints involved in the recovery process of the mobile agents. This scheme is used in Voyager[9] and Concordia[11]. In this scheme, a mobile agent periodically stores its checkpoint in the file system of the visiting platform. When the platform finds an abnormal termination of the mobile agent, the platform recovers the mobile agent using the latest checkpoint. This scheme uses no message exchange between agents, but there are disadvantages, a platform can provide only one recovery policy to all mobile agents in the platform and it is very difficult to provide a consistent recovery policy over all platforms. Using the checkpoint concepts can cause several security problems.Other mobile agents or programs can read or modify the checkpoints, because the checkpoint is saved in the file system. To keep the confidentiality and integrity of the checkpoint, it is necessary to encrypt the checkpoint, as used in Concordia. In Concordia, when an agent arrives at a platform, the Concordia Server creates a symmetric key and this key is delivered to the agent. The agent uses the symmetric key when encrypting its checkpoints. The symmetric key is also encrypted with the public key of the platform and stored with the encrypted checkpoints. In the recovery stage, it is necessary to decrypt the checkpoints, this decryption process requires the server’s private key. The system (or administrator) must ensure the confidentiality of the private key, this can be accomplished by storing the key on a secure file system or on removable media that can be physically secured by the administrator. Fault-tolerance scheme of Concordia guarantees the security of the checkpoints, but still has the inherent disadvantages of the platform-based schemes. Existing schemes use a method saving the checkpoint of the mobile agent in the file system of each platform and recovering the mobile agent with the checkpoint saved. These schemes are not suitable for devices on home network environments with very small or no secondary storage and low CPU speed.
3
Domain-Based Fault-Tolerance Scheme of the Mobile Agent
In this section, a mobile agent fault-tolerance scheme based on the checkpoint for the home network environments is described. 3.1
System Architecture and Assumption
We had developed the lightweight mobile agent system, called KAgent System [12], which is executable on the small devices based on Java 2 Micro Edition(J2ME) environments. The prototype of our domain-based fault-tolerance scheme is implemented in the KAgent System.
272
G.S. Kim and Y.I. Eom
ADF ARM
Single Home Network Environments (Domain)
Cp
HG MA
HG: Home Gateway ADF : Agent Directory Facility ARM: Agent Recovery policy Manager Cp: Checkpoint MA: Mobile Agent
Fig. 1. System Architecture
In the proposed scheme, the domain is defined as a single home network environment. A domain consists of a home gateway and several digital appliances. The home gateway has a high CPU speed, with abundant memory and secondary storage. Conversely, digital appliances have a low CPU speed with minimal memory and secondary storage. In addition, all devices in a domain have a platform assumed to support the execution environment for mobile agents. A home gateway periodically monitors whether devices in its domain are running or turned off. Home gateway manages Agent Directory Facility (ADF ) providing a directory service for mobile agents running on the domain. When the mobile agent moves between platforms, the target platform transmits the new location of the mobile agent to the ADF. The home gateway has the Agent Recovery policy Manager (ARM ) component for managing the recovery policy of the mobile agent. Figure 1 presents the proposed system architecture. In this architecture, We assume that the home gateway in the domain is trustable and not malicious. Our scheme is not the mobile agent protection scheme but the protection scheme of mobile agent’s checkpoint saved in the home gateway, therefore we don’t mention the mobile agent protection method against malicious attackers in the appliances. 3.2
Checkpoint Management
When a mobile agent is created on the home platform, a unique verification value, called vkey, is obtained from the home platform. This value is generated as follows: vkey = hash(M Aimage, KM A ) The MAimage is the code image of the mobile agent, and KM A is the private key of the mobile agent. The home platform stores the ID of the mobile agent and its vkey. The vkey is used to verify the checkpoint exception notification message during the recovery stage, discussed in Section 3.3.
Domain-Based Mobile Agent Fault-Tolerance Scheme
273
When a mobile agent enters a domain, the mobile agent moves to the platform in the domain through the home gateway. The home gateway first saves the checkpoint of the mobile agent in its file system. The checkpoint is protected from eavesdropping by other agents or software entities, because the checkpoint has important private information. To ensure protection, the proposed scheme encrypts the details, before storing the checkpoint in the file system. The key used to encrypt the checkpoint should be safely managed. The recovery process is performed according to the recovery policy of the mobile agent. When a mobile agent enters a domain, the mobile agent registers its ID, the recovery policy, and its vkey with the ARM. Recovery policy may have one of the following values: – REPORT ONLY : the home gateway only reports an exception to the mobile agent user. – REPORT RETURN : the home gateway transmits an exception report and the intermediate result of the mobile agent to the user. – RECOVER RESTART : the home gateway recovers the mobile agent with its checkpoint and again moves it to its current visiting domain. – RECOVER NEXTDOMAIN : the home gateway recovers the mobile agent with its checkpoint and moves it to the next domain to be visited. The home gateway manages the checkpoint according to the recovery policy of the mobile agent. If the recovery policy is REPORT ONLY, then the home
MakeCheckpoint(MobileAgent ma) // input : ma – mobile agent refrence { int MAID = ma.GetID(); // Extract the ID of the ma int rpolicy = ma.GetRecoveryPolicy(); // Extract the recovery policy from the ma byte *vkey = ma.GetVKey (); // Extract the verify key from the ma SaveRecoveryPolicy(ARM, MAID, rpolicy, vkey); if (rpolicy != REPORT_ONLY) { Checkpoint cp = GenerateCheckpoint(ma); // Generate the checkpoint of the ma SymKey Ks= GenerateSymmetricKey(MAID); // Generate the symmetric key for encryption byte *pecp = Encrypt(cp, Ks); // Encrypt the checkpoint with Ks if (rpolicy == REPORT_RETURN) { PublicKey Ku = GetPlatformPulbicKey(ma); // Get the public key of the home platform of ma byte *pekey = Encrypt(Ks, Ku); // Encrypt Ks with Ku SaveCheckpoint(pecp, pekey); // Save the checkpoint and key } else { // rpolicy is RECOVER_RESTART or RECOVER_NEXTDOMAIN Publickey Ku = GetHGPublicKey(); // Get the public key of the home gateway byte *pekey = Encrypt(Ks, Ku); // Encrypt Ks with Ku SaveCheckpoint(pecp, pekey); // Save the checkpoint and key } } }
Fig. 2. The algorithm of the checkpoint generation according to the recovery policy
274
G.S. Kim and Y.I. Eom
gateway doesn’t store the checkpoint. If the recovery policy is REPORT RETURN, the home gateway generates the symmetric key, Ks , and encrypts the checkpoint with Ks . The Ks is encrypted with the public key of the mobile agent’s + , and stored with the checkpoint in the home gateway file syshome platform, Khp tem. In this case, in order to decrypt the checkpoint, the private key of the home platform, Khp , is required. Therefore, only a mobile agent’s home platform can recover the mobile agent with the encrypted checkpoint. If the recovery policy is RECOVER RESTART or RECOVER NEXTDOMAIN, the home gateway encrypts Ks using its own public key, Kg+ , storing it with the checkpoint in the file system. The stored checkpoint has the following appearance: + Khp (Ks )||Ks (Checkpoint) or Kg+ (Ks )||Ks (Checkpoint)
The image of the mobile agent recovered from the checkpoint is identical to the image of the mobile agent when first entering into the domain. Figure 2 shows the algorithm of the checkpoint generation according to the recovery policy when the mobile agent enters a domain. 3.3
The Recovery Process of the Mobile Agent
A mobile agent may abnormally terminate at a place due to the failure of a device or mobile agent itself. When a mobile agent moves through a device in a domain, a device can suddenly turn off due to battery exhaustion or failure, and the abnormal termination. If a device is turned off, the home gateway detects t device failure and searches mobile agents in the ADF for the failed device, attempting to recover the mobile agent with checkpoints according to the mobile agents specific recovery policy registered in the ARM. Another case the failure is the failure of the mobile agent itself. If the mobile agent is abnormally terminated by a software exception, the platform reports the mobile agent exception to the home gateway. The home gateway attempts to recover the mobile agent with its checkpoint according to its recovery policy registered in the ARM. Figure 3 presents the algorithm used when the home gateway recovers a specific mobile agent, according to its recovery policy. If the recovery policy is REPORT ONLY, the home gateway only transmits the exception notification to the home platform of the failed mobile agent. If the recovery policy is REPORT RETURN, the home gateway transmits the exception notification message with the encrypted checkpoint. The exception notification message consists of the following structure: + (Ks )||Ks (Checkpoint) > < M A ID, M AC, Khp
The M A ID represents the ID of the mobile agent, MAC represents the hash value of the checkpoint using vkey, saved in the ARM for checkpoint verification. Therefore, in this case, the recovery of the mobile agent can be performed only by its home platform. In the case where the recovery policy is RECOVER RESTART or RECOVER NEXTDOMAIN, the symmetric key, Ks , is encrypted by Kg+ , the public key of
Domain-Based Mobile Agent Fault-Tolerance Scheme
275
Recovery(int MAID) // input : MAID – the ID of the mobile agent for recovery { int rpolicy = GetRecPolicy(MAID); // Get the recovery policy of the mobile agent from ARM if(rpolicy == REPORT_ONLY) { // Send only exception notification message to the home platform of the mobile agent SendMessage(homeplatform, EXCEPTION_REPORT); } else { byte *ecp = GetCheckPoint(MAID); // Extracts the checkpoint from the file system byte *vkey = GetVkey (MAID); // Extracts the vkey of the mobile agent form ARM // Extracts the symmetric key for decrypting the checkpoint byte *ekey = GetCheckPointKey(ecp); if(rpolicy == REPORT_RETURN ) { // Calculates the MAC value for the encrypted checkpoint with vkey byte *pmac = GetMacValue(ecp, vkey); // Send the mac value, the checkpoint of the mobile agent, // and the encrypted key to the home platform SendCheckpoint(homeplatform, MAID, pmac, ecp, ekey); } else { // Read the encrypted checkpoint corresponding to MAID from the file system byte *ecp = ReadCheckPoint(MAID); byte *cp = DecryptCheckpoint(ecp, Kr); // Kr is the private key of HG // Recovers the mobile agent from the checkpoint MA ma = RecoveryFromCheckpoint(cp); if(rpolicy == RECOVER_RESTART) { Migrate(mydomain, ma); // Migrates the ma to the current domain } else { // RECOVER_NEXTDOMAIN Migrate(nextdomain, ma); // Migrates the ma to the next domain that should be visited DeleteCheckpoint(MAID); // Deletes the checkpoint from the file system } } } }
Fig. 3. The algorithm of the mobile agent recovery according to recovery policy
the home gateway. The home gateway decrypts Ks using Kg , the private key of the home gateway, decrypting the checkpoint using Ks . If the recovery policy is RECOVER RESTART, then the home gateway moves the mobile agent to its current visiting domain. If the recovery policy is RECOVER NEXTDOMAIN, then the home gateway moves the mobile agent to the next domain that should be visited. When the mobile agent leaves the domain, the home gateway removes its checkpoint and the recovery policy of the mobile agent in the ARM from its file system.
4
Safety Analysis
A mobile agent stores its checkpoint in the file system at the home gateway. If the checkpoint is simply stored in the file system, other programs can read the private information of the mobile agent and/or modify the contents of the checkpoint. In this section, possible attacks are presented and methods to prevent these attacks are discussed.
276
G.S. Kim and Y.I. Eom
(1) A situation where other programs attempt to read the contents of the checkpoint. Each home gateway encrypts the checkpoint of the mobile agent with Ks , pro+ (Ks )||Ks (Checkpoint)} or {Kg+ (Ks )||Ks (Checkpoint)} according ducing {Khp to the recovery policy, storing it in its file system. Therefore, for other programs to read the contents of the checkpoint, they must know Ks . + Ks ← Khp (Khp (Ks )) Checkpoint ← Ks (Ks (Checkpoint)) or Ks ← Kg (Kg+ (Ks )) Checkpoint ← Ks (Ks (Checkpoint))
Therefore, only the home platform or home gateway possessing Khp or Kg , + , can read the contents of the checkpoint because the Ks is encrypted with Khp + the public key of the home platform, or Kg , the public key of the home gateway, according to the recovery policy. (2) A situation where other programs attempt to modify the checkpoint. + (Ks ) || Ks (CheckFor other programs modifying the checkpoint to create {Khp + point )} or { Kg (Ks )||Ks (Checkpoint )}, they must be able to obtain Ks . Similar to situation (1), only the home platform can obtain Khp or only the home gateway can obtain Kg . Therefore, other programs cannot modify the checkpoint and the integrity of the checkpoint can be guaranteed. + (3) A situation where the home platform receives {Khp (Ks ) || Ks + (Checkpoint )} instead of {Khp (Ks )||Ks (Checkpoint)} from a malicious home gateway. + The home platform can receive the checkpoint of another mobile agent, Khp (Ks )||Ks (Checkpoint ), from a malicious home gateway at the recovery stage when the recovery policy is REPORT RETURN. The exception notification message has following structure: + < M A ID, M AC , Khp (Ks )||Ks (Checkpoint) > M AC = hash(Checkpoint , vkey )
During the recovery stage, when the home platform receives the checkpoint of another mobile agent, it compares the M AC in the exception notification message and with the MAC, the home platform hashes Checkpoint with vkey which the original mobile agent registers with the home platform. When the two values do not match, the home platform discards the checkpoint and does not recover the mobile agent. Therefore, the home platform recovers the mobile agent from checkpoint only with the MAC value hashed by the correct vkey.
5
Conclusion
In this paper, a scheme that can recover mobile agents with a checkpoint during abnormal termination according to the recovery policy designated by the user is
Domain-Based Mobile Agent Fault-Tolerance Scheme
277
presented. This proposed scheme uses the home gateway for saving the checkpoint because devices of home network environments have minimal or no storage. Therefore, the scheme saves the checkpoint of the mobile agent in the file system of the home gateway in the domain. When the mobile agent enters one domain, the home gateway registers the recovery policy of the mobile agent in the ARM and encrypts the checkpoint of the mobile agent with the symmetric key, storing it in the file system of the home gateway. The symmetric key used to encrypt the checkpoint is encrypted with the public key of the mobile agent’s home platform or the public key the home gateway according to the recovery policy stored with the checkpoint. Consequently, the confidentiality of the checkpoint stored is guaranteed because of the confidentiality of the asymmetric key. If device failure from battery exhaustion or failure, as well as an exception of the mobile agent itself occurs, the home gateway detects the abnormally terminated mobile agent and attempts to recover the mobile agent with its checkpoint according to the recovery policy.
References 1. N. M. Karnik and A. R. Tripathi, “Design Issues in Mobile-Agent Programming Systems,” IEEE Concurrency, Vol. 6, No. 3, Jul.-Sep. 1998. 2. N. M. Karnik and A. R. Tripathi, “Agent Server Architecture for the Mobile-Agent System,” Proc. PDPTA’98, Jul. 1998. 3. D. B. Lange and M. Ohima, Programming and Deploying Java Mobile Agents with Aglets, Addison Wesley, 1998. 4. J. Yoo and D. Lee, “Scalable Home Network Interaction Model Based on Mobile Agents,” Proc. PerCom’03, Mar. 2003. 5. K. TAKASHIO, G. SOEDA, and H. TOKUDA, “A Mobile Agent Framework for Follow-Me Applications in Ubiquitous Computing Environment,” Distributed Computing System Workshop 2001 International Conference, Apr. 2001. 6. W. Lin and Q. Yanping, “An Analytic Survey of Fault-Tolerance Mobile Agent Architecture,” http://www.cs.concordia.ca/agent/FTMA/FTMAsurvey.pdf. 7. S. Pleisch and A. Schiper, “Modeling Fault-Tolerant Mobile Agent Execution as a Sequence of Agreement Problems,” Proc. of the 19th IEEE Symposium on Reliable Distributed Systems, Germany, Oct. 2000. 8. A. R. Tripathi, T. Ahmed, and N. M. Karnik, “Experiences and Future Challenges in Mobile Agent Programming,” Microprocessors & Microsystems (Elsevier), Vol. 25, No. 2, Apr. 2001. 9. ObjectSpace Inc., “ObjectSpace Voyager Core Package Technical Overview,” Technical report, ObjectSpace Inc., Jul. 1997. 10. D. Johansen, R. Renesse, and F. B. Schneider, “Operating System Support for Mobile Agents,” Proc. the 5th IEEE Workshop on Hot Topics in Operating Systems, May 1995. 11. J. Peng and B. Li, “Mobile Agent in Concordia,” http://www.cs.albany.edu/mhc/Mobile/Concordia.pdf. 12. H. Cho, G. S. Kim, K. Kim, H. Shim, and Y. I. Eom, “Development of Lightweight Mobile Agent Platform for URC Environments,” Proc. of The 2nd International Conference on Ubiquitous Robots and Ambient Intelligence (URAmI 2005), Daejeon, Korea, Nov. 2005.
Using π-Calculus to Formalize Domain Administration of RBAC Yahui Lu1,2 , Li Zhang1 , Yinbo Liu1,2 , and Jiaguang Sun1,2 1
2
School of Software, Tsinghua University, Beijing 100084, China Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China {luyh03, lizhang, lyb01, jgsun}@mails.tsinghua.edu.cn
Abstract. With the wide implementations of Role-based access control (RBAC) models in the information systems, the access control for RBAC itself, administration of RBAC, becomes more and more important. In this paper, we propose a Domain Administration of RBAC Model, DARBAC, which defines an administrative domain for each administrative role. The administrative role can execute administrative operations on the users, roles, objects and child administrative roles within its administrative domain. Then we use π-calculus to formalize the elements of DARBAC model and their interactions. Although π-calculus has been successfully used in many security areas such as protocol analysis and information flow analysis, as we have known, our approach is the first attempt to use π-calculus to formalize RBAC and its administrative model.
1
Introduction
Role-based access control (RBAC) models[1, 2] have been widely used in the database management, security management and network operating system products. In RBAC model, the access rights to objects are assigned to roles, and users must be assigned to a role to get the access rights to objects. RBAC is more flexible and manageable than previous discretionary and mandatory models. In another view, the elements of RBAC model, such as roles, users, objects and their relationships, also need to be managed by access control mechanisms in order to guarantee the system security. Several RBAC administrative models have been proposed to set up access control rules on the management of RBAC[3, 4, 5, 6, 7]. But the administrative roles in these models can only manage parts of the elements of RBAC model. ARBAC[3, 4] and SARBAC[5] only focus on roles and modular administration[6] only focus on users and objects. In this paper, we propose a domain administrative model DARBAC for RBAC. In DARBAC, we introduce the administrative domain, which includes users, roles, objects in RBAC and child administrative roles in DARBAC. Each administrative role is assigned to an administrative domain and can only execute administrative operations within its administrative domain. The administrative domain is autonomic and extendable. So DARBAC can be easily used in distributed environments. K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 278–289, 2006. c Springer-Verlag Berlin Heidelberg 2006
Using π-Calculus to Formalize Domain Administration of RBAC
279
There are many methods in the formalization of RBAC administrative models, especially in the formalization of administrative objects and administrative operations. ARBAC uses role range[3] and SARBAC uses administrative scope[5] to represent roles managed by administrative role. Wedde et al use predicate logic[6] and Koch et al use Graph Transformation[7] to represent administrative operations. In this paper, we use π-calculus to formalize the RBAC administrative model. π-calculus is a kind of process algebra. It has the ability to communicate data through channels and change the connection relationships through process[9]. It has been successfully used in the protocol analysis[12], information flow analysis [13] and Workflow modeling[14]. As we have known, our paper is the first attempt to use π-calculus to formalize RBAC and its administrative model. This is the main contribution of our paper. This paper is organized as follows. Section 2 proposes the domain administration of RBAC model. Section 3 introduces the definition of π-calculus. Section 4 describes the formalization of DARBAC in π-calculus. Section 5 concludes our work and proposes further work.
2
DARBAC: Domain Administration of RBAC Model
In this section, we first introduce the related work in the research of RBAC administrative model. Then we describe the definition of DARBAC model. 2.1
Related Work
ARBAC [3, 4] is the first hierarchical administrative model which introduces administrative roles to manage the RBAC model itself. ARBAC includes three separate models: URA model to manage user-role assignment, PRA model to manage role-permission assignment and RRA to manage roles and role graph. ARBAC has several disadvantages. First, to represent roles, it uses the concept of role range, which makes the whole administrative model complicated and uneasy to be implemented. Second, it only manages roles and doesn’t mention how to manage users and objects. Ferraiolo et al define the administrative functions and their semantics in the proposed NIST RBAC Standard[2]. But they don’t define how to implement these administrative functions with RBAC and these administration functions are more conveniently implemented in centralized administrative environments. In SARBAC [5], Crampton et al introduce the concept of administrative scope to represent the roles managed by administrative roles. Administrative roles can be combined with role graph. So the administrative scope of each administrative role can be calculated directly by this role and administrative role graph. SARBAC avoids the first disadvantage of ARBAC, but it still mainly aims at the administration of roles. Wedde et al propose a model of modular authorization and administration[6]. They define units to organize the users and objects and define authorization sphere to include these units. The authorization teams can only manage the units
280
Y. Lu et al.
within their authorization sphere. But in their model, roles are not included in the authorization sphere and the management on roles can only be controlled by predicate logic expressions. Koch et al use graph to formalize the RBAC and use graph transformation rules to formalize the administrative operations on the elements of RBAC[7]. They also use the graph method to formalize the SARBAC model[8]. But in their graph formalization, the user is public and can be created and deleted by every administrative role without any restriction. One of the main disadvantages of these administration models is that in these models the administrative roles can only manage parts of the elements of RBAC. ARBAC, SARBAC and Graph formalization only focus on roles and modular administration only focus on users and object. To overcome this disadvantage, we propose the Domain Administration of RBAC model, in which we introduce the administrative domain to include users, roles and objects that can be managed by administrative roles. 2.2
Definition of DARBAC
We give the definition of DARBAC elements as follows (Figure 1). The detailed explanations by π-calculus of these elements are described in Section 4.
AOARA ARADA
AOP
AR
UARA
AD
AOUA
AO
AOPA
AORA RH
U URA
R
OP
P
OBJ
RPA
Fig. 1. The DARBAC Model
Definition 1. The DARBAC model has the following components: – U, R, AR, OBJ, OP: the set of users, roles, administrative roles, objects and operations on object. – P: the set of all permissions. P ⊆ 2OBJ×OP . – URA⊆ U × R : a many-to-many user to role assignment relation. – RPA⊆ R × P : a many-to-many permission to role assignment relation. – RH⊆ R × R : the role hierarchy. – UARA⊆ U × AR : a many-to-many user to administrative role assignment relation. – AOP: the set of all administrative operations, which include CreateUser, DeleteUser, AssignRoleUser, DeAssignRolePerm etc.
Using π-Calculus to Formalize Domain Administration of RBAC
281
– AO ⊆ 2U ×R×P ×AR : the set of all administrative objects, which include the child administrative roles, users, roles and permissions. – AD ⊆ 2AO×AOP : the set of all administrative domains. The administrative domain includes administrative objects and the administrative operations on these administrative objects. – ARADA ⊆ AR × AD : a one-to-one administrative role to administrative domain assignment relation. – AOARA ⊆ AR×AO : a many-to-many administrative role to administrative objects assignment relation. – AOU A ⊆ U × AO : a many-to-many user to administrative objects assignment relation. – AORA ⊆ R×AO : a many-to-many role to administrative objects assignment relation. – AOP A ⊆ P × AO : a many-to-many permission to administrative objects assignment relation. The main part of DARBAC is administrative domain. The administrative domain includes administrative objects and the administrative operations on these administrative objects. The administrative object includes users, roles, permissions, and child administrative roles. These inclusion relationships are defined by AOARA, AOUA, AORA and AOPA relations. The administrative operations which can be used by administrative role are also defined within the administrative domain. Example 1. Figure 2 shows the administrative domains and administrative roles in a software development department. The administrative domain of Product Team1 is defined as: PT1::={{},{Alice, Bob, Carl},{Designer, Auditor, Employee}, {{Code, read, write}, {Requirement, read}}, {CreateUser, AssignRoleUser, AssignRolePerm, DeAssignRoleEdge}}. PT1 includes users, roles, permissions and administrative operations. And in PT1 there haven’t child administrative roles. We assign Project Security Officer PSO1 to this PT1 domain. PSO1 can give Alice the write right to Code by assign Alice to Designer role and assign Designer role to permission (Code, write). PSO1 can also create users into PT1 domain but PSO1 can not delete any users in this domain. We can also define Senior Security Officer SSO, who has the administrative domain SDD::={{PSO1},{ }, {Manager},{ },{CreateRole, DeleteRole}}. SSO has all the administrative privileges of PSO1, and he can also assign Manager role to the users and permissions. In DARBAC model, the administrative role can operate on users, roles and permissions which are included in its domain. This is the main difference between DARBAC and other administrative models. Each administrative role can be flexibly defined and changed by modifying its administrative domain. So
282
Y. Lu et al.
SSD: Adm inistrative Dom ain of SSO
SSO
PT 1: Adm inistrative Dom ain of PSO1
PSO1 Manager
Alic e
Carl Bob
Designer
Auditor
Requirem ent Code
Em ployee
Fig. 2. The administrative domain and administrative role
the DARBAC model is more easy to be implemented in the distributed environments. Further more, because each administrative role can only execute the specified operations within its domain, even if this administrative role executes wrong operations, it can not influence users or permissions beyond its domain. Thus, the DARBAC model provides a safer mechanism for authorization and administration in RBAC model.
3
Introduction to π-Calculus
The π-calculus is a mathematical model of processes whose interconnections change as they interact[9, 10, 11]. In the π-calculus, there are two basic concepts. One is name. The channels, ports, variables, data are names. The other is processes, representing the entity in systems. The interaction between processes is done through a pair of complementary ports. Definition 2. The syntax of processes expression in the π-calculus is give bellow [10, 11]: – Prefixes: π ::= a|a(x)|τ – Agents: P::=0 |π.P |P+Q |P|Q |if x=y then P else Q |if x=y then P else Q |(υx) P |A(y1 , · · · , yn ) – Definitions: A(x1 , · · · , xn )::=P, where i = j =⇒ xi = xj – Polyadic: a.P ::= (νp)a
.p.p. · · · .p.P a(x1 , · · · , xn ).P ::= a(p).p(x1 ).p(x2 ). · · · .p(xn ).P Here are some explanations on the syntax: 1. 0 is inaction, which can not perform any action. 2. The output prefix a: name x is sent along the name a. 3. The input prefix a(x): a name is received along a name a, and x is a placeholder for the received name.
Using π-Calculus to Formalize Domain Administration of RBAC
283
4. The unobservable prefix τ.P : Process can evolve to P invisibly to observer. 5. Sum P+Q: an agent can enact n either P or Q. A sum of several agents P1 + P2 + · · · + Pn is written as i=1 Pi . 6. Composition P|Q: the components P and Q can proceed independently and can interact via shared names. A composition of several agents P1 |P2 |· · · n |Pn is written as i=1 Pi . 7. Match if x=y then P else Q : this agent will behave as P if x and y are the same name, otherwise it behave as Q. 8. Mismatch if x=y then P else Q: this agent will behave as P if x and y are not the same name, otherwise it behave as Q. 9. Restriction υx.P: the scope of name x is limited to P and can be used for communication between the component within P. The channel may be passed over another channel for use by another process. 10. Identifier A(y1 , · · · , yn ): every Identifier has a Definition A(x1 , · · · , xn ) = P where xi must be pairwise distinct, and the A(y1 , · · · , yn ) behaves as P with yi replacing xi for each i. 11. Polyadic expression is extension to allow multiple objects in communications. We here also admit the case n=0 when there is no object at all and we denote this as a and a(). Definition 3. For convenience, we define some other notations which will be used in later. – AND::= then if. For example, if x=y AND z=s then P ⇔ if x=y then if z=s then P. − → – − x→ n ::= (x1 , x2 , · · · , xn ). xn is representation for n-dimension vector. − → − → – xn ↑ x ::= (x1 , x2 , · · · , xn , x).− x→ n ↑ x means we add x to vector xn to compose a n+1-dimension vector. (x2 , · · · , xn ) f or i = 1 (x1 , x2 , · · · , xi−1 , xi+1 , xn ) f or 1 < i < n – − x→ n ↓ xi ::= f or i = n (x1 , x2 , · · · , xn−1 ) − − → x→ n ↓ xi means we delete xi from vector xn to become a n-1-dimension vector. – if x∈ /− x→ n then P:: = if x = x1 AND x = x2 · · · AND x = xn then P. – if x∈ − x→ n then P(yi ) ::= if x = x1 then P (y1 ) else if x = x2 then P (y2 ) · · · else if x = xn then P (yn ).
4
Formalize DARBAC Using π-Calculus
The π-calculus representation of elements in DARBAC is shown in Table 1 and Figure 3. In our formalization, every process has several ports which can be accessed by other processes. There are two kinds of ports: administrative ports such as du, dr, do and dar which can be accessed by administrative role, and the access ports such as role access port r and object access ports opi .
284
Y. Lu et al. Table 1. The π-calculus representation of DARBAC elements DARBAC element User Role Object Administrative Role Object Operation Role Hierarchy Administrative Object Administrative Operation User Role Assignment Role Permission Assignment
π-calculus Process U Process R Process O Process AR Channel Channel Channel Process Channel Channel
Port du dr and r do, op1 , · · · , opn dar op1 · · · opn link with r port link with du,dr,do and dar link with r port link with op1 · · · opn port
dar AR du
dr
r U
R
do op[1..n] OBJ
Fig. 3. The representation of DARBAC elements
4.1
User Process
The main task of user process is : – Receive the role access port name from the administrative role and link or unlink with this port – Interact with role process along the role access port – Receive the delete command from the administrative role and destroy itself n → → rn )+ Definition 4. U(du,n,− rn )::= i=1 ri .U (du, n, − (υx) du(command, x).if command=DELETE then 0 → → rn ↑x) else if command=ARU AND x∈ /− rn then U(du,n,− → → − rn ↓ ri ) else if command=DRU AND x∈ rn then U (du,n-1, − The User process has one administrative port du which can be accessed by administrative role process, and n links to role processes. It can access role process by ri channel, or destroy itself if the administrative role process execute the DeleteUser command, or receive the role access channel ri from administrative role process to link or unlink with that role process. Example 2. We can define the following user processes in Example 1. Alice::= U(du alice, 1, r des). Bob::= U(du bob, 0, 0). Carl::= U(du carl, 0, 0). The Alice process links with one role access port r des which means Alice is assigned with Designer role. The Bob and Carl processes have no role access ports.
Using π-Calculus to Formalize Domain Administration of RBAC
4.2
285
Object Process
The main task of object process is: – Receive the access from roles – Receive the query command from the administrative role and returns the required object operation port name – Receive the delete command form the administrative role and destroy itself n −−→ −→ −−→ Definition 5. OBJ(do,n,− op→ n ,topn ) ::= i=1 opi ().OBJ(do, n, opn , topn )+ (νx) do(command, x, type). if command=DELETE then 0 −−→ else if command=QUERY AND type∈ topn − − → then (x.OBJ(do, n, − op→ n , topn )) The OBJ process has one administrative port do which can be accessed by administrative role process and n operation ports which can be accessed by role processes. The object process can wait the access by role process along opi channel, or destroy itself if the administrative role process execute the DeleteObject command, or receive query command from administrative role process and returns the operation port opi . Here we use topi to identify the type of each operation port. For example, the name of one operation port is op1 and its type is read. The operation channel opi of each object process is different, but their type topi may be the same. Thus the administrative role process don’t need to store every operation port name of every object process, it just need to store the do port and use topi to query the real operation port name of that object. Example 3. We can define the following object processes in Example 1. Requirement::= OBJ(do req, 1, op5, read). Code::= OBJ(do code, 2, op1, op2, read, write). The Requirement object can be read via port op5. The Code object can be read via port op1 and written via port op2. 4.3
Role Process
The main task of role process is: – Receive the query command from administrative role and return its access port name – Receive the access by user and other roles along its access port – Receive the object access port name from administrative role and link or unlink this port – Receive the child role access port name from administrative role and link or unlink this port – Receive the delete command form the administrative role and destroy itself −→) ::= r().R(dr, r, n, − → −→)+ → op rn , m, − op Definition 6. R(dr, r, n, − rn , m, − m m n m − → − − → − → −→)) + (ri .R(dr, r, n, rn , m, opm )) + j=1 (opj .R(dr, r, n, rn , m, − op m i=1 (νx) dr(command, x). if command=DELETE then 0
286
else else else else else
Y. Lu et al.
if if if if if
→ −→)) command=QUERY then (x.R(dr,r, n, − rn ,m,− op m → − −→ ↓ op ) − − → op command=DRP AND x∈ opm then R(dr,r,n,rn , m-1, − m i → − − −→ ↑ x) − − → command=ARP AND x∈ / opm then R(dr,r,n,rn ,m+1,op m → −→) → rn ↓ ri ,m,− op command=DRR AND x∈ − rn then R(dr,r,n-1,− m → − − −→) → − command=ARR AND x∈ / rn then R(dr,r,n+1,rn ↑ x,m,op m
The role process contains one administrative port dr to be accessed by administrative role process, one access port r to be accessed by user and other role processes, n links to its child roles and m links to object access ports. The role graph can be maintained by storing the child roles’ access ports and changed by administrative role’s AssignRoleEdge and DeAssignRoleEdge commands. Example 4. We can define the following role processes in Example 1: Manager::=R(dr man,r man,2,r des,r aud,0,0). Designer::=R(dr des,r des,1,r emp,1,op2). Auditor::=R(dr aud,r aud,1,r emp,0,0). Employee::=R(dr emp,r emp,0,0,0,0). The Manager role process has role access ports of its child role: port r des (Designer role) and port r aud (Auditor role). The Designer role process has one role access port r emp (Employee role) and one object access port op2 which means Designer can write Code object via op2. 4.4
Administrative Role Process
The main task of administrative role process is: – Store the administrative ports of users, roles, objects and child administrative roles within its administrative domain – Execute administrative operations in its administrative domain −−→ −−→ −→ −−→ −−→ op→ Definition 7. AR(ar,s,dars ,k,duk ,l,drl ,m,dom , n, topn )::=(υ du,dr,r,do,− n) −−→ −−→ −→ −−→ −−→ (CreateUser(ar,du)|AR(ar,s,dars ,k+1,duk ↑ du,l,drl ,m,dom , n, topn )) −−→ −−→ −→ −−→ −−→ +(CreateRole(ar,dr,r)|AR(ar,s,dars ,k,duk ,l+1,drl ↓ dr,m,dom , n, topn )) −−→ −−→ −→ −−→ −−→ −−→ +(CreateObj(ar,do,− op→ n , topn )|AR(ar,s,dars ,k,duk ,l,drl ,m+1, dom ↑ do, n, topn )) k −−→ −−→ −→ −−→ −−→ + i=1 (DeleteUser(ar,dui ).AR(ar,s,dars ,k-1,duk ↓ dui ,l,drl ,m,dom , n, topn )) l −−→ −−→ −→ −−→ −−→ + i=1 (DeleteRole(ar,dri ).AR(ar,s,dars ,k,duk ,l − 1, drl ↓ dri ,m,dom , n, topn )) m −−→ −−→ −→ −−→ −−→ + i=1 (DeleteObj(ar,doi ).AR(ar,s,dars ,k,duk ,l,drl ,m-1,dom ↓ doi , n, topn )) k l (AssignRoleUser(ar,dui ,drj )+DeAssignRoleUser(ar,dui ,drj )) + k i=1k j=1 + i=1 j=1 (AssignRoleEdge(ar,dri ,drj )+DeAssignRoleEdge(ar,dri ,drj )) l m n + i=1 j=1 t=1 (AssignRolePerm(ar,dri ,doj , topt )+ −−→ −−→ −→ −−→ −−→ DeAssignRolePerm(ar,dri ,doj , topt )) .AR(ar,s,dars ,k,duk ,l,drl ,m,dom , n, topn ) The administrative role process has one administrative port ar to be accessed by other administrative roles, s administrative ports of child administrative role processes, k administrative ports of user processes, l administrative ports of role processes, m administrative ports of object processes and n operation types of objects. The administrative role process is composed by administrative operations.
Using π-Calculus to Formalize Domain Administration of RBAC
287
Example 5. We define the following administrative role processes in Example 1: SSO::=AR(dar sso,1,dar pso1,0,1,dr man,0,0). PSO1::=AR(dr pso1,0,3,du alice,du bob,du carl,3,dr des,dr aud,dr emp,2, do code,do req,2,read,write). The SSO administrative role(dar sso) has 1 child administrative role PSO1 (port dar pso1) and one role Manager (port dr man)in its domain. The PSO1 administrative role(dar pso1) has three users(du alice,du bob,du carl), three roles (dr des,dr aud,dr emp) and two objects (do code,do req) in its domain. 4.5
Administrative Operation Process
The administrative operations are used by administrative role to manage users, roles, objects and their relations within its administrative domain. Process reduction rules can be used to explain the meaning of each administrative operation. Definition 8. The administrative operation processes are defined as follows: CreateUser(ar,du)::=U(du,0,0) CreateRole(ar,dr,r)::=R(dr,r,0,0,0,0) −−→ −→ −−→ CreateObj(ar,do,− op→ n , topn )::=OBJ(do,n,opn , topn ) DeleteUser(ar,du)::=du m DeleteObj(ar,do)::=do. i=1 DeAssignRolePerm(dri ,do) k DeleteRole(ar,dr)::=dr. i=1 DeAssignRoleUser(dui ,dr). l i=1 DeAssignRoleEdge(dri ,dr) AssignRoleUser(ar,du,dr)::=(υt)(dr.t(x).du) DeAssignRoleUser(ar,du,dr)::=(υt)(dr.t(x).du) AssignRolePerm(ar,dr,do,top)::=(υt)(do.t(x).dr) DeAssignRolePerm(ar,dr,do,top)::=(υt)(do.t(x).dr) AssignRoleEdge(ar,dr1 ,dr2 )::=(υt)(dr2 .t(x).dr1 ) DeAssignRoleEdge(ar,dr1 ,dr2 )::=(υt)(dr2 .t(x).dr1 ) Here are some examples of administrative operations. Example 6. SSO creates user David (Figure 4). SSO::= AR(dar sso,1,dar pso1,0,0,1,dr man,0,0) =⇒ CreateUser(dar sso,du dav) | AR(dar sso,1,dar pso1,1,du dav,1,dr man,0,0) =⇒ · · · =⇒ U(du dav,0) | AR(dar sso,1,dar pso1,1,du dav,1,dr man,0,0) = David | SSO Example 7. PSO1 assign Bob to Auditor role(Figure 5). PSO1 | Auditor | Bob =⇒ AssignRoleUser(dar pso1,du bob,dr aud) | Auditor | Bob | PSO1 =⇒ (υt)(dr aud.t(x).du bob) | R(dr aud,r aud,1,dr emp,0,0) | U(du bob,0,0) | PSO1 =⇒ (υt)(t(x).du bob) | (t.R(dr aud,r aud, 1, r emp,0,0)) | U(du bob,0,0) | PSO1 =⇒ · · · =⇒ R(dr aud,r aud,1,r emp,0,0) | U(du bob,1,r aud) | PSO1 = Auditor | Bob | PSO1
288
Y. Lu et al.
dar_sso
dar_sso SSO'
dar_pso1
dr_m an
David
dar_pso1
r_m an
dr_m an r_m an
Manager
PSO1
du_dav SSO'
CreateUser
Manager
PSO1
Fig. 4. Create User dar_pso1
dar_pso1
PSO1
PSO1 dr_aud r_aud Bob
Auditor
dr_aud AssignRoleUser
r_aud Bob'
Auditor
Fig. 5. AssignRoleUser
Example 8. SSO deletes Manager role. SSO | Manager =⇒ DeleteRole(dar sso,dr man).AR(dar sso,1,dar pso1,0,0,0,0) | Manager =⇒ · · · =⇒ AR(dar sso,1,dar pso1,0,0,0,0)
5
Conclusion and Future Work
In this paper, we propose DARBAC model, a domain administration model for RBAC. In this model, administrative role can execute administrative operations on users, roles, objects and child administrative roles within its administrative domain. Then we use π-calculus to formalize the elements of DARBAC model. Process reduction rule can be used to explain the meaning of each administrative operation. The work presented in this paper can be extended in several directions. First, in this paper we only use the process reduction to represent the dynamic behaviors of administrative role. We can use other π-calculus analysis methods, such as bisimulation, congruence and modal mu-calculus, to further analyze the safety properties and expressive powers of our DARBAC model. Second, we can use the method in this paper to formalize other access control models such as DAC, MAC etc and compare the safety properties and expressive powers of these models in a unified π-calculus background.
Acknowledgement The work described in this paper was partially supported by The National Basic Research Program of China (Grant No. 2002CB312006) and the National HighTech R&D Program of China (Grant No.2003AA411022).
Using π-Calculus to Formalize Domain Administration of RBAC
289
References 1. Ravi S. Sandhu, Edward J. Coyne, Hal L.Feinstein, and Charles E. Youman. Rolebased access control models. IEEE Computer, 29(2):38–47, February 1996. 2. David F. Ferraiolo, Ravi Sandhu, S. Gavrila, D. Richard Kuhn and R. Chandramouli. Proposed NIST Standard for Role-Based Access Control. ACM Transactions on Information and System Security, 4(3):224–274, August 2001. 3. Ravi S. Sandhu, Venkata Bhamidipati, Qamar Munawer: The ARBAC97 Model for Role-Based Administration of Roles. ACM Transactions on Information and Systems Security, 2(1): 105-135 (1999). 4. Sejong Oh, Ravi S. Sandhu. A model for role administration using organization structure. SACMAT 2002: 155-162 5. Jason Crampton and George Loizou. Administrative scope: A foundation for rolebased administrative models. ACM Transactions on Information and System Security, 6(2), 201-231, 2003 6. H F. Wedde and M. Lischka. Modular Authorization and Administration. ACM Transactions on Information and System Security, 7(3): 363-391, August 2004. 7. M.Koch, LVMancini, F.Parisi-Presicce. A Graph based Formalism for RBAC. ACM Trans. Information and System Security, 5(3): 332-365, August 2002. 8. Manuel Koch, Luigi V. Mancini, Francesco Parisi-Presicce: Administrative scope in the graph-based framework. SACMAT 2004: 97-104. 9. R. Milner, J.Parrow, and D.Walker. A Calculus of Mobile Processes, Part I/II, Journal of Information and Computation, 100(1):1-77, Sept.1992. 10. Joachim Parrow. An Introduction to the Pi calculus, Handbook of Process Algebra, Elsevier, 2001,pp. 479-543. 11. Davide Sangiorgi and David Walker. The pi calculus: A theory of Mobile Processes, Cambridge University Press, 2001. 12. Martin Abadi, Andrew D. Gordon. A Calculus for Cryptographic Protocols: The Spi Calculus. ACM Conference on Computer and Communications Security 1997: 36-47. 13. M. Hennessy and J. Riely. Information Flow vs. Resource Access in the Asynchronous Pi-Calculus. ACM Transactions on Programming Languages and Systems, 24(5):566–591, September 2002. 14. Julian A. Padget, Russell J. Bradford: A pi-calculus Model of a Spanish Fish Market - Preliminary Report. First International Workshop on Agent Mediated Electronic Trading, AMET 1998: 166-188
An Efficient Way to Build Secure Disk∗ Fangyong Hou, Hongjun He, Zhiying Wang, and Kui Dai School of Computer, National University of Defense Technology, Changsha, 410073, P.R. China [email protected]
Abstract. Protecting data confidentiality and integrity is important to ensure secure computing. Approach that integrates encryption and hash tree based verification is proposed here to protect disk data. Together with sector-level operation, it can provide protection with characters as online checking, high resistance against attacks, any data protection and unified low-level mechanism. To achieve satisfied performance, it adopts a special structure hash tree, and defines hash sub-trees corresponding to the frequently accessed disk regions as hot-access-windows. Utilizing hot-access-windows, simplifying the layout of tree structure and correctly buffering portion nodes of hash tree, it can reduce the cost of protection sufficiently. At the same time, it is convenient for fast recovery to maintain consistency effectively. Related model, approach and system realization are elaborated, as well as testing results. Theoretical analysis and experimental simulation show that it is a practical and available way to build secure disk.
1 Introduction Providing a privacy and tamper-proof environment is crucial factor for ensuring secure or trusted computing. In this paper, we focus on protecting data confidentiality and integrity of mass storage device, with the specific instance of locally connected hard disk. Here, confidentiality means to prevent unauthorized data disclosure, while integrity means protection of data from corruption or unauthorized modification. Providing confidentiality is usually fulfilled through cryptography, and secret key cryptography (or symmetric cryptography, such as block cipher like AES) is applied to protect a mass of data. Generally, the process of encryption/decryption is relative straightforward. Providing solid integrity is a difficult task, especially when online checking and resistance against replay attack are required. Online mode is to check integrity after each access. It can avoid committing error result (for example, checking integrity of a file cannot detect an invalid data block before all the data blocks of the file reached; if portions of the file have taken effect before the entire file is verified, errors may be committed), but requires more frequent checking than offline checking (which checks whether the untrusted storage device performs correctly after a ∗
This work is supported by National Laboratory for Modern Communications (No. 51436050505KG0101).
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 290 – 301, 2006. © Springer-Verlag Berlin Heidelberg 2006
An Efficient Way to Build Secure Disk
291
sequence of operations have been performed). A replay attack means that an intruder stores a message and its signature (which is often a hash result), then uses them to spoof users later. Generally, integrity is verified by its integrity code, which is also referred to as MAC (Message Authentication Code). However, MAC is brittle to replay attack. To resist against replay attack, hash tree, or Merkle tree [1], is widely used. Hash tree regards the content of all files/blocks at some point as one continuous set of data, and maintains a single (all of the data, hash) pair. Relying on its trusted root, an arbitrarily large storage can be verified and updated [2]. In such way, an intruder cannot replace some of the files (or blocks) without being detected. Although hash tree can provide online checking and resist against replay attack, a naive hash tree will make the system too slow to use, as its checking process involves many node accesses. Some optimizing measures are put forward. For example, CHTree [3] uses L2 Cache to store portion nodes of hash tree to improve the performance when applying hash tree for memory verification. LHash/H-LHash [4] uses an incremental multisets hash method to maintain access logs to make checking at a latter time (which should be seen as offline checking). There exist many systems that protect mass data. Among them, some only provide confidentiality, some give integrity relying on MAC, and some use the principle of hash tree to resist against replay attacks. CFS [5] and Cryptfs [6] encrypt file data to provide confidentiality. Tripwire [7] uses the principle of MAC to check the integrity of files. SFSRO [8] uses hash of a file-data block as the block identifier to guarantee the integrity of content data. SUNDR [9] uses a hash of a block as the block identifier and a hash tree to provide data integrity at the file system level. PFS [10] keeps a list called the block map to map a file system block number to a hash of a block to protect data integrity. Arbre [11] builds hash tree into file system design tightly to protect integrity of the entire file system. But nearly all of these existing systems suffer from integrity protection. For example, PFS cannot prevent replay attack. Arbre inherits some limitations that other tree-structured file systems have, which makes an application requiring frequent synchronizations performs poorly. In this paper, we bring forward an approach to protect hard disk confidentiality and integrity. Firstly, it operates at sector level to give low-level protection, which makes meta-data and file data to be protected at the same time and is easy to be deployed into existing systems. Secondly, it applies encryption to prevent intruders from understanding data. Lastly, it constructs a hash tree to provide integrity with online mode, and to verify the whole protected data space as a single unit to resist against the intractable replay attack. Through simplifying the structure of hash tree, utilizing local character of hard disk access, buffering some of the hash tree nodes, as well as other measures like asynchronous verification, it can optimize hash tree checking process adequately to achieve high performance. The rest of this paper is organized as follows. Section 2 elaborates the fundamental method of our protection. Section 3 describes the specific system realization. Section 4 makes performance evaluation. Section5 discusses some related things. Section 6 concludes this paper.
292
F. Hou et al.
2 Protection Approach 2.1 Model Considering the case that a hard disk connects to host through local interface/bus (such as IDE or ATA), the protection model is shown as fig.1. 7UXVWHG 8QWUXVWHG +RVW 6HFXUH'ULYHU
,QWHUIDFH
+DUG'LVN
Fig. 1. The considered protection model. The trusted boundary lies between host and the attached hard disk.
In fig.1, hard disk is assumed to be vulnerable to attack. The trusted boundary lies between host and hard disk; that is, both processor/controller and system memory are treated to be trusted. This is reasonable when considering hard disk protection individually. Existing techniques [3, 12] can be used to verify or protect system memory, if necessary. A special component of “SecureDriver” lies in the inner part of trusted boundary, and it is to: (i) provide confidentiality by encrypting any data stored to hard disk, (ii) provide integrity by checking the value that host loads from a particular location of disk is the most recent value that it stored to the same place. With the proposed model, our approach tries to achieve the following purposes. − Low-level protection. We want protection mechanism not to touch file system and to work at the lowest level of disk operation. Directly operating upon each disk sector makes our mechanism a unified protection and can protect any data in disk (metadata, file data, as well as temp regions like the swapped pages of virtual-memory). At the same time, neglecting high-level data managements gives transparent protection to existing systems, and such low-level mechanism can be implemented straightforwardly on both legacy and modern storage systems (no matter for a file system or for a raw disk, and no matter using UNIX or Windows OS). − Data encryption. Any data stored in hard disk should be encrypted. − Integrity checking with online mode and resistance against replay attack. To realize such purpose, hash tree is perhaps the only feasible way. However, hash tree may incur a big cost to complete its checking processes, which is contradictory to realizing high performance. So, making sufficient optimizing of hash tree becomes the key solution to disk protection. − Consistency. As hard disk is permanent storage device, maintaining a consistent state in the event of a system crash is required. − Performance. To make protection to be worthwhile, it must not impose too great a performance penalty. Related protection processes should be high performance.
An Efficient Way to Build Secure Disk
293
2.2 Integrity Verification Through Hash Tree Hash Tree Optimizing. Utilizing the character of the protection model, we propose a simple but available method to optimize hash tree checking process as following. We build a single hash tree on the whole protected disk space, then, we define an access-window to be a hash sub-tree that corresponds to one hard disk sub-space. Thus, the entire hash tree consists of a number of access-windows. As disk access has strong local pattern, we define hot-access-windows to be the special access-windows that are frequently accessed for a given period, while other access-windows to be cool-accesswindows. Basing on these concepts, a tree style hash scheme is illustrated in fig.2. URRW K
8SSHUPRVW/HYHO 5RRW
K
0LGGOH/HYHO 7RS /RZHVW/HYHO /HDI
K K KK KRW:LQ Fig. 2. The special hash tree with simplified structure and hot-access-windows. Structure specialization and hot-access-window give the way to make optimization.
In fig.2, a special hash tree with fixed three levels is constructed. The lowest-level nodes are leaf nodes. Each middle-level node is the collision-resistance hash of the leaf nodes affiliated with it. The root node lies in the uppermost-level, and is created from hashing the result that concatenates all the middle-level nodes. The whole protected space is split into access-windows, and each access-window has one middle-level node as the top node of its hash sub-tree. For example, h1, h2, h3 and h4 compose an accesswindow. When being frequently accessed, it becomes a hot-access-window, such as hotWin1. In hotWin1, one middle-level node of hash tree, h5, becomes the top node of the hash sub-tree in hotWin1. The integrity of h1, h2, h3 and h4 can be verified by h5, and updating of them can be aggregated at h5. The integrity of h5, as well as other top nodes of access-windows (e.g., h6), can be verified by root, and updating of them can be aggregated at root. In such way, dynamically changing data in the whole protected space can be verified and updated. To make it work well, rules and processes are designed as below. − Rules. Access-window has fixed width. Hot-access-windows can be located continuously to cover a bigger protected region, or distributed discretely in different places to cover many small regions. − Process1-- Initialize. Do once for normal working cycle: (i) reads all the protected data blocks to construct all the nodes of hash tree; (ii) saves all the top nodes of access-windows to non-volatile storage device (such as flash memory disk, or be placed in a special region of hard disk); (iii) saves the root node to non-volatile and secure place (such as an equipped flash memory device in the inner part of trusted boundary).
294
F. Hou et al.
− Process2-- Start. At the very beginning of each normal start, to do: (i) loads the saved top nodes and fills them into working buffer; (ii) concatenates all the top nodes of access-windows together, and computes the hash of the concatenated data; (iii) checks that the resultant hash matches the saved root node. − Process3-- Prepare an access-window. To do: (i) reads the protected data blocks covered by this access-window; (ii) hashes each block to get each leaf node; (iii) concatenates all the leaf nodes of this access-window together, and computes the hash of the concatenated data to get the top node. − Process4-- Construct a hot-access-window. To do: (i) executes as "Process3"; (ii) checks that the new generated top node matches the buffered top node of this hotaccess-window; (iii) buffers all the generated leaf nodes. − Process5 -- Verify a read covered by a hot-access-window. To do: (i) calculates the hash of the data block that is currently read; (ii) checks that the resultant hash matches the corresponding buffered leaf node. − Process6-- Update for a write covered by a hot-access-window. To do: (i) sets a flag to indicates that this hot-access-window has been modified; (ii) calculates the hash of the data block that is currently written; (iii) replaces the corresponding buffered leaf node with the resultant hash result. − Process7-- Remove a hot-access-window. For a no longer frequently accessed hotaccess-window, to do: (i) jumps to step (iv) directly, if this hot-access-window hasn’t ever been modified; (ii) concatenates all the leaf nodes of this hot-accesswindow together, and computes the hash of the concatenated data; (iii) updates the corresponding buffered top node of this hot-access-window with the resultant hash; (iv) withdraws this hot-access-window to abolish its buffered leaf nodes; (v) builds a new hot-access-window in another place (during working time, it implies to create a new hot-access-window if withdrawing an old one). − Process8-- Verify a read covered by a cool-access-window. To do: (i) executes as "Process3"; (ii) checks that the new generated top node matches the buffered top node of this cool-access-window. − Process9-- Update for a write covered by a cool-access-window. To do: (i) executes as "Process3"; (ii) replaces the buffered top node of this cool-access-window with the new generated one. − Process10-- Exit. At each normal exit, to do: (i) withdraws all the hot-accesswindows as "Process7" (without creating new hot-access-windows); (ii) concatenates all the top nodes of access-windows together, and computes the hash of the concatenated data; (iii) updates the root node with the resultant hash; (iv) saves back all the top nodes to their permanent storage place (as mentioned in "Process1"). The concept of hot-access-window is similar to Hou et al. [13]. But here, as system memory is treated to be trusted, related processes are significantly different to make better utilization of a relative big trusted buffer. It also has some similarities with CHTree method. But its simplified structure and regular allocation scheme differ greatly. Additionally, its regularity gives more facilities for the recovery process than CHTree does. By buffering all the leaf nodes of hot-access-window, checking in hot-accesswindow doesn’t need additional disk accesses. Checking in cool-access-window only
An Efficient Way to Build Secure Disk
295
need to read the disk region covered by one access-window. Top nodes of hot-accesswindows are updated only when withdrawing, and root node is updated only when exiting. So, most costly updates of higher-level hash nodes are combined and delayed without affecting the run-time performance, while it still provides online verifying. Fast Recovery. A consistent state can be treated as: (i) a write to a disk sector is completed, and (ii) corresponding hash tree update along the checking path until reaching the root node is completed. Recovery must ensure these two steps to be synchronous (encryption process won’t incur an inconsistent state). Although consistency can be maintained by executing "Section3.1, Algorithm1" again after unexpected crash, it requires big reading cost and has no security assurance. Fast recovery is proposed to make fast and secure recovery through a special data structure called Update-Snap. For the moment, we only describe how to make fast recovery, and we will discuss its security later. Three fields are contained in Update-Snap. One field is a flag, called UFlag, to indicate that if the top node of one access-window is consistent with the protected disk region covered by this access-window (flag value of “Y” means consistency, while “N” means inconsistency). Another field, called UHash, holds the value of the top node of one access-window. The last field is an index to associate each (UFlag, UHash) pair with its corresponding access-windows. To maintain Update-Snap, SecureDriver does as below. − For any modification to the protected region covered by an access-window, sets the corresponding UFlag to "N". This should be completed at the beginning of “Section2.2, 1), Process6, Update for a write covered by a hot-access-window” and “Section2.2, 1), Process9, Update for a write covered by a cool-access-window”. − For any updating to the top node, writes the value of this top node to the corresponding UHash, and sets its UFlag to "Y". This should be executed at the ending of “Section2.2, 1), Process7, Remove a hot-access-window”, as well as the ending of “Section2.2, 1), Process9, Update for a write covered by a cool-access-window”. To make recovery from a crash, SecureDriver does as below. − For each record in Update-Snap, fetches UHash directly if UFlag equals "Y". Otherwise, reads the disk region covered by the corresponding access-window to recalculates its top node; then, replaces UHash with the new calculated value and sets UFlag to be "Y". − Replaces the saved top nodes mentioned in "Section2.2, 1), Process1, Initialize" with these new values gotten in the above step. − Concatenates all the top nodes together, and computes the hash of the concatenated data; then, updates the root node with the resultant hash. For fast recovery, the worst case is that when system crashes, all the hot-accesswindows have been modified but haven’t been removed or withdrawn; the current update falls into cool-access-window, but the corresponding top node hasn’t been updated in time. In such case, it only requires to read the disk regions covered by all the hot-access-windows and one cool-access-window. This cost is far smaller than reading the entire protected disk space. Additionally, recording Update-Snap during
296
F. Hou et al.
the working time only occurs occasionally and related operations are lightweight. So, fast recovery is suitable for the case that recovery time is critical. Asynchronous Verification. Online checking may stall the next disk access to wait the current checking to be completed, especially when disk accesses are bursting. In order to give better performance, we allow a limited asynchronous integrity checking. For this purpose, a queue is used to hold several disk accesses to let one access start following the last one immediately without waiting the result of verification, until the queue is full. Thus, influence to the system caused by the delay of verification can be decreased. “Limited” means that the length of the queue is much shorter than pure offline verification (which makes one checking for one operation sequence about millions times of accesses, such as LHash). When asynchronous checking is applied, execution is actually “speculative”. An invalid sector may have been used before integrity violation is detected. As the length of asynchronous checking queue is limited, the probability of committing error result can also be restricted within an acceptable level. Existence of such buffer may affect the correctness of fast recovery. Possible solution is set UFlag immediately when a write transaction is put into this queue, although related checking process hasn’t been executed. 2.3 Data Encryption We apply a cipher in SecureDriver to encrypt/decrypt any data sector to/from hard disk. The cipher used by SecureDriver is the block cipher of AES. In fact, file or file system encryption need a complex key management process, such as how to apply cryptographic keys and how keys are revoked. Different systems control this process in various ways. In real usage cases, the secret key used by the cipher of SecureDriver should be changeable through an interface to high level management program. In this paper, we leave these things alone (assuming such management is the task of others, such as OS). For convenience, we assume that AES cipher always uses the same 128bit secret key. Like the root key of hash tree, AES secret key should also be kept in a secure and non-volatile memory device.
3 System Realization Specific realization needs to choose specific parameters of hash tree. One is the block size covered by each leaf node, which is set to be one disk sector (usually 512B). Another is the width of access-window. A wider value will burden more cost on preparing an access-window, and slower the process of hashing leaf nodes to match the top node. A narrower width can give more proper coverage on disk regions that are frequently accessed, but needs more buffer to hold more top nodes of accesswindows. A compromised selection is set the width to cover a 64KB disk region or 128 continuous sectors. Such width can make matching the top node of accesswindow to be quickly enough, disk coverage to be reasonable, and the number of access-windows won’t be too large. The last one is the number of hot-accesswindows. More of them can speed checking more; bur requires more buffer to hold more leaf nodes. Additionally, it may slow down the process of fast recovery. For
An Efficient Way to Build Secure Disk
297
common cases, maintaining 1K hot-access-windows is appropriate. This makes about (64KB * 1K = 64MB) protected space to be covered by hot-access-windows at any time. Additionally, we use SHA-1 hash function to produce hash tree nodes with the results of 160bit or 20B. With these selections, the root node is hold in secure memory with the capacity requirement of 20B. For a 10GB hard disk partition, there is about (10GB / 64KB § 0.16M) access-windows. So, buffering top nodes and leaf nodes of hot-accesswindows needs a buffer about (0.16M*20B + 128*20B*1K § 5.7MB). The concise modules of SecureDriver realization is shown in fig.3. 7UHH&KHFNHU 1RGH%XIIHU 6HFWRU%XIIHU
6HFXUH0HPRU\ $(6&LSKHU 6QDS5HFRUGHU
&RXSOHU
+DUGGLVNLQWHUIDFH Fig. 3. Modules of the specific realization of SecureDriver. Such security program operates at the disk driver layer, and uses system memory as its trusted buffer.
In fig.3, "Coupler" is to get sectors to/from disk. According to the addresses (i.e., sector numbers), it can determine which disk regions should be covered by hot-accesswindows. To allow asynchronous checking, a queue of "Sector Buffer" is used to hold several numbers of disk accessing. A comprised value is set the queue to be 16KB. "Tree Checker" executes the actions of hash tree optimization. "Node Buffer" holds all the top nodes of access-windows, as well as the leaf nodes of hot-access-windows. In order to avoid duplicating the read of the same disk region, it also holds the leaf nodes of the current cool-access-window (requiring additional 128*16B). So, subsequent checking in the same cool-access-window, as well as converting it into hotaccess-window immediately, needn’t read the corresponding disk region again. "Secure Memory" holds root node of hash tree permanently, as well as the AES secret key. "AES Cipher" is a cryptograph routine to convert one sector into its ciphertext or vice versa. At the end, "Snap Recorder" maintains Updating-Snap to prepare for fast recovery. For a 10GB hard disk, the size of Updating-Snap is about (UFlag+UHash+Index) * (number of access-windows) = (1bit+160bit+20bit) * 0.16M § 3.6MB.
4 Performance Simulation Compared with the speed of disk accessing, encryption/decryption latency won’t become the bottleneck of performance. For integrity checking, the main cost comes from fetching disk sectors. Without considering initialization, system start and exit, three cases will affect the performance. Constructing a new hot-access-window is the first case, which requires reading
298
F. Hou et al.
the disk region covered by this hot-access-window to build the corresponding hash sub-tree (need to read 128 continuous sectors). Checking an access in a new coolaccess-window is the second case, which spends a cost similar to the first case. Fortunately, these two cases don’t take place frequently, in proportion to the whole number of disk accesses. The last case is that the checking throughput of hot-access-window cannot keep up with the bandwidth of disk accessing. In such case, although accesses are covered by hot-access-windows, checking delay will postpone the following ones. But we should be aware that calculating several hashes and making a match in system memory can be completed much quickly, when compared with the speed of disk I/O. Additionally, the asynchronous checking queue will do great help to eliminate the checking delays caused by these cases above. To appraise performance and correctness, we build a simulation framework in a PC with 1.7GHz P4 CPU, 256MB PC2100 DDR SDRAM and ATA100 disk. We write a block driver for the Linux kernel to implement our SecureDriver. A 10GB disk partition is tested. To check its correctness, we exit normally after running some times. Then, we re-construct hash tree and make comparison with those saved top nodes, as well as the root node. Matching means that it works correctly. For simplification, we don’t implement fast recovery when making simulation. Simulation results are shown in fig.4.
7UDFH$7UDFH% 7UDFH& D
E
Fig. 4. (a) Main performance results; measured for 128-sector access-window width, 1024 hotaccess-window number, and 16KB asynchronous checking queue; comparing each sample separately, and set the cases without protection to be "1.0". (b) Performance results of different width of access-window; tested for 128-, 256-, 2048-sector per access-window separately; maintaining the same capacity of working buffer, that is, the number of hot-access-window is adjusted also; setting the best one to be "1.0".
In fig.4, we use several disk-traces to imitate different disk usages. Trace-A is captured from Andrew benchmark [14] (gotten from I/O monitoring when running it on the same original PC), while Trace-B and Trace-C are edited from HP-Labs TPC-C and TPC-D [15] trace files separately (we cut some segments from these two traces). Fig.4 (a) shows that there is a slight performance penalty less than 5% for common cases (Trace-A and Trace-B). Trace-C has a visible performance decline about 12%. The reason is that Trace-C comes from TPC-D, which reflects the disk usage of DSS (Decision Support System) applications. In TPC-D applications, big disk space is scanned (such as searching or summing in a big database) and such scan doesn’t always have strong physical locality. So, the utilization of hot-access-window is greatly affected; that is, more hot-access-windows are frequently removed and many disk accesses fall into cool-access-windows.
An Efficient Way to Build Secure Disk
299
Some parameters are also adjusted to make more deep investigations. Fig.4 (b) tells that too big width of access-window isn’t a good selection. With bigger accesswindow, preparing an access-window requires to read more disk data. Additionally, it may give worse coverage on those disk regions that are frequently accessed. Different selections of other parameters may get different results. Related simulation validates some intuitive results: more hash and AES throughput (such as using special cryptography accelerator), more long asynchronous checking queue, and more hot-access-windows, can improve performance. However, un-careful selections may incur shortcomings, such as too long asynchronous checking queue isn’t very accord with the meaning of online verification.
5 Discussions In current stage, the security of Update-Snap based fast recovery hasn’t been fully studied, which is one of our future tasks. Here, we just list the potential attacks and suggest the possible solutions as the followings. − Directly tampering to UHash for (UHash, UFlag = "Y") pair will incur an integrity violation when checking the corresponding protected disk region at later time, because the regenerated top-node will be different from the stored value of UHash. − All the (UHash, UFlag) pairs should be encrypted with a secret key known only to the core of SecureDriver. So, attempting to re-calculate the hash nodes of some maliciously modified sectors to replace the value of (UHash, UFlag = "Y") pair can be detected, because intruder cannot produce the proper ciphertext of (UHash, UFlag = "Y") pair without obtaining the encryption key. − To prevent intruder from using a copied (UHash, UFlag = "Y", Sectors covered by this access-window) old pair to replace the new pair, applying an incremental hashing scheme [4, 16] to authenticate Update-Snap with low run-time cost is an available way. That is, whenever Update-Snap is modified (such as one UFlag is set to be "N", or one UHash is updated and its UFlag is set to be "Y" again), sign it through incremental cryptograph method and save the signature to secure place. − Malicious modifications to disk sectors for a (UFlag = "N", Sectors covered by this access-window) pair cannot be prevented directly. As recovery process will read the corresponding sectors and re-calculate its hash sub-tree, intruder can tamper disk sectors and make such tampering take effect. The possible solution is to maintain accessing log for each hot-access-window and the current cool-access-window. As soon as the UFlag is set to be “N”, it begins to record each modification to disk sectors; and if the UFlag is set to be “Y” again, it clears the corresponding log. With these logs, system can rollback to the most recent state of (Correct UHash, UFlag = "Y", Correct sectors covered by this access-window) pair. − Additionally, it is better to authorize reliable user to make recovery to prohibit from arbitrary "crash-recovery" operations. In fact, UHash contained in Update-Snap can be the same one with the permanent storage of top nodes mentioned in “Section2.2, 1), Process1, Initialize”. For clarity, we give UHash a separated logical name.
300
F. Hou et al.
Besides operates on disk sectors directly, another available selection is set the block size to be equal to several continuous sectors, such as a “cluster” (commonly used by file system as the basic data unit). This will be equivalent to providing “lowlevel” protection for real usage, while it decreases the number of leaf nodes greatly to reduce the cost of node buffering. The most flexible selection of our hash tree verification scheme is to choose different width of access-window. If run-time performance is more important than any other considerations (e.g., memory occupation, recovery time, etc.), it is better to have more hot-access-windows with smaller width. Only from the point of tamper detecting, the permanent storage of top nodes mentioned in “Section2.2, 1), Process1, Initialize” doesn’t need protection, as tampering to them can incur mismatch when compared with root node. However, to improve system availability, it is better to protect these nodes also. Or else, if these nodes are tampered, it has to rebuild the whole hash tree to match the root node, which will spend a long time. We can put these nodes into a special disk region that cannot be accessed by the “public” disk interface to give them certain protection. According to the allocation schemes of common file/storage systems, disk files created in a mostly empty disk are likely to occupy sequential sectors. However, they may become fragmentized after lots operations or running long times. This will affect the efficiency of our hash tree optimization method, as it deteriorates the locality of disk I/O. Often running a “Disk Defragmenter” program (which can combine some file fragments into one continuous region) may make improvement. Wang has found collisions for MD5 hash function [17]. In fact, SHA-1 also has collisions. For very secure application scenes, we should use SHA-2 to construct hash tree, but it will take up more space to store and buffer hash nodes (because the output of SHA-2 is more long than the results of MD5/SHA-1).
6 Conclusions Through encrypting data and constructing a hash tree on the protected disk sectors, our approach can provide solid protection at the lowest-level of disk accessing. To achieve good performance, it uses the concept of hot-access-window to quicken most of the integrity checking processes. Together with simplifying the layout of tree structure and properly buffering nodes of hash tree, checking process is sufficiently optimized. For common cases, performance penalty is less than 5%. As we have demonstrated and discussed, this approach should be a practical and available way to protect hard disk against information disclosure and tampering.
References 1. R. C. Merkle: Protocols for public key cryptography. IEEE Symposium on Security and Privacy (1980) 122-134 2. M. Blum, W. S. Evans, P. Gemmell, S. Kannan, and M. Naor: Checking the correctness of memories. IEEE Symposium on Foundations of Computer Science (1991) 90-99
An Efficient Way to Build Secure Disk
301
3. B. Gassend, G. E. Suh, D. Clarke, M. van Dijk, and S. Devadas: Caches and merkle trees for efficient memory authentication. Ninth International Symposium on High Performance Computer Architecture (2003) 4. G. E. Suh, D. Clarke, B. Gassend, M. van Dijk, and S. Devadas: Hardware Mechanisms for Memory Integrity Checking. Technical report, MIT LCS TR-872 (2003) 5. M. Blaze: A cryptographic file system for unix. In 1st ACM Conference on Communications and Computing Security (1993) 9-16 6. E. Zadok, I. Badulescu, and A. Shender: Cryptfs: A stackable vnode level encryption file system. Technical report, Computer Science Department, Columbia University (1998) 7. Tripwire. http://www.tripwire.org 8. K. Fu, F. kaashoek, and D. Mazieres: Fast and secure distributed read-only file system. In Proceedings of OSDI 2000 (2000) 9. D. Mazieres and D. Shasha: Don't trust your file server. 8th Workshop on Hot Topics in Operating Systems (2001) 10. C. A. Stein, J. H. Howard, and M. I. Seltzer: Unifying file system protection. In 2001 USENIX Annual Technical Conference (2001) 79-90 11. Fujita Tomonori and Ogawara Masanori: Protecting the Integrity of an Entire File System. First IEEE International Workshop on Information Assurance (2003) 12. G. E. Suh, D. Clarke, B. Gassend, M. van Dijk, S. Devadas: Aegis: Architecture for tamper-evident and tamper-resistant processing. 17th Int'l Conference on Supercomputing (2003) 13. Fangyong Hou, Zhiying Wang, Yuhua Tang, Jifeng Liu: Verify Memory Integrity Basing on Hash Tree and MAC Combined Approach. International Conference on Embedded and Ubiquitous Computing (2004) 14. J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, M. J. West: Scale and performance in a distributed file system. ACM Transactions on Computer Systems, Vol.6, February (1988) 51-81 15. HP Labs. Tools and traces. http://www.hpl.hp.com/research/ 16. M. Bellare and D. Micciancio: A New Paradigm for collision-free hashing: Incrementality at reduced cost. In Proceedings of Eurocrypt'97, Springer-Verlag LNCS 1233 (1997) 17. Xiaoyun Wang, Dengguo Feng, Xuejia Lai, and Hongbo Yu: Collisions for hash functions MD4, MD5, HAVAL-128 and RIPEMD. Crypto2004 (2004)
Practical Forensic Analysis in Advanced Access Content System Hongxia Jin and Jeffery Lotspiech IBM Almaden Research Center, San Jose, CA, 95120 {jin, lotspiech}@us.ibm.com
Abstract. In this paper we focus on the use of a traitor tracing scheme for distribution models that are one-to-many. It can be a networked broadcast system; It can be based on prerecorded or recordable physical media. In this type of system, it is infeasible to mark each copy differently for each receipt. Instead, the system broadcasts limited variations at certain points, and a recipient device has the cryptographic keys that allow it to decrypt only one of the variations at each point. Over time, when unauthorized copies of the protected content are observed, a traitor tracing scheme allows the detection of the devices that have participated in the construction of the pirated copies. The authors have been involved in what we believe is the first large-scale deployment of the tracing traitors approach in a content protection standard for the new generation of high-definition DVD optical discs. Along the way, we have had to solve both practical and theoretical problems that had not been apparent in the literature to date. In this paper we will mainly present this state of practice of the traitor tracing technology and show some of our experience in bringing this important technology to practice.
1
Introduction
AACS [1], Advanced Access Content System, is founded in July 2004 by eight companies, Disney, IBM, Intel, Matsushita, Microsoft, Sony, Toshiba, and Warner Brothers. It develops content protection technology for the next generation of high-definition DVD optical discs. It supports expanded flexibility in accessing, managing, and transferring content within a standalone or networked environment. Compared to the previous DVD CSS system, which is a flat “do not copy” technology, AACS is an enabling technology, allowing consumers to make authorized copies of purchased movie discs, and potentially enriching the experience of the movie with an online connection. The fundamental protection of the AACS system is based on broadcast encryption with a subset-difference tree using device keys and a media key block[2]. It allows unlimited, precise revocation without danger of collateral damage to innocent devices. The mechanism is designed to exclude clones or compromised devices, such as the infamous “DeCSS” application used for copying “protected” DVD Video disks. Once the attacker K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 302–313, 2006. c Springer-Verlag Berlin Heidelberg 2006
Practical Forensic Analysis in Advanced Access Content System
303
has been detected, they are excluded from newly released content because the new media key blocks in the new content exclude the keys known to the attackers. However, the AACS founders do not believe that this level of renewability solves the piracy problem completely. What if an attacker re-digitizes the analogue output from a compliant device and redistributes the content in unprotected form? It can be an exact in-the-clear digital copy of the movie, with all of its extra navigation and features. In this case, the only forensic evidence availability is the unprotected copy of the content. Also, because of the inherent power of the revocation of the AACS system, it is possible that the attackers may forgo building clones or non-compliant devices and instead devote themselves to serverbased attacks where they try to hide the underlying compromised device(s). In one particular attack, you could imagine the attackers building a server that distributes per-movie keys. Of course, the attackers would have to compromise the tamper-resistance of one or more players to extract these keys. This is progress, because these server attacks are inherently more expensive for the attackers. However, AACS found it desirable to be able to respond to even these types of attacks. These attacks are anonymous. The only forensic evidence availability are the per-movie keys or the actual copy of the content. To help defend against these types of attacks, the AACS system uses tracing traitors technology. AACS uses the term sequence keys to refer to its tracing traitors technology against the anonymous attack. The suitability of the term will become apparent. However, to be consistent with the cryptographic literature, in this paper, the device who engages in piracy will be called equivalently either a traitor or a colluder. The AACS sequence key scheme allows us to apply the watermark early on in the content publishing process and can still provide traceability down to the individual content recipient. Traitor tracing problem was first defined by Fiat and Naor in a broadcast encryption system [3]. This system allows encrypted contents to be distributed to a privileged group of receivers (decoder boxes). Each decoder box is assigned a unique set of decryption keys that allows it to decrypt the encrypted content. what are the security problems with this system? A group of colluders can construct a clone pirate decoder that can decrypt the broadcast content. It is different from the one we are dealing in this paper. The threat model that this paper is concerned with is what AACS has called the “anonymous attack”. As we mentioned earlier, attackers can construct a pirate copy of the content (content attack) and try to resell the pirate copy over the Internet. Or the attackers reverse-engineer the devices and extract the decryption keys (key attack). They can then set up a server and sell decryption keys on demand, or build a circumvention device and put the decryption keys into the device. There are two well-known models for how a pirated copy (be it the content or the key) can be generated: 1. Given two variants v1 and v2 of a segment, the pirate can only use either v1 or v2 or unrecognizable, but not any other valid variant. 2. Given two variants v1 and v2 of a movie segment (v1 = v2 ), the pirate can generate any variant out of v1 and v2
304
H. Jin and J. Lotspiech
In this paper, we will be assuming that the attackers are restricted to the first model. This is not an unreasonable assumption in the AACS application. In order to enable tracing for content attack, each content distributed in the AACS application needs to have different variations. Watermarking is one of the ways to create different variations of the content. In a practical watermarking scheme, when given some variants of a movie segment, it would be infeasible for the colluders to come up with another valid variant because they do not have the essential information to generate such a variant. Even if they mount other attacks, such as averaging two variants, it may end up with a unrecognizable version. It can hardly be another valid variant. And there are methods as [7] that make it very difficult for colluders to remove the marks. However, an important point to note is that building a different variation is a media-format-specific problem. Watermarking is only one of the solutions. For example, in a DVD format using Blue Laser, the variation can be simply a different playlist. In this case, it has nothing to do with watermark; thus it is not restricted by the watermark robustness requirement. Also, for the key attack, the traitors will need to re-distribute at least one set of keys for each segment. For cryptographic keys that are generated randomly, it is impossible to generate a valid third key from combining two valid other keys. This is equivalent to the first model. A tracing scheme is static if it pre-determines the assignment of the decryption keys for the decoder or the watermarked variations of the content before the content is broadcast. The traitor tracing schemes in [4],[5] are static and probabilistic. They randomly assign the decryption keys to users before the content is broadcast. The main goal of their scheme in their context is to make the probability of exposing an innocent user negligible under as many real traitors in the coalition as possible. Fiat and Tassa introduced a dynamic traitor tracing scheme [8] to combat the same piracy under the same business scenario considered in this paper. In their scheme, each user gets one of the q variations for each segment. However, the assignment of the variation of each segment to a user is dynamically decided based on the observed feedback from the previous segment. The scheme can detect up to m traitors. It involves realtime computational overhead. Avoiding this drawback, sequential traitor tracing is presented in [10] and more formal analysis are shown in [11], [12]. AACS uses a similar model as [10] that requires no real-time computation/ feedback. However AACS has designed a traitor tracing scheme that attempts to meet all the practical requirements. The existing traitor tracing schemes either need more bandwidth than the content provider can economically afford, or the number of players their schemes can accommodate is too few to be practical, or the number of colluding traitors under which their schemes can handle is too few. Bringing the long-standing theoretical work to practice was the major effort we undertook in the AACS system. In the rest of this paper we will first summarize our basic scheme. We will then focus on discussing some other practical problems we have encountered in implementing the AACS tracing traitors scheme. AACS has been a collaborate
Practical Forensic Analysis in Advanced Access Content System
305
effort amongst the eight companies involved. Although the authors were the individuals primarily involved in this aspect of AACS, we benefited extensively from discussions, reviews, and proposals from the other companies. We would like to especially acknowledge Toru Kambayashi from the Toshiba Corporation, who worked out the details of mapping the technology to the HD-DVD disc format, and Tateo Oishi from the Sony Corporation, who worked out the details of mapping the technology to the Blue Ray disc format.
2
Overhead to Enable Tracing
As we mentioned above, in order to enable tracing, the content needs to be prepared with different variations. These different variations occupy extra bandwidth in a network broadcast system and occupy space in the physical optical media. Although the new generation of DVDs has substantially more capacity, the studios can use that capacity to provide a high definition picture and to offer increased features on the disc. While it is perfectly reasonable in a theoretical context to talk about schemes that increased the space required by 200% or 300%, no movie studio would have accepted this. A traitor tracing scheme can be practical only if it requires an acceptable overhead. In general, most studios were willing to accept some overhead, for example, below 10%, for forensics. As a nominal figure, we began to design assuming we had roughly 8 additional minutes (480 seconds) of video strictly for forensic purposes for a normal 2 hour movie. Theoretically we could use our 480 seconds to produce the most variations possible. For example, at one particular point in the movie, we could have produced 960 variations of a 1/2 second duration. In reality this clearly would not work. The attackers could simply omit that 1/2 second in the unauthorized copy without significantly degrading the value of that copy. We believe a better model is to have an order of 15 carefully-picked points of variation in the movie, each of duration 2 seconds, and each having 16 variations. As you can see, even with this, you could argue that the attackers can avoid these 30 or so seconds of the movie. Our studios colleagues have studied whether these parameters are sufficient, and their answer is, frankly, “it depends”. Different format seems to like different parameters. As a result, when we mapped our scheme to the actual disc format, it is very important that we made sure the duration of the variations was not pre-determined. Of course, longer durations require more overhead. But this is a tradeoff studios can make. We should also mention that whether or not a given movie uses tracing traitors technology is always the studio’s choice. In the absence of attacks, they would never use it, and the discs would have zero overhead for this purpose. To generalize the above observation, in the AACS model, we assume that each movie is divided into multiple segments, among which n segments are chosen to have differently marked variations. Each of these n segments has q possible variations. Each playing device receives the same disc with all the small variations at chosen points in the content. However, each variations is encrypted with a different set of keys such that any given device out of the player population has
306
H. Jin and J. Lotspiech
access to only one particular variation for each segment. Therefore, each devices plays back the movie through a different path, which effectively creates a different movie version. Each version of the content contains one variation for each segment. The model described here is same as used in [8][10] for content tracing. Each version can be denoted as an n-tuple (x0 , x1 , . . . , xn−1 ), where 0 ≤ xi ≤ q − 1 for each 0 ≤ i ≤ n − 1. A coalition could try to create a pirated copy based on all the variations broadcast to them. For example, suppose that there are m colluders. Colluder j receives a content copy tj = (tj,0 , tj,1 , . . . , tj,n−1 ). The m colluders can build a pirated copy (y0 , y1 , . . . , yn−1 ) where the ith segment comes from a colluder tk , in other words, yi = tk,i where 1 ≤ k ≤ m and 0 ≤ i ≤ n − 1. Unfortunately, the variations (y0 , y1 , . . . , yn−1 ) associated with the pirated copy could happen to belong to an innocent device. A weak traitor tracing scheme wants to prevent a group of colluders from “framing” an innocent user. In the AACS scheme we only deal with strong traitor tracing schemes which allows at least one of the colluders to be identified once such pirated copies are found. The AACS traitor tracing scheme, called sequence keys hereafter, is a static scheme. Like all tracing schemes in this category, it consists of two basic steps: 1. Assign a variation for each segment to devices. 2. Based on the observed re-broadcast keys or contents, trace back the traitors. 2.1
Basic Key Assignment
For the first step, AACS systematically allocates the variations based on an error-correcting code. A practical scheme needs to have small extra disc space overhead, accommodate a large number of devices in the system, and be able to trace devices under as large a coalition as possible. Unfortunately these requirements are inherently conflicting. Assume that each segment has q variations and that there are n segments. A small extra bandwidth means a small q. We represent the assignment of segments for each user using a codeword (x0 , x1 , . . . , xn−1 ), where 0 ≤ xi ≤ q − 1 for each 0 ≤ i ≤ n − 1. Take a look at a code [n, k, d], where n is the length of the codewords, k is the source symbol size and d is the Hamming distance which corresponds to the mininum number of segments by which any two codewords differ. To defend against a collusion attack, intuitively we would like the variant assignment to be as far apart as possible. In other words, the larger the Hamming distance is, the better traceability of the scheme. On the other hand, the maximum codewords the [n, k, d] code can accommodate is q k . In order to accommodate a large number of devices, e.g. billions, intuitively either q or k or both have to be relatively big. Unfortunately a big q means big bandwidth overhead and a big k means smaller Hamming distance and thus weaker traceability. It is inherently difficult to defend against collusions. In order to yield a practical scheme to meet all the requirements, AACS concatenates codes [9]. The number of variations in each segment are assigned following a code, namely the inner code, which are then encoded using another code, namely the outer code. We call the nested code the super code. The inner code effectively create multiple movie version for any movie and the outer code assign different movie versions to the user over a sequence of movies—hence the
Practical Forensic Analysis in Advanced Access Content System
307
term “sequence keys”. This super code avoids the overhead problem by having a small number of variations at any single point. For example, both inner and outer codes can be Reed-Solomon (RS) codes [9]. In a [n, k, d] RS code, d = n − k + 1. For example, for our inner code, we can choose q1 = 16, n1 = 15 and k1 = 2, thus d1 = 14. For the outer code, we can choose q2 = 256, n2 = 255 and k2 = 4, thus d2 = 252. The number of codewords in the outer code is 2564 , which means that this example can accommodate more than 4 billion devices. Suppose each segment is a 2-second clip, the extra video needed in this example is 450 seconds, within the 10% constraint being placed on us by the studios. So, both q, the extra bandwidth needed, and q k , the number of devices our scheme can accommodate, fit in a practical setting. The actual choices of these parameters used in the scheme depend on the requirements and are also constrained by the inherent mathematical relationship between the parameters q, n, k, d. In fact, there does not exist a single MDS code that can satisfy all the practical requirements. For a MDS code, n t //function m::= (t,l,i) //member labeled l of type t at offset i |(l:n,i) //bit field labeled l of size n at offset i ground::= e{id1 ,id2 ,...,idk } |uninitialized,error type,void *,char, unsigned char, short, int, long, float, double,... Fig. 1. Types in our type system
starting address of the active part of the structure. The active part shrinks when the pointer moves forward and expands when the pointer moves backward. Note that an active part of element e can be null if the pointer points at the end of e. The Type System. We employ a modified version of the type system described in Chandra et al.[8] in our dynamic type checking method. The types of our type system are shown in Figure 1. We have added the function type to the type system so we can catch potential errors caused by misuses of function pointer. We also add two special types to the ground type: uninitialized and error type. An element of type uninitialized has not been initialized by the programmer and does not have a definite value. An element of type error type indicates that an error has occurred during the propagation of dynamic types. An array in our type system is treated like a structure (each member is listed explicitly) since the dynamic types of its elements may differ from each other. An array of declared type t that has k elements can also be written as t[k] when we don’t care the dynamic types of its elements. Assumptions. For ease of discussion, we assume that all other offsets of structures and unions comply with the ANSI C standard. Another assumption is that all structure elements are explicitly padded. When a structure object is stored in accordance with the system’s alignment restrictions, one or more padding bytes may be added to it. In an “explicitly padded” version of a structure object, each padding byte is declared as an anonymous bit field. 3.2
Introduced Variables and Auxiliary Functions
Several variables are introduced to our type system to catch the dynamic properties of a running program, we list them in Table 1. In the table we use expressions like var(e) to refer to the introduced variable var which is related to internal
346
H. Shen et al. Table 1. Introduced variables
Variable dtype(v) alias(e) sptr(e)
Explanation records the dynamic type of each variable v during program execution a set which collects pointers that point to any part of the element e directly remembers the starting address of each variable of structure or array type
Table 2. Introduced auxiliary functions Function stype(e) sizeof(t) deref(t) ispointer(t) isarray(t) isstruct(t) obj(p) update alias(e)
Explanation compiler-assigned type of a C expression e size in byte of type or program element t dereferenced type of t if t is a pointer true iff t is a pointer true iff t is an array true iff t is a struct object that pointer p points to updates the type of each element in set alias(e) so that it keeps consistent with the type of e whenever the type of e changes
variable or program element e. A number of auxiliary functions are also defined in our type system, which are listed in Table 2: Besides the functions listed in Table 2, another function is defined to construct a type from a struct or an array. Let t be a struct or an array: t = s{m1 ,m2 ,...,mn } We define
k s{mk+1 , mk+2 , ..., mn } where i=1 sizeof (mi ) = siz P ostf ixT ype(t, siz) = where no such m1 · · · mk exist error type 3.3
Propagation of Dynamic Types
Initialization. When the program starts to run, the dtypes of all global elements get initialized. The rule for initializing dynamic types of program elements of ground types is quite straightforward: if the element e has a definite value, its dtype(e) is set to stype(e); otherwise it is set to uninitialized. On entry to a function, similar things happen to local variables, but the dynamic types of global variables and formal parameters are not affected. Propagation. The rule for propagating dynamic types among program elements of ground types is quite straightforward when no pointers are involved. Once pointers are involved, things get a little more complicated. To simplify the discussion, we assume that statements related to pointers in the input program have been normalized to consist of only a few simple forms which are listed in Table 3. By introducing temporary variables, any complex statement can be normalized. Some examples of normalization are shown in Table 4.
Securing C Programs by Dynamic Type Checking
347
Table 3. Statements related to pointers in the normal form Name of Statement Normal Form Address-of p = &x Assignment p = Castopt q Pointer Dereference on rhs p = *q Pointer Dereference on lhs *p = q Plus p = q + k (k 0) Minus p = q − k (k 0)
Table 4. Examples of normalization of statements related to pointers Nomal form p = q -> a;
Normalization tmp = (char *)q + Offseta ; p = *tmp; tmp1 = &q; p = &(q.a); tmp2 = (char *)tmp1 + Offseta ; p = tmp2; tmp1 = &p; (DstType *)(&p) = q; tmp2 = (DstType *)tmp1; *tmp2 = q;
The propagation rules for statements listed in Table 3 are listed in Figure 2. Besides the cases listed in the table, dynamic types are also propagated when a function call is made. A function declared as void func(formal param1 ,formal param2 ,...,formal paramn ); and called in the form func(act param1 ,act param2 ,...,act paramn ); can be looked up as a sequence of assignments: formal param1 = act param1 ; formal param2 = act param2 ; ... formal paramn = act paramn ; call func; When the function call is made, the dtypes of actual parameters are propagated to the corresponding formal parameters. The propagation process continues when statements in the function body are executed. 3.4
Dynamic Type Checking
Dynamic type checking is performed in the checking procedure. First we will introduce some auxiliary functions used in this procedure: 1. is subtype(p,q) is true iff type p is a subtype of q. The concept “subtype” here is identical to “physical subtype” presented in Chandra et al.[8]. We also use the notion “tt’” to denote that t is a subtype of t’.
348
H. Shen et al.
Address-of: alias(obj(p)) = alias(obj(p)) - p; p = &x; dtype(p) = ptr dtype(x); alias(x) = alias(x) ∪ p; update alias(p);
Assignment: alias(obj(p)) = alias(obj(p)) - p; p = q; dtype(p) = dtype(q); alias(obj(q)) = alias(obj(q)) ∪ p; update alias(p);
Pointer Dereference on rhs: Pointer Dereference on lhs: if (ispointer(p)) if(ispointer(*p) alias(obj(p)) = alias(obj(p)) - p; alias(obj(obj(p))) = alias(obj(obj(p)))-*p; p = *q; *p = q; dtype(p) = deref(dtype(q)); dtype(*p) = dtype(q); if (ispointer(p)) if(ispointer(*p) alias(obj(obj(q))) = alias(obj(obj(q)))∪ p; alias(obj(q)) = alias(obj(q))∪ *p; update alias(p); update alias(obj(p)); Plus: alias(obj(p)) = alias(obj(p)) - p; p = q + k; if (isarray(q)) dtype(p) = PostfixType(dtype(q), k * sizeof(q[0])); else if(isstruct(deref(q))) dtype(p) = ptr PostfixType( deref(dtype(q)), k * sizeof(stype(*q))); else dtype(p) = dtype(q); alias(obj(q)) = alias(obj(q)) ∪ p; update alias(p);
Minus: alias(obj(p)) = alias(obj(p)) - p; p = q - k; if (isarray(q)) dtype(p) = PostfixType(dtype(sptr(q)), sizeof(dtype(sptr(q)))sizeof(dtype(q))-k * sizeof(q[0])); else if (isstruct(deref(q))) dtype(p) = ptr PostfixType( deref(dtype(sptr(q))), sizeof(deref(dtype(sptr(q))))sizeof(deref(dtype(q)))k * sizeof(stype(*q))); else dtype(p) = dtype(q); alias(obj(q)) = alias(obj(q)) ∪ p;
Fig. 2. Propagation rules for statements
2. prototype(f ) returns the prototype of function f. 3. compatible(t1 ,t2 ) checks if type t1 is compatible with t2 using the rules listed in Figure 3. When to Perform Checking. When no pointers, no structures and no unions are involved, there is no need to perform checking, because the compiler can guarantee the type safety of the program. So we consider only the statements in which these elements are involved. General pointer references and assignments can propagate errors, but they cannot generate errors, so Address-of statements and Assignment statements of pointers can also be neglected for checking. All other statements need to be checked. How to Perform Checking. The main task of dynamic type checking is to check the compatibility of an operator and its operands, finding potential type errors and reporting them to the programmer. In most cases, what we need to
Securing C Programs by Dynamic Type Checking
349
Basic rules: scalar ground types(all ground types except void ptr) are subtypes of themselves and not of other ground types. For example: int int,int double, int long Inference rules: [Reflexivity] tt [Void Pointers] ptr t void ∗ [Member subtype]
m = (l, t, i) m = (l , t , i ) i = i t t m m
[First members]
t t m1 = (l, t, 0) s{m1 , · · · , mk } t
[Flattened first members] s{f latten(t)} s{f latten(t )} t: if t is not a structure f latten(t) = {f latten(m1 ), · · · , f latten(mk )} : if t = s{m1 , · · · , mk } t t [Integer pointers]
sizeof (int) = sizeof (void ∗) void ∗ int, ptr t int sizeof (int) = sizeof (void ∗) void ∗ unsigned int, ptr t unsigned int
[Long pointers]
sizeof (long) = sizeof (void ∗) void∗ long, ptr t long sizeof (long) = sizeof (void ∗) void ∗ unsigned long, ptr t unsigned long Fig. 3. Basic and inference rules for subtypes
do is to find the answer to the question “Can the value of an expression A be used as a value of type B?” for different A and B. 1. For unions: The dynamic type of a union object tracks the dynamic type of its active member, so there is no problem when its active member is referenced. When its non-active member is referenced, we need to check the compatibility of its active member and non-active member. If only part of
350
H. Shen et al.
the non-active member is referenced, we should check the compatibility of the active member and the referenced part of the non-active member. So we formulate the checking process as follows: (1) construct the minimum target type that contains type of the referenced part of the target member; (2) check the compatibility of the constructed type and the type of the active member. 2. For pointers: We consider only normalized statements listed in Table 3 here. (1) In Assignment statements and Address-of statements the dtype is propagated but no check is made since they cannot generate type errors. (2) In Plus statements and Minus statements, pointer arithmetic should be checked for out-of-bounds errors. This check is performed in PostfixType function calls. If the result of the plus operation goes beyond the last byte of the referenced object, or the result of the minus operation goes beyond the first byte of the referenced object, the returning type will be set to error type. The error will be found once the result pointer is dereferenced. (3) When a pointer is dereferenced, the dereferenced object must be a pointer, which is checked in deref. deref(p) will return error type if p is not a pointer. When error type is dereferenced, an error will be reported. If deref(p) returns some pointer type, we need to check the compatibility of the source type and the target type. Which is source and which is target depend on the side where pointer dereference happens. 3. For function calls: Each actual parameter of the function call should be checked to see if it is compatible with the corresponding formal parameter. If the function is called through a function pointer, another check should be made to ensure that the function pointer has the same number of parameters as the function prototype and type of each parameter is compatible with that of the corresponding parameter of the function prototype, and their returning types are the same. Checking the Compatibility of Two Objects. Function compatible is used to check the compatibility of two objects. It has two parameters: the target type, and the type we have. The rules to judge the compatibility are as follows: 1. The type uninitialized is not compatible with any type, that is,compatible(t, uninitialized) always returns false. 2. The type error type is not compatible with any type, that is,compatible(t, error type) always returns false. 3. A subtype is compatible with its parent types. 4. If the compiler can automatically convert source type to target type, for example, from char to int, they are compatible. 5. Two function types are compatible iff the following conditions hold: (1)they have exactly the same number of parameters; (2)The type of each parameter of one function is compatible with that of the corresponding parameter of another; (3)The size of each parameter of one function equals to that of the
Securing C Programs by Dynamic Type Checking
351
corresponding parameter of another. (4)Their returning types are compatible. (5)Their returning types have the same sizes. 6. In all other cases, the two given types are incompatible. When compatible is called, it uses the rules listed above one by one to test the compatibility of input types. If one rule cannot give an answer, the next one will be used. 3.5
Updating Aliases
Pointer p is an alias of object o if p references o. dtype(p) should be updated when the dynamic type of o changes, otherwise it will result in type inconsistencies. To prevent this situation, we must update the dtypes of elements in alias(e) by calling update alias(e) once program element e is updated. Since function update alias(e) is itself a recursive function, all affected elements will have their dtypes updated in one call.
4 4.1
Some Practical Problems Relaxing the Rules
Although we can flatten structures when necessary and we do not consider the labels of structure members when we perform type checking, the rules listed in Figure 3 may be still too restrictive for users. For example, some users may use array padding in their programs (see the ColorPoint example of [6]), so we add the following rules to the compatible rules list: 1. char[n] is compatible with type t iff sizeof(t)= n 2. Two structure types are compatible if they can be divided into one or more parts, and the size of each part of one structure equals that of the corresponding part of another, and the type of each part of one structure is compatible with that of the corresponding part of another. 4.2
Variable Argument Function
The rules for type checking of functions we presented above cannot manage variable argument functions because the number of arguments of such a function is variable. However, by looking into an implementation of variable argument function, it will be made clear that a variable argument function has not much more than an ordinary function except several wrapper macros which expand to assignment and pointer arithmetic statements. The C compiler puts the variable arguments in a variable argument list and transfers it to the called function. The function uses a series of macros to extract arguments from the list and casts them to destination types. Here is an implementation of these macros which appears in Microsoft C:
352
H. Shen et al.
typedef #define #define #define #define
char *va_list; _INTSIZEOF(n) ((sizeof(n)+sizeof(int)-1)&~(sizeof(int)-1)) va_start(ap,v) (ap=(va_list)&v + _INTSIZEOF(v)) va_arg(ap,t) (*(t *)((ap += _INTSIZEOF(t))-_INTSIZEOF(t))) va_end(ap) (ap = (va_list)0)
Among these macros,va start is used to adjust the pointer to the variable argument list and make it point to the first variable argument; va arg is used to extract one argument from the list, cast it to the destination type and move the pointer forward to the next argument in the list. INTSIZEOF is used to align addresses. We can see from the code that these macros are just pointer arithmetic and casts. If dynamic types of these arguments are provided, we can also perform dynamic type checking for variable argument functions. The dynamic types can be provided in this way: collect the dtype of each actual parameter, and assemble them in an array. This array is transferred to the called function with the other arguments. When an argument is extracted from the argument list, its type is also extracted from the array and type checking is performed. Then the dtypes of the arguments propagates as for ordinary functions. 4.3
External Libraries
Almost all C programs call functions from libraries. Variables defined in external libraries may also be used in C programs. Because dynamic types do not propagate in library functions, some information is lost after calling them. The dtypes of referenced variables may not be accurate any longer and we can only assume that the dtypes of all variables referenced by these functions have not changed. This can result in false alarms and many errors may go undetected. The problem can be solved by recompiling the library if its source code is available. However, in many cases, the source code cannot be obtained, then we can only count on our carefulness.
5
Experimental Results
We implement the dynamic type checking system in the lightweight C compiler — lcc[2] by making modifications to it. The version we use for implementation is 4.2. The checking system, which is linked to every user program it compiles, is implemented as part of compiler libraries. Necessary variables declarations and code of dynamic type initializations, propagations and checking are inserted by modified lcc compiler. We apply the the compiler to a set of small programs including various sorting algorithms and the maze problem to evaluate the performance of it, the results are shown in Table 5. Among the sorting algorithms, “Insertion1” is direct insertion sorting algorithm, “Insertion2” is binary insertion sorting algorithm, “Merge1” is the iterative version of mergence sorting algorithm, “Merge2” is the recursive version of mergence sorting algorithm, and others are self-explaining.
Securing C Programs by Dynamic Type Checking
353
Table 5. Performance and memory consumption measurements for small programs Program Insertion1 Insertion2 Quick Shell Selection Bubble Merge1 Merge2 Heap Maze
Base time 6.15 6.29 0.01 5.48 6.03 21.44 0.02 2.68 0.02 12.97
Base size 884 884 892 884 884 888 1084 1164 892 16740
Time 1450.54 1428.96 1.15 1109.26 708.53 3502.22 2.10 140.35 1.81 357.45
Size 4216 4216 4220 4212 4216 4216 3576 8344 4228 43248
Slowdown1 235.86 227.18 115.00 202.42 117.50 163.35 105.00 52.37 90.50 27.56
Aug1 4.77 4.77 4.73 4.76 4.77 4.75 3.30 7.17 4.74 2.58
Slowdown2 300.96 266.26 93.00 231.48 118.67 183.95 143.00 41.58 95.50 30.12
Aug2 4.17 4.18 4.14 4.13 4.18 4.13 3.57 7.06 4.14 2.12
Table 6. Error detection results when applying the tool to a test suite Bug Description Reading uninitialized locals Reading uninitialized data on heap Writing overflowed buffer on heap Writing overflowed buffer on stack Writing to unallocated memory Returing stack object Overwriting ending zero of string Function pointer with wrong number of arguments Function pointer with wrong returning type Vararg with wrong type of arguments Vararg with wrong number of arguments Bad union access/part of an object is uninitialized Bad union access/a complete uninitialized object Memory leakage Second free Bad type cases
Detection Result Yes Yes Yes Yes Yes No No No Yes Yes Yes Yes Yes No No Yes
All data are collected on a 2.0Ghz Pentium4 with 256 MB of memory, running Mandrake Linux Limited Edition 2005 (kernel 2.6.11). In the table we also list the results of Loginov et al. tool taken from Wang et al.[3] for comparison. Column “Base time” and “Base size” of the table list running time in seconds and memory consumption in KB of the program compiled by original lcc compiler respectively; column “Time” and “Size” list running time and memory consumption of the program compiled by modified lcc compiler respectively; column “Slowdown1” and “Aug1” list running time slowdown and memory consumption augmentation of “dynamically checked version” of each program; “Slowdown2” and “Aug2” list corresponding data of Loginov et al. tool. We also apply a test suite of small programs to evaluate the error detection ability of our compiler, each program in the suite contain one or more bugs caused by misuse of a flexible feature of C. The results are shown in Table 6.
354
6
H. Shen et al.
Conclusion
The dynamic type checking method we presented in this paper is effective in detecting bugs caused by misuse of flexible features of C, and the overhead it brings is tolerable. Since many of these bugs are related to vulnerabilities of systems, it is significant to enhancing system security.
References 1. Alexey Loginov, Suan Yong, Susan Horwitz, and Thomas Reps. Debugging via runtime type checking. In Proceedings of the Conference on Fundamental Approaches to Software Engineering, p.217-232, 2001. 2. David R. hanson, Christopher W. Fraser. A Retargetable C Compiler. Addison Wesley, 1995. 3. Jimin Wang, Lingdi Ping, Xuezeng Pan, Haibin Shen and Xiaolang Yan. Tools to make C programs safe: a deeper study. Journal of Zhejiang University SCIENCE. Vol.6A No.1 p.63-70,2005. 4. Julian Seward. Valgrind, an open-source memory debugger for x86-GNU/Linux. Technical report, http://valgrind.kde.org/, 2003. 5. Michael Burrows, Stephen Freund, and Janet Wiener. Run-time type checking for binary programs. In International Conference on Compiler Construction, 2003. 6. Michael Siff, Satish Chandra, Thomas Ball, Krishna Kunchithapadam, and Thomas Reps. Coping with Type Casts in C. Lecture Notes in Computer Science. 1687:180198, 1999. 7. Reed Hasting and Bob Joyce. Purify: fast detection of memory leaks and access errors. In Proceedings of the Winter USENIX Conference, 1992. 8. Satish Chandra and Thomas Reps. Physical type checking for C. In Proceedings of the ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, volume 24.5 of Software Engineering Notes (SEN), p.66-75, 1999. 9. Umesh Shankar, Kunal Talwar, Jeffrey S. Foster, and David Wagner. Automated Detection of Format-String Vulnerabilities Using Type Qualifiers. In Proceedings of the 10th USENIX Security Symposium, Washington,DC, 2001.
A Chaos-Based Robust Software Watermarking Fenlin Liu, Bin Lu, and Xiangyang Luo Information Engineering Institute, Information Engineering Univercity, Zhengzhou Henan Province, 450002, China [email protected], [email protected], [email protected]
Abstract. In this paper we propose a robust software watermarking based on chaos against several limitations of existing software watermarking. The algorithm combines the anti-reverse engineering technique, chaotic system and the idea of Easter Egg software watermarks. The global protection for the program is provided by dispersing watermark over the whole code of the program with chaotic dispersion coding; the resistance against reverse engineering is improved by using the anti-reverse engineering technique. In the paper, we implement the scheme in the Intel i386 architecture and the Windows operating system, and analyze the robustness and the performance degradation of watermarked program. Analysis indicates that the algorithm resists various types of semanticspreserving transformation attacks and is good tolerance for reverse engineering attacks.
1
Introduction
Software piracy has received an increasing amount of interest from the research community [1, 2, 3]. Nowadays, software developers are mainly responsible for the copyright protection with encryption, license number, key file, dongle etc. [1, 4]. These techniques are vulnerable suffered from crack attacks and hard to carry out pirate tracing. Moreover, software developers have to spend much time, resources and efforts for copyright protection. If there is a reliable system of software protection as the cryptosystem, the software based on the system can be protected to a certain extent. And software developers could devote most of their resources and efforts to developing the software without spending resource and efforts on intellectual property protection. Software watermarking is just an aspiring attempt in the aspect [5]. There are several published techniques for software watermarking. However, no single watermarking algorithm has emerged that is effective against all existing and known attacks. Davidson et al. [6] involved statically encoding the watermark in the ordering of basic blocks that constitute program. It is easily subverted by permuting the order of the blocks. A comparable spread spectrum technique was introduced by Stern et al. [7] for embedding a watermark by modifying the frequencies of instructions in the program. This scheme is robust to various types of signal processing. However, the data-rate is low and the scheme is easily subverted by inserting redundant instructions, code optimization, etc. With the pointer aliasing effects, Collberg et al. [8] first proposed K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 355–366, 2006. c Springer-Verlag Berlin Heidelberg 2006
356
F. Liu, B. Lu, and X. Luo
dynamic software watermarking, which embeds the watermark in the topology of a data structure that is built on the heap at runtime given some secret input sequence to the program. This scheme is vulnerable to any attack that is able to modify the pointer topology of the program’s fundamental data types. Cousot et al. [9] embed watermark in the local variable, and the watermark could be detected even if only part of the watermarked program is present. This scheme can be attacked by obfuscating the program such that the local variables representing the watermark cannot be located or such that the abstract interpreter cannot determine what values are assigned to those local variables. Nagara et al. [5] proposed thread-based watermarking with the premise that multithreaded programs are inherently more difficult to analyze and the difficulty of analysis increases with the number of threads that are ”live” concurrently. But the scheme need introducing a number of threads, and degradation of the performance could not be ignorable. In general, there are such limitations as follows: (A) the assumed threat-model is almost based on automated attacks (i.e. code optimization, obfuscation, reconstructed data and so on), but hardly on manual attacks (such as reverse engineering attacks). (B) Watermark is just embedded in a certain module of the program so that not all modules can be protected, and it can’t resist cropping attacks. (C) Watermark is embedded in the source code; because of its recompiling, the efficiency of embedding is rather low, especially fingerprint. (D) In the embedding procedure, programmers have to take on all the work, especially the complex watermark constructing and embedding, such that the watermark is not always feasible. This paper designs a new scheme that integrates chaotic system, anti-reverse engineering technology, and the idea of Easter Egg software watermarks-Chaosbased Robust Software Watermarking (CRSW) which holds the facility and feasibility of Easter Egg software watermarks, meanwhile resists various types of semantics-preserving code transformation attacks. When chaotic system is involved, dispersing watermark over the whole code provides global protection for the program. Furthermore, by involving anti-reverse engineering techniques, the resistance against anti-reverse engineering attacks is improved. In addition, CRSW embeds watermark into the executable code directly. The watermarked program need not recompile, and the efficiency is improved. The analysis of the proposed algorithm shows that CRSW resists various types of semantics transformation, is good tolerance against anti-reverse engineering technology attacks, and has modest performance degradation.
2
The Structure of CRSW
Easter Egg watermarks, a kind of dynamic software watermarks, is one of the most widely used watermarking [1, 8]. This watermarking, in essence, directly embeds a watermarking detector (or extractor) into the program. When the special input sequence is received, detector (or extractor) is activated, and then the watermark which is extracted from the watermarked program is displayed in a way of visualization. Thus, the semantics of detecting procedure (or extracting procedure) is included in the semantics of the watermarked program, so that Easter Egg water-
A Chaos-Based Robust Software Watermarking
357
marks resist various semantics-preserving transformation attacks [10]. The main problem with Easter Egg watermarks is that once the right input sequence has been discovered, standard debugging techniques will allow the adversary to locate the watermark in the executable code and then remove or disable it [8]. And then, the watermark is just embedded in one piece of the program typically, hence, cropping a particularly valuable module from the program for illegal reuse is likely to be a successful attack [10]. In this paper, basing on the idea of Easter Egg software watermarks, we attempt to propose a more robust and feasible software watermark—CRSW. The watermark is consisted with 4 essential parts: watermark W ,Input Monitoring Module Cm , Watermark Decoding Module Cd , and Anti-reverse Engineering Module Ca .Unlike the other watermarking, CRSW not only embeds watermark W into the program, but also embeds Cm , Cd , Ca in the form of executable code into the program. In this section, we expatiate the structure and the interrelations of CRSW embedding code which includes Cm , Cd , Ca (see Fig. 1). Formally, let P be the considered program, {α1 , α2 , · · ·}be the set of acceptable input of the program,P = T (P, W, Cm , Cd , Ca ) is the watermarked program (T is the watermarking transformation), the extracting procedure is = D(Γ (P )), where D is the extracting transformation, Γ is the code transforW mation. If the watermarked program has been attacked, Γ represents the attacking transformation. Otherwise, Γ is identical transformation.If D(Γ (P )) ≡cp W holds,T resist Γ , where equivcp is user-defined equal relationship. Input Monitoring Module realizes the mapping, Ψ : {α1 , α2 , · · ·} '−→ {0, 1}. If Ψ (αi ) = 1 holds, Cd is activated. α ∈ Σ = {αi | Ψ (αi ) = 1} is defined as activation key. To describe the Watermark Decoding Module clearly, we briefly describe the watermark embedding procedure. Firstly, preprocess W : W = E(W, G), where G is digital chaotic system; then embed W into the code of program with chaotic dispersion coding and get the code Iew = Ω(W , I, G), where I is the code of program (we will discuss chaotic dispersion coding in the next section).Watermark from watermarked program and Decoding Module extracts the watermark W performs it in the form of visual action. The module consists of Watermark Output Module(Cdo ) and Chaotic system Module (Cdc ). In the extracting pro-
Fig. 1. The structure and interrelations of CRSW embedding code
358
F. Liu, B. Lu, and X. Luo
with reverse chaotic dispersion codcedure, firstly, the module extracts W −1 , G); by W = E −1 (W ing(W = Ω (Iew , G)); then gets the watermark W into the visual action, V , and display V to users. at last Cdo transform W W W Anti-reverse Engineering Module, which consists of Anti-static Analyzing Module(Cas ) and Anti-dynamic Debugging Module(Cad ), offers the protection from reverse engineering attacks for Cm , Cd . Cas applies the anti-static analyzing techniques, and Cad applies the anti-dynamic debugging techniques.
3
Embedding and Extraction of the CRSW
In this section, we discuss how to embed W , Cm , Cd and Ca into P , the construction of Cm , Cd and Ca will be described in the next section. We present chaotic substitution and chaotic dispersion coding before describing the embedding and extraction. 3.1
Chaotic Substitution ∂
Chaotic substitution is replacing i with c (c,i are two 8-bit binary integers), the result is that the value of i equates c, and we get s. With c and s, the original value of i is recovered by reverse chaotic substitution. Let G is digital chaotic system, without loss of generality, let the state space of G be [a, b). Thus Chaotic substitution can be expressed by s = ∂(i, c, G) = 28 ×
c(b − a) G(x, m) − a c ⊕ i, x = + a, m = + 1 (1) 8 b−a 2 λ
where ⊕ is XOR. G(x, m) is the state of G which has been iterated m times with the initial value x. λ is the parameter which can adjust iterated times. Reverse chaotic substitution is given by: i = ∂ −1 (s, c, G) = ∂(s, c, G)
(2)
Generally, the set A = {α1 , α2 , · · · , αk } replaces B = {b1 , b2 , · · · , bk } with chaotic substitution, and get the result: R = {rj } = ∂(B, A, G) = {∂(bj , aj , G)}, j = 1, 2, · · · , k
(3)
The reverse procedure is given by B = ∂ −1 (R, A, G) = ∂(R, A, G) 3.2
(4)
Chaotic Dispersion Coding ξ
Let X = {x1 , x2 , · · · , xn } be a chaotic sequence, without loss of generality, supposing xj ∈ [a, b), j = 1, 2, · · · , n. Chaotic dispersion coding is that dispersing W over I (the code of program), which is given by [I , S ] = ξ(W, I, X) where I is the resulting code, S is the save code.
(5)
A Chaos-Based Robust Software Watermarking
359
Let the length of W be n bytes and the length of I be l bytes. Thus, W = {w1 , w2 , · · · , wn }, I = {i1 , i2 , · · · , il }, S = {s1 , s2 , · · · , sn }. The steps of ξ are as follows: 1) Initialization: L ← l, N ← n, m ← L/N , j ← i, d ← 0, let I = {i1 , i2 , · · · , il } = I xj −a , d = d + r, sj = ∂(id , wj , G), id = wj 2) Let r = m × b−a 3) Algorithm is done, if j = n is satisfied. Otherwise go to 4) 4) Let L = L − r, N = N − 1, m = L/N , j = j + 1, go to 2). When algorithm is done, S = {s1 , s2 , · · · , sn }, I = {i1 , i2 , · · · , il }. The reverse chaotic dispersion coding, recovering W and I with S and I , can be expressed as: [I, W ] = ξ −1 (S , I , X). 3.3
Embedding
In CBSW, all of W , Cm , Cd , Ca are embedded into the executable code directly, the procedure is described below(Fig. 2 shows the change of executable code after embedding):
Fig. 2. The drawing of embedding watermark
360
F. Liu, B. Lu, and X. Luo
1) Give Key < K1 , K2 >, where K1 is activation key, K2 is the key of producing chaotic sequence. Supposing that length of watermark is n bytes, W can be expressed as {w1 , w2 , · · · , wn }. 2) Construct Watermark Decoding Module Cd , and Anti-reverse Engineering Module Ca ; Construct Input Monitoring Module Cm with K1 (the details of constructions are discussed in the next section) 3) Produce the chaotic sequence X = {x1 , x2 , · · ·} 4) Apply chaotic substitution to embed Cm , Cd , Ca into the code of P . Let the code blocks which are replaced with Cm , Cd , Ca be Im , Id , Ia respectively. We can get Sm = ∂(Im , Cm , G), Sd = ∂(Id , Cd , G), Sa = ∂(Ia , Ca , G), where G is the digital chaotic system. 5) Get the subsequence X (1) (the length is n) from X and preprocess W : W (1) (1) = E(W, X (1) ) = W ⊕ X (1) = {w1 , w2 , · · · , wn } = {w1 ⊕x1 , w2 ⊕x2 , · · · , wn (1) ⊕xn }, where ⊕ is XOR. 6) Get the subsequence X (2) (the length is n) from X; Embed W to I (I is the code which is the whole code exclusive the code that is replaced with Cm , Cd , Ca ) with chaotic dispersion coding and get the result [I , SW ] = ξ(W , I, X (2) ) (Fig. 2 shows the distribution of W in the watermarked program). 7) Save Sm , Sd , Sa and SW to the end of the executable code, and adjust the header of the executable code. 3.4
Extraction
Because the watermark extractor is embedded into the program, the extraction of the watermark is included in the execution of the watermarked program. We describe the execution of the watermarked program to illustrate the extraction. 1) The watermarked program runs. 2) The code of Anti-reverse Engineering Module runs. 3) The code of Input Monitoring Module runs, which monitor the input of the program. 4) Produce the chaotic sequence Y 5) Get the subsequence Y 2 (the length is n, Y (2) is the same as X (2) in the embedding algorithm) from Y , recover the code which is replaced with W , ] = ξ −1 (SW , I , Y (2) ). the procedure can be expressed by [I , W 6) Recover the code which is replaced with Cm , Cd , Ca , the procedure can be expressed by Im = ∂ −1 (Sm , Cm , G), Id = ∂ −1 (Sd , Cd , G), Ia = ∂ −1 (Sa , Ca , G). 7) The watermarked program keeps on running. 8) If the input matches with K1 (activation key). Get the subsequence Y (1) (the length is n, Y (1) is the same as X (1) in the embedding algorithm) form Y , with inverse preprocess, which is W = E −1 (W , Y (1) ) = into W and put W (1) , Y ). E(W into V (visual action) and perform V . 9) Transform W W W
A Chaos-Based Robust Software Watermarking
4
361
The Analysis of CRSW
This section is intended to discuss the robustness of CRSW and the performance degradation. Let the lengths of W , Cm , Ca , and Cd be n bytes, lm bytes, la bytes and ld bytes respectively. Firstly, we analyze the robustness. Let R P be the semantics of P , ω ∈ Γb = {ϕ|R ϕ(P ) } = R P } is semantics-preserving transformation. In CRSW, because of visual output VW , R VW ⊆ R P is hold. Then the following relation holds according to the definition of semantics-preserving transformation: = R ω(P ) R VW ⊆R P
(6)
Equation (6) indicates that the semantics-preserving transformations can not destroy the semantics of VW , and CRSW can resist various types of semanticspreserving transformation attacks except the attacks which can distinguish R VW and R P . In the Anti-reverse Engineering Module, anti-static analyzing techniques and anti-dynamic debugging techniques are introduced to thwart reverse engineering attacks. The performance of resistance against reverse engineering depends on anti-reverse engineer techniques applied in CRSW. As we can apply the more effective anti-reverse engineering techniques to the module that is dynamic and scalable, the resistance against reverse engineering will be enhanced. Moreover, watermark is embedded into the code of program by chaotic dispersion coding. Therefore, practicing the combination of the instructions and data, it improves the performance of anti-static analyzing. Because of the application of the chaotic dispersion coding, the watermark will cause the program to fail if the adversary wants to reuse any part of code solely. Since W is distributed uniformly over the code which is the whole code exclusive the code that is replaced with Cm , Cd and Ca , there is a byte of d −la bytes code averagely. Thus: watermark per l−lm −l n lv =
l − l m − ld − la n
(7)
where lv is the average length of the reused code. If lv ≤ lT are ensured, n, the d −la . length of watermark, must satisfy the inequation n≥ l−lm −l lT It is difficult to locate the watermark because the position of W is generated by chaotic sequence. In addition, s = ∂(i, c, G) (chaotic substitution) can be considered that i is encrypted with G and c. If c is tampered, i could not be decoded correctly when i = ∂ −1 (s, c, G). In CRSW, if W is tampered, it is impossible to recover the code which is replaced with W correctly in the extracting procedure; if Cm , Ca and Cd is tampered, it is also impossible to get back the code which is replaced with Cm , Ca and Cd correctly, which could cause the program to fail. As for the given G, c is assumed the secret key, thus the key space should be 2lc (lc is the length of c); the key space is 28(n+lm +ld +la ) in CRSW. We analyze the performance degradation of watermarked program below. From the point of space, embedding watermark increases the size of the program.
362
F. Liu, B. Lu, and X. Luo
In the embedding procedure, the size of the program increases n + lm + ld + la bytes because of chaotic substitution which is applied to our algorithm. From the point of runtime, embedding watermark brings the increasing runtime of the program. The reason is that before the execution of the watermarked program, the original code should be recovered from Sm , Sd , Sa and SW , of which the recovering time not only depends on the iterative efficiency of digital chaotic system, but also the contents of W , Cm , Ca and Cd . Let t be the time of iterating once, and T1 , the time of recovering code from Sm , Sd and Sa , satisfies the following inequation: 1 (lm + ld + la )t≤T1 ≤ (lm + ld + la )t × 28 λ
(8)
With recovering code from SW , chaotic sequence of n bytes should be generated for ξ −1 at first. Thus T2 , the time of recovering code from SW , satisfies: nt + nt≤T2 ≤nt +
28 nt λ
(9)
T1 + T2 , the time of recovering all code, satisfies 2nt + (lm + ld + la )t≤T1 + T2 ≤nt +
28 (n + lm + ld + la )t λ
(10)
If W , Cm , Ca and Cd is bit-balance (Bits 0 and 1 occur at the same frequency), the average time of the procedure is T = nt +
27 + 0.5 (n + lm + ld + la )t λ
(11)
In general, Cm , Ca and Cd are fixed, that is to say, lm + ld + la is constant, and t is also a constant for a given digital chaotic system, the equation (11) can be rewrite as follow: T = nt(1 +
27 + 0.5 27 + 0.5 )+ (lm + ld + la )t = β1 n + β2 λ λ
(12)
where β1 , β2 are constants. Equation (7) shows that the larger n is, the smaller la is, and the more intensive the protection is. Equation (12) shows that T is linearly increased in a manner that involves n. The users who are intent to apply CRSW should exhibit a trade-off between intensity of protection and the performance degradation.
5
Implementation of CRSW
The algorithm’s implementation is in the Intel i386 architecture and the Windows operating system. This section is to expatiate on the implementation of Input Monitoring Module, Anti-reverse Engineering Module, and Watermark Decoding Module. There are several problems that arise when implementing these modules, and the corresponding solutions are given at the end of this section.
A Chaos-Based Robust Software Watermarking
5.1
363
Input Monitoring Module Cm
The purpose of Cm is to monitor the input of the program. When implementing Cm , we put activation key K1 (or µ(K1 ), µ is a one-way function) into this module. When the input α is a match for K1 (or µ(K1 )), Watermark Decoding Module is activated. 5.2
Anti-reverse Engineering Module Ca
In theory, a sufficiently determined attacker can thoroughly analyze any software by reverse engineering. It is impossible to thwart completely reverse engineering attacks. The goal, then, is to design watermarking techniques that are ”expensive enough” to break-in time, effort, or resources—that for most attackers, breaking them isn’t worth the trouble. There are two kinds of techniques-static analyzing and dynamic debugging—in the reverse engineering techniques. Therefore, Ca consists of Anti-Static Analyzing Module and Anti-Dynamic Debugging Module. Decompile is the foundation of static analyzing techniques, we can disable static analyzing by disturbing decompiler which is developed based on the hypothesis that data and instructions are separated. However, data and instructions in the Von Neumann architecture are indistinguishable. Thus, we can mix data and instructions in order to disturb decompiler by adding special data and instructions (we call them disturbing data) between instructions. In Fig. 3, (a) gives source code by assembly language, lines 1,5,6 are original instructions, but lines 2,3,4 are the disturbing data. (b) shows the instructions from decompiler. We can see that there are errors from line 4 to the end. There are a number of disturbing data in [4]. In this paper, we insert several disturbing data into Cm , Ca and Cd . If we can apply code encryption, compression etc. to Anti-Static Analyzing Module, the performance will be further improved. Dynamic debugging relies on debugging tools highly, so the general principle of anti-virus can be introduced to detect whether program is being debugged or not by the characters of debug tools. If debugged, the program will jump to wrong control flow in order to prevent from debugging .We already have achieved the algorithm based on the characters of SoftICE, Windbg, and Ollydbg. Experiments demonstrate that it is available to resist these debugging tools. The characters of other debugging tools can be introduced to the improved implementation.
Fig. 3. Example of disturbing data
364
F. Liu, B. Lu, and X. Luo
There are several registers for debugging in the processor of the i386 architecture. Several debugging tools design feasible functions, such as BPM1 , hardware breakpoint, by involving the debug registers [4]. In the paper, we modify the value of debugging register and invalidate these functions. In addition, time sensitive code and breakpoint detection are introduced to the implantation. We have involved several kinds of anti-reverse engineering techniques. It is worthy mentioning that this module is scalable that more efficient anti-reverse engineering techniques can be introduced. Thus, they can enhance the resistance against reverse engineering attacks. 5.3
Watermark Decoding Module Cd
Watermark Decoding Module includes Watermark Output Module and Chaotic System Module. Watermark Output Module transforms the watermark extracted from watermarked program into visual output; Chaotic System Module implements digital chaotic system, which is only applied to the watermark extracting procedure. When chaotic systems are discretely realized in finite precision, some serious problems will arise, such as dynamical degradation, short cycle length and non-ideal distribution. And then, we must compensate for the dynamical degradation in the presence of chaos system. We apply 1D piecewise linear chaotic maps, and the scheme of compensation for degradation in [11] to our implantation. 5.4
Problems and Solutions
Because of directly embedding watermark into the executable code, when implementing Cm , Ca and Cd , two problems arise as follows: (1) After every module is embedded into various executable code, the code and data are loaded onto different addresses of memory, and the code can’t access data in memory correctly. Therefore, it must do self-location (locate the memory address by the code itself). (2) Since it is unnecessary to recompile after embedding, modules can’t automatically find the address of Windows API by compiler and loader, but get the address by themselves. The self-location of code and data can be implemented by call/pop/sub instructions. Fig. 4 gives the specific codes. EBX, a register, is used to save the difference of the loading address and the designing address. The loading address is the sum of EBX and the designing address, which is self-location. The procedure in getting the addresses of Windows APIs is as follows: 1) Get the loading base address of kernel32.dll. There is exception handling in Windows—structured exception handling (SHE). All exception handling functions are in a linked list, and the last element of the linked list is the default exception handling function which is in the module of kernel32.dll. We can gain the address of the default exception handling function through traversing the linked list, from which we can get the loading base address of kernel32.dll. 1
BPM, an instruction of SoftICE, can set a breakpoint on memory access or execution.
A Chaos-Based Robust Software Watermarking
365
Fig. 4. The code of self-location
2) Get the addresses of LoadLibrary and GetProcAddress, which are Windows APIs, from the export table of kernel32.dll by the loading base address of kernel32.dll. 3) Get the address of the arbitrary Windows API with LoadLibrary and GetProcAddress.
6
Conclusion
A chaos-based robust software watermarking algorithm is proposed in this paper, in which the anti-reverse engineering technique and chaotic system are combined with the idea of the Easter Egg software watermarks. In CBSW, Anti-reverse Engineering Module is open and scalable, and more efficient anti-reverse engineering techniques can be applied. The program can be protected by embedding the watermark into the entire codes with chaotic dispersion coding. It is difficult for the adversary to tamper the message (includes W , Cm , Cd and Ca ) embedded in the program with chaotic substitution. The analysis of the CRSW shows that the scheme can thwart various types of semantics-preserving transformation attacks, such as dead code wiping, code optimization, code obfuscation, and variable reconstruction. Furthermore, it improves resistance against reverse engineering attacks to a certain extent.
Acknowledgement The work is supported partially by the National Natural Science Foundation of China (Grant No.60374004), partially by the Henan Science Fund for Distinguished Young Scholar(Grant No.0412000200), partially by HAIPURT(Grant No. 2001KYCX008), and the Science-Technology Project of Henan Province of China.
References 1. C. Collberg, C. Thomborson. Watermarking, tamper-proofing, and obfuscation tools for software protection. IEEE Trans. Software Engineering. Vol.28, No.8, pages: 735-746 2. Zhang Lihe, Yang YiXian, Niu Xinxin, Niu Shaozhang. A Survey on Software Watermarking. Journal of Software. Vol.14, No.2, pages: 268-277, in Chinese. 3. Business Software Alliance. Eighth annual BSA global software piracy study: Trends in software piracy1994-2002, June 2003.
366
F. Liu, B. Lu, and X. Luo
4. Kan X. Encryption and Decryption: Software Protection Technique and Complete Resolvent. Beijing: Electronic Engineering Publishing Company, 2001, in Chinese. 5. Jasvir Nagra and Clark Thomborson. Threading software watermarks. In 6th Workshop on Information Hiding, 2004, pages: 208-223. 6. Robert L. Davidson and Nathan Myhrvold. Method and system for generating and auditing a signature for a computer program. US Patent 5,559,884, September 1996. Assignee: Microsoft Corporation. 7. Julien P. Stern, Gael Hachez, Franois Koeune, and Jean-Jacques Quisquater. Robust object watermarking: Application to code. In 3rd International Information Hiding Workshop, 1999, pages: 368-378. 8. C. Collberg and C. Thomborson, Software watermarking: Models and dynamic embeddings. Proceedings of POPL’99 of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1999, pages: 311-324. 9. Patric Cousot and Radhia Cousot. An abstract interpretation-based framework for software watermarking. In ACM Principles of Programming Languages(POPL’04), Venice, Italy, 2004, pages: 173-185. 10. Christian Collberg, Andrew Huntwork, Edward Carter, and Gregg Townsend. Graph theoretic software watermarks: Implementation, analysis, and attacks. In 6th Workshop on Information Hiding, 2004, pages:192-207. 11. Liu Bin, Zhang Yongqiang, and Liu Fenlin. A New Scheme on Perturbing Digital Chaotic Systems. Computer Science, Vol.32, No.4, 2005, pages: 71-74, in Chinese
Privately Retrieve Data from Large Databases Qianhong Wu1 , Yi Mu1 , Willy Susilo1 , and Fangguo Zhang2 1
Center for Information Security Research, School of Information Technology and Computer Science, University of Wollongong, Wollongong NSW 2522, Australia {qhw, ymu, wsusilo}@uow.edu.au 2 School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510275, Guangdong Province, P.R. China [email protected]
Abstract. We propose a general efficient transformation from Private Information Retrieval (PIR) to Symmetrically Private Information Retrieval (SPIR). Unlike existing schemes using inefficient zero-knowledge proofs, our transformation exploits an efficient construction of Oblivious Transfer (OT) to reduce the communication complexity which is a main goal of PIR and SPIR. The proposed SPIR enjoys almost the same communication complexity as the underlying PIR. As an independent interest, we propose a novel homomorphic public-key cryptosytem derived from Okamoto-Uchiyama cryptosystem and prove its security. The new homomorphic cryptosystem has an additional useful advantage to enable one to encrypt messages in changeable size with fixed extension bits. Based on the proposed cryptosystem, the implementation of PIR/SPIR makes PIR and SPIR applicable to large databases.
1
Introduction
Consider the following scenario. A user wants to obtain an entry from a database of n λ-bit strings but does not want the database to learn which entry it wants. This problem was formally defined as Private Information Retrieval (PIR) in [4]. While protecting the privacy of user, if the user is not allowed to learn any information about the entries out of choice, the corresponding protocol is called a Symmetrically Private Information Retrieval (SPIR) protocol [10]. Usually, the PIR schemes (e.g., [5], [8], [9]) were suggested to be converted into SPIR protocols by employing zero knowledge techniques to validate the query. The notion of Oblivious Transfer (OT) is similar to PIR. It was introduced by Rabin [13] in which Alice has one secret bit m and wants to make Bob get it with probability 0.5. Additionally, Bob does not want Alice to know whether he gets m or not. For 1-out-of-2 OT, Alice has two secrets m1 and m2 and wants
This work is supported by ARC Discovery Grant DP0557493 and the National Natural Science Foundation of China (No. 60403007).
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 367–378, 2006. c Springer-Verlag Berlin Heidelberg 2006
368
Q. Wu et al.
Bob to get one of them at Bob’s choice. Again, Bob does not want Alice to know which secret he chooses. 1-out-of-n OT is a natural extension of 1-out-of-2 OT to the case of n secrets. 1-out-of-n OT is also known as all-or-nothing disclosure of secrets (ANDOS) [1] in which Alice is not allowed to gain combined information of the secrets, such as, their exclusive-or. Clearly, a 1-out-of-n SPIR is also a 1-out-of-n OT. The reason we need two concepts is the different motivations for using these primitives (and the way they were historically defined). The early motivation of OT is to reduce the intricate multi-party computation to simple cryptographic primitives. As SPIR is designed to retrieve entries from large databases, it is crucial to reduce the communication complexity and make it much less than n for a 1-out-of-n bit SPIR protocol. Traditionally, there are two models for PIR protocols, i.e., the multi-database model and the single-database model. For PIR in the former model, there are ω > 1 copies of the database in different servers and the servers have unlimited computational power, but the communication among them is not allowed. The best upper bound of communication O(nlog log ω/ω log ω ) in this model is due to Beimel et al. [3]. In the single-database model, it is assumed that the database server is computationally bound and there is only one copy of the database. The first scheme in this model was proposed in [8], with its security based on the quadratic residuosity problem and with O(κN ) server-side communication complexity, where is any constant and κ is a security parameter. Stern proposed a SPIR protocol based on an semantically secure homomorphic public-key√cryptosystem [14]. It has super-logarithmic total communication O(κδ logδ nδ logδ n ), where δ is the ciphertext expansion ratio of the underlying homomorphic cryptosystem. Cachin et al. [6] constructed PIR with polylogarithmic communication complexity O(κ log≥4 n) under the Φ-hiding assumption. Based on Paillier cryptosystem [12], Chang proposed PIR with communication complexity O(κ2d log n) in server side and O(κdn1/d log n) in the user side [5], where d > 3 can be any integer. Based on Damg˚ ard-Jurik public-key cryptosystems [7], Lipmaa proposed PIR with complexity O(κ log2 n) in user side and O(κ log n) in the server side [9]. This is the best asymptotical result to date. Following the second guideline, we reduce the concrete communication and computation overhead of PIR/SPIR protocols. The main contributions of this paper include: – The notion of polishing public-key cryptosystems and a general transformation from PIR to SPIR. Unlike existing schemes relying on inefficient zero-knowledge proofs, our transformation employs an efficient construction of OT and meets the goal to reduce the communication complexity of SPIR. The SPIR has almost the same communication complexity as the underlying PIR. – A novel efficient homomorphic public-key cryptosytem as an independent interest. The new homomorphic cryptosystem enables one to encrypt messages in changeable size with fixed extension bits. Based on the proposed cryptosystem, efficient PIR/SPIR protocols are implemented.
Privately Retrieve Data from Large Databases
369
The proposals outperform the state-of-the-art PIR/SPIR protocols. It makes PIR and SPIR applicable to large databases. For instance, to run a PIR or SPIR protocol with a database of 235 512-bit entries, the total communication is only about 134KB. The unavoidable linear computation cost regarding the scale of the database is in the server side which has often powerful computation power. The computation cost in the user side which has often limited computational power is logarithmic regarding the scale of the database. Hence, the proposals are practical for private information retrievals from large database. The rest of paper is organized as follows. In Section 2, we review the security definition of PIR. Section 3 presents a general transformation from PIR to SPIR with almost the same communication complexity as the underlying PIR. A novel homomorphic public-key cryptosytem is proposed and efficient PIR/SPIR protocols are implemented in Section 4, followed with conclusions in the last section.
2
Definitions of PIR/SPIR
In this section, we review the definition of 1-out-of-n bit PIR in [5] where each entry of the database is a single bit. The definition is naturally extended to 1-out-of-n λ-bit PIR in which each entry of the database is an -bit string. For an integer a ∈ N, let [a] denote the set {1, · · · , a}. We use the notation a ← A to denote choosing an element a uniformly at random from the set A, and use PPT to denote probabilistic polynomial time. A function is negligible in if for any polynomial p(·) there exists a 0 such that for all > 0 we have f ( ) < 1/p( ). Informally, a private information retrieval (PIR) scheme is an interactive protocol between two parties, a database Server and a User. The Server holds a database of n λ-bit strings x = x1 x2 · · · xn , where xk =∈ {0, 1}n for k ∈ [n], and the User holds an index ı ∈ [n]. In its one-round version, the protocol consists of (1) a query sent from the User to the database generated by an efficient randomized query algorithm, taking as an input the index ı and a random string C (ı); (2) an answer sent by the database answer algorithm, taking as an input the query sent by the User and the database x; and (3) an efficient reconstruction function applied by the User taking as an input the index ı, the random string C (ı), and the answer sent by the Server. At the end of the execution of the protocol, the following two properties must hold: (I) the User obtains the ı-th λ-bit string xı ; and (II) a computationally bounded database does not receive any information about the index of the User. We now give a formal definition of a PIR scheme. Definition 1. The single-database private information retrieval (PIR) is a protocol between two players Server, who has n λ-bit strings x = x1 x2 · · · xn where xk is the k-th λ-bit string of x, and User, who has a choice of index ı ∈ [n], that satisfies the following two properties: – Correctness: If the User and the Server follow the protocol, the User can learn xı and Server can send less than nλ bits to User.
370
Q. Wu et al.
– Choice Ambiguity: For any PPT algorithm A and any ∈ [n], the following value is negligible in the security parameter : |Pr[A(1 ; C (ı)) = 1]−Pr[A(1 ; C ()) = 1]|; where C (σ) is the distribution of communication from User induced by an index σ ∈ [n]. An SPIR scheme is a PIR scheme satisfying an additional privacy property: choice insulation. Namely, a computationally bounded User does not learn any information about the bits out of choice. It can also be viewed as an OT protocol (e.g., [13], [1]) with communication overhead lower than the scale of the database. As SPIR is designed to retrieve entries from large databases, it is crucial to reduce the communication complexity and make it much less than nλ for a 1-out-of-n λ-bit string SPIR protocol. We now give a formal definition of an SPIR scheme. Definition 2. The single-database private information retrieval (PIR) is a protocol between two players Server, who has n λ-bit strings x = x1 x2 · · · xn where xk is the k-th λ-bit string of x, and User, who has a choice of index ı ∈ [n], that satisfies the following three properties: – Correctness: If the User and the Server follow the protocol, the User can learn xı and Server can send less than nλ bits to User. – Choice Ambiguity: For any PPT algorithm A and any ∈ [n], the following value is negligible in the security parameter : |Pr[A(1 ; C (ı)) = 1]−Pr[A(1 ; C ()) = 1]|; where C (σ) is the distribution of communication from User induced by an index σ ∈ [n]. – Choice Insulation: For any PPT algorithm A and any n λ-bit strings x1 x2 · · · xn such that xσ = xσ for some σ ∈ [n], the following value is negligible in the security parameter : |Pr[A(1 ; C (x1 , x2 , · · · , xn )] = 1]− Pr[A(1 ; C (x1 , x2 , · · · , xn )] = 1]|; where C (z1 , z2 , · · · , zn ) is the distribution of communication from Server induced by a database of n λ-bit strings z = z1 z2 · · · zn .
3
General Constructions
In this section, we rewrite the Limpaa PIR scheme [9] with less restrictions on the underlying semantically secure homomorphic public-key encryptions. Subsequently, we transform it to SPIR without using zero-knowledge proofs as most existing schemes.
Privately Retrieve Data from Large Databases
3.1
371
General PIR Based on Homomorphic Public-Key Cryptosystems
Assume that the database has n λ-bit strings x1 , x2 , · · · , xn , where n = dI and d can be any constant. If n < dI , one can append a string of λ(dI − n) zeroes to the database. Let the User’s choice be ı = a1 + a2 d1 + · · · + aI dI−1 , where I−1 I ai ∈ [d] for i ∈ [I]. Let Eyi (·) : {0, 1}λ+Σi=1 γi → {0, 1}λ+Σi=1 γi for i ∈ [I] be I semantically secure homomorphic public-key encryptions, where δ0 = 0 and γi is the expansion length of the i-th encryption and yi is the corresponding public key. Denote the i-th decryption by Dsi and si is the corresponding secrete key. The Server and the User run the PIR protocol as follows. – For i ∈ [I], j ∈ [d], the User computes bi,j = Eyi (0) for j = ai and bi,j = Eyi (1) for j = ai . It sends (bi,1 , · · · , bi,d ) as its query to the database Server. – The Server does the following. • Compute x0,1 = x1 , · · · , x0,n = xn , J0 = n. • For i = 1, · · · , I, compute: xi−1,µ+d(Ji −1) xi−1,µ Ji = Ji−1 /d, xi,1 = ⊗dµ=1 bi,µ , · · · , xi,Ji = ⊗dµ=1 bi,µ . • Return xI,1 to the User. – The User computes x˜ı = Ds1 (Ds2 (· · · (DsI (xI,1 )) · · · )). We first consider the correctness of the protocol. Assume that ı0 = ı = a1 + x0,µ+d(j1 −1) a2 d1 + · · · + aI dI−1 = a1 + dı1 . In the first iteration, x1,j1 = ⊗dµ=1 b1,µ = Ey1 (x0,a1 +d(j1 −1) × 1 + Σµ=a1 +d(j1 −1) x0,µ × 0) = Ey1 (x0,a1 +d(j1 −1) ) for j1 = 1, · · · , J1 = J0 /d. Hence, by decrypting the (ı1 + 1)-th entity x1,ı1 of J1 strings {x1,j1 }, the User can extract x˜ı = Ds1 (x1,a1 +d(ı1 +1) ) = x0,a1 +dı1 = xı . Hence, the 1-out-of-J0 PIR is reduced to a 1-out-of-J1 PIR. By repeating the reduction I times, the User can extract x˜ı = Ds1 (Ds2 (· · · (DsI (xI,1 )) · · · )) = xı . Then we consider the choice ambiguity. In the protocol, the choice of the User is encoded as (a1 , a2 , · · · , aI ) and ai is encrypted as (Eyi (0), · · · , Eyi (0), Eyi (1), Eyi (0), · · · , Eyi (0)). ai −1
d−ai
Since the encryption is semantically secure, the Server learns nothing about the choice of the User. Hence, the choice is ambiguous and we have the following result on security. Theorem 1. The above PIR protocol is secure if the underling public-key cryptosystem is homomorphic and semantically secure. In the above PIR, the User needs d logd n encryptions and logd n decryptions. The Server needs (dn − d2 + d − 1)/(d − 1) exponentiations. To query the log n database, the User needs λ logd n + Σi=1d (logd n − i + 1)δi bits and the Server logd n needs λ+Σi=1 δi bits to answer the query. The total communication complexity is about λ(logd n+1)+logd n(logd n+3)δ/2, where δ = max{δ1 , · · · , δlogd n }. For a sufficiently large n and an appropriate parameter d, the total communication is O(log2d n) and less than nλ bits.
372
3.2
Q. Wu et al.
From PIR to SPIR
To convert PIR into SPIR, a popular method is to let the User run a zeroknowledge proof protocol and convince the Server the validity of the query ([5], [8], [9]). However, this solution is inefficient in practice which additionally introduce times of complexity of the underlying PIR protocol. In the following, we propose a general transformation from PIR to SPIR by embedding into an efficient OT protocol. Our approach requires almost no additional communication overhead. To achieve a general construction of efficient OT protocol, we introduce the notion of polishing public-key cryptosystems. Let λ, λ0 be security parameters. Definition 3. Let F ← G(1λ ), where Fy (·) : {0, 1}λ0 → {0, 1}λ is a family of public-key encryptions, and y ∈ Y is the public key and Y the public key space. Let s ∈ S be the corresponding secret key satisfying y = f (s) and S the secret key space, where f (·) is a one-way function. Fy (·) is a polishing public-key cryptosystem if for any PPT adversary A, the following probability is negligible Pr[F ← G(1λ ), s ← S, y0 = f (s), y1 ← Y, b ← {0, 1}; ˜b ← A(1λ , F, f, yb ) : ˜b = b]. The probability is taken over the coin flips of G and A. This notion is similar to the dense cryptosystem [15] in which the public key distributes uniformly. However, in a polishing public key cryptosystem, we only require that any PPT distinguisher cannot distinguish a correct public key from a random string in the public key space. In the following, we show how to achieve efficient 1-out-of-n OT and SPIR using polishing public-key cryptosystems. Let H(·) : {0, 1}∗ → Y be a cryptographic hash function and F ← G(1λ ). Fs−1 (·) denotes the reverse of Fy (·). Assume that the User’s secret choice is ı ∈ [n] and the Server has n λ0 -bit messages mk for k ∈ [n]. The OT protocol between the two parties is as follows. – The User randomly selects s ∈ S and computes y = f (s). It sends the Server y = y ⊕ H(ı) as its queries. Here ⊕ represents an efficient operation Y × Y → Y such that there exists another efficient operation ( : Y × Y → Y satisfying α ⊕ η ( η = α for any α, η ∈ Y. – The database Server computes yk = y ( H(k) for k ∈ [n]. It returns xk = Fyk (mk ) to the User. – The User extracts m ˜ı = Fs−1 (xı ). Clearly, the above protocol is an OT protocol. First, the User can decrypt mı since it knows the secret key of the ı-th public key yı = y ( H(ı) = y ⊕ H(ı) ( H(ı) = y = f (s). Second, the Server cannot determine the choice of the User as the underlying public-key cryptosystem is polishing and the database Server cannot distinguish yı from y1 , · · · , yn . Finally, the User cannot learn any information of the other messages out of choice from the ciphertexts from the Server since it does not know the corresponding secret keys. Hence, we have the following result.
Privately Retrieve Data from Large Databases
373
Lemma 1. If there exists a polishing public-key cryptosystem, then there is a one-round 1-out-of-n oblivious transfer with complexity O(n). The protocol is a generalization of the OT protocol in [16] which is a special case based on ElGamal cryptosystem. The construction exploits more extensive efficient public-key cryptosystems to build efficient OT protocols, for instance, OT from NTRU cryptosystem. From the viewpoint of the SPIR, the above protocol is unsatisfactory since the communication complexity is O(n). However, to enable the User symmetrically retrieve message mı from the Server’s database of n λ0 -bit messages mk for k ∈ [n], we can embed the above OT protocol into our PIR scheme and achieve an efficient SPIR scheme: The Server does not need to directly return all the ciphertexts xk = Fyk (mk ) to the User for k ∈ [n]. It can compress the n ciphertexts into one ciphertext xI,1 as shown in section 2.2 to enable User to obtain xı and then decrypt mı . The detailed transformation is as follows. – The User randomly selects s ∈ S, and computes y = f (s), y = y ⊕ H(ı). For i ∈ [I], j ∈ [d], the User computes bi,j = Eki (0) for j = ai and bi,j = Eki (1) for j = ai . It sends y , (bi,1 , · · · , bi,d ) as its query to the Server. – The database Server does the following. • Compute yk = y ( H(k), xk = Fyk (mk ) for k ∈ [n]. • Compute x0,1 = x1 , · · · , x0,n = xn , J0 = n. • For i = 1, · · · , I, compute: xi−1,µ+(d−1)Ji xi−1,µ , · · · , xi,Ji = ⊗dµ=1 bi,µ . Ji = Ji−1 /d, xi,1 = ⊗dµ=1 bi,µ • The database responds with xI,1 to the User. ˜ı = – The User computes x˜ı = Ds1 (Ds2 (· · · (DsI (xI,1 )) · · · )) and extracts m Fs−1 (x˜ı ). Clearly, the above protocol is a SPIR with communication O(log2 n). Indeed, compared with the underlying PIR protocol, the SPIR protocol requires only one additional element of the public key space Y. It is much more efficient than those relying on zero-knowledge proofs which introduce times of additional complexity of the underlying PIR protocol. From Lemma 1 and Theorem 1, we have the following result. Theorem 2. If there exists a polishing public-key cryyptosystem and semantically secure homomorphic public-key cryptosystem, then there exists a one round 1-out-of-n SPIR protocol with communication complexity O(log2 n).
4 4.1
Implementation Issues A Novel Homomorphic Cryptosystem
In this section, we propose a novel homomorphic cryptosystem. It can be viewed as an extension of the Okamoto-Uchiyama cryptosystem [11] and enjoys similar security properties. However, our scheme is much more efficient and has the
374
Q. Wu et al.
additional useful property to enable one to encrypt messages in changeable size with fixed extension bits. The new cryptosystem employs the difficulty of factorizing N = P t Q, where P and Q are ρ-bit primes. To the best of our knowledge, the most efficient 1− algorithm to factorize N = P t Q runs in time 2k +O(log ρ) [2], where t = k . This algorithm due to Boneh et al. [2] requires only polynomial space in log N . When = 1/2, the algorithm asymptotically performs better than the Elliptical Curve Method (ECM). However, we can always set the parameters k and appropriately so that the running time is about 280 , which is beyond the current computation power. Let P be a prime, t > 0 an integer and Γ = {x|x = 1 mod P t−1 ∧ x ∈ Z∗P t }. Note that Z∗P t is a cyclic group with order P t−1 (P − 1) and then #Γ = P . For any x ∈ Γ , define L(x) = (x − 1)/P t−1 . Clearly, L(x) is well-defined and has the following homomorphic property. Lemma 2. For a, b ∈ Γ , L(ab) = L(a) + L(b) mod P t−1 . If x ∈ Γ satisfies L(x) = 0 and y = xm mod P t for m ∈ ZP t−1 , then m = L(y)/L(x) = (y − 1)/(x − 1) mod P t−1 . Proof. From the definition of L, we have that L(ab) = (ab − 1)/P t−1 = (a − 1)(b − 1)/P t−1 + (a − 1)/P t−1 + (b − 1)/P t−1 = L(a)(b − 1) + L(a) + L(b). Note that (b − 1) = 0 mod P t−1 . It follows that L(ab) = L(a) + L(b) mod P t−1 . Then m = mL(x)/L(x) = L(xm )/L(x) = L(y)/L(x) = (y − 1)/(x − 1) mod P t−1 . This completes the proof. Let N = P t Q where P, Q are strong primes and gcd(P − 1, Q) = 1, gcd(P, Q − 1) = 1, t > 1. Assume that P, Q ∈ {2ρ , 2ρ + 1, · · · , 2ρ+1 } where ρ is a security ∗ parameter and γ = 2ρ + 2, λ = (t − 1)ρ. We randomly select an integer g ∈ ZN such that the order of g1 = g P −1 mod P t is P t−1 and the order of g mod N is P t−1 (P − 1)(Q − 1)/2. The public key is (N, g, λ). The private key is P . Encryption: To encrypt a message m ∈ {0, 1}λ , one randomly selects r ∈ ZN and computes C = g m+rN mod N . The ciphertext is C. Decryption: Given a ciphertext C ∈ Z∗N , compute c = C P −1 =g1m mod P t . m = L(c)/L(g1 ) mod P t−1 . For the infeasibility of inverting the encryption function, we have the following result. Theorem 3. Inverting the encryption function of our scheme is infeasible if and only if it is infeasible to factorize N = P t Q. Proof. Clearly, if there exists a PPT algorithm factorizing N with non-negligible probability, with the same probability, one can run this algorithm first and then invert the encryption function as the decryption algorithm. The time to recover m is polynomial. Now we assume that our scheme is insecure. There exists an PPT adversary A which can compute m from C with non-negligible probability. We will construct
Privately Retrieve Data from Large Databases
375
a PPT algorithm B using A as a black box to factorize N with non-negligible probability as follows. B randomly selects g ← Z∗N . As P, Q are strong primes and gcd(P − 1, Q) = 1, gcd(P, Q − 1) = 1, the probability that the order of g1 = g P −1 mod P t is P t−1 and the order of g mod N is P t−1 (P − 1)(Q − 1)/2 with overwhelming probability. (N, g, λ) is a correct public key pair with the same probability. B then randomly selects u ← ZN and computes C = g u mod N . We prove that C is a correct ciphertext with non-negligible probability. Let the order of g mod P t is P t−1 P and the order of g mod Q is Q , where P |(P − 1), Q |(Q − 1). The distribution of C can be represented by (u1 , u2 ), where u1 = u mod P t−1 , u2 = u mod lcm(P , Q ), gcd(N, lcm(P Q )) = 1. Similarly, the distribution of C mod N can be represented by (v1 , v2 ) where v1 = v mod P t−1 , v2 = v mod lcm(P , Q ) such that C = g v = g m+rN mod N . When u1 and v1 are fixed, the distribution of u2 and v2 are statistically close. As v1 uniformly distributes in {0, 1}λ and u1 uniformly distributes in ZP t−1 , where {0, 1}λ ⊂ ZP t−1 ⊂ {0, 1}λ+1 , C is in {C} with probability at least 1/2. Let A output 0 ≤ m ≤ 2λ < P t−1 for the forged ciphertext C. Then m satisfies m = u mod P t−1 . If u > P t−1 (the probability is overwhelming as u randomly distributes in ZN ), then u − m is a multiple of P t−1 . It follows that gcd(N, u − m) is P t−1 , P t or P t−1 Q. For each case, there are polynomial time algorithms to find the factors P and Q. Hence, with the help of A, B can efficiently factorize N with non-negligible probability. This completes the proof. For the semantically security of the scheme, it relies on the following Decisional N -Subgroup Assumption. This assumption is related to the P -subgroup assumption in [11] and the Decisional Composite Residosity Assumption in [12]. Definition 4. Decisional N -Subgroup Assumption. Let G(·) be a generator regarding our scheme such that (N, g, λ) ← G(1λ ) is a public key pair as defined above. For any PPT adversary A, the following value |Pr[(N, g, λ) ← G(1λ ), r ← ZN , b ← {0, 1}, A = g b+rN mod N ; b ← A(1λ , N, A)] − 0.5| is negligible in λ. The probability is taken over the coin flips of G and A. Theorem 4. The above cryptosystem is semantically secure against the chosen plaintext adversaries if and only if the Decisional N -Subgroup Assumption holds. Proof. Assume that the Decisional N -Subgroup Assumption does not hold. that is, a PPT algorithm A can distinguish E(0) and E(1) with non-negligible probability, where E(·) denotes the encryption defined above. Given {m0 , m1 } and C = E(m) where m ∈ {m0 , m1 }, we will construct a PPT algorithm B using A as a subroutine to distinguish E(m0 ) and E(m1 ). B randomly select α ← ZN and computes C = C/g m0 mod N , g = g (m1 −m0 )−αN mod N . With non-negligible probability, gcd((m1 − m0 ) − αN , P t−1 (P − 1)(Q − 1)/2) = 1. Hence, with the same probability, the distribution of g and g is statistically close, and β/((m1 −m0 )−αN ) mod lcm(P −1, Q−1) is defined for a random
376
Q. Wu et al.
integer β. Denote the encryption with N, g by E (·). Therefore, when C = E(m1 ), C = g m1 −m0 g rN = g (m1 −m0 )/((m1 −m0 )−αN ) g rN = g 1+(r+α/((m1 −m0 )−αN ))N mod N = E (1), if r + α/((m1 − m0 ) − αN ) mod lcm(P − 1, Q − 1) is defined. When C = E(m0 ), we have that C = g m0 −m0 g rN = g 0+(r/((m1 −m0 )−αN ))N mod N = E (0), if r/((m1 −m0 )−αN ) mod lcm(P −1, Q−1) is defined. B runs A with (N, g , C ) as inputs and obtains the answer whether C is E (0) or E (1), which immediately implies whether C is E(m0 ) or E(m1 ). Assume that our scheme is insecure. Then there is a PPT algorithm A to distinguish E(m0 ) and E(m1 ) with non-negligible probability. We will construct a PPT algorithm B using A as a subroutine to break the N -subgroup assumption. Let C is either E(0) or E(1). B randomly select α ← ZN and computes C = g m0 +αN C (m1 −m0 ) mod N . If C = E(0), C = E(m0 ). If C = E(1), C = E(m1 ). B runs A with (N, g, C ) as inputs and obtains the answer whether C is E(m0 ) or E(m1 ), which immediately implies whether C is E(0) or E(1). This completes the proof. Clearly, the above encryption is homomorphic. Let ρ = !. The expansion rate of the scheme is 2/(t + 1). In the case that t = 2, it is the Okamoto-Uchiyama cryptosystem [11]. For t > 2, our extension is more efficient than the original Okamoto-Uchiyama cryptosystem. Further more, by keeping ρ fixed and improving t, one obtains a series of homomorphic encryptions of changeable message length with the same expansion γ = 2ρ + 2 bits. It is also more efficient than the schemes in [7] with the similar property. 4.2
Implementation of PIR
Let us assume the same settings as section 2.2. Following the general construction, the PIR protocol is implemented as follows. – The User randomly generates I public keys (N1 , g1 , λ), (N2 , g2 , λ + γ), · · · , (NI , gI , λ + (I − 1)γ), where Ni is generated as above. Denote the corresponding decryption procedures by Di (·) for i ∈ [I]. For i ∈ [I], j ∈ [d], the ξ +r N User randomly select ri,j ∈ ZNi and computes bi,j = gi i,j i,j i mod Ni , where ξi,j = 0 for j = ai and ξi,j = 1 for j = ai . It sends (Ni , bi,1 , · · · , bi,d ) as its query to the Server. – The database Server does the following. • Compute x0,1 = x1 , · · · , x0,n = xn , J0 = n. d xi−1,µ+d(j−1) . • For i ∈ [I], j = 1, · · · , n/di , compute xi,j = µ=1 bi,µ rNI • Return cI = xI,1 gI mod NI to the User, where r ∈R ZNi . – The User extracts x˜ı = D1 (D2 (· · · (DI (cI )) · · · )). The User needs about (d + 1)(λ logd n + γ logd n(logd n + 1)/2) bits. The Server needs about λ + γ logd n bits. For the User, the most time-consuming job is to generate the logd n public keys. Note that the public keys is reusable, and the User can generate sufficiently many public keys before the protocol is
Privately Retrieve Data from Large Databases
377
run. After this pre-computation, the User requires about (d + 1) logd n (λ + iγ)bit modular exponentiations. The Server needs about (n − 1)/(d − 1) d-base (λ + iγ)-bit modular exponentiations. We now analyze the practicality of the protocol by concrete parameters. Let γ = 1026, d = 32, λ = 512, n = 235 . That is, a User will privately retrieve from a large database Server of 235 512-bit strings. The largest t is 8 = 5121/3 and = 1/3. The running time to factorize N is about 273 . In this scenario, the User needs about 133KB and the Server needs about 1KB. The User needs about 231 modular exponentiations. The Server needs about 235 modular exponentiations. This computation is heavy and unavoidable. However, in practice, the Server has often scalable computational power and hence it is bearable. 4.3
Implementation of SPIR
Assume that the database has n λ-bit strings m1 , m2 , · · · mn where n = dI and d can be any constant. The User’s choice is ı ∈ [n]. Let G =< g > be group of a
-bit large prime order in which discrete logarithm is difficult and g is a generator of G. Let H(·) : {0, 1}∗ → G be a cryptographic hash function. First, the User randomly selects s ∈ {0, 1} and computes y = (g ⊕ H(ı))s . Here, ⊕ means the group operation and denote its reverse by (. The User sends the Server y, and (Ni , bi,1 , · · · , bi,d ) as that of the PIR protocol in the above section. The database Server selects a random integer r ∈ {0, 1} and computes xk = mk ⊕(g ⊕H(k))r for k ∈ [n]. Then it run the PIR protocol in Section 2.2. Finally, it returns z = y r and xI,1 to the User. The User extracts x˜ı = D1 (D2 (· · · (DI (xI,1 )) · · · )) and then decrypts m ˜ı = x˜ı ( z 1/s . Compared with underlying PIR protocol, the above SPIR requires only additional bits introduced by y and z. As xk = mk ⊕(g⊕H(k))r can be pre-computed before the query from the User, the SPIR protocol has almost the same online complexity as the underlying PIR protocol. Hence the SPIR scheme is also practical for large database retrievals.
5
Conclusions
Private information retrievals are useful cryptographic primitives. It implies other well-known cryptographic primitives such as the existence of one-way functions, oblivious transfers, multi-party computations. It can also be directly implemented for applications such as medical database retrievals, digital bank transactions and so on. In this paper, we proposed a general efficient transformation from PIR to SPIR without exploiting the zero-knowledge proofs. The proposals are implemented efficiently. The schemes of PIR/SPIR are applicable to secure large database retrievals. As an independent interest, we also contribute a novel efficient homomorphic public-key cryptosystem. It can be used to encrypt messages in changeable size while extension bits are constant.
378
Q. Wu et al.
References 1. G. Brassard, C. Cr´epeau, J.-M. Roberts. All-or-Nothing Disclosure of Secrets. In Proc. of Crypto’86, LNCS 263, pp. 234-238, Springer-Verlag, 1987. 2. D. Boneh, G. Durfee, and N. Howgrave-Graham. Factoring N = pr q for large r. In Proc. Crypto’99, LNCS 1666, pp. 326–337, Springer-Verlag, 1999. 3. A. Beimel, Y. Ishai, E. Kushilevitz, and J.-F. Rayomnd. Breaking the O(n1/(2k−1) ) barrier for information-theoretic private information retrieval. In Proc. of the 43-th IEEE Sym. On Found. Of Comp. Sci., 2002. 4. B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan. Private Information Retrieval. In Proc. of 36th FOCS, 1995. 5. Y. Chang. Single Database Private Information Retrieval with Logarithmic Communication. In Proc. of ACISP’04, LNCS 3108, pp. 50-61, Springer-Verlag. 2004. 6. C. Cachin, S. Micali, and M. Stadler. Computational Private Information Retrieval with Polylogarithmic Communication. In Proc. of Eurocrypt’99, LNCS 1592, pp. 402-414, Springer-Verlag, 1999. 7. I. Damg˚ ard, M. Jurik. A Generalisation, a Simplification and Some Applications of Paillier’s Probabilistic Public-Key System. In Proc. of PKC’01, LNCS 1992, pp. 119-136, Springer-Verlag, 2001. 8. E. Kushilevitz and R. Ostrovsky, Replication is not needed: single database, computationally-private information retrieval. In Proc. of FOCS’97, pp. 364-373. 9. H. Lipmaa. An Oblivious Transfer Protocol with Log-Squared Communication. In Proc. of ISC 05. LNCS 3650. pp.314-328, Springer-Verlag, 2005. 10. S. K. Mishra, P. Sarkar. Symmetrically Private Information Retrieval. In Proc. of Indocrypt’00, LNCS 1977, pp. 225-236, Springer-Verlag, 2000. 11. T. Okamoto, S. Uchiyama. A New Public-Key Cryptosystem as Secure as Factoring. In Kaisa Nyberg, editor, In Proc. of Eurocrypt’98, LNCS 1403, pp. 308-318, Springer-Verlag, 1998. 12. P. Paillier. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In Proc. of Eurocrypt’99, LNCS 1592, pp. 223-238, Springer-Verlag, 1999. 13. M. Rabin. How to Exchange Secrets by Oblivious Transfer. Technical Report TR81, Aiken Computation Laboratory, Harvard University, 1981. 14. J. P. Stern. A New and Efficient All-or-nothing Disclosure of Secrets Protocol. In Proc. of Asiacrypt’98, LNCS 1514, Springer-Verlag, pp. 357-371, 1998. 15. A. De Santis and G. Persiano, Zero-Knowledge Proofs of Knowledge Without Interaction. In Proc. of FOCS’92, pp. 427-436, IEEE Press, 1992. 16. W. Tzeng. Efficient 1-out-of-n Oblivious Transfer Schemes. In Proc. of PKC’02, LNCS 2274, Springer-Verlag, pp. 159-171, 2002.
An Empirical Study of Quality and Cost Based Security Engineering Seok Yun Lee1, Tai-Myung Chung1, and Myeonggil Choi2,* 1
School of Information and Communication Engineering, Natural Science Campus Sungkyunkwan University, 300 Cheoncheon-dong, Jangan-gu, Suwon-si, Geonggi-do, 440-746, Korea [email protected], [email protected] 2 Department of Systems Management Engineering, INJE University, 607 Obang-dong, Gimhae, Gyeongnam, 621-749, Korea [email protected]
Abstract. For reliability and confidentiality of information security systems, the security engineering methodologies are accepted in many organizations. A security institution in Korea faced the effectiveness of security engineering. To solve the problems of security engineering, the institution creates a security methodology called ISEM, and a tool called SENT. This paper presents ISEM methodology considering both product assurance and production processes take advantages in terms of quality and cost. ISEM methodology can make up for the current security engineering methodology. For support ISEM methodology, SENT tool, which is operated in Internet, support the production processes and the product assurances which ISEM demands automatically.
1 Introduction Many organizations have invested many resources to increase reliability and confidentiality of information security systems. As a series of efforts to obtain high quality of information security systems, the security engineering methodologies such as CC, ITSEC, SSE-CMM and SPICE have been introduced [8,11]. The security engineering methodologies could be divided into two approaches in terms of assuring objects. The first approach is a product assurance approach and the second approach is a production process approach. The product assurance approach focuses the assurance of products through evaluating functions and assurances of information security systems. CC (Common Criteria), ITSEC (Information Technology Security Evaluation Criteria) and TCSEC (Trusted Computer Security Evaluation Criteria) could be included in the product assurance approach. Although the product assurance approach could assure high quality, it takes high costs and periods. The production process approach focuses the assurance of production process. The production process approach shifts its focus from assuring products to assuring production processes. SSE-CMM (System Security Engineering-Capability Mature Model), *
Corresponding author.
K. Chen et al. (Eds.): ISPEC 2006, LNCS 3903, pp. 379 – 389, 2006. © Springer-Verlag Berlin Heidelberg 2006
380
S.Y. Lee, T.-M. Chung, and M. Choi
SPICE, ISO 9000-3 (Guidelines for the development supply and maintenance of software) could be included in the production process approach. Although the cost and period the production process approach is lower than those of the product assurance approach, the assurance level should have been lower than that of the first approach. The product assurance approach has been frequently introduced in developing high reliable information. To solve high engineering costs, many organizations have sought a cost-effective security engineering methodology. In nature, the two security engineering approach could be supplemental [4]. This paper presents a security engineering methodology and a tool supporting the methodology, with which a security research institution in Korea have tried to solve a trade-off between cost and quality. The institute in Korea has created ISEM (High Secure Engineering Methodology) assuring both products and production process. To support ISEM, SENT (Secure Engineering Tool) has been developed. ISEM could make up for shortcomings of the product assurance approach such as CC, ITSEC, TCSEC, and could reflect the advantages of the production process approach such as SSE-CMM, SPICE. SENT could direct the users participating engineering to follow all the processes and to describe all assurances ISEM demands.
2 Review of Security Engineering Methodology This section briefly reviews the product assurance approach and the production process approach. In the early of 1980’s, TCSEC was developed in United States. TCSEC was primarily applied to engineer the trusted ADP (automatic data processing) systems. It was used to evaluate information security systems and the acquisition specifications of information security systems in public institutions. TCSEC has two distinct requirement sets, which consist of (1) security functional requirements, and (2) assurance requirements. The security functional requirements encompass the capabilities typically found in information processing systems employing general-purpose operating systems. General-purpose operating systems are distinct from the applications programs. However, the specific security functional requirements can be applied to the specific systems owing functional requirements, applications or special environments. The assurance requirements, on the other hand, can be applied to the systems that cover the full range of computing environments from the dedicated controllers to multilevel secure systems [2]. ITSEC is European-developed criteria filling a role roughly equivalent to the TCSEC. While ITSEC and TCSEC have many similar requirements, there are some important distinctions. ITSEC tends to place emphasis on integrity and availability, and attempts to provide a uniform approach in evaluating both products and systems. Like production process approach, ITSEC also introduces a distinction between doing the right job effectiveness and doing the job right. To do so, ITSEC allows less restricted collections of requirements for a system at the expense of more complete and less comparable ratings [3]. CC is an outcome of a series of efforts to develop IT security evaluation criteria that can be broadly accepted within the international community. The sponsoring organizations of TCSEC, and ITSEC pooled their efforts and began a joint activity to align their separate criteria into a set of IT security criteria. CC has security functional requirements and security assurance requirements. The CC has 7 Evaluation Assurance Levels. Especially, CC has been standardized as ISO/IEC 15408 [6, 7].
An Empirical Study of Quality and Cost Based Security Engineering
381
Sun
Su n ULTRA
2
Engineering Process
Assurance Process
Risk Process
Fig. 1. SSE-CMM consists of three domains such as risk process, engineering process and assurance process
ISO 9000-3, SPICE, and SSE-CMM are security engineering methodologies focusing on quality and controls in production process [1, 9, 11]. SSE-CMM is based on SE-CMM. To handle special principles of information system security engineering, SE-CMM was interpreted in respect of information security area, and new domains of production process and practices have been identified. As fig. 1 shows, SSE-CMM consists of three domains, which are risk process, engineering process and assurances process. In risk process, risk of product and service should be identified and prioritized. In engineering process, solutions to manage risks can be suggested. In assurance process, assurance rationale should be submitted to customers [5].
3 ISEM Methodology A security research institution in Korea had adopted a production process methodology to develop high reliable information security systems. To increase reliability and confidentiality of the information security systems, the institution shifted its focus from assuring quality of production process to assuring products themselves. After shifting its focus, the institution has faced increased costs and prolonged periods to engineer the information security systems. To solve the problem, the institution has sought to take advantages of both the production process approach and product assurance approach. The institution creates a methodology called ISEM and a tool called SENT to solve a trade-off between quality of products and engineering costs in developing the information security systems. ISEM accepts the advantage of the two approaches, so that it focuses both production process and product assurance. As fig.2 shows, ISEM consists of a design stage, and three stages of developing prototypes. ISEM adopts three main production processes in SSE-CMM, which includes assurance process, risk process and engineering process. The differences between ISEM and the two security engineering approaches lie in the granularity level. The production process approach such as SSE-CMM could adopt a same granularity level of security engineering process in an enterprise. The
382
S.Y. Lee, T.-M. Chung, and M. Choi
l
XX z{hnlG z{hnlG TTjGkT jGkT
s
z{hnl YY z{hnl
T{GX T{GX w GkT w GkT
l
h n Gw
z{hnl ZZ z{hnl
T{GYGw GkT T{GYGw GkT
l [[ z{hnl z{hnl T{GZGw GkT T{GZGw GkT
o
Fig. 2. ISEM Methodology consists of four stages
product assurance approach such as CC, ITSEC demands a granularity level of product in dependent of a product rating. But ISEM could elevate the granularity level with going through the four stages. Consequently, the granularity level of production process could increase from the 1st stage to the 4th stage. The rating in CC, TCSEC and ITSEC could be considered as a concept of stage in ISEM. ISEM demands a different level assurance in each stage. The reliability of the information security systems can be guaranteed through assurances in the form of documents. The assurance level of ISEM is lower than CC and TCSEC. ISEM requires for the developers and the evaluators to describe only the essential items in the documents. All of the information security systems need not be engineered from the 1st stage to the 4th stage. Depending on reliability demanded, the engineering stage of the information security systems can be decided. The high reliable information security systems should be engineered in all four stages, whereas the low reliable information security systems should be engineered in two or three stages. st
3.1 The 1 Stage in ISEM st
In the 1 stage, the developer designs the information security systems in conceptual level. As fig.3 shows, the conceptual design should assure a reliability of design. The st 1 stage consists of production processes, product assurances and evaluation. The production processes which the developer should observe are as following: 1). surveying currents of the information security systems, 2) analyzing user requirements, 3) designing security mechanisms, 4) specifying target systems, 5) designing information security systems in a conceptual level. The developer should survey technical currents to reflect target systems and analyze user’s requirements and security environments. Based on the survey and the analysis, the developer should develop cipher algorithms and security protocols and specify the target information security systems. After completing these procedures, the developer should specify the target systems. The specification of the target systems includes risk analysis, user requirements, functional requirements, and assurance requirements. The developers
An Empirical Study of Quality and Cost Based Security Engineering
383
cGwGwGGGXz{ z{hnlGe
z Gt z Gt k k
Y
X
|Gy |Gy
uGG uGG {Gz {Gz
Z
zGG zGG {Gz {Gz
[
jGk jGk
\
XX hG hG l l
st
Fig. 3. The production processes in the 1 stage
are able to design the information security systems in a conceptual level. To manage configuration of the 1st state, the developer should describe a sub-system by subsystem. For product assurance, the developer should document the activities happened in production process. The document which should be described are as following: 1) the note of technical survey, 2) the analysis of user requirements, 3) the design of security mechanisms, 4) the specification of target systems, 5) the conceptual design of information security systems, 6) the document of configuration management. The granularity level of product assurance in the 1st stage is low in that the 1st stage does not demand a detail description in the analysis of user requirements, the specification of target systems, and the conceptual design of information security systems. But the 1st stage demands a detail description of security mechanism design and configuration management. After completing the developing activity, the evaluator should develop the observation of production processes, the reliability of conceptual design, and the completeness of configuration management. To decrease periods and costs, the observation of production processes can be evaluated through check-list and interviews. In the 1st stage, the focus of evaluation lies in security mechanisms and configuration management. The reason to focus security mechanisms is that the design of security mechanisms is the most important process in developing high reliable information security systems. nd
3.2 The 2 Stage in ISEM st
nd
Developing and evaluating the 1 prototype happen in the 2 stage. As fig.4 shows, st the developer should reflect the conceptual design to the 1 prototype. After developst ing the 1 prototype, the evaluator should mainly validate the correctness between st assurance of the security mechanisms and the 1 prototype. nd In the 2 stage, the production processes are as following: 1). specifying the 1st prototype, 2) designing security mechanisms for the 1st prototype, 3) developing the 1st prototype, and 4) testing functions of information security systems. The production processes in the 2nd stage should mainly implement security mechanisms of the 1st prototype. To develop the 2nd prototype, the developer should design the security mechanisms using formal method and verify them in a mathematical way. Therefore, the vulnerability of security mechanisms and the related functions could be verified.
384
S.Y. Lee, T.-M. Chung, and M. Choi
cGwGwGGGY z{hnlGe
kG kG XX w w
Y
z Gt z Gt k k
X
XX w w z z
Z
mG{ mG{
[
hG YY hG l l
nd
Fig. 4. The production processes in the 2 stage
To assure the 1st prototype, the developer should describe documents, which are as following: 1) the specification of the 1st prototype, 2) the design of security mechanisms, 3) the document of functional test results, 4) the conceptual design of systems, and 5) the document of configuration management. The assurance granularity of the 2nd prototype is higher than that of the 1st stage. The specification of the 1st prototype and the design of security mechanisms should be described using formal method. Especially, the design of security mechanisms should include results of vulnerability test and security test in terms of security protocols operation. The documentation of functional test results only includes the result of function test. Despite the increased granularity of assurances, the documents described in production processes could be simple compared to those of CC. In the 2nd stage, the evaluator should validate correctness between assurances and the 1st prototype. To validate reliability and integrity, the evaluator should verify security mechanisms and functional results using documents and independent tests. rd
3.3 The 3 Stage in ISEM rd
nd
In the 3 stage, the 2 prototype should be developed and evaluated. As fig.5 shows, rd nd the production processes of the 3 stage are similar to those of the 2 stage. To make nd the 2 prototype closer to the target systems, the developer should modify the security st mechanisms and improve the entire functions in the 1 prototype. rd In the 3 stage, the production processes are as following: 1). specifying the 2nd prototype, 2) modifying security mechanisms for the 2nd prototype, 3) developing the 2nd prototype, and 4) testing the entire functions of information security systems. In the 3rd stage, the security mechanisms and the entire functions should be confirmed. In a case, the 2nd prototype could be the target systems and the developer could complete the 2nd prototype as the target systems To assure the 2nd prototype, the developer should describe documents, which are as following: 1) the specification of the 2nd prototype, 2) the design of security mechanisms, 3) the document of functional test results, 4) the detail design of information security systems, and 5) the document of configuration management. The differences between the 2nd stage and the 3rd stage are the assurance level of products. In the 3rd stage, the detail specification of the target systems and the detail design of information security systems should be described. To increase assurance level of the 2nd prototype, the developer should provide quantitative criteria to evaluate correctness of the security mechanisms and the entire functions.
An Empirical Study of Quality and Cost Based Security Engineering
385
cGwGwGGGZ z{hnlGe Y
kG kG w YY w
X
t Gz G t Gz G tGk tGk
w YY w z z
Z
mG{ mG{
GhG ZZG hG l l
[
rd
Fig. 5. The production processes in the 3 stage
The evaluator should verify the completeness of security mechanisms and the entire functions in the 2nd prototype. Although the granularity in the specification of target systems and the design of systems could increase, the costs and the periods of production process are similar to those of the 2nd stage. th
3.4 The 4 Stage in ISEM rd
th
rd
The 3 prototype can be developed and evaluated in the 4 stage. The 3 prototype is the target systems specified in the first stage. As fig.6 shows, the production processes th rd rd of the 4 stage are as following: 1). specifying the 3 prototype, 2) developing the 3 rd prototype 3) testing performance of the 3 prototype 4) testing hardware adaptation, rd and 5) testing operation of 3 prototype in the target environment. The production processes in the 4th stage focus operation of the 3rd prototype in the target environrd ments. Based on the results of tests, the placement of the 3 prototype could be decided. cGwGwGGG[ z{hnlGe Y
wG{ wG{
kG kG ZZ w w
X
ZZ w w z z
Z
lG{ lG{
[
vG{ vG{
[[ hG hG l l
\
th
Fig. 6. The production processes in the 4 stage
To assure the 2nd prototype, the developer should describe documents, which are as following: 1) the specification of the 3rd prototype, 2) the document of performance test results, 3) the document of environmental test results, 4) the document of operation test, and 5) the document of configuration management. To assure completeness of the 3rd prototype, the developer should provide quantitative criteria to meet the specification of the 3rd prototype in the performance test, the operational test, and the environment test. After testing, the developers should describe the results of tests. In the 4th stage, the documents of tests should be described in a detail.
386
S.Y. Lee, T.-M. Chung, and M. Choi
Evaluator should verify the consistencies between the assurances and the overall tests. After 4th stage, the overall assurances of products become completed. All the assurances of products have been completed in the form of documents in each stage.
4 SENT Tool To support ISEM methodology, SENT that consists of a Process-Supporting Systems (SYS1), an Assurance-Supporting Systems (SYS2), and a Specifying/EvaluatingSystems (SYS3) are developed in fig. 7.
Fig. 7. SENT tool
The Process-Supporting Systems (SYS1) support the developers to observe production processes in each stage. The Assurance–Supporting systems (SYS2) support the developers and the evaluators to describe assurances of products. The AssuranceSupporting Systems are able to generate a predefined form, in which the developers just describe the assurances, so that the documents could be consistent through all the stages. Therefore, the description level in any document and document can be consistent. The specifying/evaluating systems (SYS3) support the users to specify and evaluate the prototype and target systems in each stage. SENT can be operated in a web-server which includes JSP container, JAVA BEAN, and Xindice database. 4.1 The Process-Supporting Systems (SYS1) The Process-Supporting Systems are operated in a central server. The users are able to upload and download the documents. As fig.8 shows, the documents can be saved in File Systems and information concerning the document can be saved in Database. The Process-Supporting Systems provide the two functions. First, the rules for production process guide the users to observe the production process. The users can perform their task in accordance with the production processes. The users could not jump up or omit any production processes before completing previous production processes. Second, authenticating users authorizes the users, who consist of the managers, the developers and the evaluators, to access SENT. The roles of the users is different depending on the users task, so that only the manager can review all the documents, grant the general users with access authorization, and post news in the systems. The general users can search and edit the documents.
An Empirical Study of Quality and Cost Based Security Engineering
387
Fig. 8. The structure of the Process-Supporting Systems
4.2 The Assurance-Supporting Systems (SYS2) The Assurance-Supporting Systems provide following functions. First, generating a form of document supports the users to edit data in a predefined format. As fig.9 shows, Presentation Layer, the Document Manager, and the Database Manager generate a form, which can be drawn and modified by the authorized users. The data and the form in the document can be saved in the Database and the File Systems, respectively. Second, Managing configuration provides two functions, which are categorizing documents and managing configuration documents. The function can categorize the documents by form and title and manage history of configuration documents in a detail. The function of managing configuration can issue a report of document alternation, which describes modifying items of documents, the modifying time, and the modifying users. As fig.9 shows, the Version Manager is able to perform categorizing documents by title and form and saving them.
Fig. 9. The structure of the Assurance-Supporting Systems
388
S.Y. Lee, T.-M. Chung, and M. Choi
4.3 The Specifying/Evaluating Systems The Specifying/Evaluating Systems provide the two important functions. First, the Specifying/Evaluating Systems can save a template of the specification and the evaluation report, which are similar to that of the Assurance-Supporting Systems. Second, describing the specification could analyze security environments which include assumptions, threats, and organization security. As fig.10 shows, inference engine is able to present security environment to the users to describe security environments. When the users input threats, assumptions, and security policy of an organization to the Intelligent Manager, the Intelligent Manager could pass them the Inference Engine. On receiving it, the Inference Engine is able to infer the security environments using the Database, and return the final specification of the target systems.
Fig. 10. Specifying/Evaluating Systems
5 Conclusion The paper suggests ISEM methodology and SENT tool for engineering high reliable information security systems. ISEM has been presented to take advantages from the contrary approaches of information security engineering. Although the product assurance approach such as TCSEC, ITSEC, and CC could engineer information security systems in a precise way, it takes high costs. Although the production process approach such as SPICE, SSE-CMM takes less cost compared to the product assurance approach, it could not assure information security systems precisely than the product assurance approach could assure. ISEM demands the users describe assurances of information security systems, and observe the suggested four stages. For feasibility of approach, ISEM mitigates assurance level of products and production processes depending on the reliability information security systems. The reason that ISEM is suitable for engineering high reliable information security systems is that the high reliable information security systems should be developed in high assurance level and cost-effective way. ISEM could provide the high assurance level and cost-effective engineering process.
An Empirical Study of Quality and Cost Based Security Engineering
389
SENT could support all the production processes and product assurances. SENT could be operated on Internet so that the users could easily access to SENT. The Process-Supporting Systems help the users observe the four stages and the AssuranceSupporting Systems help the users save efforts describing document. SENT is proved to be useful in developing high reliable information systems. Although the production process approach and the product assurance approach were introduced for developing the special-purpose information security systems, they could be applied in commercial information security systems. Due to reliability and effectiveness of ISEM and SENT, they could be suitable for developing high reliable systems, including cipher systems, military systems, space systems and so on.
References 1. Software Engineering Institute, Carnegie Mellon Univ.: SSE-CMM Appraisal Method, V.2.0, (1999) 2. Department of Defense: Trusted Computer System Evaluation Criteria, DoD 5200.28STD, (1985) 3. European Commission: Information Technology Security Evaluation Criteria (ITSEC), (1992). 4. Eloff,M., Solms,S.H.: Information Security Management, Hierarchical Framework for Various Approaches, Computers & Security, Vol.19, (2000) 243-256. 5. Hefner,R., Monroe,W.: System Security Engineering Capability Maturity Model, Conference on Software Process Improvement, (1997). 6. ISO/IEC: Common Criteria for Information Technology Security Evaluation Part 3: Security Assurance Requirements Version 2.1, (1999) 7. ISO/IEC: Common Methodology for Information Technology Security Evaluation Part 2: Evaluation Methodology Version 1.0 (1999) 8. Piazzal,C., Pivato,E., Rossi,S.,:CoPS-Checker of Persistent Security, In: Jensen, K, Podelski A.,(eds): Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, Vol.2988, Springer-Verlag, Berlin Heidelberg New York (2004) 93–107 9. Pijl ,G., Swinkels,G., and Verijdt, J.:ISO 9000 versus CMM: Standardization and Certification of IS Development, Information & Management, Vol.32, (1997) 267-274. 10. Qadeer,S., Rehof, J.: Context-Bounded Model Checking of Concurrent Software. In: Halbwachs,N.,Zuck, L.,D.(eds): Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, Vol.3440, Springer-Verlag, Berlin Heidelberg New York (2005) 93–107 11. Wood, C. and Snow, K.: ISO 9000 and information, Security, Computer & Security, Vol.14, No.4, (1995) 287-288.
Author Index
Baek, Yoo-Jin 1 Bai, Shuo 78 Bao, Feng 112, 142
Kim, Jangbok 67 Kim, Yosik 248 Lee, Jung Wook 153 Lee, Pil Joong 153 Lee, Seok Yun 379 Li, Jianhua 123 Li, Shipeng 13 Li, Tieyan 112, 142 Li, Xiao 134 Li, Xiehua 123 Lim, Jongin 33 Lin, Dongdai 314, 321 Lin, Lei 321 Liu, Fenlin 355 Liu, Yinbo 278 Lotspiech, Jeffery 302 Lu, Bin 355 Lu, Yahui 278 Luo, Hao 45 Luo, Xiangyang 355
Cao, Tianjie 314 Cao, Zhenfu 226 Chai, Zhenchuan 226 Chen, Kefei 165 Chen, Liqun 202 Cheng, En 100 Cheng, Zhaohui 202 Choi, Kyunghee 67 Choi, Myeonggil 379 Chung, Tai-Myung 379 Comley, Richard 202 Dai, Kui 290 Dong, Ling 165 Eom, Young Ik
269
Manulis, Mark 187 Mao, Xianping 314 Mu, Yi 214, 332, 367
Fang, Binxing 45, 57 Feng, Min 13 Han, Dong-Guk 33 Han, Zongfen 100 He, Guangqiang 177 He, Hongjun 290 He, Mingxing 134 Hou, Fangyong 290 Hu, Yupu 25 Huang, Xinyi 214 Hwang, Yong Ho 153
Noh, Mi-Jung
Park, Sangseo 248 Peng, Jinye 177 Ping, Lingdi 343 Qi, Fang 112 Qiu, Ying 142 Ramakrishna, R.S. 235 Ryou, Jaecheol 248
Jia, Weijia 112 Jiang, Zhonghua 321 Jin, Hai 100 Jin, Hongxia 302 Jin, Shiyao 90 Jung, Gihyun 67 Khan, Muhammad Khurram Kim, Gu Su 269 Kim, Hyung Chan 235
1
260
Sadeghi, Ahmad-Reza 187 Sakurai, Kouichi 235 Schwenk, J¨ org 187 Shen, Haibin 343 Shim, Jaehong 67 Shin, Wook 235 Sun, Jiaguang 278 Sun, Kang 343 Susilo, Willy 214, 332, 367
392
Author Index
Takagi, Tsuyoshi 33 Tang, Qiang 202 Tao, Zhifei 100 Wang, Baocang 25 Wang, Jimin 343 Wang, Ping 57 Wang, Zhiying 290 Wu, Ji 90 Wu, Qianhong 367 Wu, Yongdong 112 Xie, Feng 78 Xiong, Jin 177 Xu, Lin 321
Yang, Shutang 123 Ye, Chaoqun 90 Yun, Xiaochun 45, 57 Yun, Youngtae 248 Zeng, Guihua 177 Zhang, Fangguo 367 Zhang, Futai 214 Zhang, Jiashu 260 Zhang, Li 278 Zhao, Cunlai 13 Zhou, Lan 332 Zhou, Yuan 226 Zhu, Bin B. 13 Zhu, Hongwen 123 Zhu, Huafei 142