326 70 2MB
English Pages 284 [285] Year 2019
RESEARCH METHODOLOGY FOR SOCIAL SCIENCES Research Methodology for Social Sciences provides guidelines for designing and conducting evidence-based research in social sciences and interdisciplinary studies using both qualitative and quantitative data. Blending the particularity of different sub-disciplines and interdisciplinary nature of social sciences, this volume: • • •
Provides insights on epistemological issues and deliberates on debates over qualitative research methods; Covers different aspects of qualitative research techniques and evidence-based research techniques, including survey design, choice of sample, construction of indices, statistical inferences and data analysis; Discusses concepts, techniques and tools at different stages of research, beginning with the design of field surveys to collect raw data and then analyse it using statistical and econometric methods.
With illustrations, examples and a reader-friendly approach, this volume will serve as a key reference material for compulsory research methodology courses at doctoral levels across different disciplines, such as economics, sociology, women’s studies, education, anthropology, political science, international relations, philosophy, history and business management. This volume will also be indispensable for postgraduate courses dealing with quantitative techniques and data analysis. Rajat Acharyya is Professor of Economics at Jadavpur University, Kolkata, and Director (additional charge) of UGC-Human Resource Development Centre, Jadavpur University, India. He was the former Dean, Faculty of Arts, Jadavpur University (2013–2016). Professor Acharyya received his MSc (Economics) degree from Calcutta University in 1990 and PhD (Economics) degree from Jadavpur University in 1996. He was a Ford Foundation postdoctoral fellow at Rochester University, New York, USA, during 1997–1998. Professor Acharyya has written five books and published more than 60 articles in journals and edited volumes. His recent books include International Trade and Economic Development (co-authored with Saibal Kar, 2014) and International Economics: Theory and Policy (2013). He was awarded the EXIM Bank International Trade Research Award in 1997, Global Development Network (Washington D.C.) Research Medal in 2003, the Mahalanobis Memorial Medal in 2006 and Shikhsaratna (Best University Teacher) Award by the Government of West Bengal in 2016. Nandan Bhattacharya is Assistant Director of the UGC-Human Resource Development Centre, Jadavpur University, India. Dr Bhattacharya received his MSc (Zoology) degree in 1992 and PhD (Zoology) degree in 2004 from Vidyasagar University, India. He has published several articles in different reputed journals and delivered lectures at different colleges and institutes of higher learning within and outside West Bengal. He has coordinated and designed course curriculums for orientation programmes, workshops and short-term courses specially conducted for college and university teachers/librarians under the UGC Guidelines. His areas of research interest include ecology, education and communication skill development, amongst many others.
Contemporary Issues in Social Science Research Series editors: Rajat Acharyya and Nandan Bhattacharya UGC-Human Resource Development Centre, Jadavpur University, India
Contemporary Issues in Social Science Research is a series dedicated to the advancement of academic research and practice on emerging 21st-century social and cultural themes. It explores fresh perspectives on a legion of interdisciplinary social science themes connecting subject areas that have hitherto been unexplored, underdeveloped or overlooked. This series aims to provide scholars, researchers and students a ready reference for the new and developing in social science academia which has come into the fore as focal points of debate and discussion today. Research Methodology for Social Sciences Edited by Rajat Acharyya and Nandan Bhattacharya Peace and Conflict Studies Theory and Practice Edited by Shibashis Chatterjee and Anindya Jyoti Majumdar For more information about this series, please visit www.routledge.com/ Contemporary-Issues-in-Social-Science-Research/book-series/CISSC
RESEARCH METHODOLOGY FOR SOCIAL SCIENCES
Edited by Rajat Acharyya and Nandan Bhattacharya
First published 2020 by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN and by Routledge 52 Vanderbilt Avenue, New York, NY 10017 Routledge is an imprint of the Taylor & Francis Group, an informa business © 2020 selection and editorial matter, Human Resource Development Centre, Jadavpur University; individual chapters, the contributors The right of Rajat Acharyya and Nandan Bhattacharya to be identified as the authors of the editorial material, and of the authors for their individual chapters, has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication Data A catalog record has been requested for this book ISBN: 978-1-138-39051-5 (hbk) ISBN: 978-0-367-40984-5 (pbk) ISBN: 978-0-367-81034-4 (ebk) Typeset in Bembo by Apex CoVantage, LLC
CONTENTS
List of figures vii viii List of tables List of contributors ix Forewordxiii Editors’ note xiv Introduction Rajat Acharyya
1
PART I
Epistemological issues
7
1 Methodological or epistemological issues in social research Achin Chakraborty
9
PART II
Debates in research methods
25
2 Towards a pragmatic centre: debates on qualitative methodology27 Samita Sen 3 Ethnographic fieldwork: the predicaments and possibilities Amites Mukhopadhyay
44
vi Contents
4 Diversity in economics: an examination and defence of heterodox approach Soumik Sarkar and Anjan Chakrabarti
57
PART III
Methods of conflict analysis and policy evaluation
89
5 Game theory: strategy design in conflict situations Swapnendu Banerjee
91
6 Impact evaluation: a simple need and a difficult choice of methodology Arijita Dutta
111
7 Construction of different types of indices in social science research: some numerical examples Sushil Kr. Haldar
122
PART IV
Quantitative research methods and predictive analysis
165
8 Designing a primary survey-based research Tanmoyee Banerjee (Chatterjee)
167
9 Sampling methods: a survey Malabika Roy
181
10 An introduction to statistical inference Sugata Sen Roy
206
11 Problems of endogeneity in social science research Arpita Ghose
218
12 Quantitative methods for qualitative variables in social science: an introduction Ajitava Raychaudhuri
253
FIGURES
5.1 5.2 5.3 5.4 5.5 6.1 6.2 7.1 7.2 8.1 9.1 9.2 11.1 11.2
Payoffs in battle of sexes game Payoffs in prisoners’ dilemma game Payoffs in matching pennies game Payoffs in stag hunt game Dynamic battle of sexes game Impact evaluation: a graphical presentation Before-after comparison Lorenz curve Concentration curve Box plot diagram Different methods of sampling Snowball sampling Scatter diagram of quantity and price Identification of demand function
94 96 97 98 100 113 114 144 151 179 188 199 228 230
TABLES
4.1 7.1 7.2 7.3 7.4 7.5
Snapshot view of difference in economic theories 76 Goalposts for the GDI 128 Gender-specific human development indicators 129 Income (in Rs.) earned by members per day of different social groups 131 Poverty amongst different social groups 132 Dimensions and indicators of multidimensional poverty at the household level 137 7.6 Multidimensional poverty at the household level 138 7.7 SAHS data by income 153 7.8 Computations for estimation of CI 153 7.9 Estimation of CI for grouped data 154 7.10 Parameters of human poverty across states, 2015–16 156 7.11 HPI across sixteen major states using Anand and Sen (1997) methodology156 7.12 HPI (weighted and un-weighted) and the ranks of states 157 A.7.1 SPSS output of PCA 162 A.7.2 Factor loadings 162 9.1 Example of probability sampling 187 9.2 Example of SRSR 189 9.3 All possible samples 189 9.4 All possible samples with probability and mean 190 9.5 Samples under systematic sampling 194 9.6 Systematic sampling with non-integer N/n195 A.9.1 Probability distribution of sample mean 203 A.9.2 Stratification 204 10.1 Some standard distributions and their parameter estimators 210 12.1 Elasticities of change in lowest and highest probabilities (the marginal effects) 266
CONTRIBUTORS
Swapnendu Banerjee is Professor of Economics, Jadavpur University, Kolkata, India. His areas of interest include microeconomic theory, game theory and economics of contracts. He completed his PhD from Jadavpur University and subsequently did his postdoctoral research from the National University of Singapore (2004–2005) and London School of Economics (2016). He has published extensively in reputed international journals and given presentations and invited lectures at places like Cornell, LSE, Birmingham, Nottingham, National University of Singapore, Indian Statistical Institute, Delhi School of Economics, IIM Kolkata, IIM Bangalore and Indira Gandhi Institute of Development Research, amongst others. Tanmoyee Banerjee (Chatterjee) is Professor of Economics, Jadavpur University, Kolkata, India. Her areas of interest are industrial organization, microeconomic theory, micro econometrics, microfinance, financial inclusion and gender analysis and economic growth. She has undertaken various empirical research projects funded by national agencies of India, such as ICSSR and UGC. She has published extensively in reputed international journals. Anjan Chakrabarti completed his MSc in Economics from University of Calcutta and his PhD in Economics from University of California, Riverside. He is currently Professor of Economics, University of Calcutta, India. His interests span Marxian theory, development economics, Indian economics, financial economics, history of economic ideas and political philosophy. He has to his credit eight books and has published over 60 articles in peer-reviewed academic journals, edited books and handbooks. His latest co-authored book is The Indian Economy in Transition: Globalization, Capitalism and Development. He has published in journals such as Cambridge Journal of Economics, Rethinking Marxism, Economic and Political Weekly, Journal of Asset Management, Collegium Anthropologicum and Critical Sociology. He is presently
x Contributors
a member of the editorial board of Rethinking Marxism and Marxism 21. He is the recipient of Dr V.K.R.V. Rao Prize in Social Science Research in Economics for the year 2008. Achin Chakraborty is Professor of Economics and currently the Director of the
Institute of Development Studies Kolkata (IDSK) in India, specializing in welfare economics, development economics, political economy and methodology of social science. Chakraborty obtained his PhD in economics from the University of California at Riverside, USA. He has published widely in journals such as Economic Theory, Social Indicators Research, Journal of Quantitative Economics, Environment and Development Economics, Economics Bulletin, Economic and Political Weekly and others. He has co-edited two recently published books, The Land Question in India: State, Dispossession and Capitalist Transition (2017) and Changing Contexts and Shifting Roles of the Indian State (2019). He has also co-authored the recently published book Limits of Bargaining: Capital, Labour and the State in Contemporary India (2019). Arijita Dutta is Professor of Economics in Calcutta University, India. An alumnus of the Centre for Economic Studies and Planning, Jawaharlal Nehru University, New Delhi, and the University of Calcutta, Arijita’s research interests are health economics, quantitative development economics and econometrics. Prof. Dutta has many publications in national and international journals to her credit and has completed many research projects. She has been closely associated with health policy making of the state of West Bengal and has published policy documents on health care in West Bengal. She has taught in a number of European universities and has been a visiting researcher at UNU Merit, Maastricht, the Netherlands, and the University of Bonn, Germany. Arpita Ghose is Professor of Economics, Jadavpur University, Kolkata, and the for-
mer head of the Department of Economics, Jadavpur University. She also taught at the Department of Economics of University of Calcutta and Presidency University as guest faculty. She is an associate editor of Trade and Development Review, a peerreviewed e-journal published by the Department of Economics, Jadavpur University. She earned her PhD from Indian Statistical Institute Kolkata and worked in the area of dynamic disequilibrium macro models. She has published her research papers in reputed international and national journals and in different edited volumes from renowned international publishers in the area of computable general equilibrium models, productivity and efficiency analysis using data envelopment approach, stochastic frontier models, applied time series analysis and economies of education and health. She has authored three books and edited two books with reputed publishers. She has presented research papers and chaired sessions at different international and national conferences, organized national and international seminars/conferences/ workshops, completed project funded by DFID-UNCTAD-Government of India for Jadavpur University’s Trade Research Capacity building programme, Department of Health and Family Welfare, Government of West Bengal, UGC, ICSSR,
Contributors xi
amongst others, and supervised many PhD students. Her major areas of interest include econometrics, macroeconomics, and empirical models, including international trade, economies of social sector and regional development. Sushil Kr. Haldar is Professor of Economics at Jadavpur University, India, since
2004. His areas of teaching and research are research methodology in social sciences and economics of social sectors, including economics of health, education, poverty and human vulnerability. He has carried out numerous projects funded by the European Union, UNDP, University Grants Commission – New Delhi, Centre for Advanced Studies – Economics Department and UPE – Jadavpur University. He has supervised many PhD and MPhil students and authored several research articles in national and international journals and edited volumes in books. Amites Mukhopadhyay teaches Sociology at Jadavpur University, Kolkata, India.
Trained as an anthropologist, he works on ecology, landscape and governance in India, coastal wetlands in particular. Apart from publications in academic journals and edited volumes, Mukhopadhyay authored Living with Disasters: Communities and Development in the Indian Sundarbans, published in 2016. His research interests include narratives of governance and development, identity politics and institutional histories, science, and knowledge making in the South Asian context. Ajitava Raychaudhuri is a Professor of Economics, former Head and former
Coordinator of Centre for Advanced Studies, Department of Economics, Jadavpur University, Kolkata, India. He held Chair Professor (honorary) of Planning and Development Unit sponsored by NITI Aayog. He was a senior Fulbright Scholar at Economic Growth Center,Yale University and Reid Hall Scholar in Residence of Columbia University, Paris Campus. He has undertaken several important projects from UNESCAP, ADB, World Bank, UNDP, UNCTAD, PEP, GDN, SANEI, Finance Commission, Planning Commission etc. He has also written numerous papers in international and national journals and edited books. Amongst his notable books are India’s Exports: An Analytical Study (with Sugata Marjit) and International Trade in Services in India: Implications for Growth and Inequality in a Globalising World (with Prabir De). His latest book (co-edited with Prabir De) is Myanmar’s Integration with the world: Challenges and Policy Options. He also co-authored two District Human Development Reports of Nadia and Purba Medinipur, published by UNDP and Planning Commission. Malabika Roy is Professor of Economics at Jadavpur University, Kolkata. She received her PhD from the Indian Statistical Institute, India. Prof. Roy has conducted projects funded by organizations like UGC and ICSSR and has acted as a consultant for ESCAP, UN and ILO. She has published widely in reputed journals – both national and international. Her areas of interest in research and teaching are corporate finance, functioning of financial institutions, issues related to finance and development and industrial organization, especially in the context of developing economies.
xii Contributors
Soumik Sarkar is pursuing PhD in economics at University of Calcutta and work-
ing as an economist in the Indian Chamber of Commerce (ICC), Kolkata, India. His research interest is in political economy and Indian economics. He has to his credit articles on the broad areas of Marxian theory and Indian State and is one of the translators for The Country of First Boys by Amartya Sen. Samita Sen is Vere Harmsworth Professor of Imperial and Naval History at Cam-
bridge University, UK. She received her PhD from Cambridge in 1992. She was the first Vice Chancellor of Diamond Harbour Women’s University (2013–2015), and Dean of the Faculty of Interdisciplinary Studies, Law and Management at Jadavpur University (2016–2017). She has been taught at Calcutta University and Jadavpur University. Her monograph on women’s employment in the jute industry in colonial Bengal was published in 1999 and won the Trevor Reese Prize in Commonwealth History. She is at present working on women’s migration in relation to tea and overseas plantations. Her various research publications cover issues such as education, the women’s movement, marriage, domestic violence, domestic workers, women in governance and women’s land rights. Her recent publications include the jointly written Domestic Days (2016) and edited volumes Passage to Bondage (2016), Intimate Others (2010) and Mapping the Field (2011). Sugata Sen Roy did his BSc with major in Statistics from Presidency College,
Kolkata, India. He then did his MSc in Statistics and subsequently my PhD from the University of Calcutta. He joined the Department of Statistics, University of Calcutta, as a lecturer in 1989 and is currently serving as a Professor and the Head of the department. He has also been a visiting professor in Indian and foreign universities/institutes. His research interests are primarily in the areas of time series analysis, regression analysis, survival analysis, development statistics, applied multivariate analysis and functional data analysis and he has guided research students in these areas. He has also been involved in collaborative work with faculties from other institutes and universities in India and abroad.
FOREWORD
I congratulate Prof. Rajat Acharyya and his colleague Dr. Nandan Bhattacharya at the UGC-Human Resource Development Centre of our University for initiating a series on Contemporary Issues in Social Science Research based on lectures delivered by experts in teacher-training programmes, such as orientation courses, refresher courses and short-term courses. The present volume is the first in the series. It addresses the crucial issue of research methodology for social sciences. Social science by its nature is subjective; any problematic in this discipline can be viewed differently, and the more scholarly debates there are, the more enriched the discipline becomes. But such debating exercises need to be undertaken on the basis of scientific research methods. This entails constant refinement of research methodologies in social sciences, and the present collection gains a particular significance in this perspective. At the same time, increasing adoptions of interdisciplinary approaches in social science research are making irrelevant the research methods and tools of analysis defined by traditional rigid confinements of particular subjects, such as economics, political science, sociology or history. But in the name of interdisciplinarity the specific flavour of a particular branch of social science need not be undermined. In terms of social science research methodology what is perhaps thus required is a fine blending of the particularity of a subject and interdisciplinarity of the discipline. I hope in this context, too, this volume will break new ground. I understand that selections on Peace and Conflict Studies:Theory and Practice, History and Philosophy of Science, Gender Studies and Disaster Management have been planned under the present series.We eagerly look forward to their publications. On my own behalf and on behalf of the university I also sincerely thank Routledge India for collaborating with the UGC-HRDC unit of Jadavpur University to publish this collection, which should benefit both the reading public in general and social scientists in particular. – Suranjan Das, Vice-Chancellor, Jadavpur University
EDITORS’ NOTE
A wide variety of methods and techniques are applied in social sciences to analyse social and economic phenomena. Such methods range from census survey data to the analysis of a single agent’s social behaviour, from documentation of stylized facts and case studies to rigorous statistical and empirical analyses and from collecting data from the field to analyses of secondary data. This volume on Research Methodology for Social Sciences takes researchers and market analysts through concepts, techniques and tools at different stages of research. With epistemological issues, debates over qualitative versus evidence-based quantitative research, survey design, choice of sample, methods of data collection, construction of indices, statistical inferences and quantitative analyses of both qualitative and quantitative date put together, the volume provides a useful guide for researchers in – but not limited to – the fields of commerce, economics, sociology, political science, international relations, strategic studies and history. The contributors in this volume are experts in their respective fields who have developed the respective chapters based on their lectures delivered at the UGC-sponsored Short Term Courses on Research Methodology organized by the Department of Economics, Jadavpur University, in collaboration with the UGCHuman Resource Development Centre, Jadavpur University during 2014–2016. The courses were targeted for and attended by MPhil and PhD research scholars working at different universities and institutes in India. Understandably, their feedback and observations on the lectures have helped the authors in developing their lectures into chapters for this volume. We take this opportunity to thank all the authors for their support and cooperation to undertake this endeavour. Without their contributions it would not have been possible to bring out this volume. Comments from the anonymous external reviewers engaged by Routledge India had been extremely useful as well. We also thank Ms Shoma Choudhury, commissioning manager of Routledge India, for her
Editors’ note xv
continuous encouragement and advice right from the inception of the series and this volume. Excellent support from Ms Brinda Sen and other technical staff at Routledge cannot be forgotten as well. Our sincere thanks go to Prof. Suranjan Das,Vice Chancellor of Jadavpur University, since it was his idea to disseminate knowledge, generated in academic programmes conducted by the UGC-Human Resource Development Centre of Jadavpur University, through publication of edited volumes like this. He has always been a source of inspiration for us in endeavours like this. We also thank Dr Pradip Kumar Ghosh, Pro Vice Chancellor, Jadavpur University; Sri Gour Krishna Pattanayak, Finance Officer, Jadavpur University; and external members of the Local Programme Planning and Management Committee of UGC-HRDC, Jadavpur University, for all their help and encouragement. We also cannot but happily remember the support provided by all the staff at HRDC, particularly, Mr Prabir Chatterjee and Ms Chaitali Mukherjee, who made the process of publication of this volume smoother. Last but not least, we thank Ms Shrimoyee Ganguly, research scholar at the Department of Economics, Jadavpur University, for her academic inputs as well as conceptualization of the cover image for the book series, and Mr Prithwish Bhowmick of Amar Ekushe for logistic support in this regard. Rajat Acharyya Nandan Bhattacharya May 2019
INTRODUCTION Rajat Acharyya
Over the years research methodology as a subject matter of study has gained growing importance not only in pure academic discourses but also for research students as it has been enforced upon the institutes of higher learning in India by the regulatory bodies, such as the University Grants Commission, as a necessary precondition for research degrees. But this aura of regulation in research, emphasizing too much the techniques and measurement, has somewhat bypassed more fundamental questions, such as what exactly constitutes research methodology in social sciences and debates over quantitative and qualitative methodologies. At the same time, there have also been rather insufficient academic attempts to put together whatever academic and scientific discourses that we have on both these dimensions and how the methods and tools at hand can be used for research questions that cut across different disciplines and fall in wider socio-political-economic contexts. In the Indian context, insufficiency of academic discourses becomes even more apparent. Though there has indeed been some insightful exploration of debates over research methodology in the social and cultural contexts of India, there has been no sustained engagement. Research methodology in social sciences is not and cannot be uniquely defined. Research methodologies are as diverse as the different disciplines or paradigms in social sciences. However, while research methodologies or techniques may differ widely in respect of approach, objective, language and expression, the common trait is that it intends to train the researcher how to examine whether any causal relationship exists between different events or observations; how to evaluate, interpret or predict outcome of an event; and how to produce objective or subjective knowledge from occurrence of events and observations of facts. Within this broad perspective of research methodologies, there are two distinctively different but equally important approaches. One is the qualitative, which focuses on reconstruction of causal-relationship or impact-evaluation through logical reasoning based on different accounts of observed facts and their subjective
2 Rajat Acharyya
evaluations by other researchers; the other is the quantitative techniques, which delves into the construction of causal-relationship through quantification of observations, events and their outcomes in numeric forms. Whereas the former is primarily interpretative in nature, the latter is more of a predictive kind. Part I of this volume discusses the differences and debates in these two approaches and the epistemological issues involved in research methodologies in general. Part II deliberates upon debates on research methods, and Part III discusses methods of conflict analysis and evaluation of development programmes. Part IV of the volume introduces different quantitative techniques of research and how those can be used in analysing events data of both qualitative and quantitative nature and research questions in larger contexts that go beyond the boundaries of specific disciplines. In the first chapter of this volume, ‘Methodological or epistemological issues in social research’, Achin Chakraborty raises certain epistemological issues critical to the understanding of diversity in research methods in social sciences. He argues that the ‘actual practice of research in social sciences is too rich in diversity and innovativeness to be disciplined by a few prescriptive norms’. There are several distinct motivations as well which drive research inquiries in social sciences. In this context, he distinguishes between explanatory (the ‘why’ question) and descriptive (the ‘what’ question) analysis or research questions.Then there is the research question of ‘evaluation and assessment of the good and the bad’ that leads to evaluative inquiry. All these types of research inquiries that he discusses fall in the positivist paradigm. Part II of the volume consists of three chapters. In her chapter, ‘Towards a pragmatic centre: debates on qualitative methodology’, Samita Sen emphasizes the fundamental debates on research methodology across the world, that concerns fundamental questions about whether there is a need and a role of methodology in social science research, the paradigm wars and the more prosaic debate between quantitative and qualitative methodologies, all of which are closely interconnected. She talks about interconnected paradigm wars of three kinds. First is the research methodology versus anti-methodological traditional scholarship, interpretation or judgements versus evidence. The second paradigm war draws us to the qualitative versus quantitative methods and scepticism over recent attempts to combine both. The third one is conflict between evidence-based methodologies, mixed methods, interpretive and critical theory schools. These conflicts have been taken as the entry point for discussion of five commonly used qualitative methods of social sciences: ethnography and narrative methods, mixed method and grounded theory and interviews and focus group discussions in the context of feminist method. In ‘Ethnographic fieldwork: the predicaments and possibilities’, Amites Mukhopadhyay argues how conditions of governance under colonialism led to ethnography’s interest in other cultures. He provides an account of ethnography’s emergence in the Indian context and documents the debates of the late 1980s when ethnography as a positivist practice came under the scanner. Such debates marked a critical rethinking of ethnography as an instrument of anthropological imagination. The question of diversity in research, particularly in the discipline of economics, has been analysed by Soumik Sarkar and Anjan Chakrabarti in their chapter,
Introduction 3
‘Diversity in economics: an examination and defence of heterodox approach’. Often a normative perspective of research may warrant a heterodox approach, but what they object to is that theorizing the economy cannot but be heterodox.The authors put two epistemological questions at the centre of their discussion: ‘what is theory?’ and ‘what distinguishes one theory from another?’ They argue that the contrasting forms of determinism and non-determinism and entry point serve as the foundation of constructing different theories and of inter-theoretical comparison. Part III of the volume focuses on theories of conflict analysis in social sciences, measuring development, and evaluation of public development programmes. In real life, we encounter conflicts in every stage of our decision making, whether economic decisions, political decisions or decisions of any kind and in any sphere of life. Potential conflict situations arise because as individuals or as group-members (whether it is a social or an economic group we belong to), we pursue our selfinterests and our aspirations, capabilities and possessions of means to achieve our interests different from each other. Pursuit of self-interest is conceived through individuals having characterized as rational agents, and this is a building block of a wide spectrum of social science research. It is because our self-interests differ and conflict with those of others and that we are rational agents and behave noncooperatively even in situations when cooperation could have been better for the aggregate. Game theory provides us a tool for choosing strategies in best pursuance of our self-interest in conflict situations and non-cooperative environment. Such tools become more engaging when we do not have sufficient or complete information regarding the environment, including other rational agents whom we interact with. Swapnendu Bandyopadhyay, in his chapter on ‘Game theory: strategy design in conflict situations’, introduces researchers to such game theoretic tools in a wide variety of situations, such as international relations and political diplomacy, power supremacy, crime detection, ethics and morality. With non-technical introduction of optimal choices of non-cooperative strategies by two agents in terms of wellknown and generalized games, he elaborates upon their applications in specific cases like regional conflict, bribery and corruption, surrogate motherhood and auction. In social sciences one major research question is assessment or evaluation of benefits of a development policy, whether it is a poverty eradication programme, a health programme, a targeted education programmes (such as Kanyashree Prakalpa), an employment generation programme or a targeted policy of social inclusion. Impact evaluation is a technique of evidence-based research whereby acceptability of a development programme amongst potential beneficiaries and the long-term outcomes of it are measured and assessed. Arijit Dutta, in her chapter titled ‘Impact evaluation: A simple need and a difficult choice of methodology’, discusses challenges of impact assessment, construction of counterfactuals and associated problems and different methodologies of impact evaluation and their uses in India, such as randomized control trials, propensity score matching and difference-indifference methods. Construction of indices and studying their trends is an alternative approach of evaluating and predicting outcomes of a public policy programme. Indices are
4 Rajat Acharyya
useful constructions to measure multi-dimensional socio-economic and demographic characters of an economy and its population, such as poverty, health, education and human development, in terms of a one-dimensional numeric value.The issues assume further relevance in social sciences as we often confront both qualitative as well as quantitative variables. The qualitative variable is measured ordinally, and it is subject to fluctuation and different scales or degrees in perception-based study. Given these dimensions, appropriately constructed indices reflect relative position of an individual based on various numeric and non-numeric characteristics. In ‘Construction of different types of indices: some numerical examples’, Sushil Haldar elaborates on the construction of different indices, their theoretical underpinnings and rationale and their limitations in reflecting different dimensions of development. Chapters in Part IV of the book discuss different dimensions of evidence-based research methods and predictive analysis. Data, which may be both quantitative and qualitative, is the key element of it; as such, at the core of evidence-based research lies the method of data collection. However, in contrast to observation as a method of data collection in qualitative research, survey and questionnaire are the main forms of method of data collection in the quantitative research.Tanmoyee Banerjee (Chatterjee), in the chapter on ‘Designing a primary survey-based research’, discusses different aspects of primary survey to collect information regarding socioeconomic features of a group of individuals on which official sources of information either are insufficient or shed no light. Such aspects concern ethical issues related to primary survey-based research, a comparison of different modes of survey – such as face-to-face and telephonic interviews and email – and finally design of questionnaire depending upon the research question at hand. In evidence-based quantitative research, though the target is to make predictions about a population, surveys for data collection cannot be made over the entire target population, except in case of census, simply because of the time, effort and money that such complete enumerations will take. But census data may not always help a researcher get his or her required information either since these capture only decadal information and may not cover all dimensions of socio-economic attributes of population.Thus, surveys are conducted on a sample or sub-group of population with some representative attributes of the entire population. The issue at hand then is how to select the sample of population in the best way to make a prediction or an inference of the population characteristics based on information collected on characteristics of the sample to address a set of research questions. There are different sampling techniques, and applicability of these techniques varies with the nature of the research question at hand. Malabika Roy introduces the researchers to these techniques and their applicability in her chapter titled ‘Sampling methods: a survey’. Most relevantly, in the context of this volume, she discusses non-random sampling methods that are more useful to address research questions in wide areas of social sciences but which are not covered adequately in standard text books, in addition to probability sampling, which is the most adopted method in physical sciences and in economics.
Introduction 5
Having designed the suitable sample and collected all information from that sample of population relevant for the research question at hand, a researcher needs to draw conclusions about the population characteristics based on the sample observations. Statistical inference concerns itself with this dimension of the evidence-based quantitative research. There are two dimensions: estimation and hypothesis testing. In his chapter, ‘An introduction to statistical inference’, Sugata Sen Roy elaborates upon these dimensions with the help of quite a wide variety of examples. His discussion of estimation and hypothesis testing also takes the readers through some essential mathematical rigour. In a predictive analysis, the major problem faced by a researcher is the direction of cause-and-effect relationships between events (or variables capturing such events) and identification of such relationships from observations or data on such events. This problem is prevalent not only in economics but also in a wide variety of disciplines, such as demography, sociology, physical education and development studies. An endogeneity bias arises when we estimate an incorrectly presumed unidirectional causality. In the chapter on the ‘Problem of endogeneity in social science research’, Arpita Ghose discusses the nature, sources and consequences of such endogeneity bias. The chapter also highlights two related issues: the problem of identification, which is concerned with whether all the parameters of the system can successfully be estimated, and the endogeneity problem arising out of omitted variables and measurement errors. The last chapter of the volume, titled ‘Quantitative methods for qualitative variables in social science: an introduction’, Ajitava Raychaudhuri introduces the quantitative techniques for analysing qualitative data. Qualitative data may take a wide variety of forms, such as binary form – in research questions such as probability of survival of cancer patients, for example, the variable under consideration would be whether the patient has survived or not – or ordered variables in terms of rank. In such cases standard methods of quantitative analysis fails, and one needs special techniques to estimate and analyse statistical relationships from the qualitative data. Ajitava Raychauduri introduces the researchers to two such techniques, called logit and probit, which transform discrete binary variables into continuous variables that are amenable to statistical estimations.
PART I
Epistemological issues
1 METHODOLOGICAL OR EPISTEMOLOGICAL ISSUES IN SOCIAL RESEARCH Achin Chakraborty
1 Introduction The courses in research methodology seem to have been driven by the widely held notion that carefully drawn methodological principles would tell us how to do research scientifically (or social-scientifically?). And once they are drawn, the next obvious step would be to appraise an actual piece of research or a research programme in terms of those principles. In other words, the logical sequence turns out to be from a set of prescriptive principles to the practice that is supposed to follow those principles. In economics, for example, philosophers of science were believed to hold the key to how to do ‘economic science’, even though several groups outside mainstream economics (e.g. Marxists, Austrians, Institutionalists) have had their shared methodological approaches. What has been common amongst the practitioners of economic research – both within and outside the mainstream – is that both sides have viewed methodology as offering a set of prescriptions on what constitutes legitimate practice. In this view, the common concern of methodological discussions is essentially normative and is based on philosophers’ attempt to justify knowledge claims. In this chapter, we take the opposite route.We argue that there is enough evidence to show that actual practice of research in social sciences is too rich in diversity and innovativeness to be disciplined by a few prescriptive norms.Therefore, in section 2 we begin with the invocation of the diversity of practices and a rough classification of different types of research inquiries, each of which is ostensibly driven by a different motivation. Research inquiries are not always explanatory – or as economists tend to suggest, predictive. Apart from explanation and prediction there are several other motivations that drive social research. A major area, for example, deals with the normative issues involved in assessing states of affairs or changes therein. For example, an issue like how development of a country or a region is to be assessed is
10 Achin Chakraborty
evaluative in nature. Of course, the brief account of different types of social research presented here is far from exhaustive. In section 3 we discuss how an explanatory kind of research question is dealt with within the positivist-empiricist framework. In particular, the respective roles of theory (or explanatory framework), data and method are discussed. In section 4, we raise a few issues about normative-evaluative kind of research. In section 5, we briefly discuss the post-positivist approaches in social research, and in section 6 we conclude.
2 From practice to methodology The commonplace view about social research is overwhelmingly explanation- oriented, where the central question is ‘why’. Why is the labour force participation rate of women low in India? Why are some states better at human development than others? Why have so many farmers committed suicide in India in the recent past? Answers to these questions take a causal form, even though the method usually deployed to establish a causal explanation can accomplish the job only imperfectly. Nevertheless, most policy discussions are based on some understanding of the causes and their effects on various outcomes. In other words, the essential nature of inquiry here is explanatory. Inquiries of this kind end up indicating or ‘establishing’ some causal connections between choices or actions of agents (individuals, groups, governments, corporations etc.) and outcomes. However, the self-conscious practitioners of statistical or econometric techniques know rather well that at best their techniques establish some association between variables, rather than a causal connection. One must take a big leap of faith to claim an associational observation between, say, x and y, as a causal one, even though certain econometric techniques, such as the Granger causality test, claim to establish causal connections between variables. Thus, we might self-consciously seek to establish some association between entities, in which case the nature of inquiry would be associational or relational, rather than explanatory. Besides explanation and finding association, one can identify several other motivations which drive research inquiries. When a study is designed primarily to describe what is going on or what exists, without entering into the analysis of underlying relationships or causal connections that are not so apparent, it is descriptive. A question such as ‘how has GDP of India grown in the post-reform period’ falls in this category. To answer this question one has to describe the pattern of growth in India’s GDP between, say, 1991 and the present. However, there is no such thing as ‘pure description’, as description involves conscious methodological choice.1 As in this apparently simple question, one has to decide on whether the average annual rate of growth or the trend rate of growth should be calculated, whether the period should be divided into sub-periods and the average or the trend growth rates in the sub-periods should be noted and so on. Different underlying motivations seem to dominate different disciplines. In economics, for instance, prediction is considered to be the most important motivation behind theoretical and empirical inquiry. In mainstream economics, the standard
Methodological or epistemological issues 11
methodological route is to set up a model of behaviour of agents (individuals, firms etc.). Starting from a set of axioms about behaviour of the agents, conclusions are derived using deductive mathematical logic. The methodological approach is therefore called hypothetico-deductive. Such models based on deductive logic are expected to predict future outcome.This dominant view was made explicit by Milton Friedman in his widely known paper ‘The Methodology of Positive Economics’ (Friedman [1953]). Friedman argued that the assumptions made by economists while modelling individual behaviour should be judged ‘by seeing whether the theory works, which means whether it yields sufficiently accurate predictions’, not by the ‘realism’ of the assumptions. Amartya Sen, however, holds a different view: Prediction is not the only exercise with which economics is concerned. Prescription has always been one of the major activities in economics, and it is natural that this should have been the case. Even the origin of the subject of political economy, of which economics is the modern version, was clearly related to the need for advice on what is to be done on economic matters. Any prescriptive activity must, of course, go well beyond pure prediction, because no prescription can be made without evaluation and an assessment of the good and the bad. (Sen [1986, p. 3]) Thus, ‘evaluation and an assessment of the good and the bad’ gives rise to yet another altogether different kind of inquiry, which is evaluative. For an evaluative inquiry one applies certain normative criteria to judge states of affairs. For example, a question such as, ‘Is gender inequality more in country A than in country B?’ apparently falls in the descriptive category. But on closer scrutiny, it becomes clear that there is no obvious way of assessing gender inequality with a comparative perspective. Even if one restricts oneself to this question, ignoring such related questions as why gender inequality is more in one country than in another, it turns out to be non-trivial as explicit value judgements with moral philosophic underpinnings are deeply involved. Amartya Sen often makes a distinction between evaluative2 exercises and descriptive-analytic or predictive-prescriptive exercises, as in the earlier quotation, and emphatically points out that the motivation behind the evaluative type of inquiry is no less important than that behind others. The entire theoretical literature on measurement of inequality, poverty and human development falls in this category. All these types of research inquiries briefly described here roughly fall in the paradigm which can be called positivist. In the next section we elaborate on the notion of paradigm and the epistemology of positivism.
3 Positivist-empiricist practice There is no simple formula to establish any connection between specific ‘causes’ and ‘effects’. Three basic ingredients of social research are (1) some ideas about
12 Achin Chakraborty
how things are or how change takes place, (2) data or observations on ‘facts’ and (3) methods that integrate ideas and observations. By method we mean a set of tools or techniques informed by an approach which is applied in a research inquiry. But methodology is concerned with the framework within which particular methods are appraised. In other words, methodology deals with the broader question of ‘how we know what we know’ and is somewhat close in meaning to what we understand as epistemology. Ideas are obtained from various theories. They may often look like commonsense. But if they are part of a theoretical framework one can expect logical coherence in the ideas, which commonsense does not guarantee. What is theory? Before we come up with an imprecise answer to this question, it would be helpful if we accept that theory can be defined only within a paradigm. Roughly speaking, a paradigm is a combination of a set of underlying beliefs about the ways things are and specific ways of inquiring about how things are, how they change, how they are connected with or influenced by each other and so on. In other words, a paradigm can be identified with specific ontological and epistemological positions. For many of us who work in what is loosely called ‘development research’, a kind of positivism seems to be the underlying paradigm. In this version of positivism the core belief is that reality is out there and that by gathering ‘facts’ it is possible to find out what is happening in reality. The researcher is assumed to stand apart from the observed and produce objective knowledge. How does he or she go about it? First, the researcher identifies separate aspects of reality and expresses them as ‘variables’. Then he or she goes on examining the relationships between variables. This involves both observation and reasoning based on arguments acceptable within the paradigm.Within the positivist paradigm, a theory is expected to answer our ‘how’ and ‘why’ questions in the most generalized way with a coherent logical structure. Generalizability is at the core of theoretical statements. Someone with an empirical bent of mind and relying less on theory for illumination often tends to say ‘facts speak for themselves’. As a matter of fact, facts hardly speak for themselves. One has to sort out relevant from irrelevant facts at the outset. Without some prior idea about the nature of the phenomena, without some propositions, assumptions etc., there is no way this can be meaningfully done, according to a positivist. Deciding that observation X or Y is relevant marks the start of a theory. In this paradigm, theory means a logically valid chain of reasoning starting from certain premises called postulates. Postulates are taken as axiomatically given and contain certain terms that are representative of persons, organizations, things, actions, states etc. found in the world of experience. A meaningful analysis presupposes that the terms are unambiguously defined. In this positivist-empiricist paradigm hypothesis testing seems to take the pride of place. It is a commonly held view that any proposed research in social science must specify at the outset the hypotheses to be tested. Admittedly, certain types of social research do require the use of hypotheses. They can be useful in helping to find answers to ‘why’ questions and therefore are developed at the outset to set the direction. However, precise specification of the hypotheses is neither necessary nor appropriate in many cases. In particular, when explanation is expected to come
Methodological or epistemological issues 13
out in the form of a complex web of interconnections and mutual influences, a cut-and-dried kind of hypothesis testing may not give a better insight into a phenomenon. Hypotheses should ideally be derived from a theory of some kind. Hypotheses that are simply based on common sense or intuition without making any reference to the existing state of knowledge rarely make significant contributions to the development of knowledge. Most mainstream economists believe that their methodology is positivist. The philosopher who has had the greatest influence on the methodology of economics is Karl Popper, as evident from the frequent invocation of Popper by economic methodologists such as Mark Blaug (1992). Popper’s philosophy even influenced a major introductory textbook – Richard Lipsey’s An Introduction to Positive Economics. Popper’s philosophy of ‘scientific knowledge’ is concerned with what he calls ‘the problem of demarcation’, i.e. the problem of distinguishing science from nonscience. Popper introduces falsification as the criterion to be applied for demarcation. A statement is in principle falsifiable if it is logically inconsistent with some finite set of true or false observation reports. Popper himself gives an example of a scientific statement: ‘All swans are white’.This is a falsifiable statement since the observation of a non-white swan would establish its falsity.There is an asymmetry between verifiability and falsifiability. A universal statement concerning an unbound domain, such as ‘all swans are white’, may be falsifiable but not verifiable. For example, the observation ‘this swan is black’ falsifies the statement ‘all swans are white’. To verify the statement, we need to observe all possible swans, but the set of all swans is unbounded as it includes, inter alia, swans yet to be born. In other words, it is not possible to verify any truly universal statement, but one can falsify it or verify its negation (Hausman, 1992]. Ironically, economists do not seem to practice what they preach. Hausman criticizes ‘the methodological schizophrenia that is characteristic of contemporary economics, whereby methodological doctrine and practice regularly contradict one another’. We take a different standpoint here, one which fundamentally disagrees with the view that economists should stick to falsificationism as the only criterion to assess knowledge claims in economics. In what follows, we present a glimpse of the variety of theoretical ideas and the corresponding methodological approaches that scholars have actually adopted in social sciences in general and in certain brunches of economics in particular.
4 Explanatory framework and varieties of theory: structure versus agency A broad classification of theoretical approaches in social sciences could be made in terms of the basic unit of analysis.Traditionally, the dominant view in social science happened to be that of understanding the functioning of the social system. From Adam Smith’s An Inquiry into the Nature and Causes of the Wealth of Nations to Max Weber’s The Protestant Ethic and the Spirit of Capitalism, most of the classics in social sciences focused on some kind of social (or economic) system. However, much of
14 Achin Chakraborty
contemporary social research focuses on explaining individual behaviour. With the development of quantitative methods of research, dependence on individual-level data has increased significantly. If one still feels that the functioning of the system should remain the central problem for research inquiry, how does one go about explaining the system? From both ontological and epistemological points of view there can be two broad ways of explaining the system – systemic (or structuralist) and agency-based3. A structuralist mode of explanation generally rejects the view that the social, economic or political structure can be explained entirely as the aggregate of the actions of individual agents. The most well-known statement representative of an extreme form of structuralist explanation is perhaps contained in Marx’s preface to A Contribution to the Critique of Political Economy. In the social production of their life, men enter into definite relations that are indispensable and independent of their will, relations of production which correspond to a definite stage of development of their material productive forces. The sum total of these relations of production constitutes the economic structure of society, the real foundation, on which rises a legal and political superstructure and to which correspond definite forms of social consciousness. The mode of production of material life conditions the social, political and intellectual life process in general. It is not the consciousness of men that determines their being, but, on the contrary, their social being that determines their consciousness. (Marx [1859, 1977]) Here individuals seem to have no ‘agency’ role to play; relations of production corresponding to a definite stage of development of material productive forces determine what they do. Marx continues in the preface to further suggest that it is the interplay of the forces and relations of production that brings about social transformation – transition from one mode of production to another. In a similar vein, more recently, Terry Byres (1997) begins an article on India’s development planning as follows: This is an essay on development planning in post-1947 India, written from a Marxist political economy perspective. It is not an essay about the ‘failure of planning in India’. Nor is its theme Indian planning’s achievements. It is rather a consideration of those contradictions that were inherent in Indian development planning at its inception, which centred on the state and class: contradictions which deepened as planning proceeded; and which continue to demand attention despite planning’s apparent demise. A full assessment of planning in India, Byres demands, must be a historical e xercise – in an analytical political economy sense, not necessarily in a chronological sense. In the analytical political economy sense it must be historical, according to him,
Methodological or epistemological issues 15
in three broad ways: (1) planning must be located with respect to the ‘laws of motion’ or ‘tendencies’ which mediate the Indian social formation, and their attendant contradictions, especially as these relate to accumulation and the sources of accumulation; (2) the instrumentality of planning must be judged in relation to the possibility of transition from economic backwardness to a dominant capitalist mode of production; and (3) a long time horizon, sufficient to talk about the structural, the epochal, the trend – not the ‘moment’ or the conjunctural.The methodological agenda that Byres charts out are clearly very different from the one we introduced at the beginning and do not fall into the familiar theory-method-data kind of positivist-empiricist category. To take another example roughly in the same tradition as that of Byres, the key analytical concept that runs through the book by Chandrasekhar and Ghosh (2002) is ‘contradiction’, which has been extensively deployed to analyse the evolution of the Indian economy from the planned phase through ‘neoliberal reform’, using plenty of empirical material. Three mutually reinforcing and interrelated contradictions, which arose out of the various roles that the state had to play, led to the development impasse of the late 1960s and the 1970s, according to Chandrasekhar and Ghosh. The chapter on ‘the political economy of reform’ says in a nutshell that the ‘internal contradictions’ of the earlier policy regime generated increasing support within the powerful and affluent sections of society for changing this regime in the manner desired by the International Monetary Fund and the World Bank. An empirically oriented economist is likely to find this series of contradictions unpersuasive since it cannot be demonstrated easily by appealing to relevant empirical evidence that it is indeed ‘internal contradictions’ that explain the entire evolution of the Indian economy in the past decades. No amount of data is enough to refute the hypothesis of internal contradiction. Chandrasekhar and Ghosh have used a wide range of macroeconomic data to present an excellent narrative of the experience of economic reforms in India. But to what extent their data support the ubiquity of ‘internal contradictions’ remains a question. If one goes through the huge literature on agrarian reform, one is struck by the variety of methodological and epistemological perspectives – mostly implicit but occasionally explicitly articulated – that the scholars have taken. Sometimes they trigger curious sort of exchanges in seminars and conferences.The paper by Griffin et al. (2002) presented at the International Conference on Agrarian Relations and Rural Development in Less-Developed Countries (Kolkata) was labelled as ‘neoclassical’ by several commentators, to which Griffin’s reply was as follows: Several participants have described our analysis as ‘neoclassical’. . . . While labels do not matter terribly, it is slightly puzzling why anyone would think our analysis is neoclassical, given the emphasis we place on non-market clearing, uncompetitive behaviour, multifaceted labour market discrimination, organized interest groups of landowners, the exercise of political power and so on. The American variety of neoclassical economist would disown us!
16 Achin Chakraborty
Clearly, between Byres’s kind of pure Marxist political economy perspective and what Griffin’s commentators call ‘neoclassical’ there could be a variety of perspectives in between. It appears that, while Byres’s perspective is less likely to be contaminated by other perspectives, many would not mind going a step down the system level for more illumination. One significant attempt to capture the long-term development in the capitalist world system in systemic terms, needless to say, is Amiya Kumar Bagchi’s Perilous Passage (Bagchi [2005]). He traces out when and how the divergence between countries occurred. Instead of focusing exclusively on divergences in economic prosperity of the nations he looks at the living standard of people from the human development perspective. With the help of demographic and anthropometric data, he shows that divergence did take place from the Industrial Revolution, although it is not a case of steady divergence throughout the period to date. From the methodological point of view the most significant aspect of the book is its narrative style, eschewing either a rigid form of determinism that is characteristic of Marx’s preface or an empiricism that comes out of positivist thinking. Even though the underlying theme is ‘capitalist world system’, his narrative style seems to be consistent with any perspective that takes as an analytical entry point any level below the capitalist world system (for example, evolution of institutions or behaviour of corporations) (Chakraborty [2008]). In sum, what we have tried to illustrate here is the variety of practices which do not subscribe to the standard positivist-empiricist method that combines theory, data and hypothesis-testing. And second, it is impossible to put different theoretical approaches in water-tight paradigmatic compartments.
5 Contested terrain in evaluative research: composite index as an illustrative example The process of development is essentially qualitative in nature. However, in order to know how a country or a sub-region in a country is doing vis-à-vis others we need to identify certain quantifiable aspects of development. Whereas the expansion of a country’s productive capacity may be considered a necessary condition for development in the long run, it is not sufficient to ensure expansion of the real freedom of people to do what people have reason to value. Development is conceptualized as freedom, and freedom is multi-dimensional. Therefore, any measure of development has to be multi-dimensional. Each dimension is quantitatively represented by an indicator, and the indicators are often put together through certain aggregators to construct a composite index of development. Underlying any composite measure is some assumption regarding how various dimensions of development (or interchangeably ‘well-being’, ‘quality of life’, ‘human development’) are related to the overall index. There are mainly two ways of establishing such a relationship. One is the so-called data-driven method to derive a set of weights. Starting from observed data on the variables to be included in the measure of well-being, a principal component analysis is carried out, and the first principal
Methodological or epistemological issues 17
component, if it explains a significant proportion of the total variability of the variables, is taken to be a composite index of well-being since the first principal component is a linear combination of the original variables.This method does not define any well-being function explicitly. The alternative approach, what we may call the ‘ethical approach’, starts from an explicit well-being function. We discuss a few plausible ethical positions that would give rise to different composite indices. Since the most widely known composite index is the Human Development Index (HDI), popularized by the United Nations Development Programme (UNDP), many of the issues that we discuss naturally draw heavily on the conceptual background of HDI and the lessons one learns from its evolution and application over the past quarter of a century. Essentially there are three steps in construction of a multi-dimensional measure of development. We briefly describe the steps here and illustrate them with the example of HDI.
Identification First, we need to identify the dimensions that we propose to include in the multidimensional measure. The dimensions are selected on the basis of the conceptual framework that underlies the measure. The HDI, for example, measures human development, which is conceptualized as expansion of people’s choices. Following this conceptualization UNDP settles on three dimensions of human development, namely a long and healthy life, access to knowledge and a decent standard of living. The corresponding indicators are life expectancy at birth, a combination of mean years of schooling and expected years of schooling and Gross National Income (GNI) per capita at purchasing power parity (PPP) dollars. It is obvious that the conceptual richness of human development cannot be fully captured by these three dimensions. As a matter of fact, the chief architects of HDI, Amartya Sen and Mahbub ul Haq, both have written explicitly about the crudeness of the index. There are many variables of relevance to human development that are not included in the HDI, such as civil and political rights, nutritional status, autonomy, mobility, freedom from crime and violence and so on. Some of them could be included. But the basic idea was to keep the index simple and manageable. One must acknowledge the dilemma at this point. Should we aspire for a measure that is ‘complete’, scientifically perfect, logically correct and so on? Or should we try to find a measure that may not be perfect but effective for advocacy and policymaking? HDI is believed to strike a balance between the two.4 The income component of HDI has been criticized on the following ground. The conceptual foundation of HDI is based on the concepts of capabilities and functionings. A functioning refers to the state of being of a person or how the person is doing. In this sense, income is not a functioning. It may be instrumental in achieving some functioning, but it is the functioning which is valued or a person’s reason to value. Income has only ‘derivative importance’ rather than any intrinsic importance. The rationale for including income, however, is that it acts as a close, feasible proxy for all choices other than those captured by longevity and knowledge
18 Achin Chakraborty
indicators.The HDI is also criticized for mixing stock variables with flow variables. Life expectancy or expected years of schooling, for example, are stock variables in the sense that they refer to some point in time, not duration of time. We do not say ‘life expectancy per year or per month’. But income is a flow variable. It refers to duration, as income is expressed as ‘income per year’ or ‘per month’, for example.
Valuation Once the dimensions are settled on and the component indicators are identified, the next question is whether the indicator values reflect our ethical intuition about the relative worth of different values. For example, if we interpret income as an indicator of real standard of living, should we consider a country with a per capita income of 80,000 dollars as offering 80 times the living standard offered by another country whose per capita income is 1,000 dollars? We may not. It is a value judgement based on our understanding of the ethical worth of certain indicator values.The HDI treats the income component differently from the other components. Instead of taking the real per capita income in its face value, it takes the logarithm of per capita income. This is a valuational exercise. In other words, the authors of the Human Development Report think that as a country’s per capita income rises, it improves people’s living standard, but at a diminishing rate.Without this logarithmic transformation, for highincome countries the value of the income component would be so large that the value of the composite index would be largely dominated by the income component, which would go against the very purpose of constructing an alternative index.
Aggregation The final step is aggregation.The multiple dimensions produce a vector of numbers. We cannot say whether country A ranks higher than country B in terms of human development achievement, if, for example, the first indicator is higher in A than in B and the value of the second indicator is higher in B than in A. We can compare A with B only in the less likely situations where all the indicators in A show higher values that corresponding values in B. This is the case of ‘vector dominance’. But in most cases of actual comparisons we do not find vector dominance. Therefore, we need to combine the component indicators to construct a scalar number, which is called the composite index. Two broad types of aggregators are cardinal and ordinal. HDI, for example, follows a cardinal aggregation procedure in which geometric mean of the component indicators (after they are normalized) is taken as the composite index. An example of ordinal aggregation is the method proposed by Borda in the context of voting.5 From 2010, UNDP radically changed its method of aggregation – from arithmetic mean to geometric mean. HDI first transforms the indicators into unit-free numbers and then takes the geometric mean of the three transformed variables (United Nations Development Programme, 2016). One implication of the arithmetic mean is that the three components are treated as perfect substitutes. In other
Methodological or epistemological issues 19
words, low longevity and high per capita income are considered as good as high longevity and low per capita income. Low achievement in one dimension is assumed to be perfectly compensated by high achievement in another dimension. Although UNDP had been using the arithmetic mean for almost two decades on the ground of ease of comprehension by policy makers, it was always felt that perfect substitutability between dimensions was not a very reasonable assumption – hence, the realization that the geometric mean would make better sense.The geometric mean has the following implication: if the indicator that has the lowest value makes an improvement by one point, its contribution to the composite index would be more than the same one point improvement in the indicator that has a higher value. In any composite index, we should know what weights are attached to the component dimensions to properly judge if it has got the balance right.The weight in any given dimension can be defined as the index’s first partial derivative (slope) with respect to that dimension. Since the component dimensions have been rescaled so that they lie in the 0–1 interval, what really matters is the relative weights of its component dimensions. In other words, we need to know the assumed tradeoffs, as given by the HDI’s marginal rate of substitution (MRS), i.e. how much of one desired component of the HDI must be given up for an extra unit of another component, keeping the overall index constant. If a policy or economic change entails that one of the positively valued dimensions increases at the expense of another dimension, it is the MRS that tells us whether human development is deemed to have risen or fallen. Martin Ravallion (2010) has pointed out that the implicit trade-offs often go contrary to what our value judgement would suggest. On a priori grounds it is unclear what effect relaxing perfect substitutability between the transformed indicators would have on the trade-offs in the core dimensions. Whether the MRS increases or decreases essentially depends on the data. The HDI’s new aggregation method based on the geometric mean hides partial success amongst countries doing poorly in just one dimension. As dimension X approaches Xmin we see that HDI approaches zero no matter what value is taken by the other dimensions. Consider, for example, Zimbabwe, which had the lowest 2010 HDI, 0.14, and it is the lowest by far, at about 60 per cent of the next lowest. Yet this is due to one component that currently scores very low, namely income. Zimbabwe’s income index value of 0.01 is the lowest of any country by a wide margin (60 per cent of the next lowest value). However, the schooling index value is 0.52 and the longevity index value is 0.43 – both well above the bottom. Indeed, there are 56 countries with a lower schooling index than Zimbabwe’s, yet this relative success is hidden by the HDI’s new aggregation formula, given its multiplicative form. Using the arithmetic mean instead (with other data unchanged), Zimbabwe still has a low HDI, but it ranks higher than six countries.
Reckoning change The following figures are taken from the India Human Development Report 2011 presented by the Planning Commission of India. Between 1999 and 2007, the HDI
20 Achin Chakraborty
value in Bihar increased by 0.075 from 0.292, and in Kerala it increased by 0.113 starting from 0.677. Clearly Kerala made greater absolute improvement. However, if we compared the percentage changes in HDI in the two states and with the national average as well, which the report did, the increase in Bihar (25.7 per cent) turned out to be greater than the increase in the national average (20.7 per cent). On the other hand, Kerala’s HDI improved by 16.7 per cent, which was below the national average of 20.7 per cent. What kind of meaning can we attach to these percentages? Do they mean that Bihar’s improvement is more valuable than Kerala’s? Do they mean that Bihar has performed better than Kerala? The two questions are distinctly different, and they direct us to two very different ways of measuring improvement. If one holds the view that the marginal human development diminishes as human development improves, any improvement from a lower level is considered to be more valuable than similar improvement from a higher level. In this view Bihar’s improvement would be judged more valuable than Kerala’s. However, if one is to judge how commendable the performances of the two states are, it is rather unlikely that a reasonable person would accept the view that Bihar had performed better than Kerala. Kerala clearly made more commendable progress as it managed to improve its HDI by 0.113 from an already high 0.677, whereas Bihar’s HDI increased by only 0.075 starting from a low 0.292. Moreover, in this example, the gap between the HDI values in the two states in fact increased rather than decreased. Therefore, just by looking at the relatively higher percentage changes in the HDI values in low HDI states one cannot logically conclude that the low HDI states made better progress than others and ‘there is convergence taking place in HDI across states’.6 What we have presented here shows the contested nature of evaluative research. Each step of constructing a composite index of development is fraught with a range of conceptual and technical issues, and there is no magic formula that would resolve the ‘problems’. However, there is nothing defeatist about the multiplicity of possibilities and not having any way to privilege one over the others. In certain social science disciplines, such as economics, pluralism as a value position does not seem to be the most sought after.
6 Pluralism and post-positivist approaches Till this point we have taken the position that there is an underlying reality which research can find out more and more about. With effort and technical know-how we can achieve greater precision in our understanding. However, an alternative belief could be that different accounts or pictures of reality are simultaneously possible based on different perspectives and interests. ‘Qualitative’ research methods are usually associated with this constructivist or interpretive view. The researcher tries to bring out and record different accounts from different viewpoints and then construct what appears to be a complex story. But the problem with this view is that one does not have a sure way of deciding on what the best story amongst several possibilities is.
Methodological or epistemological issues 21
In some areas of social research, the qualitative-quantitative distinction has led to protracted arguments with the proponents of each arguing the superiority of their kind of method over the other. The quantitative side argues that it is ‘rigorous’, ‘hard’, ‘scientific’ and so on. The qualitative approach, as claimed by its proponents, is ‘superior’ because it is ‘sensitive’, ‘nuanced’, ‘contextual’ and so on. Both qualitative and quantitative research rest on rich and varied traditions that come from multiple disciplines, and both have been employed to address almost any research topic one can think of. There is no reason to give primacy to one over the other. Different methods are required to address different problems, and a combination of techniques may yield greater insight than either one of them used in isolation. However, it must not be taken for granted that a combination of qualitative and quantitative methods should always be considered superior to either of them. Two things have to be separated in this context. As far as types of data are concerned, there is little difference between qualitative and quantitative data. All qualitative data can be coded and expressed in numeric form. Even if one retains qualitative data in non-numeric form and uses it for additional insight besides what is known from quantitative information, the methodological approach is still positivist-empiricist.7 The difference perhaps lies in the assumptions about reality and about the way one should acquire knowledge about reality. For instance, many researchers who follow the qualitative approach believe that the best way to understand any phenomenon is to view it in its context. Some in the qualitative tradition believe that the researcher cannot stand apart from the process and produce objective knowledge. Thus, the two approaches clearly differ in terms of ontological assumptions and epistemological positions. Ignoring the deeper issues of this kind, a blanket advocacy of the so-called ‘mixed method’ (apparently some ad hoc combination of qualitative and quantitative methods) smacks of superficiality. A wide range of scholars have advocated ‘methodological pluralism’ ever since Caldwell (1982) made a compelling case for change in the thinking about economic methodology. We believe the primary purpose of our methodological discussion is to enhance our understanding of what social science is all about and, by so doing, to improve it. The critical appraisal of methodological approaches plays an essential role in methodological pluralism. Criticism is not to be undertaken for the purpose of correcting the deviations from the so-called ideal, for the ideal in the absolute sense does not exist at all. Appraisal of scholarly practice is therefore a complex process of questioning and interpreting that cannot be based on any general so-called ‘scientific method’.8
7 Conclusion In this chapter, we have argued that research methodology in social science must not be viewed as prescribing a set of rules about how to do social research. We have illustrated with examples from the literature the variety of practices which can hardly be seen as following a narrow set of normative criteria. Judging by the impact that various scholars have in the research community with their research
22 Achin Chakraborty
output, it seems that methodological appraisal as a scholarly enterprise can meaningfully contribute to social research if it draws on the richness of actual practice and on how scholars deploy their methodological strategies to persuade the members of their community. Moving away from a meta-theoretic perspective such as positivism is likely to facilitate better communication amongst practitioners in social sciences.
Notes 1 Sen (1980) explains why a ‘good’ description may not be a precisely true description. 2 This is not to be confused with programme evaluation or impact assessment. 3 For a comprehensive and lucid discussion on this see Hollis (2000). 4 Some of these issues and others are discussed in Chakraborty (2002). 5 For an application of the Borda method of ordinal aggregation see Chakraborty and Mishra (2003). 6 For further details see Chakraborty (2011). 7 This point was discussed in Chakraborty (1996). 8 Such an attempt was made in Chakraborty (2005).
References Bagchi, A. K. 2005. Perilous Passage: Mankind and the Global Ascendancy of Capital. Rowman & Littlefield: Lanham. Blaug, M. 1992. The Methodology of Economics: Or How Economists Explain, 2nd Edition. Cambridge University Press: Cambridge. Byres, T. 1997. “State, Class and Development Planning in India”. In: T. Byres, ed., The State, Development Planning and Liberalization in India. Oxford University Press: New Delhi. Caldwell, B. 1982. Beyond Positivism: Economic Methodology in the Twentieth Century. Allen & Unwin: London. Chakraborty, A. 1996. “Quantitative, Qualitative, and Post-Modern Epistemology”. Economic and Political Weekly,Vol. 31(49), 132. Chakraborty, A. 2002. “Issues in Social Indicators, Composite Indices and Inequality”. Economic and Political Weekly,Vol. 37(13), 1199–1202. Chakraborty, A. 2005. “The Rhetoric of Disagreement in Reform Debates”. In: J. Mooij, ed., The Politics of Economic Reforms in India. Sage Publications: New Delhi. Chakraborty, A. 2008. “In Search of Constitutive Plurality in Development Discourse”. Contemporary Issues and Ideas in Social Sciences,Vol. 4(2), 1–16. Chakraborty, A. 2011. “Human Development: How Not to Interpret Change”. Economic and Political Weekly, December 17. [on why taking percentage change is inappropriate] Chakraborty, A. and U. S. Mishra. 2003. “Making Inter-Country Comparison of Life Expectancy Inequality Sensitive”. Social Indicators Research,Vol. 64. Chandrasekhar, C. P. and J. Ghosh. 2002. Market That Failed: A Decade of Neoliberal Economic Reforms in India. LeftWord: New Delhi. Friedman, M. 1953. Essays in Positive Economics. University of Chicago Press: Chicago. Griffin, K., A. R. Khan, and A. Ickowitz. 2002. “Poverty and Distribution of Land”. In:V. K. Ramachandran and M. Swaminathan, eds., Agrarian Studies: Essays on Agrarian Relations in Less Developed Countries. Tulika: New Delhi. Hausman, D. M. 1992. The Inexact and Separate Science of Economics. Cambridge University Press: Cambridge.
Methodological or epistemological issues 23
Hollis, M. 2000. The Philosophy of Social Science. Cambridge University Press: Cambridge. Published in India by Foundation Books; New Delhi. Marx, K. 1859, 1977. Preface to A Contribution to the Critique of Political Economy, Progress Publishers: Moscow. Ravallion, M. 2010. “Troubling Tradeoffs in the Human Development Index”. Policy Research Working Paper 5484. World Bank. Sen, A. K. 1980. “Description as Choice”. Oxford Economic Papers,Vol. 32(3), 353–369. Sen, A. K. 1986. “Prediction and Economic Theory”. Proceedings of the Royal Society of London, Series A, Mathematical and Physical Sciences,Vol. 407 (1832). United Nations Development Programme. 2016. Human Development Report 2016 (Technical Note).
PART II
Debates in research methods
2 TOWARDS A PRAGMATIC CENTRE Debates on qualitative methodology Samita Sen
1 Introduction At the turn of the 21st century, research methodology made a rather spectacular entry into the stage of Indian higher education. This was linked to the then newly adopted national project for expansion in higher education, a project that has seen some remarkable vicissitudes in its short twenty-five-year history. In the meanwhile, research methodology has been inducted into curricula from the master’s degree to the research level and has become a major component in training and refresher courses. All this creates an aura of regulation in research that is somewhat new to Indian social sciences and even more so to the humanities.The pedagogy of methodology tends to be overly prescriptive, focusing on a range of practical ‘how to’ measures and sidestepping some of the fundamental debates that have attended the question across the world for many decades now. There have been many levels in the debates over qualitative methodology in recent years: there are fundamental questions about whether there is a need and a role of methodology in social science research, there have been what are labelled ‘paradigm wars’ and then there is the more prosaic debate between quantitative and qualitative methodologies. These may appear to be separate debates, but they are really closely interconnected and reflect changing intellectual climates. These issues can only be flagged in brief in the purview of this chapter. Moreover, the focus is perforce on international debates, since there has been relatively less attempt to explore the specificity of social and cultural contexts of these debates for India, even though developments in the field in the last three decades have been driven in large measure by post-colonial theory. This is not to say that there has been none. There has been some insightful exploration of social science research methodology in the Indian context, but this is not voluminous, and there has been no sustained engagement.1 There has been some discussion about the relationship between
28 Samita Sen
disciplines – such as a famous set of essays in the 1980s on the relationship of history and anthropology – which have implications for methodology even though this was not the chief focus.2 There can be no doubt that, in general terms, qualitative research methods have multiplied and proliferated in recent years, so much so that it is virtually impossible to describe these in any unified manner. There are multifarious interpretive approaches, widely divergent and even oppositional, including critical theories, such as of race and pedagogy, performance studies, standpoint epistemologies, queer, materialist, feminist and reflexive methods. Moreover, there are schools that have promoted different kinds of methodological assays, such as the Frankfurt School and the British cultural studies traditions. Then, there has been a return to empiricism in grounded theories. Some of these methodologies have emerged from political engagements of groups of scholars, primarily against social injustice. In qualitative research, more than in the quantitative variant, there has been a greater play of ideology and a more avowedly humanist and social justice commitment. The proliferating plurality of qualitative methodology has emerged from three paradigm wars, which have led to an explosion in the literature on methodology across the world (Tashakkori and Teddlie [2003, 2011]). The first war was against positivism and may be termed a constructivist challenge (1970–1990). In perhaps the most famous statement of its kind, Michael Billig (1988) recommends antimethodological ‘traditional scholarship’, which depends more on specialist knowledge than a set of laid-down rules. He makes two insightful arguments –first, that methodology makes the scholar anonymous. It is presumed that if good methodology is followed properly then two scholars will reach the same conclusions, which is one way of talking objectivity. His second point is regarding the responsibility of the individual scholar’s judgement (rooted in ideology), which values interpretation over evidence.3 These questions have been greatly amplified in recent years by scholars who have questioned the very possibility of objectivity and asserted the overwhelming significance of the location and politics of the researcher. In the social sciences, some scholars have emphasized methodology as a means of upholding their claim to the designation ‘science’; equally, however, there have been detractors who have challenged the possibility of precision in researching the social. One form anti-methodology sometimes takes is the emphasis on an intuitive component, sometimes referred to as ‘headnotes’. The fissure in such debates is sometimes down the line of quantitative versus qualitative methodology. While there has been much ink spilled in discussing the comparative merits and demerits of these two, there has also been in recent years a move away from ‘purity’ of either and a promotion of mixed methods of various kinds. This was in part a response to post-structuralist intervention, which challenged the possibility of an objective scholar turning observation into data. The new critical perspective taught us that there is data only when theory acknowledges them as data (Richardson [2000]; St. Pierre [2011]). The activity of research became replete with more doubt than certainty – no theory, method, discourse, genre, or tradition could claim any longer to be universal or to be the ‘right’ or privileged form of authoritative knowledge
Towards a pragmatic centre 29
(Richardson [2000]). Indeed, all claims to universal truth were shown to be actually a replay of the contingent and/or the local. One source of this uncertainty was shown to be in the fluid nature of meanings, always in motion, incomplete, partial and inconsistent. There can be no precision or accuracy, thus no validity or reliability, merely different representations of different representations. The evidence- and scientifically based research communities responded to this uncertainty and flux with variations of mixed methods so that data could be both qualitative and quantitative, even at the same time. The second paradigm conflict occurred within the mixed methods community and involved disputes between quantitative and qualitative methods and whether they could be used together. The chief issue was whether these could be combined or were the differences in paradigm assumptions too radical to allow a combination. One may say, the conflict was amongst competing constructivist and critical theory paradigms (1990–2005). In the 21st century, we arrived at a more pragmatic paradigm, in which quantitative and qualitative methods are believed to be compatible, and researchers can use both simultaneously in their empirical inquiries. Even in this ‘what works’ pragmatic stand, however, there remains the issue of the politics of evidence. This is the space that evidence-based research entered and that gave rise to the third phase of the war, in which we are now. There is conflict between evidencebased methodologies, mixed methods, interpretive and critical theory schools. There is upheaval and a new round of debate about ‘scientific’ research. On the one hand, there is a renewed focus on empiricism; on the other, there are innovative discussions on qualitative methods, which are radical in their questioning of objectivity and more reliant on a disruptive politics of representation. Such a paradigm, if it can be named as such, emphasizes social transformation and pursues an agenda of equality. One aspect of such research is to make itself accessible for public use and policy making and we will return to this issue in the conclusion of the chapter. There has been no treaty to end the paradigm wars. Rather, the war is renewed in every generation as fresh questions, both intellectual and political, emerge. The challenge to qualitative methods is a reflection of the churn in the social sciences. Specific debates over methodology have in most cases inhabited extremes of various positions. The tendency today is for scholars to congregate midway through these extremes, searching for a ‘soft’ and pragmatic centre. One aspect of this is the intermingling of paradigms. Thus, for instance, there is an attempt at an accommodation between mixed methodologies based on scientifically based research on the one hand and renewed calls for social justice inquiry from the critical social science tradition on the other, which continue to pull from opposite poles. A continued ‘performance turn’ in qualitative inquiry is likely in a historical context of rapid and unprecedented change in what now appears to be a new phase in capitalism. In considering these questions, we have to remember that qualitative research methods, usually considered to have originated in the social and behavioural sciences such as sociology, anthropology and psychology, now have wide application. Indeed, they are applied in fields quite far removed from the social science contexts in which
30 Samita Sen
the major debates have and are taking place. Thus, for instance, one major use of ethnography is in marketing research. One should also make a distinction between methods and tools; especially since some of the tools are common to many of the methods, which adds to perceived similarities and overlaps between methods. For instance, the tools of in-depth interviews with individuals, group discussions (from two to ten participants is typical), diary and journal exercises and in-context observations are common to almost all qualitative methods. These instruments are used in both traditional and new ways. In addition to in-person interviews, for instance, there can be interviews by telephone, videoconferencing or email. As in many other arenas affected by new information and communication technologies, there is uncertainty about how change in format may affect the functioning of the tool. Qualitative methods aim at gaining a deep understanding of a specific organization or event or a small group of people rather than surface description of a large sample of a population. Thus, the focus is often on an explicit rendering of the structure, order, and broad patterns found amongst a small group of participants. By and large, this kind of research does not help introduce treatments or manipulate variables. Most qualitative methods attempt to describe perceptions and/or a range of behaviour. The results of qualitative research are usually descriptive and not predictive. Rather than imposing the researcher’s predetermined values, such methods are supposed to let meaning emerge from the participants. In this sense, methods have to be flexible and able to adjust to different contexts. One aim is to understand how the participants derive meaning from their surroundings, and how these meanings influence their behaviour. At the core of qualitative research is the use of observation as a method of data collection. As already indicated, this aspect of qualitative method has come under considerable controversy in recent years, but different modes of observation continue to be used extensively in studies by psychologists, anthropologists and sociologists. Earlier it was assumed that direct observation can be produced by an instrument (e.g. questionnaire), but now it is recognized that this is a very complex process and that the claims of precision and accuracy are somewhat dubious. Moreover, it is increasingly recognized that the context or background of behaviour be included in observations of both people and their environment. In this chapter, the three conflicts have been taken as the entry point for a discussion of five commonly used qualitative methods of social sciences. The first moment, the constructivist challenge, had a major impact on ethnography and narrative methods. In the discussion of these two methods, other related methods, such as phenomenology, life history and oral history, have also been discussed. The second moment signalled controversy over and within mixed method schools. Grounded theory is discussed in this context because some of these debates took place within this school and because the inductive assumptions of grounded theory allowed more play for discussion of different kinds of evidence. The third moment is that of the pragmatic centre, but it has also witnessed an assertion of ideologies of justice and equality. This is taken as an entry point to discuss the protracted debate over the very possibility of a feminist method. Two methods, interviews and focus
Towards a pragmatic centre 31
group discussions, have been framed in these debates and have been discussed in the third section within such a context. It must also be said that the discussions in the following three sections have not always remained focused on ‘method’ as such but have moved between epistemology, methodology, method, tools and techniques because that is how the debates have taken place. Moreover, these distinctions cannot always be sustained – they move and blur into each other.
2 Ethnography and narrative: the constructivist challenge In this section, I discuss two qualitative methods, which were transformed by the first paradigm war against positivism. These two methods, ethnography and narrative, are themselves closely interrelated, with considerable similarities and overlaps. In terms of techniques, the staple of ethnography is participant observation and that of narrative is storytelling. Both, however, have weak claims to objectivity. The narrative method was in some respects vulnerable to post-structuralist critique, which imbued the term with a range of new meanings. The relationship between the ‘text’ and the ‘narrative’ has raised a range of questions central to critical theory.The critique of ethnography came from two quite different directions – there was, first, the demand for decolonization of anthropology, a shift from the study of the exotic ‘other’ bound in a telos of progress; and second, there was the locus of the researcher in tension between participation and observation. The emphasis in ethnography is on studying an entire culture, tied to the notion of ethnicity and geographic location. It is a central assumption of ethnography (indeed perhaps of qualitative methods in general) that to understand individual or group behaviour, one must learn the culture within which it takes place. Cultures, however, cannot be formulated in comprehensive or explicit terms. This leads to notions of immersion – or at the very least a cultural habitus that researchers acquire over time in fieldwork (Hammersley [1997]).4 Earlier, the anthropologist addressed unfamiliar cultures. There has been considerable discussion of the colonial origins of this method of research, but it has been reinvented and honed for a broad range of purposes. It is no longer associated exclusively with the study of the ‘other’ as there are ‘emic’ possibilities as well as ‘auto-ethnography’, i.e. the insider’s view or the study of one’s own people. At one time, ethnography was deployed to study exotic people in faraway locales; the contemporary uses of ethnography include the study of workplaces, product design and consumer marketing. Clifford Geertz (1988) characterizes ethnography as ‘being there’. According to him, the power of ethnography lies in convincing the reader that the anthropologist had been there, that this ‘offstage miracle’ had taken place. Geertz shows how despite apparent differences, some of the major concerns of the ethnographer remain. He compares Raymond Firth’s We, The Tikopia (1936) with Loring Danforth’s The Death Rituals of Rural Greece (1982). Firth is calm and magisterial and produces in great detail descriptions of his observations of people, behaviour and places.Yet there is nervousness about his relationship with his material and his writing. He reflects on the need for a more refined methodology and deprecates
32 Samita Sen
his own inability to be more objective or dispassionate. Danforth, too, views the problem of his work to be chiefly epistemological, and he reflects at length on the distance between the self and the other in the process of fieldwork. His writing, animated by constructivism and anti-colonialism, raises some now-familiar questions about the relationship between the researcher and researched. Geertz shows that, despite great dissimilarities, there is a common thread in the concerns of the two ethnographers: ‘the delicate but successful establishment of a familiar sensibility, much like our own, in an intriguing but familiar place, not at all like our own’ (Geertz, [1988]). Later in the book, Geertz writes of the resistance to acknowledge that the writing of ethnography involves ‘telling stories, making pictures, concocting symbolisms, and deploying tropes’ and that such resistance has to do with a philosophic confusion between fact and fiction – that fact is only literally so. Thus, the imagined is considered imaginary, and fiction is considered false (ibid, p. 140). Kamala Visweswaran draws a line from this to write about the kinship between autobiography and ethnography. In her formulation ‘fictions of feminist ethnography’, she points out that the relationship between fiction and ethnography is different for marginalized people. Indeed, fiction as much as ethnography may be part of the arch of imperialism (Visweswaran [1994]). In the context of South Asia, there has been a long history of colonial or ‘official ethnography’ undertaken by the British (but also other European missionaries and travellers), which has inflected the attitude and approach of modern Indian academy towards this branch of knowledge production. This history goes back to the International Exhibition in London in 1862, when exotica from India included people brought as exhibits. The period from 1860 to 1900 saw the British in India create official projects to document India’s past and present. These included the establishment of the archaeological survey, the search for Sanskrit manuscripts, the establishment of schools for arts and crafts to preserve artisan skills, the Linguistic Survey of India, the establishment (before 1900) of 50 museums and the decennial census. The findings of these investigative agencies were digested and normalized and appeared in dictionaries of industrial products – e.g. the Statistical Survey of India, the Imperial Gazetteer and the Recruiting Handbook for the Indian Army. They were also exhibited both live and in photographs and models in international exhibitions and world’s fairs. In most post-colonial societies, there is considerable suspicion of a method so deeply associated with race and power. M. N. Srinivas, the doyen of Indian anthropology, argued that, rather than social anthropology, sociology found a more hospitable ground in India because of these historical reasons. In ethnography, the insider/outsider problematic looms large. These issues have been resolved somewhat by the anthropologists’ focus on ‘tribes’ as the primitive other within Indian society (Srinivas [2004]). Stella Mascarenhas-Keyes (2004) has written at length about the particular dilemmas of the ‘native anthropologist’. If the challenge for the outsider is to become ‘marginal natives’ in the ‘social space’ of fieldwork, the problem is reversed for the ‘insider’, who has to transcend an ascribed social position in
Towards a pragmatic centre 33
order to relate to a whole spectrum of ‘native’ society as though an outsider. The anthropologist in one’s own society has the double bind of a liminal space between outside and inside. In Work of Culture, Gananath Obeyesekere writes of the nature of inter-subjective relations in anthropology. The claim to represent the ‘native’ point of view as part of an emic account of other societies and cultures is replete with assumptions about not only the unified subjectivity of the researcher but also that of the researched. He points out that the respondent is very like the scholar in that he provides answers about his own culture from assumptions of knowledge about causality of institutions. These respondents, however, need not, in fact, do not, agree with each other; they share in the contentious nature of the scholar. The internal controversy within cultures may be hidden or explicit but every society has a ‘cacophony of voices’, which ethnography cannot represent even though it purports to represent the native. He urges the researcher to go beyond the native’s point of view in a nomological move, which endeavours to move away from the specificity of one culture or another towards a more general understanding of society (Obeyesekere [2004]). The most common ethnographic approach is participant observation. The idea is that the ethnographer becomes immersed in the culture as an active participant and records extensive field notes. There is no preset limiting of what will be observed and no real ending point in an ethnographic study. Participant observation is conceived as a period of intensive social interaction between the researcher and the subjects, in the latter’s environment. It becomes the full-time occupation of the researcher. Participant observers are trained in techniques of observation, which distinguishes them from regular participants. One of the most common methods for qualitative data collection, participant observation is also one of the most demanding. It requires that the researcher become a participant in the culture or context being observed but at the same time maintain a distance, which allows the scholar to be an observer. The process of training in participant observation stresses how to enter the context, the role of the researcher as a participant, the collection and storage of field notes and the analysis of field data. Participant observation often requires months or years of intensive work because the researcher needs ‘access’ and ‘acceptance’ before the process can be undertaken. For the participant observer, everything is noted as if it were happening for the first time, and everything is subject to enquiry. This is the basis of the self/other distance in ethnography, and it is founded on a theory of symbolic interaction. This assumes that social interaction is produced through interpretation and definition, equally that such interactions create their own meaning. While objective structures may set conditions and fix consequences, they do not predetermine the individual’s actions. Thus, agency is often a key question in such research. The narrative approach is most common in the discipline of history, even though it is also used in many other social sciences. A narrative has at its core an attempt to write a story according to a plot. In this sense, the question of ethnography and fiction (and that of biography and autobiography, as previously discussed) ties together disciplines of anthropology, literature and history. This has become more
34 Samita Sen
and more evident in interdisciplinary practices in recent scholarship. A narrative weaves together a sequence of events, and this can apply to anything. In fact, the narrative method is often associated with journalism and different kinds of writing for the media. The aim of narration is to form a cohesive story. One can conduct in-depth interviews, read documents and look for themes. The aim of narratives is often to show how an individual story illustrates the larger life influences that created it. Sometimes, interviews are conducted over weeks, months or even years, but the final narrative is not necessarily chronological. Indeed, a narrative does not need to reconcile conflicts and contradictions; in the best narrative traditions, tensions and challenges are highlighted to draw attention to cracks and fissures. Thus, in recent years, narrative enquiry has become as concerned with the form of narration as it is with content. Indeed, storytelling is considered a ‘performance’ or a kind of discourse. This may take a specific form, which is popular in many disciplines – that of life history. Narratives may focus on the individual participant or may seek to generalize across experiences.5 There has been interest in working across studies allowing a kind of meta-analysis of different narrative accounts, though of course this has its own challenges (Josselson [2006]). In addition to life history, narratives also invoke orality and oral history and since it is often reconstructed from ‘tellings’, it has also involved considerable discussion about memory and the kind of evidence memory provides the researcher. These two methods are also closely linked with phenomenology, which is considered a philosophical perspective and an approach to qualitative methodology. It has a long history in several social research disciplines, including psychology, sociology and social work. Phenomenology emphasizes people’s subjective experiences and interpretations of the world. That is to say, a phenomenologist wants to understand how the world appears to others. A phenomenological study is an appropriate qualitative method for description of an event, an activity or a phenomenon and demands a combination of methods, such as conducting interviews, reading documents, watching videos or visiting places and events. The aim is to understand participants’ perspectives and to provide insight into their motivations. As in many qualitative methods, phenomenology is exploratory and mostly descriptive. One usually conducts and collects interviews to look for emerging themes and validate findings.
3 Grounded theory: revisiting the empirical Grounded theory is a qualitative research approach developed by Glaser and Strauss (1967) in the 1960s. The self-defined purpose of grounded theory is to develop a theory about phenomena of interest. It prioritizes inductive methods over deductive methods. It describes itself as ‘grounded’ because the task of classifying and interpreting data begins with the data rather than from a pre-existing conceptual framework.Yet it is theoretical because it seeks to model the relationships between the categories that have been generated from the data. Glaser and Strauss developed this approach in clearly stated opposition to the hypothetico-deductive method, which was then the prevailing orthodoxy in social sciences.
Towards a pragmatic centre 35
Grounded theory is a complex iterative process.The research begins with the raising of generative questions which help to guide the research but are not intended to be either static or confining. As the researcher begins to gather data, a core theoretical concept (or concepts) is identified. The two key terms in grounded theory are saturation (the point beyond which further exploration yields no new insight) and axial coding (the systematic exploration of relationships amongst categories). The emphasis is on constant comparison as a means of defining, validating and limiting categories. Grounded theorists do not seek incremental accumulation of data; rather, they adopt purposive sampling to confirm or modify categories. This is sometimes called abduction since it is not quite induction, and such research both generates and tests hypotheses. Ideally, one approaches ‘conceptually dense theory’ as new observation leading to new linkages which lead to revisions in the theory and more data collection. This kind of method is applied commonly to fields such as health, education, social work and management. The main point of contention in grounded theory is the induction a ssumption – that is to say, the assertion that there can be a technique of research, which builds theory from the ground up rather than a top-down model in which theory leads to research questions and/or design. In grounded theory, as in other kinds of research, theory (or politics or ideology) plays an important role from the start of the project in terms of the selection of theme and thrust, even if these are not made explicit. The researcher cannot begin with a tabula rasa, or ‘blank slate’; therefore, the basic assumptions of induction in grounded theory is open to doubt, as is the implicit claim to greater objectivity. At the same time, grounded theory does not espouse the same kind of objectivity as prescribed in earlier positivist positions. Given the freedom to build theory from the ground up, grounded theory can offer a creative process with considerable opportunity for original thinking. Within grounded theory, there have been several divergent tendencies. In its original conception, grounded theory was capacious, including all kinds of data, including documents, literature, interviews and/or statistics. In this view, ‘all is data’. In a redefined version, Strauss and Corbin (1990) have narrowed the field considerably, while attempting to address the nature of ‘theory’ in grounded theory. There had been some criticism that what was called theorization was actually a search for causality. Strauss and Corbin (ibid.) have offered instead a template for modelling through causal/contextual factors and agency. In another departure, Charmaz (2000) offers a narrative or constructivist approach, which has not been well received by purists in the field. Whereas grounded theory does offer a new way to approach empirical enquiry, it was not developed in conversation with post-modernism and does not address its concerns for explaining the epistemological location of a researcher or her assumptions about causality or notions of ‘theory’. The rigours of the method follow from residual assumptions of objectivity; it demands careful attention to empirical data with rigidly laid-down procedures, such as coding. The multiplicity and intermingling of methods also affects grounded theory – researchers now borrow successfully and effectively from grounded theory rather than follow it in its detail.
36 Samita Sen
The intervention by Charmaz (ibid.) connects grounded theory to another set of debates. The stress on coding has led to its association with the use of computer software for qualitative analysis. The use of software for large-scale content analysis has been acceptable for a long time now. There has been a persistent fear, however, that expanding such usage will lead to methodological biases and distortions and alienate the researcher from the data. In the Indian context, the use of software for qualitative analysis is still not widely spread. To many scholars, mechanical analysis of data appears to be at odds with a qualitative orientation of research. Indeed, some have gone so far as to invoke Frankenstein’s monster: the computer turning against its creator is a potent modern myth. Udo Kelle (1997) considers the argument to be overstated on both sides. He has pointed out that ‘coding’ is a bit of a misnomer in grounded theory. According to him, the process is similar to a more traditional technique of analysis, which is indexing and cross-referencing; for this purpose computer software is useful and should not be abjured. However, he also makes the point that there is no exclusive methodological link between grounded theory and the use of computer software. The use of software is a technique and not a method in itself. Such techniques can be used in diverse methodological contexts where coding and retrieval are required. The issue of the application of codes in qualitative analysis remains, however, a somewhat open question: do ‘codes’ represent the perspective of the researcher rather than clear-cut empirical content, as the proponents of grounded theory would have us believe? The fear of a ‘new orthodoxy’ in qualitative social science research lingers in the trail of empirical traditions that seek to hold out against constructivist or postmodern challenges.
4 Feminist fictions: interview and group discussion Just as there have been questions raised about research methodology in general, there has been a long-standing debate within feminism as to whether there is a distinctive research method which can be called feminist. In 1981, Lorraine Code (1981) posed an ‘outrageous’ question: ‘is the sex of the knower epistemologically significant?’ She formulated a feminist epistemological approach that was distinct from traditional mainstream epistemology (see Code [1991]). She was one of many feminists who were seeking a way to address masculinity, power and authority in knowledge creation. Their challenge resonated across many disciplines. The question was not only the content of knowledge, however, but also the condition of its production, which drew a line from epistemology to methodology. In her wellknown essay ‘Is there a Feminist Method?’, Sandra Harding (1987) answered her own question in the negative. A few years earlier, Ann Oakely (1981) had reached a similar conclusion, arguing that some methods work better for feminists, but there was no feminist method. According to Harding the question had to be approached at the levels of epistemology, methodology and method. She argued that although there were important connections between these three, the distinctive feature of the best feminist research was not in any singularity of its research methods. She warned that to elaborate a theory of the distinctive nature of feminist enquiry that
Towards a pragmatic centre 37
disqualified the best feminist social science research for not fulfilling its criteria would be rather pointless. Instead, perhaps, one should identify the characteristics of such a ‘method’ from the most illuminating examples of feminist research. She concluded that there are methodological features in feminist research, which emerge from the application of traditional methods to the questions of feminism but is not a unified ‘feminist method’. The ‘extraordinary explanatory power’ of feminist research, she argued, was due to the feminist challenge to grand theories and the andro-centric assumptions of traditional research. A few years later, Harding (1991) propounded the standpoint theory, which has become a text for feminist methodology. Her main concern was to explore how feminist values can inform empirical enquiry from the perspective of feminist scientists and their critiques of science. There are three stages in the development of feminist epistemology, argued Harding.The first stage is that of feminist empiricism, when it is possible to remove sexist and other biases from the process of research. The belief is that once this is done an objective position can be achieved. Harding regarded this as an attempt to reform ‘bad’ science by adding women into existing frameworks. The second stage is that of the feminist standpoint. A standpoint, according to Harding, is a place from which human beings view and socially construct the world. The aim of feminist research is to understand women’s lives from an exploration of their experience of oppression – and this produces less distorted knowledge than that produced by men.Thus, women’s experience produces conditions for a more complete understanding of the social reality of the oppressed.Thus, a ‘feminist standpoint’ reveals truths not comprehensible to the ‘ruling gender’.That is to say, social group membership affects people’s standpoints and their ability to produce knowledge. Moreover, all standpoints are partial; so (for example) standpoint feminism coexists with other standpoints. Harding’s third epistemological position is that of feminist postmodernism. According to her, the first two positions are locked in dialogue, and though she speaks of the second position as ‘transitional’, she is not very clear or easy about postmodern alternatives. There was a general feeling, shared by Harding, that postmodernism did not facilitate feminist analysis or even feminist politics. Seyla Benhabib (1995) considered postmodernism to be incompatible with feminism, undermining its very possibility as an emancipatory politics by women. In the 1990s, however, postmodernism had a profound impact on feminist epistemology, and this led to new thinking about research methodology. The position many feminists took in the 1980s, in common with Harding, that there was no distinctive feminist method, gave way in the 1990s to multiple renditions of feminist methodologies. Hilary Graham (1983) eloquently asked, ‘Do her answers fit his questions?’ She argued that surveys designed on the basis of men’s lives were being used to understand women’s experiences. In the next decade, things had not changed greatly, since women’s lives were being made to fit into male theories (Edwards 1990). In the 1980s, however, Ann Oakely (1981) wrote an essay on interviewing, which remains one of the most influential interventions in feminist method. She argued, as pointed out earlier, that some methods were better suited to feminist aims. Nevertheless, she
38 Samita Sen
challenged the masculine assumptions of ‘proper interviews’. Oakley suggested that, contrary to an objective, standardized and detached approach to interviewing, the goal of finding out about people through interviewing was ‘best achieved when the relationship of interviewer and interviewee is non-hierarchical and when the interviewer is prepared to invest his or her own personal identity in the relationship’ (ibid., p. 41). She discussed the interviews she did of expectant mothers, talking about the intensely personal experience of pregnancy, birth and motherhood and argued that avoiding involvement and subjectivity was not helpful. She invoked ‘sisterhood’ to argue that she did not wish to exploit her interviewees as sources of information but that she wished to give them greater visibility.6 In addition, she took the decision to answer questions asked of her to establish a reciprocal relationship, which she considered to be a part of the ‘rapport’ researchers are urged to build. She concludes that there should be no mystification of the researcher and the researched as objective instruments of data production and that personal involvement between the two is more than dangerous bias – it is the sine qua non of social interaction, of which an interview is an example. Oakely argues that the mode of interview is a gender issue because the division between ‘objective’ (or formal interview in conventional sociology) and ‘subjective’ overlaps with the division of public and private (ibid.). The one-on-one interview has dominated social science research, including feminist research. In recent years, however, there has been an effort to use focus group discussion (FGD) as a method for varying kinds of research. FGD meets the need for a more socially situated method which addresses both feminist ethical concerns about power and the imposition of meaning. It has the additional value of generating high-quality, interactive data and offering the possibility of theoretical advances regarding the co-construction of meaning between people. There is great potential for such group interaction in development of feminist research methods, especially with regard to dynamic negotiation of meaning in specific social contexts. The basis of the FGD is social constructivist rather than essentialist – it privileges relational rather than individual theories of self, and it reduces hierarchy between researcher and researched because the latter is a group and the former an individual. The FGD establishes the importance of interaction, not only between the researcher and the researched but with the researched interacting with each other, which highlights the significance of conversation as data. The group discussion process is dynamic and engages respondents more actively than is possible in more structured surveys. Such group activities can take place within existing groups (such as a family) or by creating a group (Wilkinson [1998]). At this moment it is particularly poignant to recall that Lin Farley identified and named ‘sexual harassment’ in the context of a consciousness-raising exercise (Farley [1978]). Feminist sociologist Carrie Herbert (1989) combined participant observation and diary writing with unorthodox group meetings to explore young women’s experience of sexual harassment in schools. The similarities between focus group discussions and consciousness-raising sessions in the early years of second-wave feminism has promoted interest in this method amongst feminist researchers. In a
Towards a pragmatic centre 39
way, group discussion was a critical aspect of second-wave feminism. The activity of meeting with others and sharing experiences promoted the recognition of commonalities in what had previously been considered individual and personal problems. It is through group discussion that it is still hoped that women will develop a clearer sense of the social and political significance of their experiences and how they are constructed. This is an explicit political agenda, in that such groups promote organization and feminist activism (Wilkinson [1998]). Feminist methodologies have been articulated through and within the challenge of postmodernism. This can be summarized as three-fold. First and foremost, the ‘woman’ of the 1980s soon ceased to be. There was stringent critique of the notion of ‘woman’ as a unified object of theorizing and as a unified subject of knowing. Second, the strong claim for socially situated knowledge led to the formulation of the concept of reflexivity and to the role of the researcher in constructing knowledge (Doucet and Mauthner [2007]). Third, the intersections between feminism and postmodernism have led to the articulation of a plurality of perspectives, none of which could claim objectivity or transcend into a ‘view from nowhere’ (Haraway [1991]).This push for plurality from postmodern and postcolonial critiques has also had a major impact on the feminist standpoint, which has acknowledged multiple situated standpoints. At the same time, however, the focus on empirical evidence and experience has helped maintain a link between empiricism and the feminist standpoint. In the 1990s, the emphasis on Harding’s three epistemologies became partially replaced by reflections on ‘knowing, knowers and known’ (Hawkesworth [1989]). In this period, the three dominant issues in feminist methodology have been subjectivity, representation and legitimation. In the long run, the growing influence of postmodernism has challenged the fundamentals of feminist epistemology, undermining the possibility that there could be one or even several feminist ‘ways of knowing’. Thus, feminists, like other researchers, are faced with a profound triple ‘crisis of representation, legitimation and praxis’ (Denzin and Lincoln [2000]). The ‘reflexive turn’ or the ‘narrative turn’ has created great uncertainty with its challenge to all knowledge claims (Richardson [1997]). From the 1980s to the second decade of the 21st century, we seem to have moved a full cycle in terms of answering Harding’s question. There is once again great doubt as to whether feminists have or need their own epistemologies. To name a particular method, methodology or theory – as feminists call for some unity of standards – may exclude work that do not fit such criteria. Moreover, there is no feminist consensus on anything to do with method – even such basics as mutuality and equality in interviews.
5 Conclusion Only a very quick sketch of the various debates attending qualitative methodology in social science research over the last few decades has been possible in the brief purview of this chapter.These debates have covered huge ground, which has ranged from fundamental questions of philosophy regarding the production and meaning
40 Samita Sen
of knowledge to quibbling over the details of techniques of individual strategies of collecting, storing, retrieving and analysing data/evidence. Moreover, these debates have reflected major shifts in theory, from positivism to post-structuralism, postmodernism and post-colonialism and a gradual drift towards a middle ground in recent years. Meanwhile, the development of increasingly sophisticated software has raised questions about computation techniques even within qualitative methodology, further blurring the division between quantitative and qualitative, which has been a major strand in methodological debates within social sciences. Alongside this, there has been a robust tradition of anti-methodology amongst social scientists, who have emphasized the art and craft of scholarship and the creative process of thinking and writing. The other major thrust, which has gained momentum in the last few decades, is a reframed question of ideology. There is a resonant echo of Marx in the portals of social scientists – to understand the world is only half the story; the goal of social transformation is the other equally significant part of it. Thus, this chapter has placed questions of discrimination, inequality, democracy and social justice in focus. Even as we see a renewed interest in empirical research, there are arguments not only about data (for and against) but also questions regarding whose data and data for what purpose. It is no longer possible to think of data outside a critical framework, which includes practice and politics and their ability to speak for and with those who are at the margins. One aspect of the ‘practice’ argument takes the form of linkage with governance and policy. In the Indian context, this has been a powerful institutional argument within public-funded higher education and research. In a study of policy impact, however, researchers have found low to moderate policy impact of social science research in India. In this study, which examined documents from various ministries, it was found that policy documents approximated on average 40 per cent of citations from social science research. Government tends to use its own research for designing policy, depending on academics not to make policy but to evaluate and modify. Equally, academics, too, see their policy impact to be low unless research is designed to be policy-driven (Thorat and Verma [2017]). We should not reduce the social impact of research to policy impact; yet these questions are not unrelated to the concerns of this chapter, since interdisciplinary and mixed method research have much greater linkages with policy. The pragmatic centre in qualitative methodology emerging from the paradigm wars of recent years has produced some unintended and unexpected outcome, one of which is renewed concern with practical uses of research.
Notes 1 For a review of methodological writing in India, see Madan (1972). There has also been discussion of appropriate methodologies in Mukherjee (1979), Bose (1995) and Mukherji (2000). 2 The most famous is a set of essays by Bernard Cohn (1987). Peter Kloos questions the importance given to dual synchronism. He argues processes of change must be included
Towards a pragmatic centre 41
in ethnography and suggests that history and anthropology have to be combined. Thus, while anthropology relates to space, history relates to time and a combination of the two is necessary. Every real society is a process in time (Kloos [2004]). 3 In a much-cited essay, C. Wright Mills (1959) avers that ‘methods’ are simply ways of asking and answering questions and ‘theory’ is paying close attention to the use of words. He also says that to master theory and method is to become a self-conscious thinker but to be mastered by theory and method means to stop working. These are like the language of the country one lives in – to know it is not special but to not know it is a disgrace and an inconvenience. 4 This is sometimes expressed as nomothetic or generalizing. This has given rise to a great emphasis on comparative method in ethnography, a comparison of cultures for the understanding of both. Even when this is not an explicit comparison, if a researcher from one culture studies another, there is always implicit comparison (Beteille [2004]). 5 James A. Freeman (2004) wrote at length of his experience of collecting life histories of untouchables in India, especially that of one respondent, Muli. He made two extremely valuable points. First, life history provides an immediate and concrete account of an individual that compels our attention and prompts reflection on human predicaments across different cultures. Second, in a counter-intuitive way, it draws out what is common rather than that which is unique. For instance, in his work he found not the specificity of Indian caste system but the experience of degradation amongst all victims of graded hierarchies. 6 Reflecting on her fieldwork with balmiki women, Malavika Karlekar (2004) wrote of the sense of responsibility she felt about the possibility of her intervention causing anguish and frustration amongst her respondents, since a heightened awareness alone would not allow them to escape their situation. The researcher in this case emerged from the process of fieldwork with a greater understanding of the latent aggression and power in her role.
References Benhabib, S. 1995. “Feminism and Postmodernism”. In: S. Benhabib, J. Butler, D. Cornell, and N. Fraser, eds., Feminist Contentions: A Philosophical Exchange. Routledge: New York, pp. 17–34. Beteille, A. 2004. “The Comparative Method and the Standpoint of the Investigator”. In: V. Kumar Srivastava, ed., Methodology and Field Work. Oxford University Press: New Delhi, pp. 112–131. Bose, P. Kumar. 1995. Research Methodology : A Trend Report. New Delhi: Indian Council of Social Science Research. Charmaz, K. 2000. “Grounded Theory: Objectivist and Constructivist Methods”. In: N. Denzin and Y. Lincoln, eds., Handbook of Qualitative Research. Sage Publications:Thousand Oaks: London, pp. 509–535. Code, L. 1981. “Is the Sex of the Knower Epistemologically Significant?” Metaphilosophy, Vol. 12, 267–276. Code, L. 1991. What Can She Know? Feminist Theory and the Construction of Knowledge, Cornell University Press, Ithaca NY (1991). Cohn, B. S. 1987. An Anthropologist among the Historians and Other Essays. Oxford University Press: New Delhi. Danforth, L. 1982. The Death Rituals of Rural Greece, Princeton University Press, Princeton NJ, 1982. Denzin N. and Lincoln Y. (Eds.). 2000. Handbook of Qualitative Research. London: Sage Publication. Doucet, A. and N. S. Mauthner. 2007. “Feminist Methodologies and Epistemology”. In: D. B. Clifton and D. L. Peck, eds., The Handbook of 21st Century Sociology. Sage Publications: Thousand Oaks: London.
42 Samita Sen
Edwards, R. 1990. “Connecting Method and Epistemology: A White Woman Interviewing Black Women”. Women’s Studies International Forum,Vol. 13, 477–490. Farley, L. 1978. Sexual Shakedown: The Sexual Harassment of Women on the Job. Warner Books: New York. Firth, R. 1936. We the Tikopia: A Sociological Study of Kinship in Primitive Polynesia, London: Allen and Unwin. Freeman, J. A. 2004. “Collecting the Life History of an Indian Untouchable”. In:V. K. Srivastava, ed., Methodology and Field Work. Oxford University Press: New Delhi, pp. 389–412. Geertz, C. 1988. Works and Lives: The Anthropologist as Author. Stanford University Press: Stanford. Glaser, B. and A. Strauss. 1967. The Discovery of Grounded Theory: Strategies for Qualitative Research. Aldine: Chicago. Graham, H. 1983. “Do Her Answers Fit His Questions? Women and the Survey Method”. In: E. Gamarnikow, D. Morgan, J. Purvis, and D. Taylorson, eds., The Public and the Private. Tavistock: London, pp. 132–147. Hammersley, M. 1997. “Qualitative Data Archiving: Some Reflections on Its Prospects and Problems”. Sociology,Vol. 31 (1), 131–142. Haraway, D. J. 1991. Simians, Cyborgs, and Women:The Reinvention of Nature. Sage: New York. Harding, S. 1987. “Is There a Feminist Method?” In: Feminism and Methodology. Indiana University Press and Open University Press: Bloomington and Buckingham. Harding, S. 1991. Whose Science? Whose Knowledge? Thinking from Women’s Lives. Open University Press: Milton Keynes. Hawkesworth, M. 1989. “Knowers, Knowing, Known: Feminist Theory and Claims to Truth”. Signs: Journal of Women in Culture and Society,Vol. 14, 533–547. Herbert, Carrie M. H. 1989. Talking of Silence:The Sexual Harassment of Schoolgirls.The Falmer Press: London. Josselson, R. 2006. “Narrative Research and the Challenge of Accumulating Knowledge”. Narrative Inquiry,Vol. 16 (1), 3–10. Karlekar, M. 2004. “Search for Women’s Voices: Reflections on Fieldwork, 1968–93”. In: V. K. Srivastava, ed., Methodology and Field Work. Oxford University Press: New Delhi, pp. 371–388. Kelle, U. 1997. “Theory Building in Qualitative Research and Computer Programs for the Management of Textual Data”. Sociological Research Online,Vol. 2 (2). Kloos, P. 2004. “Restudies in Anthropology”. In: V. K. Srivastava, ed., Methodology and Field Work. Oxford University Press: New Delhi, pp. 132–148. Madan, T. N. 1972. “Research Methodology: A Trend Report”. In: A Survey of Research in Sociology and Social Anthropology. Popular Prakashan: Mumbai. Mascarenhas-Keyes, Stella. 2004. “The Native Anthropologist: Constraints and Strategies in Research”. In: V. K. Srivastava, ed., Methodology and Field Work. Oxford University Press: New Delhi, pp. 421–435. Mukherjee, R. 1979. What Will It Be? Explorations in Inductive Sociology. Allied Publishers: New Delhi. Mukherji, P. N. 2000. Methodology in Social Research: Dilemmas and Perspectives. Sage: New Delhi. Mills, C. W. 1959. The Sociological Imagination. Oxford University Press: Oxford. Oakely, A. 1981. “Interviewing Women: A Contradiction in Terms”. In: H. Roberts, ed., Doing Feminist Research. Routledge: London. Obeyesekere, G. 2004. “The First Intersubjectivity: The Anthropologist and the Native”. In: V. K. Srivastava, ed., Methodology and Field Work. Oxford University Press: New Delhi, pp. 85–93.
Towards a pragmatic centre 43
Richardson, L. 1997. Fields of Play: Constructing an Academic Life. Rutgers University Press: New Brunswick, NJ. Richardson, L. 2000. “Writing: A Method of Inquiry”. In: K. D. Norman and Y. S. Lincoln, eds., Handbook of Qualitative Research. Sage Publications: Thousand Oaks, pp. 923–948. Srinivas, M. N. 2004. “The Insider versus the Outsider”. In:V. K. Srivastava, ed., Methodology and Field Work. Oxford University Press: New Delhi, pp. 413–420. St. Pierre, E. A. 2011. “Anything Can Happen and Does”. Cultural Studies-Critical Methodologies,Vol. 11 (4), 386–389. Strauss, A. and J. Corbin. 1990. Basics of Qualitative Research: Grounded Theory Procedures and Techniques. Sage: Newbury Park. Tashakkori, A. and C. Teddlie. 2003. Handbook of Mixed Methods in Social and Behavioural Research. Sage Publications: Thousand Oaks, London. Teddlie, C. and A. Tashakkori. 2011. “Mixed Methods: Contemporary Issues in an Emerging Field”. In: N. K. Denzin and Y. S. Lincoln, eds., Handbook of Qualitative Research, 4th Edition. Sage Publications: Thousand Oaks, London, pp. 285–299. Thorat, S. and S.Verma. 2017. Social Science Research in India: Status, Issues and Policies. Oxford University Press: New Delhi. Visweswaran, K. 1994. Fictions of Feminist Ethnography. University of Minnesota Press: Minneapolis. Wilkinson, S. 1998. “Focus Groups in Feminist Research: Power, Interaction and the CoConstruction of Meaning”. Women’s Studies International Forum,Vol. 21 (1), 111–125.
3 ETHNOGRAPHIC FIELDWORK The predicaments and possibilities Amites Mukhopadhyay
1 Introduction Ethnography is a particular way of doing fieldwork which has been characteristically associated with the discipline of anthropology. All pre-doctoral students of anthropology are aware that fieldwork – usually in some far-flung location – is the discipline’s centre piece, the ultimate transformative experience through which they must pass if they wish to call themselves anthropologists (Gardner [1999, p. 49]). An ethnographer is believed to do ethnographic fieldwork in a faraway place, ideally amongst a close community or a village where he or she would spend a considerable length of time interacting with the people, collecting information from them, minutely looking at their ways of life and picking up various anecdotes, incidents, events and occurrences. It is through such intricate details that a comprehensive account of people’s lives are documented. The word ethnography is split into its two distinct parts – ethno (meaning people or man) and graphy (documenting human affairs in a graphic detail). This graphic detailing of human affairs and close documenting of people’s lives has remained central to the disciplinary tradition of anthropology as it has evolved over time. As part of their training, anthropologists learn to immerse themselves in their fields and become part of the community that they study. It is this immersion that allows for the unfolding of the embedded or the hidden that would not be apparent and visible to an outsider. This brings us to the second important facet of ethnography – as a method of research.The anthropologists are trained to enquire into the lives of the quintessential other. Anthropology is founded on the belief that one can realize oneself at his/ her anthropological best if he/she studies the other and not his/her own community. Anthropologists not only travel to far flung places but also focus on groups or communities whose ways of life are distinct from those of the anthropologists. Participant or non-participant observation becomes a tool by which the ethnographer
Ethnographic fieldwork 45
unravels the ways of life of the other. It is about not only collecting data amongst a particular group or community but also immersing oneself in the field in such a way that people would start to consider or perceive the anthropologist as someone who belongs to their own community. In other words, one carries out fieldwork as though he or she is part of the community. It is only with time and a long-drawn presence in the field, with interaction deepening over a time, that anthropologists eventually gain recognition as anthropologists. One is expected to take note of every minute detail in order to be able to provide in-depth understanding of the field and the people who inhabit the field. To that extent ethnography often suggests that anthropologists should go to the field as if they were a clean slate free of presumptions. They should pick up the local terms that people use, local dialects that people speak, and local artefacts they use and start providing an account of peoples’ lives as they go on. Anthropologists should make detailed notes in a diary and should not start with a questionnaire or a formal interview. Building rapport with the other, with the alien community, is the first task of an anthropologist. The more he or she is carrying out fieldwork, the more he or she is coming to terms with a life which was foreign, unknown or unfamiliar. Thus, familiarizing oneself with ways of life one has never known is what forms the centrepiece of ethnography as a method of gaining anthropological knowledge. The idea is to focus on the mundane because it is only through the unfolding of the ordinary and mundane that one gets to the most interior and innermost spheres of the community life.
2 Western Enlightenment, anthropology and studying the other Anthropological interests and ethnography as a method to sustain that interest in other cultures was the result of journeys that the West had undertaken to the East. Resource hunt and exploitation was the immediate need of the industrializing Europe, but colonial rule was not simply consolidated through recourse to warfare. Colonialism evolved diverse methods of engaging the natives. Anthropology grew out of the European Enlightenment’s self-certainty, a certainty that instilled in the West a desire and curiosity to know about the other who might lie out of the geophysical and epistemic space of Europe. The beginning of anthropological journeys could be found in activities that would be far from being anthropological. Colonial interest of the industrializing Europe made possible encounters between the West and the East. Apart from the discovery of sea routes and establishing trade links, a number of travellers came to the East inspired as they were by an urge to know what civilization in the East might look like. Bernard Cohn’s book Colonialism and Its Forms of Knowledge (1997) offers insight into different technologies of knowledge production in colonial India. Colonialism was not also certain as to which method would yield result, because the societies it colonized were not clearly intelligible to them. It was only with time that colonial rule gradually tried various methods to gather information about the country which appeared to be completely alien. Travellers, painters, artists, colonial administrators all attempted diverse portrayals of
46 Amites Mukhopadhyay
India. These depictions were significant in that they pointed to the West’s increasing concern with what is empirically knowable about India. This concern with the empirical manifested in myriad of ways long before the gazetteers and censuses emerged as new instruments of empirically governing India. These travellers, artists and administrators not only were present in India but also travelled widely across all colonized countries and documented very closely every conceivable aspect of the civilizations outside of the West.They were ethnographers in every sense of the term long before ethnography was instituted as a method within the disciplinary traditions of anthropology. In the capacity of travellers, painters or administrators, they all demonstrated their concern with minute details – both the unique and the recurrent – the details that constituted the society, culture and politics of the colonized terrain. Their works captured the diverse and minute components of the physical and social environment that paved the way for the subsequent enterprises of science and knowledge building that unfolded in different colonized societies. India, in the words of Gyan Prakash for example, was an ideal locus for experimenting western science (Prakash [1992, p. 155]; italics as in original). Western science – manifested in different branches of knowledge, such as zoology, botany, geology, archaeology and anthropology – was deployed in understanding and classifying the physical and social environment of colonial India. Thus, the early travellers and painters in many ways anticipated the subsequent unfolding of different branches of knowledge and institutions for the pursuit of knowledge (such as botanical, zoological, geological, archaeological and anthropological surveys) in colonial contexts. All these different branches of western science had evolved in response to predicaments and contingencies of colonial governance. Anthropology too had evolved in close conjunction with the practicalities of the colonial state building. Anthropology was deployed as a science in classifying and studying races. Since governance needs to be founded on the exhaustive knowledge of the people and its place, efforts were underway in documenting and collecting different shapes and sizes of human skulls. The idea was not simply to collect them but to assess how each skull type had been indicative of a particular stage in human evolution. Evolutionism in biological studies informed race studies in physical anthropology. Not only did race studies institute disciplinary boundaries of anthropology but it also aided colonial governments in classifying the native populations. Such classification was integral to colonial governance. Because it is fairly well known now that colonialism could have been sustained not only by sheer force or warfare but also by constant attempts on the part of the colonizer to define and redefine an epistemological space for the colonized (Mukhopadhyay [2016, p. 31]). Therefore, it was not merely a happenstance that the so-called objective and value-free sciences which lie at the core of western enlightenment evolved and often unfolded as an indispensable constituent of colonial governance. It was also not a coincidence that anthropology was increasingly wedded to the principle of colonial rule of difference, whereby race became the sole perceptible marker of difference between the colonizer and the colonized. Race and evolutionism unmistakably informed anthropology’s forays into the native society and culture.
Ethnographic fieldwork 47
Talal Asad (1973) documents ways in which anthropological thinking and practices had been affected and influenced by British colonialism and instrumental imperatives of colonial administration. According to him, the West gives access to cultural and historical information about the societies it has progressively dominated and thus not only generates a certain kind of universal understanding but also reinforces the inequalities in capacity between the European and the nonEuropean worlds – and, derivatively, between the Europeanized elites and the ‘traditional’ masses in the ‘Third World’ (Asad [1973]). Drawing instances from the African and Indian contexts, the book builds a narrative of the wedlock between colonial rule and the beginning of anthropological imaginings. For example, both anthropology and colonial administration (through enumeration, census and surveys) complemented each other in classifying India along two major axes, those of caste and tribe. Nicholas Dirks (2001) highlights what he called ethnographic writings of colonial administrators to trace the twists and turns in the discourses of caste in colonial India. According to Dirks, ethnographic writings are a site where anthropology and colonial administration collaborated to produce divergent accounts of caste. Anthropology also came in handy helping colonial administration with classifying tribes and producing accounts of tribal life. Anthropology’s interest in the ‘other’ culture has been in many senses a derivative of western enlightenment’s epistemic interest in non-European civilizations and imperatives of colonial administration, i.e. the practicalities involved in governing the native, the primitive and the exotic other (the West’s ‘other’). John Beattie’s Other Cultures ((1964) brilliantly sums up anthropology’s interest in the other. It was anthropology’s perennial interest in the other that helped anthropology to disciplinize and professionalize itself. From the beginning of the twentieth century, particularly after the World War I, the British school of anthropology grew under the leadership of stalwarts like Malinowski and Radcliffe Brown. Ethnography became not only a popular academic pursuit but also the principal mode in which anthropology was instituted as a discipline in the British universities. Malinowski’s Argonauts of the Western Pacific (1922) has been a living testimony to how ethnography unfolded as a method in providing a vivid account of the life of communities who lived on faraway islands in the western Pacific. Their life and living, their material and non-material worlds all became the central theme of the monograph. Malinowski’s book demonstrates his ability to immerse himself in the field, the life that communities led and their economic social and political practices. The monograph is not only a portrayal of the Tobrianders’ life but also an account of ethnography as a method – an account of how well he mingled with the communities and how he cultivated the art of observing, interviewing and listening. In sum, Malinowski’s monograph tells us how one could provide a minute portrayal of a social world to which the ethnographer was primarily a stranger. Radcliffe-Brown’s Andaman Islanders (1922) and his study of the African kinships consolidated ethnography as the method in opening up the disciplinary frontiers of anthropology as quintessentially the study of the other. The anthropology department at the University of Oxford witnessed anthropological journeys
48 Amites Mukhopadhyay
being undertaken by famous anthropologists. Evans Pritchard’s study of the Azande in Africa (1937) point to the significance of the ethnographer’s presence amongst the so-called exotic other, i.e. those who are not ethnographer’s own people. Not only does ethnography indicate the importance of linguistic skill to be acquired by the anthropologist (to learn a language with which he or she is not familiar) but it also emphasizes the significance of local words and terms and their power in constituting the world of the other. Anthropological theories and perspectives moved from functionalist or structural functionalist world of Malinowski and Radcliffe-Brown to a possible transition from function to meaning under Evans-Pritchard and then to structuralism in anthropology under Levi Strauss. From an evolutionist’s perspective, which looks at the native life and practices in terms of their functional significance and the stage of their functional evolution, anthropology moves to structuralism that studies cultural practices as manifestations of an underlying structure, as a complex of signs and symbols that can be understood in relation to the constituting elements of the structure. However, despite these changes in perspectives, ethnography remained a method of enquiry into people’s life and method of collecting and documenting information about the life of the other.The researcher would be expected to spend time observing the field, the people who inhabit that field and their material and cultural practices. To ensure closer representation of the field and people’s ways of life, the ethnographer picks up the minute details of all that goes on an everyday basis. Compared to disciplines such as economics, politics or other human studies, where the focus is on the wider and broader structure, the ethnographer focuses on the regular, recurrent, mundane and banal. It is believed that it is through the unfolding of the mundane and every day that the significant ingredients of the local life can be noted and put together. Therefore, the image of an ethnographer that anthropology had carefully built was that of a person who would move about with his or her diary and pen writing down everything that he or she could see and hear as a matter of habit and noting everything that could possibly happen during his or her stay in the field. If anthropology was about the life of the exotic other and ethnography meant immersing oneself in the field, building rapport, therefore, remained the most significant aspect of ethnographic encounter. Anthropological monographs were replete with anecdotes telling us a great deal about how fieldworkers built rapport with the community they had studied. Here one is reminded of Clifford Geertz’s research on cockfight in Bali in Indonesia, telling us in vivid detail how he had found his way into the Balinese society: Early in April of 1958, my wife and I arrived, malarial and diffident, in a Balinese village we intended, as anthropologists, to study. A small place, about five hundred people, and relatively remote, it was its own world. We were intruders, professional ones and villagers dealt with us as Balinese seem always to deal with people not part of their life who yet press themselves upon them as though we were not there. . . . As we wandered around, uncertain, wistful
Ethnographic fieldwork 49
eager to please, people seemed to look right through us with a gaze focused several yards behind us on some more actual stone or tree. . . . My wife and I were still very much in the gust of wind stage, a most frustrating, and even, as soon as you begin to doubt whether you are really real after all, unnerving one, when ten days or so after our arrival, a large cockfight was held in the public square to raise money for a new school. . . . Now, a special few occasions aside, cockfights are illegal in Bali under the Republic. . . . As a result the fights are usually held in a secluded corner of a village in semisecrecy. . . . In the midst of the third match, with hundreds of people, including, still transparent, myself and my wife, fused into a single body around the ring, a superorganism in the literal sense, a truck full of policemen armed with machine guns roared up. . . .The superorganism came instantly apart as its components scattered in all directions. People raced down the road, disappeared headfirst over walls, scrambled under platforms, folded themselves behind wicker screens, scuttled up coconut trees. . . . On the established anthropological principle, ‘when in Rome’, my wife and I decided, only slightly less instantaneously than everyone else, that the thing to do was run too. We ran down the main village street, northward, away from where we were living, for we were on that side of the ring. About halfway down another fugitive ducked suddenly into a compound – his own, it turned out – and we, seeing nothing ahead of us but rice fields, open country, and a very high volcano, followed him. As the three of us came tumbling into the courtyard, his wife, who had apparently been through this sort of thing before, whipped out a table, a tablecloth, three chairs, and three cups of tea, and we all, without any explicit communication whatsoever, sat down, commenced to sip tea, and sought to compose ourselves. (Geertz [1973, p. 412–415]) This is a brilliant narrative on building rapport and striking a chord with the local community. The people who were otherwise oblivious to the couple’s presence amongst them, who always looked through them as if they never existed as really embodied persons suddenly became aware of their presence. It was as if by being the circumstantial fugitives the couple made an entry into the Balinese life; as if by running and fleeing together, they gained people’s confidence, a confidence that was so necessary for the couple to get access to the Balinese life. There was no explicit communication. There was no specific exchange of words between them. Yet, there grew a sense of camaraderie amongst them. The narrative excerpts from Geertz’s The Interpretation of Cultures (1973) are demonstrative of what Geertz would call thick description, a term he coined to describe anthropologists’ engagement in the field. What Geertz presented in the book is not simply a strategy or simple description of how he struck the chord with the local people; in telling us how he developed the rapport he in fact introduced readers to the Balinese society in diverse and multiple layers. For Geertz, an anthropologist can describe the field in primarily two ways: thin description, whereby
50 Amites Mukhopadhyay
one describes the field as it appears; and thick description, whereby the hidden comes out, and the embedded unfolds. In sum, the ethnographer’s encounter with the field becomes symbolically meaningful. Following Geertz, one could argue that winking can be explained away as being simply batting of eyelids and contractions of muscles, but in a specific context winking can have a symbolic significance in that it can help unravel the layers of the social, it can help bring to the fore the hidden, the unspoken. By winking at each other, people indulge in acts that can be construed as being culturally insightful and revealing. It is through the accounting of the mundane one establishes a cord with the local community. Thus, ethnography is interested in what Geertz understands as deep play. An ethnographer should be able to engage the field at the symbolic level. One should be able to indulge in the thick description of people’s world and pick up the terms which a particular ethnic community uses to understand and describe their world. Over time, thick description becomes a very significant method with anthropologists undertaking ethnography. Not only does deep play allow for the deeper unfolding of the field but it also unsettles many of the assumptions that anthropologists have about their field or the people they study. Katy Gardner (1999), in her study of the Bangladeshi community in Sylhet, sees in the purdah or veil (used by women to cover their faces in public) an occasion for unsettling her assumptions about the so-called Third-World women. Gardner states that before she came to Sylhet, she had always looked upon the veil as an instrument of patriarchal control perpetrated upon women in the Third World. However, by becoming part of the purdah herself (i.e. veiling herself or her own face as part of everyday mundane ritual) during the course of her fieldwork she realized a much deeper and more transformative significance of purdah than being simply an instrument of patriarchal power. Thus, ethnographic encounter not only alerts one to the possibility of deeper layers but also transforms one and one’s thoughts and presumptions in new ways. In Gardner’s own words: To insist that there is no single view of Bangladeshi women, or purdah, for they like the anthropologists, are continually in different roles, and continually changing, is a start. Likewise, we need to recognise the various levels at which social norms work, form encompassment. . . . to the manipulation and negotiation of overarching structures. (Gardner [1999, p. 63]) Just as anthropologists step into different roles in response to the differing circumstances of the ethnographic encounter, so does the world of people deeply implicated in diverse social positions, shifting hierarchies and engaging in divergent and discreet levels of interactions. Ethnography, as it evolves through many anthropological journeys, does not stipulate any fixed format for carrying out fieldwork. Nor does it prescribe any set of rules for observing and studying the other. Ethnography not only prepares you for a field but trains you to unlearn many things and shed many of your assumptions. Ethnography as deep play or thick description allows
Ethnographic fieldwork 51
for improvisation, facilitating engagements that may not always be time-tested. The ethnographer has to be a sensitive observant of the field, taking note of each event, spectacular or mundane. An ethnographer must cultivate the habit of being a listener. He or she may not be present at two places around the same time to observe or study an event but can certainly make up for the loss by simply listening to how people speak about the event. An ethnographer must remain alert to multiple narrations of the same event and figure out how each symbolically configures or reconfigures the event. Malinowski might not have seen or witnessed kula, the trade practices of the Tobriand islanders. He might have documented the minute details of the trade – how the communities living on distant islands undertake boating expeditions towards the middle of the sea to exchange goods – by listening to the narratives islanders offered about their economic practices.
3 Ethnography as a method of understanding post-independent India Anthropology in post-independent India found itself busy trying to disentangle itself from the colonial legacy founded on principles derived from the perspectives and traditions of fieldwork evolved and instituted over time within anthropology. There was an increasing concern with the empirical realities of India. Srinivas’s work amongst the Coorgs in South India (1952) challenged Louis Dumont’s views about caste and India based on his understanding of the ideological texts. In fact, Srinivas’s concept of Sanskritization challenged the colonial idea of timeless and inertia-driven Indian society. The ethnography popularized by the British anthropological school (that of Radcliffe-Brown and Evans Pritchard) was what informed Srinivas’s work on the Coorgs of South India. It was his long-drawn fieldwork amongst the Coorgs community in Karnataka that earned Srinivas his reputation as a social anthropologist. Ethnographic interest increasingly revolved around village as the structural reality of India. Bailey’s ethnographic work on two villages in Orissa produced a dynamic account of the villages of contemporary India. Instead of looking at caste and other relations through classical and Brahmanical texts, ethnographic fieldwork focused on villages as a microcosm of the post-independent Indian nation. Ethnographers from the West continued with their interest in the East, but ethnographies documented diverse realities of Indian villages. Ethnography began to be conducted in diverse contexts and around diverse themes. Much before environment became a motive force for developmental thinking in the late 1980s and early 1990s, ecology was invoked as a category in understanding villages and people in India. Scarlett Epstein conducted a longitudinal ethnographic study of two villages in South India. She studied the villages in the 1950s and revisited them in the 1970s and 1980s to document changes that had happened in these villages. Epstein deployed ecology as a category in understanding different dimensions (caste, economic, familial, political etc.) of the village life. Ralph Nicholas (1963) had attempted a comparative study of two villages in deltaic Bengal (one selected from the moribund delta and the other from the active delta), invoking
52 Amites Mukhopadhyay
ecology as a very potent social category. In fact, the new Indian nation on the threshold of independence traversed two distinct paths in understanding its own journey. At the macro-national level, economists were busy planning India’s development. They joined hands with policy makers in ensuring that economic growth could materialize through five-year plans. On the other hand, the sociologists and anthropologists turned their attention away from the centre stage of macro policy making towards the villages of India (acting on the presumption that independent India still lives in its villages), analysing and documenting the journey of the new nation as it unfolded in the villages. Remembering becomes a very important aspect of ethnographic encounter. An ethnographer is required to note everything that occurs lest he or she forgets. Therefore, ethnography becomes a painstaking process of translating everything that the ethnographer sees, observes and hears into field notes. Srinivas’ monograph The Remembered Village (1978) ranks in the annals of ethnography as a masterpiece, for Srinivas had his field notes destroyed, and he produced his monograph on the basis of all that he could remember about the village. The village comes alive through an act of reminiscing.Village in its different dimensions and ever-changing facets – caste, class, agrarian structure, kinship matrix, households – began to ignite the imaginations of anthropologists. Anthropology no longer remains the West’s curiosity about the East, as it used to be, but ethnography also becomes an occasion for studying one’s own culture.Young anthropologists from the subcontinent trained in the discipline in the Indian and foreign universities began to study their own people and societies. However, the question of insider-outsider remains central to the ethnography. In other words, ethnography remains a transformative experience for the self even when he or she studies his or her own people. Thus, ethnography not only unsettles ethnographers’ assumptions about their own society and culture but also makes them come constantly to terms with the limits of their horizon. Agrarian relations became one of the central themes around which much ethnographic writing grew. With time, anthropology’s interest became diversified; its focus no longer remained confined to the villages and rural India. Cities and towns, urban culture increasingly became subjects of ethnographic research. Thus, the socalled primitive culture no longer remained the pivot around which anthropology revolved. A modernizing society such as India – under the long-standing influence of colonial modernity – shaped the imagination of anthropologists. Milton Singer’s When a Great Tradition Modernizes (1972) is an ethnographic work conducted in Madras City. The monograph looks into the ways in which great Indian tradition has been in the throes of transition to modernity in the wake of colonialism, urbanization, modern forms of market etc. Geertzian symbolic anthropology has also triggered a number of ethnographic researches into cultural practices, festivals and popular cultures in India. Àkos Östör conducted ethnographies on various cultural festivals, such as Durga pujo (Östör [1980]) and Manasa pujo (Östör [1984]) in Bengal, particularly in the context of these cultural forms’ encounter with modern institutions and practices. Ethnographic research has also been wedded to historical studies under the influence of
Ethnographic fieldwork 53
anthropologists like Surajit Sinha. Sinha’s ethnographic work on the tribal life in central India demonstrates his sensitivity to histories of tribal–non-tribal interaction in the region. History in his work does not simply appear as background information, not as chronology, or a backdrop to an ethnographic research; rather, it is a powerful constituter of concepts and ideas that are integral to an understanding of the tribal life in the region (Ghosh [2003, p. 113]).
4 The accusations of the 1980s and rethinking ethnography The publication of three major texts – Anthropology as a Cultural Critique (Marcus and Fischer in 1986), Writing Culture (Clifford and Marcus in 1986) and The Predicament of Culture (Clifford in 1988) bring to the fore the criticisms that question the efficacy of ethnography an instrument of anthropological knowledge. These writings displace ethnography’s epistemic certainty – its arrogance as being the source of authentic knowledge about the other. In the words of Clifford: With expanded communication and intercultural influence, people interpret others, and themselves in a bewildering diversity of idioms – a global condition of what Mikhail Bakhtin [(1953)] called heteroglossia.1 This ambiguous and multifocal world makes it increasingly hard to conceive of human diversity as inscribed in bounded and independent cultures. (Clifford [1988, p. 22–23]) According to Clifford, what passes as ethnographic fieldwork is actually ethnographic authority of the West imposed on other cultures. The West can no longer present itself as the unique purveyor of anthropological knowledge about others (ibid., p. 22).These accusations are in many ways informed by works such as Edward Said’s (1978) Orientalism, which problematizes the western scholarship that has produced and reproduced a particular knowledge about the Orient. Said shows how this process of production and reproduction demonstrates the West’s will to power. It is under the influence of such scathing critique of the West and the epistemic violence that it perpetrated (in the name of producing knowledge) upon the other (the Orient or the ‘non-West’) that ethnography as a method of knowledge production came under the scanner. Ethnography stands critiqued as a systematic and objective account of the other. An ethnographic field and encounter is made possible by the presence of the self of the ethnographer.When an ethnographer’s entry into the so-called field makes possible an ethnographic account, the question that arises is how can the self of ethnographer be oblivious to his or her own complicity in producing a knowledge about the other? The questions that loom large are as follows: How can the ethnographer (the self ) absolve himself or herself of his or her will, intent or responsibility towards producing knowledge about the other? How can the ethnographer ignore his or her involvement or implication in producing a knowledge about the other and still portray his or her account as being objective?
54 Amites Mukhopadhyay
Ethnography has been dubbed as being positivistic, steeped in the essentialist idea of a field. Ethnography is believed to look upon its field as essentially bounded, territorially fixed, within which everything that an ethnographer observes is the reality. There is an assumed correspondence between the ethnographer’s presence and the generation of data.What anthropologist sees, observes and hears is the reality as if there is nothing that exists beyond the ethnographic presence. Culture, it is believed, is inscribed within a bounded world. Just as nation states in Europe made societies appear as coterminous to the emerging nations, anthropological curiosity similarly made culture appear as synonymous with the idea of a bounded field. The key word here is participant observation that allows the ethnographer the unique privilege of being the purveyor of the truth. As Clifford suggests, Understood literally, participant observation is a paradoxical, misleading formula, but it may be taken seriously if reformulated in hermeneutic terms as a dialectic of experience and interpretation. . . . Experience and interpretation have, however, been accorded different emphases when presented as claims to authority. In recent years there has been a marked shift of emphasis from the former to the latter. (1988, p. 34) The field is understood to be primarily available to the ethnographer through his or her senses. And ethnography is about translating this experience into a text form. The process is complicated by the action of multiple subjects and political constraints beyond the control of a writer. In a response to these forces, ethnographic writing enacts a specific strategy of authorship. This strategy has classically involved an unquestioned claim to appear as the purveyor of truth in the text (ibid., p. 25). However, what provides a missing link between the experience of being there and its possible translation as truth is an act of interpretation. Interpretation engulfs us in an act of truth construction that is far more complex than being merely sensory. Interpretation enables an encounter with many realities and many voices that may not instantly exist in the field as the truth. Many perspectives and voices creep in to the making of the text.The text demonstrates author’s intent in negotiating diverse voices and perspectives. Interpretation problematizes the deeply ahistorical idea of the field and thereby points to the crises in representation in ethnography. Interpretative anthropology by viewing cultures as assemblages of texts loosely and sometimes contradictorily united, and by highlighting inventive poesis at work in all collective representations, has contributed significantly to the defamiliarization of ethnographic authority. (ibid., p. 41) Interpretation has the tendency to defamiliarize the experientially familiar. Ethnography is not an encounter where field as experience can translate into uniform
Ethnographic fieldwork 55
monolithic text. Rather, it encounters informants or subjects as informed by multiple discourses. There is an increasing tendency to treat the field as a discursive reality, as a dynamic configuration and reconfiguration by juxtaposition of worlds characterized by continuities and discontinuities. The field is never complete, the culture is concretely open-ended, creative dialogue of sub cultures, insiders and outsiders and diverse factions (ibid., p. 46). It is in the light of these developments that history is increasingly viewed as being complementary to ethnography. It is increasingly being realized that field cannot simply be looked at in a synchronic fashion (as snap shot, ahistorical) but rather as a diachronic encounter. The field is co-produced and co-constituted historically by both the observer and the observed entangled continuously in overlapping identities. There are ethnographies of institutions and practices such as state, society, market and bureaucracy in the third-world context. These ethnographies aim not only to undertake field-centric fieldwork, but look into the institutions (such as state, market or society) being constituted by divergent discourses, discourses that make the post-colonial situation so characteristically peculiar. There have emerged ethnographies that look into everyday relations and negotiations around material objects and artefacts (as opposed to earlier anthropological studies on material and nonmaterial cultures), aiming to make ethnography appear as perpetually an unsettling exercise.
Note 1 Clifford invokes Bakhtin heteroglossia to question the centrality of ethnographer’s presence and the authorial centrality of anthropologist. Bakhtin’s heteroglossia (originates in his work Discourse in the Novel) looks upon language having multiple possibilities, variations and perspectives. Speaking subjects speak to each other in the most innovative and unanticipated ways. Bakhtin views the novel as being heteroglossic because multiple voices, divergent worlds and speaking subjects in multiple capacities figure in the novel beyond the authorial intention of a novelist.
References Asad, T., ed. 1973. Anthropology and Colonial Encounter. Ithaca Press: London. Beattie, J. 1964. Other Cultures: Aims, Methods and Achievements in Social Anthropology. Routledge and Kegan Paul: London. Clifford, James. 1988. The Predicament of Culture: Twentieth-Century Ethnography, Literature and Art. Harvard University Press: Cambridge, Massachusetts, and London. Clifford, J. and George E. Marcus., eds. 1986. Writing Culture:The Poetics and Politics of Ethnography. University of California Press: Berkeley, Los Angeles. Cohn, B. S. 1997. Colonialism and Its Forms of Knowledge:The British in India. Oxford University Press: New Delhi. Dirks, Nicholas B. 2001. Castes of Mind: Colonialism and the Making of Modern India. Princeton University Press: Princeton.
56 Amites Mukhopadhyay
Evans-Pritchard, E.E. 1937. Witchcraft, Oracles and Magic among the Azande. Clarendon Press: Oxford. Gardner, K. 1999. “ ‘Location and Relocation: Home’ the Field and Anthropological Ethics (Sylhet, Bangladesh)”. In: C. W. Watson, ed., Being There: Fieldwork in Anthropology. Pluto Press: London. Geertz, C. 1973. The Interpretation of Cultures: Selected Essays. Fontana Press: New York. Ghosh, A. 2003. “The Historical Anthropology of Surajit Sinha”. Journal of Indian Anthropological Society,Vol. 38, 111–115. Malinowski, B. 1922. Argonauts of the Western Pacific: An Account of the Native Enterprise and Adventure in the Archipelagos of Melanesian New Guinea. Routledge and Kegan Paul: London. Marcus, George E. and Michael M. J. Fischer. 1986. Anthropology as Cultural Critique: An Experimental Moment in the Human Sciences. Chicago University Press: Chicago. Mukhopadhyay, A. 2016. Living with Disasters: Communities and Development in the Indian Sundarbans. Cambridge University Press: New Delhi. Nicholas, R. W. 1963. “Ecology and Village Structure in Deltaic West Bengal”. Economic Weekly, July, pp. 1185–1196. Östör, À. 1980. The Play of Gods: Locality, Ideology, Structure and Time in the Festivals of Bengali Town. Chicago University Press: Chicago. Östör, À. 1984. Culture and Power: Legend, Ritual, Bazaar and Rebellion in a Bengali Society. Sage: New Delhi, Beverly Hills. Prakash, G. 1992.“Science ‘Gone Native’ in Colonial India”. Representations,Vol. 40, 153–177. Radcliffe Brown, A. 1922. The Andaman Islanders: A Study in Social Anthropology. Cambridge University Press: Cambridge. Said, E. 1978. Orientalism. Pantheon Books: New York. Singer, M. 1972. When a Great Tradition Modernizes: An Anthropological Approach to Indian Civilization. Praeger Publishers: New York. Srinivas, M. N. 1952. Religion and Society among the Coorgs of South India. Clarendon Press: Oxford. Srinivas, M. N. 1978. The Remembered Village. Oxford University Press: Oxford.
4 DIVERSITY IN ECONOMICS An examination and defence of heterodox approach Soumik Sarkar and Anjan Chakrabarti
1 Introduction The existence and unity of a discipline called economics reside in the eye and mind of the beholder. . . . Our view is that no discipline of economics exists. Or, rather, no unified discipline exists. The ‘discipline’ of economics is actually an agonistic and shifting field of fundamentally different and often conflicting discourses. (Resnick and Wolff [2012])
This in a nutshell is what our intervention will be about. Taking the body of economy to be constituted by the production, distribution and consumption of goods and services (Krause [1982]), we seek to demonstrate the existence of different theories in the discipline of economics. We first take up the theoretical challenge of creating the basis for defining a theory and for distinguishing one theory from another. To this end, we expand on the criteria of epistemology and entry point pioneered by Resnick and Wolff (1987, 2006, 2012) and Madra (2016). To explain and defend heterodoxy by using this theory of theoretical differences in the field of economics, we concentrate on three economic theories – neoclassical, Marxian and Keynesian.1 Rather than view one theory as more true than others, we argue in favour of viewing the varied theories as distinct knowledge of economy and of their respective analysis of capitalism. This implies that all theories speak relative truth and not absolute truth about the economy. Before proceeding with the substance of the central problem, some clarifications are required.The first point concerns a few terminologies that go into the constitution of a theory and underlies any research problem.Very broadly, epistemology is the study of character and compass of knowledge, concerning questions such as, what constitutes knowledge? How knowledge is produced? How do we know what we
58 Soumik Sarkar and Anjan Chakrabarti
know about justification of knowledge as such? Epistemology is in a nutshell the foundation of a theory, and different epistemologies mark different theories. Based on an epistemology, methodology focuses upon research objectives, questions and study design (Carter and Little [2007, p. 1316]); at times, the boundary between the two may become hazy. From another angle, Kaplan (1964, p. 18) argues, it is ‘the study – the description, the explanation, and the justification – of methods, and not the methods themselves’. Methodology justifies method, which is research in action – that is, an actual way in which facts are collected, sorted and analysed. Method may take a qualitative or quantitative route, and its associated technique and interpretation remain circumscribed by the respective epistemologies and entry points within which the researcher functions. Ontology is the question of ‘being’ or ‘what is’. It is study of the world, reality and its existence. It asks question such as, what are the fundamental parts of the world and their relationship? Finally, entry point is about certain aspects chosen to enter into the reality and produce knowledge in relation to it; entry point captures the focus of a theory and as such helps fix the border of a theory. Specifically, economic theories differ from one another in terms of ‘how to conceptualize, know and justify’ (epistemology) and ‘what to know’ (entry point) of economy, and these then operate with a certain methodology and method to present a picture of their respective understanding of the world. The gap between the epistemological and ontological presents a vexing problem as we shall see. Dissatisfied with this epistemological-led approach, one can also begin from and concentrate on the ontological or even the so-called phenomenological approach, but we refrain from discussing it here because of its limited presence in the creation of various economic schools of thought. Our limited focus in this chapter remains epistemology and entry point (in relation to the question of their connection to ontology) that we, following Resnick and Wolff, contend are sufficient to unpack the problem at hand. The second issue concerns the differences within each theory. Take Marxism. Just as in economics, there are varieties of Marxian theory. Therefore, the ‘discipline’ of Marxism, too, is diverse, and from that field we pick up one theory – classfocused Marxian theory – as representative of Marxism (Resnick and Wolff [1987, 2006, 2012]; Gibson-Graham [1996]). Keynesian economics, too, has disaggregated into multiple approaches – post-war neoclassical-Keynesian synthesis, new Keynesian economics, post-Keynesian economics etc. From here, we pick one interpretation of Keynes – the post-Keynesian economic theory – as embodying Keynesianism (Davidson [1994]; Harcourt and Kreisler [2013]). If we can show that post-Keynesian (henceforth, Keynesian) and class-focused Marxian (henceforth Marxian) economics are legitimate theories that cannot be reduced to neoclassical or any other theories, we have had in general demonstrated our claim of heterodoxy in economics to be valid. The urgency to grapple with questions of theoretical differences comes against the backdrop of a discussion on the relation between neoclassical and mainstream economics that has evolved in the last few decades. Some now claim that neoclassical economics (which they qualify as Walrasian general equilibrium theory) has disintegrated from 1970s onwards as a result of its own limitations (inability to address
Diversity in economics 59
the presence of the structurally positioned role of auctioneer within the general equilibrium frame, Sonnenschein-Mantel-Debreu theorem of excess demand function and imperfect information) and through extension towards evolutionary game theory, experimental economics, behavioural economics etc. (reflecting a move towards a more Marshallian type partial analysis).2 This differentiation towards a non-Walrasian horizon is associated with a parallel claim that there is now pluralism in economics that accommodates all possible varieties of doing economics. That is, mainstream economics have surpassed the narrow horizon of neoclassical economics and presently encompasses pluralism; hence, wither heterodoxy as a separate concern. Mainstream economics qua economics is already heterodox. Building on Resnick and Wolff ’s pioneering approach, we show that this claim is deceptively wrong in two ways. First, if we base the definition of theory on the standards of epistemology and entry point, the so-called differentiated existence of economics within a singular discipline – mainstream economics – is misleading. Rather, the so-called disintegration of Walrasian general equilibrium approach into a Marshallian-type partial approach is actually a reassertion of neoclassical economics in new clothes, a reassertion that may be called Late Neoclassical Economics. Following Madra (2016), the deception is based on the untheorized claim that neoclassical economics is tout court Walrasian general equilibrium, thereby suppressing the tradition of Marshallian approach within that school of thought. The seminal point remains that the two traditions – Walrasian and Marshallian – share a united philosophy, that of theoretical humanism (rational subject and equilibrium order; to be defined in details later) that contains a common epistemology (determinism) and entry point (given preference, endowment and technology). That is, in terms of the corpus of essential categories that constitute the (b)order of a theory, the mentioned differences emanate from within the same (b)order and hence do not represent a break from one theory to another. Dispersion within a theory, here late neoclassical, is nothing new. It is present in all theories; there may be differences on methodology and particularly in method within every theory containing a particular philosophical approach. That is why we have debates, dissent and changes in every theory; a theory that does not have this plasticity would lapse into dogmatism and rarely survive. But this must not present a ground to conflate and confuse between plasticity within a theory and differences between theories. What then is claimed as pluralism in mainstream economics consists of this dispersion emanating from within the commonly held philosophical approach: theoretical humanism. The suggested break from neoclassical economics is actually its sublimation to late neoclassical economics. Additionally, this misplaced claim of differentiation constituting a unified discipline of economics serves another insidious objective: exclusion of other economic theories – post-Keynesian, Marxian, Neo-Ricardian etc. – from the discipline of economics itself, economic theories that work with different epistemological and entry point positions emanating from other philosophical traditions. By claiming pluralism within late neoclassical economics to be pluralism within the discipline of economics per se what is thereby sought is the annihilation of competing theories
60 Soumik Sarkar and Anjan Chakrabarti
from the discipline of economics. We make here a sharp distinction between heterodoxy and pluralism. Heterodox economics pertains to foundational differences between economic theories based on what each theory focuses upon (entry point) and how they seek to produce knowledge (epistemology). Pluralism, on the other hand, may exist within each theory. Unpacking a theory of theoretical differences, this chapter not only will seek to produce and demonstrate the normative value of treating the discipline of economics as a heterodox field but also argue that the field cannot but be heterodox. Finally, our approach can even be turned to interrogate the conception of ‘heterodox’ in a critical vein. For instance, one can appreciate the criticism of a newKeynesian macroeconomic approach by post-Keynesians (Davidson [1994]) who argue that the former cannot be treated as a distinct Keynesian theory since, from our vantage point, it shares the epistemology and entry point of late neoclassical economic theory mentioned earlier. Similarly, the classical Marxian theory of historical materialism shares broadly with neoclassical and Keynesian economics the epistemology of determinism even though its entry point is different from the latter (Resnick and Wolff [2012]). By virtue of its dissimilar entry point and epistemology, it is not the same as class-focused Marxian theory. We shall begin our discussion by exploring the issue of epistemology, marking a difference between the deterministic epistemology (used in neoclassical and post-Keynesian economic theory) and the non-deterministic epistemology of overdetermination (used in class focused Marxian economic theory). We then proceed to define entry point and explain its relevance in theory making, with a particular focus on economic theories. Subsequently, depending upon their diverse epistemologies and entry points, we show how various economic theories get constructed. The discipline of economics is revealed to be foundationally heterodox.
2 Overdetermination versus determinism: the logical structure . . . a thousand threads one treadle throws. Where fly the shuttles hither and tither, Unseen the threads are knot together, And an infinite combination grows. – Goethe, ‘Faust’, Part I, Scene 4 (in Sigmund Freud’s The Interpretation of Dreams, p. 301) The city is prior in the order of nature to the family and the individual. The reason for this is that the whole is necessarily prior to the part. – Aristotle (Politics [1995, p. 11]) In the Cartesian world, that is, the world as a clock, the phenomena are the consequences of the coming together of individual atomistic bits, each with its own intrinsic properties, determining the system as a whole. – Levins and Lewontin (The Dialectical Biologist, p. 2)
Diversity in economics 61
Let us begin by presenting two broad approaches concerning the logical structure: deterministic epistemology that foregrounds essentialist structure of causality along with its associated reductionist approach and the contrasting non-deterministic epistemology that foregrounds overdetermination and contradiction along with its non-reductionist mode of analysis. Later on, we will examine another aspect of epistemology, concerning how deterministic and non-deterministic logics are respectively related to reality and the connection of these theories to claims about truth. Let us start with the category of overdetermination and the logic it introduces within a theory. Inspired by Marx’s (1973, pp. 83–111) commentary on the relation between production, distribution and consumption and Freud (2010, pp. 295–326) in the context of dream analysis, overdetermination was developed as a non-deterministic epistemology by Althusser (1969) and Resnick and Wolff (1987). It rules out any presence of a priori privileging, centricity and origin in the order of explanation.We showcase the most developed discussion on overdetermination through Resnick and Wolff. The word epitomizing overdetermination is ‘constitution’ or ‘constitute’, which literally means ‘bringing an entity into existence’. It begins with the understanding that no process can exist a priori in any concrete situation. Rather, each process (say, the economic process of class or of market) is brought into existence or is constituted by the combined effects of many other processes (say, the other economic processes as also cultural, political and natural processes), some even unknown; in turn, the former will affect the latter set of processes, thereby becoming their constituting process in that role. As such, processes mutually constitute one another, that is, to say the same thing, processes are overdetermined by one another. Looking at it from a different angle, every process provides condition of existences for other processes. For example, an individual’s subjective disposition or consciousness is constituted by the combined effects of an infinite number of mutually constituting processes. This ranges from the brain/psyche of the individual (including the processes emanating from the unconscious and neuronal networks) to the effects from social and natural processes and the form of multilayered relationships in which the individual is embedded. In this regard, the number of constituting effects serves as the ‘conditions of existence’ of who we are and what becomes of us. Furthermore, these conditions of existences are unique, i.e. irreducible and in opposition to one another, thereby pulling and pushing the process being constituted in multiple directions. Conditions of existence are therefore said to be in contradiction to one another; by corollary, every process is a bundle of contradictions. Because overdetermination encapsulates contradiction, every process is said to be overdetermined and contradictory, which entails that they are in a state of flux – their movement uncertain and prone to contingency and their direction uneven, i.e. non-teleological.There is only one thing that is constant in overdetermination: change. That is why the unit of analysis under overdetermination is process: an entity in a state of change. Every process is changing and changing everything else. In this context, relationships (say, a relationship of friendship or love), practices (say, organizing a trade union in a
62 Soumik Sarkar and Anjan Chakrabarti
factory) or activities (say, cooking food at home) are to be seen as combined effects of a specific cluster of overdetermined and contradictory processes that comprise and transform them. A site is a conceptually defined space where groups of relationships, practices and activities occur whose combined effects constitute the site. The individual, family, school and university, religious institutions, parliament, trade union, capitalist enterprise, state, Indian economy etc. are illustrations of a site. An initial thesis emerges: the logical structure of explanation under overdetermination is non-reductionist. No entity – its existence, role, movement etc. – can be reduced to a pregiven core process that stands beyond the scope of being affected by other processes. It also means that no process, under any situation, can be taken as completely characterized by itself and hence as a predetermined prior existing somehow independently of the rest of the process to which it becomes ultimately related. There is no ‘independent’ variable to which the dependent variable can be ultimately reduced. For example, the generic representation of X and Y variables in economics through X = a + bY with X dependent and Y independent is inconsistent with overdetermination. If that were to happen, the a priori independent process would stand out as privileged in the order of explanation and the produced system of meanings. As an extension, overdetermination rejects mutual causality too, since even when it reverses the order of cause and effect (X determining Y and Y determining X), in each the order and the mode of explanation (cause and effect being distinct, one exists prior to the other, and one is reduced to the other) are maintained.Take another example – are we who we are because we are 30 per cent nature and 70 per cent nurture? Answering the question this way would certainly be inconsistent with overdetermination, which would argue that we are constituted by both nature and nurture (Levins and Lewontin [1985]). Their mutually combined effects producing the human self cannot be legitimately separated out in strict statistical percentages. Since all sites are cluster of overdetermined and contradictory processes and the ontological reality is seen as a configuration of all these sites, the socio-economicnatural formation/reality cannot be reduced to any historically deterministic laws or underlying rationale, with a purposive logic underlying the inexorable transition of economy and society from a preordained origin to a predestined end or telos (such as that articulated by the Rostow model, Lewis model, Solow model or classical Marxist varieties of historical materialism). Though we do not pursue the discussion any further here (see Chakrabarti and Cullenberg [2003]; Chakrabarti et al. [2015]; Chakrabarti and Dhar [2017]), it is pertinent to note that overdetermination renders deterministic history spurious since there is no a priori rational element or inner laws of motion to which the trajectory of economy and society can be reduced. The contrast of non-deterministic structure of causality with that proposed under the popular deterministic epistemology is palpable. In the latter, cause (X) and effect (Y) are strictly separate such that the causal factor is primary/antecedent, whereas what it affects is secondary/precedent. In any proposed order of explanation, the factor which causes is not the object of effect from other factors and
Diversity in economics 63
hence is beyond interrogation. It is active because it causes other factors but passive because it remains outside the scope of determination by virtue of existing a priori; as such, the existence of the causing factor is pregiven and considered primary in that relation and therefore in the order of explanation. In contrast, overdetermination states that no entity is outside the grip of constitution and none can be envisaged as primary; every process is both the cause and effect and hence changing. Under the deterministic logical structure of causality that is a hallmark of positivism, it is held ‘that any apparent complexity – a person, a relationship, a historical occurrence . . . can be analysed to reveal a simplicity lying at its core’ (Resnick and Wolff [1987, pp. 2–3]). That core causal factor which determines others without being itself the target of determination by others is known as essence. Not surprisingly, determinism, with its causal force of essence, operates via reductionism since the explanation of existence of anything and their functions (individual, structure, social norms, practice, institution, event, history etc.) is reduced to that of the essence. Therefore, if X is taken as the essence and Y the entity to be causally determined by X, the equation X = a + bY is a mere extension/representation of the underlying structure of causality. How the relation between X and Y will evolve is then guided by this deterministic structure that says that explanation of Y is ultimately reducible to X. Take another example in relation to human/thinking and nature/matter to elaborate on the differences between the two logical structures.Whereas there are many issues concerning how their relationship is conceptualized, let us just focus on one: morality. One of the dominant strands of enlightenment called for making human the centre (Kant [1784]) and as distinct from nature/matter (Bilgrami [2009]). Not only is the human distinctly exclusive but it is also one who can know and control nature for its gain (Bilgrami [2009]; Henry [2017]). Thus, amongst other things, utilitarianism was born, where nature was transformed into natural resources that could be experimented with and used for the utility of human beings. As one can anticipate, the birth of industrial capitalism would have been impossible without this epistemological break. This vision of the world is thus underpinned by a deepseated pregiven dualism between human and nature, such that thinking, structure, action and morality/ethics came to rest upon human desires, subjective utilities and interest, whether that be of their earthly gains or higher states of mind, such as sympathy and moral sentiments. Moreover, this human-centric understanding of nature (what we call humanism) means that morality/ethics/justice are derived from human mind as ideas and codified in principles and laws that one is supposed to follow. Such a reductive understanding of morality/ethics/justice will be rejected under overdetermination. Here, human and nature constitute one another; hence, there can be no morality/ethics/justice outside of their overdetermined and contradictory relations. This means that values are derived only in relation to nature or ‘this worldly’ existence. This is a picture of values in which values are not merely something we create and ‘project’ onto the world . . . but they are things that are found in the world. . . .
64 Soumik Sarkar and Anjan Chakrabarti
a world of nature, of others who inhabit nature with us, and of a history and tradition that accumulates in the relations among these, and within which value is understood as being ‘in the world’. (2009, p. 49) Values in the world condition and guide our emotional and practical relations and actions. Since our practical agency is influenced and guided by such normative demands, humans are also moral beings whose decisions-actions cannot be reduced merely to asocial, atomistic, calculating creatures. Interestingly, this philosophical rejection of the liberal category of actor as an atomistic individual existing independently of ‘this worldly’ relations is perhaps as old as human civilization.Therefore, in the Bhagavad Gita, we get the following: All actions take place in time by the interweaving of the forces of Nature; but the man lost in selfish delusion think that he himself is an actor. ( The Bhagavad Gita [1992, pp. 27, 58]) Despite the different approaches and objectives, there seems an uncanny similarity of the underlying epistemology underscored in this quotation with the current of overdetermination which, as we shall see later in this chapter, is adopted in classfocused Marxian theory.
3 Determinism and economics The point remains that, in modern history of knowledge, the deterministic structure of causality has had profound implications, particularly in normalizing the binary/ dualistic way of thinking and the systems that rose on that foundation. The dualist frame has two components: first, the structure of two – whole and part, structure and individual, human and nature, man and woman, modern and tradition, capitalism and non-/pre-capitalism, the West and Rest, etc.; second, it produces a world view and mode of analysis whereby the structure of two is founded on and operates through the logic of one. Human-nature relation becomes anthropomorphic; manwoman relations, androcentric capitalism–noncapitalism relation, capital-centric, industry-agriculture relation, industry-centric, and so on. Henceforth, binary/dualist structure becomes monist in logical orientation. Overdetermination challenges this way of structuring our understanding of the world and modes of reasoning. It is also the reason why this variety of knowledge production marks difference; it challenges the deterministic ‘consensus’ in ‘how’ we look at things and perceive relations, practice and history. In the field of social science generally and in economics particularly, the deterministic epistemology unveils itself in two kinds of economic discourse – neoclassical and Keynesian. In the former, the whole is reduced to the parts (Cartesian totality), and in the latter, the parts reduce to the whole (Hegelian totality) (Cullenberg [1996]). The debate between Cartesian and Hegelian totality has been
Diversity in economics 65
particularly intense in the last couple of centuries; in economics that is reflected in the fierce debate between Keynesian and (neo)classical theories in the last century, which boiled down to a tussle between theoretical humanism (individuals as parts determining the structures/whole – socio-economic order – to produce a spontaneous and harmonious socio-economic order, i.e. equilibrium) and structuralism (structure/whole determining the individuals/parts). Either way, both have something in common. They end up producing a centred totality, a totality centred on either the whole or the part. Overdetermination, on the other hand, produces a decentred and open-ended totality, one that cannot be reduced to either the parts or the whole. Let us expand the contrast a bit more. The Hegelian totality is a variant of holistic social theory that can be traced as far back as Aristotle (Cullenberg [1994, Ch. 2]). Its core idea is that the whole is prior to and independent of the parts. The whole is centred on an inner essence that makes the whole a structured totality and gives it purpose. Parts appear and function as part of the totality and, at times, as expressions of the inner essence structuring the whole.There are many kinds of holistic social theory, such as Hegelian totality, that have additionally a logic of history associated with the existence and function of the structured totality. It influenced a specific Marxian version of historical materialism. Another form of holistic social theory is Keynesian theory, which works with structured totality and embodies features of expressive causality, as in Hegel, but has no added logic of history. The Hegelian totality gives rise to a particular type of causality – an expressive causality (Althusser [1969]; Cullenberg [1994, 1996]). Expressive causality means that the parts in the totality act out or express their role according to the pregiven nature of the totality (the ontological world). In this case, some aspects derived from or representative of the structure/totality serve as the essence which serves as a virtual law of motion in terms of which the different parts – for example, individuals – appear, function and move. The parts thus exist in the end as expressions (appearances) of a prior, pregiven whole/structure; the whole causes the parts, i.e. the parts are derived from the whole. In this sense, the Hegelian totality is associated with a particular form of essentialist causality. Theory derived from the centrality of structure or from features governing the structure (such as some underlying laws or elements) is referred to in social science as structuralism. As mentioned earlier, an example of structuralism in economics is the post-Keynesian interpretation of Keynes’s The General Theory of Employment, Interest and Money that is sustained by an a priori entry point of (1) mass psychology (populations propensities to save out of their incomes for unforeseen contingencies that in turn cannot be calculated) and conventions for making economics decisions, (2) the power of institutions (e.g. labour unions and government) to motivate and shape individual behaviours in markets, and (3) the ‘animal spirits’ that govern individual investors’ decisions about when and where to invest and how much. (Resnick and Wolff [2012, p. 348])
66 Soumik Sarkar and Anjan Chakrabarti
Not only are individual decisions derived from the given properties of social structures but the structure/general economy also has features that cannot be reduced to the parts/individuals.3 Hence, an explanation of business cycle, crisis, glut, boom, bust, overproduction, underconsumption, panic bubble, upturn, downturn, recession and depression cannot be explained merely from an individually derived economic system that is the hallmark of the Cartesian system as followed by (neo) classical economics. In contrast, Cartesian totality considers parts as primary from which the whole is derived. This totality is an assertion about the ontological structure of the world where ‘any totality, whether animate, inanimate, or social is composed of a set of basic elements or atoms which exist prior to and independent from the totality’ (Cullenberg [1996, p. 127]). The totality, then, is nothing more than the resulting configuration of the patterns of interaction of its independently constituted parts/ atoms. This approach is best captured by Descartes’s comparison of the method to a machine. In part V of his Discourse on Method, Descartes outlined his approach by comparing the inanimate and animate world to a machine. Through it, Descartes construed modern thought as reflecting the idea that ‘the world was ultimately decomposable into a set of independently constituted parts, and therefore that the “proper” method for understanding reality was to discover, and then analyse one by one, its pre-existing parts’ (Cullenberg [1996, p. 128]). With its essential authenticity intact, each part is independent and autonomous of the totality – the machine. These parts through a reductionist or inductive logic of causality connecting them constitute the machine. In part II of the Discourse on Method, Descartes forwarded the fundamental rules of his method.These rules suggest a breakdown of any problem of understanding into its most robust and evidently true elements. It is then possible to build up knowledge in an orderly fashion from the basic elements. Whereas the understanding of Cartesian totality is used in different fields – including natural science, such as biology, physics etc. (Levins and Lewontin [1985]) – in the field of social science the machine, a metaphor for the whole/structure, is best seen as nothing more than the summation of the reassembled individuals, individuals being a metaphor for atoms. Individuals thus emerge as the basic unit of analysis. Since the social is reduced to the parts, exploring the parts or individuals and not the whole or structure is the key in revealing the social. The fundamental assertion is that all social explanations are incomplete and therefore incorrect, unless and until they can be derived from the behaviour of independently constituted individuals. Individuals are to be understood as ‘atoms’ of more complex social institutions, such as classes, governments, enterprises, whole societies and so on. Complex social institutions, their characteristics and their properties are to be deduced from the beliefs and dispositions of individuals and not vice versa. This commonly deployed mode of explanation, ‘methodological individualism’, is a form of social explanation based squarely on the ontological structure of the Cartesian totality (Elster [1985]; Roemer [1986]). In neoclassical economics (which we take as a modern/micro-foundational reincarnation of the classical vision of capitalism as a harmonious socio-economic order), the general equilibrium model in the tradition of Arrow and Debreu is an
Diversity in economics 67
example of an economy conceived of as a machine, where the economy is understood as an equilibrium result of the interactions of pregiven agents/atoms; in demonstrating that an economy in competitive general equilibrium is ‘Pareto-optimal’, it is claimed that such an arrangement will spontaneously lead to a harmonious social order that also maximizes individual welfare. Because the human, aka individual, is the centre of everything that follows neoclassical economic theory, its adopted epistemology is said to follow theoretical humanism: Theoretical humanism (is) a decidedly post-Enlightenment philosophical orientation that establishes a vision of social harmony premised upon a notion of a human subject who is self-conscious, self-transparent, rational, and autonomous. . . . to put it as economically as possible, the neoclassical tradition is structured around the theoretical problematic of reconciling the individual and the aggregate (collective, social, market) rationality. (Madra [2016, p. 5]) In contrast to Keynesianism, business cycles and other forms of instabilities are explained from the decision-action of agents; our good and bad times are reduced ultimately to the totality of those decisions-actions and thus cannot be blamed on the social or systemic characteristics. In the field of macroeconomics, the central point of counterrevolution by new classical school against the Keynesianism of synthesis variety was precisely the charge that the latter’s macro models lack micro foundation traceable to self-conscious, self-transparent, rational and autonomous human subjects. This attack on Keynesianism did not owe its origin merely to the critique launched by new classical macroeconomists – prior to that, the philosopher Karl Popper and philosopher-economist Friedrich Hayek launched a critique of the holistic social theory as a whole. The long-standing battle between Keynesian and neoclassical economics on the representation of capitalism and its business cycle is an exemplification of the point that differences and modes of analysis/reasoning come to be based on their specific adoption of deterministic epistemology. While each of these approaches – here, Keynesianism and neoclassical – differs in relation to what it considers the essence, whether that be a pregiven whole/structure or the rationality of a fully self-constituted agent, each shares with the other a reductionist approach, wherein social explanations are ultimately constructed on the basis of a rock-bottom essence. In both approaches, attention to contingent and varied institutional contexts, social embeddedness and the de-centred and mutual constitutiveness of structure and agent are generally avoided. Later on, we shall explore both theories further. In contrast, our version of Marxian theory, which takes overdetermination as its epistemology, distinguishes itself from both determinisms on the relation between whole and part. For one, consider a process. Is it a whole or a part? The answer is both. In so far as a process is a site of the combined effects of all its constituting conditions of existence, it is a whole. On the other hand, in so far as it is one of the constituting effects of other processes, it is a part. This characteristic of every process being
68 Soumik Sarkar and Anjan Chakrabarti
simultaneously whole and part is an extension of the non-deterministic logical structure where every process is both cause and effect.
4 Truth and economics Having clarified the difference of deterministic logic from overdetermination as a competing mode of logic and reasoning, we now move to the question of the relation of theory with reality and truth. This discussion is important because of the underlying untheorized claims of neoclassical and Keynesian economics – namely, that their produced conception of reality is the reality itself. Therefore, it is not surprising that much of the battle in economics is over which economic theory best mirrors ontological reality and hence is the absolute truth. One comes across ‘true versus false’, ‘scientific versus unscientific’, ‘rigorous versus substandard’ etc. being used by economists against one another, one theory against the others.The isomorphism between knowledge production (a claim of what reality is) and ontological (where the claim is magically turned into the reality itself – truth of the world) in turn flows from the two dominant post-Enlightenment epistemological traditions – empiricism and rationalism – that have gripped the foundation of diverse economic discourses. In contrast, an absolute truth claim is impossible if the claim of knowledge produced by each theory is never allowed transmogrification into an ontological claim of what the world is. This is the case with overdetermination. It is not to say that there is no relation between the epistemological position of a theory and the ontological reality under overdetermination; rather, it is to say that relation cannot be taken as a straightforward isomorphism. What we have instead are relative truths, i.e. claims by each theory of their respective knowledge of the world, without any added condition that the world per se mirrors the claimed view of the world in theory. Here, we can showcase overdetermination to demonstrate the conceptions and claims of truth as relative rather than absolute. This discussion on conceptions and claims of reality and truth, then, is one way of settling whether economics ought to be seen as a foundationally singular/unified discipline or as heterodox. Let us start with overdetermination. Central to the deterministic field of rationalism and empiricism is the purported ontological division between thinking and being, mind and body, subject and object and so on. Overdetermination rejects such dichotomy with great consequences: Overdetermination is more than a non-reductionist, mutually determined form of causality. Overdetermination . . . admits no causal essence, whether a pre-given Subject or an all-encompassing whole, nor does it admit any telos to which contradiction(s) are subservient. . . . As an epistemology, overdetermination implies that thinking (the realm of theory) and being (the material realm) mutually constitute one another. . . . there is no ontological gap between thinking and being as the traditional epistemologies of empiricism and rationalism maintain. (Cullenberg [1996])
Diversity in economics 69
How is it that there is no ontological gap between thinking and being? The answer lies in the way overdetermination recasts the relation between whole and parts. Once we accept that thinking, too, is a process that occurs in conjunction with all other processes, it constitutes all other process as much as they constitute it. In Marx, material reality is the combined effect of all processes, including the thinking process (it is the concrete-real or the ontological world qua the whole/totality); the thinking process, at any time, exists as an outcome of the combined effects in the concrete (it is the thought-concrete). This means that, even when abstracted, human thought cannot be taken as independent and autonomous of the concrete relations that constitute it (and that includes the effect of the thinking process already congealed in it). Rather, the concrete is already embodied in the abstracted thinking process. Instead of conceptualizing the individual in abstraction as independent and transparent, Marx suggests a conception of the socially embedded individual in abstraction (Marx [1845]). This means that Marx has a radically different theory of abstraction and of the relation between the abstract and concrete in comparison to the more positivist renditions of abstraction. Moreover, the point remains that the two concretes – thought-concrete and concrete-real – are in an inalienable relation as the cause and effect of the other. It is at that moment of their overdetermination and contradiction that knowledge is produced: Knowledge, for Marx, is the process connecting concrete-real to the thoughtconcretes.The knowledge process that connects both concretes connects also the ceaseless transformation of both. . . . Thinking is a process of change: change in both the concrete-real and thought-concretes. Thinking cannot, therefore, be conceived as either the consequence of the concrete-real or its effect. Rather, thinking is both a creative, active constitutive part of the concrete-real and a process overdetermined in and by that concrete-real.This was a major theme of Marx’s critique of materialism. (Resnick and Wolff [1987, pp. 55–56]) Evidently, the mind/subject/thinking and body/object/material dualistic division is non-functional here since ‘all objects (are) overdetermined by the totality of social processes, including the thinking process of subjects’ (Resnick and Wolff [1987, p. 56]). Each theory, then, is a certain way in which the concrete-real is connected to the thought-concrete, thereby producing a distinct conception of the world. Because different theories connect concrete-real and thought-concrete in unique ways, different channels of knowledge production of understanding reality emerge. One thus can only have multiple theories and multiple truths. To reveal the contrast, we now move to explore the two deterministic epistemological standpoints of theory – rationalism and empiricism – and the subsequent claim of absolute truth they make (Resnick and Wolff [1987, Chs. 1 and 2]; Resnick and Wolff [2012, Ch. 7]). Empiricism considers reality to be causally prior to theory. It begins with the presumption that there is a real, factual world out there that can be known, i.e. experienced
70 Soumik Sarkar and Anjan Chakrabarti
by means of our five senses – sight, smell, touch, hearing and touch. Our thinking produces ideas about the world, which are then compared to the facts that our senses have adopted through experience. Discovering the truth about the world is about which ideas best correspond to the gathered facts.The theory most successful in approximating reality through the neutral/scientific medium of senses is the one capable of best fitting the facts and hence revealing the real truth. Hence, theory must mirror the facts/experience, i.e. reality (Rorty [1979]).That theory is truthful. Rationalism inverts the causal order of empiricism such that it is the theory that is causally prior and capable of expressing reality. According to rationalist epistemology, ‘human reasoning is structured in a manner isomorphic to the ontological structure of the world’ (Cullenberg [1996]). The mind is therefore adequate to the task of tracking the causal or rational order of the concrete world out there. Rather than the standard of senses, nature and society are knowable through human reason or the rational element that can be extracted from the storehouse of knowledge production and research. By isomorphism, the essence – the extracted rationale from the knowledge storehouse – forms the core around which the logical order of reality can be revealed and viewed. The theory that unpacks the true essence of reality is the one most capable of revealing the truth. Under rationalism, then, theory determines and validates facts. Facts contrary to the theory would be considered unimportant, misguided or irrelevant. In an opposite vein to empiricism, facts must mirror the true theory. A specific example of rationalist epistemology in the somewhat Hegelian vein is the classical Marxian historical materialist model, where the facts of society (state, law, property, class, surplus etc.) and their transition are ultimately explained in terms of either the forces of production whose existence and development is taken as the given rational element of society and history (Cohen [1978]) or in terms of mode of production (forces and relations of production together) (Olsen [2009]). The rest of the facts (gender, caste, market, etc.) only exist to serve or move this rational element and hence are rendered relatively unimportant or secondary in the order of explanation or history. Empiricism and rationalism thus differ in terms of the chosen (neutral) standard and order of causality deployed to reveal the true reality. However, as in empiricism, under rationalism all other theories are excluded at the expense of one specific theory – the theory considered most capable of revealing the essence (the core underlying the truth of the world). In other words, deterministic epistemology – empiricism or rationalism – tends to eulogize absolute truth. Consequently, theories that construct knowledge based on deterministic epistemology not only take recourse to reductionist reasoning, as we have seen earlier, but also search for truth in an absolutist sense, thereby introducing in the process the division between ‘right’ and ‘wrong’ theory and between what is ‘science’ and what is ‘ideology’. Consequently, this knowledge production in terms of ‘right’ against ‘wrong’ theory or of science against ideology leads one from epistemological essentialism to theoretical essentialism in the sense that all other theories are excluded at the expense of one specific theory considered capable of revealing or discovering the truth. In sharp
Diversity in economics 71
contrast, by rejecting determinism, whether in logical structure or in the process of knowledge production, overdeterminist epistemology moves away from the reductionist view of the reality and absolute truth claim. This discussion is important because, in economics, both the neoclassical and the Keynesian theories we are interrogating have married their adopted deterministic structure of causality or logic (humanist versus structuralist, as discussed earlier) to rationalist and empiricist epistemologies (whether facts or theory is the primary determining element) so as to construct knowledge of economic reality and truth. The logic of relationship they elaborate in their description tends to function in a scenario where either the truth of reality conforms to elements of theory (for example, epitomized by a specific law of motion of capital accumulation, as in the Solow model, driving the path and predestined end of growth in all economics, rich or poor, with similar specifications) or to facts (knowledge deciphered through field survey or laboratory testing).This is not to say that these instruments or means of knowledge production per se are illegitimate but rather to point out that when deployed in the context of rationalist and empiricist epistemologies they tend to lend themselves to a deterministic logic of reduction and an unquestionable claim of absolute truth where the claim of theory of the world is transformed into a truth claim of the world itself. The contrast with overdeterministic epistemology cannot be more palpable.
5 Economy and non-economy: the contesting perspectives In abstraction, the economy can be defined as the ensemble of all the processes related to production, distribution and consumption of goods and services. These economic processes occur within a set of relations with one another and with other social and natural processes. The key question regards the character of their relationship, and here the economic representation diverges in the two perspectives – overdetermination and determinism. Under overdetermination, the ‘economy’ – the processes of production, distribution and consumption – remains constituted by various non-economic processes, i.e. the political processes that structure the rules and regulation of authority over individuals and groups, the cultural processes that produce and disseminate meanings and, of course, the natural processes. The ‘economy’ in any of existential situation can never be circumscribed by, and hence explained by, only ‘economic processes’. Change in non-economic processes affect economic processes and vice versa. Resultantly, an ‘economic’ analysis will remain economy-focused, but it can never be economy-specific. That is, one cannot incarcerate economic explanation to merely economic processes. Nor can explanation be ultimately reduced to some economic process. That would be economic determinism. For example, under an influential variant of classical Marxism, construction and explanation of superstructures (political and cultural domains) are seen as ultimately emanating from the economic/class relations of production and/or the relations of production seen as derived from the
72 Soumik Sarkar and Anjan Chakrabarti
level of forces of production (stock of means of production and labour power) – the former is economic determinism; the latter, technological determinism. Moving further, it is claimed that structure (economic, political and cultural) determines the subject’s consciousness and subjectivity; this is akin to structural determinism or structuralism. In contrast, under overdetermination, no such determinisms can be accommodated since processes in each relationship are bonded through mutual constitution, none more important than the other; thus, the ‘economy’ (say, the complex combination of forces and relations of production) has excess meanings irreducible to the economic processes per se – by virtue of being constituted by noneconomic processes (where each is the cause and effect of one another) – it is always already social.That is the only sense in which we can characterize the presence of economy. Under Keynesian and neoclassical economics, that essential core to which the economy is reduced is something else, as we shall soon observe. As an initial example of a variation within neoclassical theory, take Gary Becker. How does he view economic theory? Not as production of goods and services or even the study of market economy. If we are to follow Becker, economic theory is defined by the method of analysis personified by the conduct of homoeconomicus, which takes the form of ‘maximizing welfare as they conceive it, whether they be selfish, altruistic, loyal, spiteful, or masochist’ (Becker [1993, p. 386]). This maximization of welfare takes place under constraints which are also diverse, including ‘income, time, imperfect memory and calculating capacities, and other limited resources, and also by the opportunities available in the economy and elsewhere’ (Becker [1993, p. 386]). Evidently, rational conduct underlying the homoeconomicus transpiring through the cost-benefit calculation is what guides all kinds of behaviour: Indeed, I have come to the position that the economic approach is a comprehensive one that is applicable to all human behaviour, be it the behaviour involving money prices or imputed shadow prices, repeated or infrequent decisions, large or minor decisions, emotional or mechanical ends, rich or poor person, men or women, adults or children, brilliant or stupid persons, patients or therapists, businessmen or politicians, teachers or students. The application of the economic approach so conceived are as extensive as the scope of economics in the definition given earlier that emphasizes scarce means and competing ends. It is an appropriate approach to go with such a broad and unqualified definition, and with the statement by Shaw4 that begins this essay. (Becker [1976, p. 8]) Given the Beckerian definition of economic theory and universalization of human behaviour as a divine explanatory device, aspects (such as fertility, child birth and child rearing, health, education, crime and punishment and leisure) which are traditionally seen as belonging to ‘non-economic’ domains now become a discursive field for economic analysis because the individuals’ behaviour qua rational conduct could be seen here as revealing the outcomes and the explanation of why they are realized.
Diversity in economics 73
In the neoclassical tradition, Becker’s economic reductionism exemplifies a humanist approach that, through the lens of the purported human behaviour, projects its derived understanding of capitalist society as an instance of spontaneous and harmonious order of capitalism itself. Evidently, underlying this humanist approach is an unsubstantiated albeit hidden isomorphism between the economic theory of society and the concrete society; the latter is taken as a mirror of the former.The strength of the Beckerian approach of analysis is its imperializing of the non-economic domains through their reduction to explanations emanating from economic theory, which in turn is reduced to the essence of human behaviour guided by cost-benefit calculation to resolve the choice problem amongst competing ends. In so far as everything is claimed to be derived from this core essential element, it makes not only other economic theories but also much of the other social sciences redundant. This kind of deterministic analysis and absolute truth claim reducible to economic theory cannot be accommodated under overdetermination. What, when, where and how we produce, distribute and consume are not simply matters of economy but are also constituted by processes related to authority (political), meaning (cultural) and nature – each constitutes the other. Any notion of an independent and autonomously procreating economy (or, as in Becker, extended to the whole society) that is reducible to one or a few core aspects – such as specific human behaviour (in the neoclassical tradition) or laws of motion of development of forces of production relations of production (in the classical Marxian tradition) as and when it is imagined to exist – is outrightly rejected by the epistemology of overdetermination and so is the legitimacy of any analysis and policy made with such an understanding of economy. Clearly, then, the choice of epistemology leaves its indelible mark on the produced knowledge of the economy and the kind of world attempted to be produced through practice and policy.
6 Entry point and differentiating theories There is a problem though. Our conception of an overdetermined and contradictory economic reality implies encountering a wide array of processes beginning from nowhere and ending nowhere.That is, in so far as overdetermination describes economic reality as the cluster of infinite number of economic and non-economic processes that constitute one another, it is impossible to produce knowledge/theory from within such a complex reality. Where does one begin and end the analysis? How does one break the mentioned impasse of knowledge production in an overdetermined reality? To enable construction of knowledge, we must choose one process or a composite of processes as an entry point to produce knowledge of society specific to and regulated by that entry point. All theories thus must have an entry point – a focus of analysis. Since ‘society is the totality of all designatable processes’ (Resnick and Wolff [1987, pp. 4–5]), any theory built on a chosen entry point process will focus upon and specify processes of society that connect with the process chosen as the entry point. The entry point, once specified, would connect with the adopted
74 Soumik Sarkar and Anjan Chakrabarti
epistemology – deterministic epistemology or non-deterministic epistemology of overdetermination. In combination, they will give rise to a particular theory. Other than the adopted epistemology, entry point is thus another significant marker in terms of which theories are constructed and differentiated. Neoclassical economics, Keynesian economics and Marxian economics each have a distinct process or a composite of processes as their chosen entry point; consequently, their descriptions of the economy are produced in connection with their respective entry point of analysis (see Amariglio et al. [1990]) That, in turn, explains why these theories disagree over the character and description of the economy. They disagree, in part, because each is drawing attention to and, in fact, defending the relevancy of its respective entry point.
6.1 Entry point under overdetermination While the idea of entry point has been considerably discussed elsewhere (Resnick and Wolff [1987, 2006]; Chakrabarti and Cullenberg [2003, Ch. 1]), it will be helpful to specify the steps that must be followed for producing any theory, such as an economic theory. First, one or more processes are chosen as the entry point specified by an adjective: ‘class’ process, ‘gender’ process, ‘caste’ process etc. For example, a Marxist theory takes ‘class’ defined in terms of surplus labour as an economic entry point (Resnick and Wolff [1987, 2012]); another contending classical Marxist theory takes forces of production as the entry point (Cohen [1978]). As we shall see later, some theories, such as neoclassical and Keynesian, take a unique combination of aspects as their respective entry point. Why does one choose a specific entry point? Like any other process, this choice, too, is overdetermined. Second, creation of knowledge would proceed by connecting the chosen entry point – the focus of analysis – to other processes, economic as well as non-economic. Thus, economic theory would be specified by the connection of the chosen entry point (such as class) to its related processes (other economic and non-economic processes). Each process in this connected chain would serve as both the cause and the effect; as such, from an overdeterminist perspective, (1) entry point process remains in a state of change and hence cannot serve as undetermined origin or essence of any explanation and (2) explanation would be entry point–focused but not entry point–specific. That is to say, any explanation of class and non-class relation cannot be made by reducing it to class alone.With the chosen economic entry point of class, we have a distinct understanding of society that is class-focused but not class-specific. Clearly, then, entry point signals the order of discourse constituting the focus of analysis. This also tells us that if we place ourselves within the epistemology of overdetermination, no theory can claim to explain all the possible dimensions of society. As and when such totalizing and absolute claims are made, they are false and hence illegitimate. Unlike the deterministic epistemologies of rationalism and empiricism, there can be no such thing as absolute truth.There can be no claim of absolute truth, because no such truth is accessible or possible. Truth is always partial and relative, specific to the focus designated by the entry point.
Diversity in economics 75
6.2 Entry point under determinism The relation of entry point process with other processes appears differently under a deterministic structure of causality. In such deterministic theories, typically, production of knowledge, method of reasoning and analysis take an additional leap by making their chosen entry point process the essence or the exclusive determining element of what is to be explained. In short, their underlying epistemology forces such theories to arrest overdetermination by turning the entry point process as the fundamental causal force of the relationship/institution/event/history to be explained. This unchanging causal force is pregiven and hence static/unchanging in the order of explanation. In this order of explanation, reductionism rules, whereby complex entities are reduced to some simple underlying core. In economics, notwithstanding their different entry points, neoclassical and Keynesian economics make their chosen entry points as the a priori essence from which everything else follows. This reduction to the essential core qua entry point at the level of explanation also gets combined more often with the rationalist or empiricist idea of construction of theory. That is, the explanatory claim made by the theory about the world gets magically transformed into a claim about the world itself. Therefore, the claim of absolute truth (through either rationalism or empiricism) comes to be associated with the discovery of this essence and its reductionist property of deducing everything else from it and is seen as the approach to reveal the truth. Unlike in overdetermination, deterministic theories are evidently seekers and defenders of absolute truth. As we shall see later, Keynes castigates the classical theory for being wrong/untruthful on the representation of economy; the (neo)classicals pay back the Keynesians in the same coin (Lucas [1976]; Lucas and Sargent [1979]).
6.3 Differentiating economic theories We surmise that two yardsticks differentiate theories: the chosen epistemology and the entry point. In economics, as explained earlier, Keynesianism and neoclassical economics are differentiated by their different entry points and their shared deterministic structure of explanation, which functions by reducing some aspects (what is affected) to a pregiven aspect (the cause), even though they differ in what are to be cause and effect (structuralism versus humanism). For example, the disagreement on the chosen entry point and forms of deterministic epistemology helps explain Keynes’s dismissal of (neo)classical approach: For Classical Theory has been so accustomed to rest the supposedly selfadjusting character of the economic system on the assumed fluidity of money wages, and when there is rigidity, to lay on this rigidity the blame for maladjustment. . . . My difference from this theory is primarily a difference of analysis. (Keynes [1936, p. 257])
76 Soumik Sarkar and Anjan Chakrabarti TABLE 4.1 Snapshot view of difference in economic theories
Theory
Entry Point
Epistemology
Marxian
Class process of surplus labour
Neoclassical
Wants (preferences) and scarcity (technology and endowments) Social structures (mass psychology, exogenous uncertainty, social conventions and institutions)
Overdetermination Relative Truths Deterministic/Cartesian Absolute Truth Deterministic/Structuralism Absolute Truth
Keynesian
Source: Modified from Resnick and Wolff (2012, p. 352).
Theories may differ not only on account of different kinds of determinisms but also on account of chosen epistemology, which could be non-deterministic as against deterministic. Therefore, the class-focused Marxian theory differs from both Keynesian and neoclassical theory not only by virtue of its uniquely chosen entry point of class but also because of its adopted overdeterminist epistemology as against deterministic ones that the other two follows. The following is a primary snapshot view of the definition and differences amongst the three economic theories regarding their diversely produced understanding of economy and capitalism. With the differences in place, we now proceed to briefly clarify the connection of these yardsticks to the three theoretical formulations. What emerges is the following: construction of the economy and its social relations, identifying the object of analysis and deriving the outcomes are all specific to their respective focuses and modes of analysis.
7 Class focused Marxian economics: a non-deterministic theory Following Althusser (1969) and Resnick and Wolff (1987, 2006, 2012), we deploy the epistemology of overdetermination and entry point of class to define a Marxian theory that is class-focused but not class-specific. By the overdeterminist epistemology, none of these class processes can occur alone in any concrete situation; they always occur in mutually constituting groups, groups that include non-class processes, including economic processes. As and when these non-class processes (say, caste, gender, income, property, ecological processes etc.) constitute and are constituted by class processes, they form the analytical terrain of Marxian analysis. Non-class processes matter as much as class processes in explaining class-focused relationships and practices. Moreover, as non-class processes change, so do class process and vice versa, implying that class cannot be presumed as a pregiven essence. Rather, class process, like every other process, is simultaneously the cause and effect and hence subjected to change just as it subjects other processes to change. Resultantly, the hallmark of determinist epistemology – reductionism – is banished, and so are forms of economic/class reductionism. At
Diversity in economics 77
best, we can say that the entry point of class is said to have a discursive privilege but no epistemological or ontological privilege, i.e. no privilege in conceptualizing, representing and explaining class-related relationship in any concrete scenario. What is class process? Following the reading of three volumes of Marx’s Capital: Critique of Political Economy, the entry point of Marxian theory has since come to be related to his most important discovery – surplus labour. During the labour process, performed labour time (working with the means of production) is divided into necessary labour and surplus labour.Whereas the equivalent of necessary labour is remunerated to workers (sometimes called direct producers), the surplus labour remains unremunerated, which Marx takes as his entry point. Surplus labour has four processes: performance, appropriation, distribution and receipt. Following Marx, these four processes are designated as class (Resnick and Wolff [1987, 2012]). Therefore, in this Marxian theory and unlike most other Marxian approaches (and of course the non-Marxian approaches), class is a process of surplus labour and does not refer to a group of people predicated on power, property or income. The diverse ways in which class process appears furthermore map out into slave, feudal, independent, capitalist, communist and communitic forms (Chakrabarti and Cullenberg [2003]). Class struggle is fundamentally a struggle over class process of surplus labour between contingent situated/formed groups of people/individuals and not primarily that between homogenous groups of predefined, structurally fixed actors, as is often represented in other more conventional class analysis. Class identities – collective that raise, analyse, embody and participate in class questions and class struggle – are not preformed political groups reducible to structural locations or human nature but are contingently and contextually constituted, actually and potentially. While surplus labour relates immediately to the domain of production (its labour process), it moves beyond production because surplus labour is in turn appropriated, distributed and received. Notably, depending upon whether wealth/use values are in commodity or non-commodity form, surplus labour can take forms of surplus value or surplus product. Marxian theory has by now developed a complex and rich analysis that would use the entry point of class process to differentiate economic systems across time and space and in particular unpack the economic system of capitalism in a unique way. One may wonder what led Marx to choose class as his entry point concept. The answer in a nutshell is that he wanted to draw attention to a defining characteristic of capitalism – exploitation. This, he argued, was ignored in classical political economy, and the same continues to be true of modern theories of capitalism: Keynesian and neoclassical. By exploitation, Marx meant a scenario whereby the direct producers or workers are excluded/alienated from the process of appropriation of the surplus that they created; instead, the surplus is appropriated by a minority of non-performers (masters under slave system, lords under feudal system and capitalists under capitalist system). Specifically, with respect to capitalism, Marx challenges the mainstream claim that deployed resources bought from the market get back in return what they have contributed in production. Particularly, he shows that workers do not get back what they have contributed in production; they get back only a portion of created
78 Soumik Sarkar and Anjan Chakrabarti
value as wages. The rest is appropriated by non-performers and comprises what Engels called the discretionary fund of a society. Exploitation is wrong for what it is – an instance of theft or robbery as Marx contended. This class injustice is associated with a class division at the point of production – between those who are direct producers of surplus and those who are appropriators of produced surplus. Class inequality is thus rooted in economic forms of societies dominated by exploitative production structure. By the criterion of exploitation, despite considerable advancement in technology, division of labour and wealth creation, capitalism is similar to feudalism and slave organization of surplus labour. Non-exploitation, in contrast, is a scenario where this schism between producers and appropriators of surplus is ended; no exclusion happens as the workers become both the appropriators and distributors of surplus. End of class division implies the end of inequality between the two sets of performer and appropriator of surplus (to be careful, this is not an argument for equality of income which Marx never made). At any point in time in a society, different class organizations of surplus – exploitative and non-exploitative – may coexist even though it is usually one form that dominates or tries to dominate. Marx was particularly interested in social formations in which exploitative organization of surplus was dominant which is not surprising since capitalism was becoming one of the dominant historical forms in Marx’s time and certainly it is the dominant one in our time. It is hardly surprisingly that Marx was interested in history as he traced the movement from slavery to feudalism to capitalism as indicative of a transition in the forms of exploitation, their organizations of surplus performance and appropriation and the system to which they were wedded into. The systems as they appear in historical periods, amongst other things, are different variations of an embodied systemic injustice – called exploitation – and all the problems associated with it. By showcasing the rise and demise of different historical systems, Marx wanted to deliver an important lesson: no system, including capitalism, is immortal. Coming back to the specific historical case of capitalism, Marxian theory shows that exploitation takes place in a scenario where means of production, labour power and final produce are all in commodity/value form. The exploiter/employer (e.g. an individual in sole proprietorship firms, partners in partnership firms or board of directors and big shareholders in corporations) is defined as capitalist/employer; worker/employee, as the exploited. Under capitalist organization of surplus, the class division in production is associated with a sharp division in income distribution since those who come to appropriate the surplus tend to distribute most of it to themselves and their immediate supporting cohorts or condition providers. The class-income division tends to generate a wealth and social division, too, since the appropriators and their cohorts, with the discretionary income and wealth via the surplus at their disposal, arrogates the capacity to buy assets/property, power and security, education, health and institutions (including media); they can wield influence, as they do, to produce and disseminate information and opinion for fortifying the acceptance of an exploitative system as a natural outcome of human rationality and human nature. In this way, the dominant organization of capitalist organization of exploitation and that of organization of society tend to constitute one another, thereby creating in
Diversity in economics 79
the process a capitalist system. Organization of capitalist exploitation thus tends to reshape the world it touches; the changed world in turn affects the organization of capitalist exploitation, sometimes deepening its hold and at other times bringing it into crisis through the system’s contradictory effects. Marx and Marxian theory further relate the capitalist system as an exploitative organization of surplus appropriation to alienation (from nature, means of production, produced use values and surplus), inequality (income and wealth, structural and social), violence (such as that through primitive accumulation) and war/occupation (colonialism/imperialism), instabilities (business cycles) and breakdowns (crisis). Once this picture of capitalism is granted, it becomes clear why Marx was a critique of capitalism. The change he sought was not to transit from one dominant organization of exploitation to another form of exploitation (which is what he argued has happened so far) but to transcend the injustice of exploitation per se and the various irrationalities it produces in economy and society. Hence, he chose class process of surplus labour as his entry point not only to produce a critique of capitalism but also to pose the possibility of post-capitalist futurity. It is evident, then, that whereas neoclassical and Keynesian choices of respective entry points were initiated with the intent to prop up and defend capitalism, Marx’s entry point was chosen to critique and transcend capitalism. In short, the solution professed here is to cure the disease (the system itself) and not to address the symptom (suggesting changes within the system). Marxian theory’s examination and findings are thus unique, and this uniqueness is due to class and overdetermination as its organizing foundation, a foundation that no other theories have.
8 Keynes and structuralism: a deterministic theory We concentrate on Keynes’s opening chapter in the General Theory of Employment, Interest and Money, where he flags his objective and contribution by way of a fundamental break from the (neo)classical theory in the very manner in which knowledge of ‘economy’ is to be produced so as to reflect the reality of economy: I have called this book the General Theory of Employment, Interest and Money, placing the emphasis on the prefix general.The object of such a title is to contrast the character of my arguments and conclusions with those of the classical theory of the subject, upon which I was brought up and which dominates the economic thought, both practical and theoretical, of the governing and academic classes of this generation, as it has for a hundred years past. I shall argue that the postulates of the classical theory are applicable to a special case only and not to the general case, the situation which it assumes being a limiting point of the possible positions of equilibrium. Moreover, the characteristics of the special case assumed by the classical theory happen not to be those of the economic society which we actually live, with the result that its teaching is misleading and disastrous if we attempt to apply it to the facts of experience. (Keynes [1936, p. 1])
80 Soumik Sarkar and Anjan Chakrabarti
Keynes’s contribution can be isolated as seeking to build a conception of economy that will mirror actually functioning capitalist economy different from what the (neo)classical economics produced. He believed that the challenge is to produce a theory that would correspond to the logical structure underlying the real capitalist economy. To show that, though Keynes retained deterministic epistemology, here of a rationalist variety, he nevertheless rejected the form of deterministic epistemology and explanatory system adopted by the (neo)classical variety which, according to him, did not reflect the facts of experience.The difference that Keynes wanted to highlight was fundamentally foundational – one that would change ‘the (very) character of . . . arguments’. In short, (neo)classical economics epitomizes a false representation of capitalism – or what Keynes called the modern economy. This reading of Keynes is miles away from the ‘reconstructed’ Keynesianism of the Samuelson-Hicks-Solow-Tobin variety, which attempted to marry some of the outcomes pointed to by Keynes to the postulates of neoclassical economics – the famous synthesis. Evidently, this epistemological displacement of Keynes’s basic frame so as to render his approach palatable, i.e. ‘mainstream’, within a universal, all-encompassing ‘economics’ turned out to be parricidal: it ended up destroying the very objective and framing of economy that Keynes was attempting. It was no shock, then, that the attempt to synthesize (neo)classical with Keynes proved impossible and ended in a grand failure (see Lucas [1976]; Lucas and Sargent [1979]; Davidson [1994] for divergent analysis of this failure). But if we get away from the synthesis version of Keynes and read him in terms of the spirit of the general theory, as the post-Keynesians claim to do, a somewhat radical representation of capitalist economy emerges. In the latter version, Keynes departed from some of the central tenets of classical theory that continue to more or less hold true today in the dominant macroeconomic representation of capitalism. Specifically, he rejects the postulates of (1) neutrality of money as a consequence of Say’s law (nominal variables do not affect real variables) in both the short and the long run; (2) the gross substitution theorem (money and financial assets have positive elasticity of substitution with producible commodities); and (3) the ergodic axiom (uncertainty is to be seen as a stochastic economic process implying that reliable prediction is possible by using a probabilistic analysis of past and current outcomes). Resultantly, Keynes rejects the existence of a spontaneous and harmonious economic and social order, i.e. of a representation of capitalism which is Euclidean in character. A manifestation of this harmonious order derived from the classical postulates in the theoretical humanist tradition rules out any meaningful explanation of unemployment in real economy, except for job searching friction (natural rate of unemployment), which is voluntary leave from employment and short-term market friction (the ‘new’ Keynesian efficiency wage hypothesis). No involuntary unemployment in the competitive economic system is ultimately possible except through supply-side frictions. Keynes rejects the Euclidean representation of capitalism as unreal. Instead, he views the reality of modern capitalism as a structure characterized by instability and cycles, i.e. a non-Euclidean space. His representation is based on certain postulates that he argues capture the real character of a modern capitalist economy. This is unlike the classical postulates whose economic representation and equilibrium
Diversity in economics 81
(that is, simultaneously market clearing), he points out, are a corner solution of his theorized economy. The postulates that Keynes counterpoises are the following: (1) modern economy is a money-wage-contract-based system; (2) money matters for the real economy in the long and the short run; (3) the world is non-ergodic, i.e. the dynamics of economic systems are driven by uncertainty and statistically unpredictable futures; and (4) money elasticity of production is zero and elasticity of substitution between liquid assets, such as money and producible goods, is zero. These alternative postulates that are structural in nature (capturing its entry points we have earlier referred to) imply that monetary liquidity (non-producible assets) is needed to meet contractual obligation regarding purchase of goods or debt in an uncertain environment, thereby providing a rationale for holding money (to meet unforeseen contingencies), which in turn means that demand and supply of goods and services will equate at a point (effective demand point) that will not be able to clear the labour market. Involuntary unemployment thus exists. The source of this scenario is not supply side but demand side marked by the concept of effective demand. More concisely, Keynes’s concept of effective demand shows that, in a non-ergodic world of a money-wage-contract-based capitalist system, it is the existence of non-producible assets held for liquidity purposes, and for which the products of industry are not gross substitutes, that constitute the fundamental cause of involuntary unemployment. Unlike conventional economics – which bases its argument on microfoundation (rational actors personifying rational expectations) that are supposed to characterize and drive the system of capitalism and its cycles – here, it is the social structures (mass psychology, exogenous uncertainty, social conventions and institutions) that are taken as the entry point. That then becomes the fundamental causal force – the expressive causality – of determining everything else that appears, including the decision and motivation of actors, unemployment, economic cycles etc. It is then structure that not only determines the subject but also rules the somewhat unstable dynamics of the economic system of capitalism. The structure (macroeconomy/ capitalism) thus cannot be reduced to the individual subjects and therefore is in need of a unique theoretical explanation with a different theoretical frame, which was precisely what Keynes was trying to craft. Structuralism, not humanism, is thus the way to capture the real truth of capitalism – its dynamics, problems and solutions. Since that real truth represents a somewhat unstable and disharmonious picture of capitalism that is prone to recurring cycles, one needs a permanent role of the state to not only regulate and control the function of an otherwise market-driven economy but also impose policy solutions to preempt and absorb any potential and actual crisis stemming from that instability. Keynes’s objective to present a disharmonious picture of capitalism was thus not to overthrow it but to save it from itself.
9 Neo-classical economics and humanism: a deterministic theory Neoclassical economics works with the Cartesian totality that resembles the billiard ball metaphor of mechanistic causality, where some things come first and others follow – one ball in motion setting the other balls in motion (Cullenberg [1996]).
82 Soumik Sarkar and Anjan Chakrabarti
In the neoclassical approach, a general equilibrium economy resembles a machine (the totality) coming into existence as an equilibrium outcome of the interactions of pregiven agents (the parts), where the agents (demanders and suppliers) come to be situated in terms of a pregiven preference structure, technology and endowment, all assumed unchanged at the level of subsequent analysis. This given preference structure, technology and endowment becomes the chosen entry point of neoclassical economics, and it also emerges as the essence of economic analysis since the economy and explanations regarding its objects of analysis (prices, chosen quantities, consumption, investment etc.) are ultimately produced as consequences or effects of this pregiven entry point (Amariglio et al. [1990], Resnick and Wolff [2012]). Any model regarding the economy, in any spatial-temporal plane, functions with the presumed entry point of preference, endowment and technology; in the course of that explanation, the nature of the entry point does not change. This is dramatically unlike class-focused Marxist theory, which integrates the change in class process in the course of its analysis. The difference here pertains to adopting and working with diverse epistemologies. In the neoclassical school of thought, the concrete economy is supposed to somewhat mirror the model economy produced in terms of its chosen entry point. Like in the case of Keynesian economics, the knowledge of economy that its theory produces is presented as, by sleight of hand, the economy itself, i.e. the claim of a theoretical representation is assumed to mirror reality. This claimed isomorphism between its economic theory and ontological reality remains unexplained. Given this claim of absolute truth of economy, the entry point that is also the causal factor is thus an essence, and the mode of explanation is reductionist in nature. Further phenomena (for example, demand and supply) are shown to be a result of these pregiven entry points (preference structure, endowment and technology) and then again those competing plans of the buyers and producers are coordinated through market to determine the prices of the commodities and so on. Ultimately, by virtue of Walras law, it is claimed that markets for every commodity clears, an economy we refer to as the general equilibrium (Debreu [1959]; Arrow and Hahn [1971]; Hilderbrand and Kirman [1976]). From the entry point onwards, each step of analysis is marked by a deterministic structure of causality that essentializes the causal factor and reduces the existence of subsequent elements to that factor. Through this reductionist method, the structure (the economy) is represented as a result of the interaction of autonomous and competing individual choices-decisions functioning through the institution of market, which is what makes this approach humanist. This basic economic structure of neoclassical economic theory is, of course, a utopia, but it is a necessary utopia without which the standard/yardstick of assessing and judging the mechanism and quality of choices and outcomes cannot be explicated. Depending upon their chosen reference or departure point (which basically boils down to which assumptions of the general equilibrium economy to retain or drop), different trends within neoclassical economics emerge. However, these different trends neither depart from nor question the basic foundation comprising
Diversity in economics 83
the mentioned entry point and the deterministic epistemology that produces and drives the reductionist logic of the general equilibrium economy; the foundation remains unquestioned no matter the variations thereafter that are supposed to capture the ‘real’ economy. This is what Resnick and Wolff (2012) and Madra (2016) have referred to as late neoclassical economics, which takes the general equilibrium epistemology foundation and entry point as the foundation for subsequent analysis. In this regard, we do not ascribe these as a case of pluralism in economics but rather pluralism within neoclassical economics.We have already provided one example by way of Becker’s theoretical humanist approach. Let us list a few more trends5. The most faithful to general equilibrium theory of all macro theories is the real business cycle approach. It rejects any possibility of market failure, demand side as a source of permanent shocks and involuntary unemployment. Assigning numbers to certain variables in the general equilibrium economy, which they call the basic economy, this quantitative macroeconomics approach produces a simulated economy mimicking the real economy (such as that of the United States or India) construed as functioning across time. The simulated real economy (the presumed ontological field) is constructed as the image of the basic economy (the theoretical construct). Consequently, the simulated real economy owes its origin to the pregiven entry points of preference, endowment and technology and works within the Cartesian form of deterministic epistemology. Exploring this simulated economy as if it were the real economy, we find that results are predicted and policies are prescribed on the basis of the results. Importantly, it accommodates the possibility of permanent supply-side shock (technology, labour-leisure choice etc.) as a cause of business cycle (Kydland and Prescott [1982]; Plosser [1989]). The asymmetric information approach challenges the assumption of perfect and complete information, which is needed for (general equilibrium) economy. By perfect information, we mean the knowledge that agents possess equally about all the possible static or dynamic states of the economy and by complete knowledge we mean the knowledge that agents possess uniformly about all the possible outcomes of the economy in a dynamic and static setup. In this way, information is assumed to be symmetric under (general equilibrium) economy. Dropping this assumption of symmetric information, this approach presumes that asymmetric information of all kinds (adverse selection and moral hazard being popular varieties) is embedded in generic economic practice and claims to be capturing the ‘real’ economic scenario under which peculiar kinds of market failure are brought about by the free choices of the interacting actors themselves (Akerlof [1970], Mankiw et al. [1991]; Ledyard [2008]; Wilson [2008]).This problematization of the (general equilibrium) economy does not amount to the rejection of the neoclassical entry point and the manner in which the analysis is constructed. Rather, the asymmetric information approach, too, presumes and works with the same entry point as the general equilibrium approach and with a similar deterministic structure of causality. The retention of neoclassical entry point and its structure of causality are again the reason as to why the ‘new’ Keynesian school (focusing on the short run) cannot be disassociated from the foundation of neoclassical economics (Mankiw et al. [1991]).
84 Soumik Sarkar and Anjan Chakrabarti
It says that market rigidity/failures leading to macroeconomic fluctuations can transpire for several reasons, such as asymmetric information, menu costs etc. in the short run. Their alignment with Keynes on account of showing fluctuations in relation to market rigidity/failures has, however, been criticized by post-Keynesians as misdirected. Problematizing its association with Keynes, Davidson (1994) argues that Keynes’s framing and results are not a matter of the short or long run (also see Nachane [2013]). It is a radical departure from the foundation of (neo)classical economics, as our earlier analysis of Keynes revealed. Based on that foundation, Keynes’s alternative construction of the ‘real economy’ boiled down to a clear rejection of the (neo)classical postulates. He rejected Say’s law, dismissed the principle of ergodicity (hence the entire genre of expectation formation theories that came later in neoclassical economics) and discarded the gross substitution axiom (entailing that any good, including money, is a substitute of any other good); his analysis resulted in the dismissal of the faith in classical dichotomy and in foregrounding the presence of involuntary unemployment as a structural feature of a lack of effective demand in a real-time capitalist economy. In contrast, the new Keynesians neither dissociate themselves from the (neo)classical postulates nor challenge the classical result about neutrality of money and full employment which they, wittingly or unwittingly, assume will hold in the long run if market imperfections are removed.They might argue that the long run never comes and that the short run is the permanent state of affairs, but that is at best an assumption/faith not backed up by an explanation of why that may not transpire in case the rigidities of the short run are allowed to disappear in that theory. In short, they accept and share the (neo)classical foundation of constructing the idea of economy as a variant of the theoretical humanist approach and then look for and explain temporary deviations from it by tracing it to the optimizing behaviour of agents under conditions of rational expectations. Rejecting the humanist approach with its specific deterministic epistemology, Keynes, on the other hand, challenged the very conception of (neo)classical economy itself, irrespective of short or long run. Sharing the entry point and epistemology is why these so-called diverse approach of macroeconomics within neoclassical economics can converse even as they find the post-Keynesian economics (not to speak of Marxian economics) strange, redundant and at times false. But then we have shown that this strangeness and redundancy is due only to the failure to recognize the role of the yardsticks of epistemology and entry point in producing and differentiating theories. What these alternative theories explain and how they explain differ from the neoclassical economics. The charge of falseness, as and when made, is only a consequence of neoclassical economics’ chosen deterministic epistemologies, which finally comes down to essentializing its entry point as the centre of capturing the true nature of economy. This belief when embraced is in effect delusional.
10 Conclusion In this chapter, we have defended the heterodox approach by producing and upholding a theory of theoretical differences. We conclude that not taking a heterodox
Diversity in economics 85
approach will always be a case of a glass half-filled and half-empty; hence, it will by no means serve to generate an adequate understanding of the diverse field of economics and the various issues in it. There is no one way to look at aspects such as capitalism, market, price, value, competition, inequality, unemployment etc. This realization makes it imperative that the discipline of economics be considered as agnostic and shifting. Our educated guess is that our theoretical explication based on epistemology and entry point, which we exemplified in case of the discipline of economics, would also hold true for many of the other disciplines. What happens if we impose a singular theory to rule the discipline, taking it as the singular truth? Suppose there is a family of five made up of father, mother and three children.6 One of the children is a career success and thinks his success is due to his family; the second is career-wise mediocre and thinks that he could have done better if his family was/is reformed; and the final one is a failure who thinks he owes it to the dystopic family. If asked to describe and analyse the family, a partial picture would emerge if only the view of the successful child is considered. It is only when the contesting and clashing views are accounted does a complete picture of the family appear, for there is not one but many truths about the family. Similarly, if we view capitalist economy only through the humanist perspective of neoclassical economics, which operates within and celebrates a competitive market economy driven by the interaction of asocial, self-conscious, self-transparent, rational and autonomous individuals, a partial picture will appear. It is only when we complement this perspective with those of the critical ones of Keynesianism (which defends capitalism – but not that proposed by [neo]classical economics) and Marxism (which, while acknowledging its strength, shows the dysfunctionality of capitalism per se and seeks to overcome it) does a complete picture of capitalism as an economic system emerge. Will the students and researchers of economics (or for that matter any disciple) be better off if they only learn to know and celebrate the virtues of competitive market-based economic system of capitalism, or will their education and the research field as a whole be more enriched and complete if they complement this understanding with its critique and alternative understandings? Just like the analysis of family will remain incomplete, the analysis of capitalism, too, will be hopelessly short if the diverse analyses and positions with respect to it are not accounted. To use the central precept of neoclassical economics, it is a matter of choice and the freedom to choose amongst many contending options and find out what they have to offer. Does our discipline practice what it preaches? Is it democratic? One wonders.
Notes 1 There are, of course, even many more economic theories such as, to name a few, the NeoRicardian, Austrian and Capability theory. Our choice of the three is sufficient to address our problem. 2 See Bowles and Gintis (2000), Colander (2000); Colander and Rosser (2004) and Davis (2006). 3 The notable exception is the case of investment which is fundamentally derived from individual decision-making process, However, its nature is very different from neoclassical
86 Soumik Sarkar and Anjan Chakrabarti
theory. Due to fundamental uncertainty on investment decision that cannot be probabilistically calculated, it is heavily influenced by masses’ perception of the future leading to (irrational) scenarios of optimism and pessimism. Thus, in the end, investment decision making cannot be accounted without confronting the issue of mass psychology. 4 ‘Economy is the art of making the most of life’ (George Bernard Shaw in Becker [1976, p. 1]). 5 There are many other approaches in the theoretical humanism tradition that claims to have challenged various aspects of the neoclassical economics. These include theories of motivational diversity, bounded rationality and behavioural economics. Resnick and Wolff (2012, ch 5) argue that the former two are not really able to break away from the neoclassical tradition but are essentially a variant of it since they retain the latter’s entry point and epistemological standpoint. Behavioural economics holds the most promise of breaking away from neoclassical economics even though at present it has been essentially showing that the imposed axiom of the asocial, atomistic, independent and rational individual in making choice (which is not challenged) may not be functional at the point of making choice; therefore, explanation of choice making needs to be supplemented by other phenomenon such as herd behaviour, money illusion, animal spirits, etc. This puts the idea of individual being autonomous of social aspects in trouble since now the latter must be seen as constituting the individual. Whether this possibility is taken to its logical conclusion, i.e., whether this ends up challenging the basic edifice of neoclassical theory or not (which at present is not the case), needs to be seen. It is of course worth reminding that behavioural economics is more of a challenge to neoclassical economics from within and is no substitute for other theoretical approaches like Marxian and Keynesian theories. 6 We owe this example to Richard Wolff.
References Akerlof, G. A. 1970. “The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism”. Quarterly Journal of Economics,Vol. 84 (3), 488–500. Althusser, L. 1969 (1965). For Marx. Trans., Ben Brewster and Allen Lane. The Penguin Press: London. Amariglio, J., S. A. Resnick, and R. D. Wolff. 1990. “Division and Difference in the ‘Discipline’ of Economics”. Critical Inquiry,Vol. 17 (Autumn). Aristotle. 1995. Politics. Trans., Ernest Barker. Oxford University Press: New York. Arrow, K. J. and F. Hahn. 1971. General Competitive Analysis. Holden Day: San Francisco. Becker, G. 1976. The Economic Approach to Human Behaviour. University of Chicago Press: Chicago and London. Becker, G. 1993 (1963). Human Capital: A Theoretical and Empirical Analysis with Special Reference to Education, 3rd Edition. University of Chicago Press: Chicago and London. The Bhagavad Gita. 1992. Tran Juan Mascaro. Penguin Group: New Delhi. Bilgrami, A. 2009. “Value, Enchantment, and the Mentality of Democracy: Some Distant Perspectives from Gandhi”. Economic and Political Weekly,Vol. 44 (51). Bowles, S. and H. Gintis. 2000. “Walrasian Economics in Retrospect”. The Quarterly Journal of Economics,Vol. 115 (4), 1411–1439. Carter, S. M. and M. Little. 2007. “Justifying Knowledge, Justifying Method, Taking Action: Epistemologies, Methodologies, and Methods”. Qualitative Health Research,Vol. 17. Chakrabarti, A. and S. Cullenberg. 2003. Transition and Development in India. Routledge: New York. Chakrabarti, A. and A. Dhar. 2017. “Transition”. In: David M. Brennan, David KristjansonGural, Catherine P. Mulder, Erik, and K. Olsen, eds., Routledge Handbook of Marxian Economics. Routledge: London. Chakrabarti, A., A. Dhar, and B. Dasgupta. 2015. The Indian Economy in Transition: Globalization, Capitalism and Development. Cambridge University Press: New Delhi and London.
Diversity in economics 87
Cohen, G. A. 1978. Karl Marx’s Theory of History: A Defense. Oxford University Press: Oxford. Colander, D. 2000. “The Death of Neoclassical Economics”. Journal of the History of Economic Thought, Cambridge University Press,Vol. 22 (2). Colander, D. and B. Rosser. 2004. “The Changing Face of Mainstream Economics”. Review of Political Economy,Vol. 16 (4), 485–499. Cullenberg, S. 1994. The Falling Rate of Profit: Recasting the Marxian Debate. Pluto Press: London. Cullenberg, S. 1996. “Althusser and the Decentering of the Marxist Totality”. In: A Callari and D. F. Ruccio, eds., Postmodern Materialism and the Future of Marxist Theory: Essays in the Althusserian Tradition. Wesleyan University Press: Hanover and London. Davidson, A. 1994. Post-Keynesian Macroeconomic Theory. Edgar Elgar: Cheltenham. Davis, J. 2006. “The Turn in Economics: Neoclassical Dominance to Mainstream Pluralism?” Journal of Institutional Economics,Vol. 2 (1), 1–20. Debreu, G. 1959. Theory of Value: An Axiomatic Analysis of Economic Equilibrium.Yale University Press: New Haven. Elster, J. 1985. Making Sense of Marx. Cambridge University Press: Cambridge. Freud, S. 2010 (1955). The Interpretation of Dreams.Trans., James Starchey. Basic Books: New York. Gibson-Graham, J. K. 1996. The End of Capitalism (as we knew it): A Feminist Critique of Political Economy. Blackwell Publishers Ltd: Oxford and Cambridge. Harcourt, G. C. and P. Kreisler. 2013. The Oxford Handbook of Post-Keynesian Economics, Volumes 1 and 2. Oxford University Press: Oxford. Henry, J. 2017. Knowledge Is Power: How Magic, the Government and an Apocalyptic Vision Helped Francis Bacon to Create Modern Science. Icon Books: Cambridge. Hilderbrand, W. and A. Kirman. 1976. Introduction to Equilibrium Analysis. North-Holland: Amsterdam, The Netherlands. Kant, I. 1784. What is Enlightenment? http://www.columbia.edu/acis/ets/CCREAD/etscc/ kant.html Kaplan, A. 1964. The Conduct of Inquiry: Methodology for Behavioral Science. Chandlers: San Francisco. Keynes, J. M. 1936. The General Theory of Employment, Interest and Money. Palgrave Macmillan: London. Krause, U. 1982. Money and Abstract Labor.Verso: London. Kydland, F. E. and E. C. Prescott. 1982. “Time to Build and Aggregate Fluctuations”. Econometrica,Vol. 50 (6), 1345–1370. Ledyard, J. O. 2008. “Market Failure”. In: The New Palgrave Dictionary of Economics. 2nd Edition. Palgrave Macmillan: London. Levins, R. and R. Lewontin. 1985. The Dialectical Biologist. Harvard University Press: Cambridge, MA. Lucas, R., Jr. 1976. “Econometric Policy Evaluation: A Critique”. Carnegie-Rochester Conference Series on Public Policy, Elsevier,Vol. 1 (1), 19–46, January. Lucas, R., Jr. and Thomas J. Sargent. 1979. “After Keynesian Macroeconomics”. In: Quarterly Review, Federal Reserve Bank of Minneapolis,Vol. 3 (2), 1–16. Madra,Y. M. 2016. Late Neoclassical Economics: The Restoration of Theoretical Humanism in Contemporary Economic Theory. Routledge: London and New York. Mankiw, N. G., D. Romer, and J.Thackara, ed. 1991. New Keynesian Economics,Volumes 1 and 2. The MIT Press: Cambridge. Marx, K. 1845. The German Ideology. Marxist Internet Archive. www.marxists.org/archive/ marx/works/1845/german-ideology/index.htm. Marx, K. 1973. Grundrisse. Penguin Classics: London.
88 Soumik Sarkar and Anjan Chakrabarti
Nachane, D. 2013. “Global Crisis and the New Consensus Macroeconomics”. Economic and Political Weekly, Vol. 48 (1), 5.Olsen, E. K. 2009. “Social Ontology and the Origins of Mode of Production Theory.” Rethinking Marxism,Vol. 21 (2): 177–195. Plosser, C. I. 1989. “Understanding Real Business Cycles”. Journal of Economic Perspectives,Vol. 3 (3), 51–77. Resnick, S. A. and R. D. Wolff. 1987. Knowledge and Class: A Critique of the Political Economy. Chicago University Press: Chicago. Resnick, S. A. and R. D. Wolff. 2006. New Departures in Marxist Theory. Routledge: London and New York. Resnick, S. A. and R. D. Wolff. 2012. Contending Economic Theories: Neoclassical, Keynesian and Marxian. The MIT Press: Cambridge. Roemer, J. E. 1986. Analytical Marxism. Cambridge University Press: Cambridge. Rorty, R. 1979. Philosophy and the Mirror of Nature. Princeton University Press: Princeton. Wilson, C. 2008. “Adverse Selection”. In: The New Palgrave Dictionary of Economics, 2nd Edition. Palgrave Macmillan: London.
PART III
Methods of conflict analysis and policy evaluation
5 GAME THEORY Strategy design in conflict situations Swapnendu Banerjee
1 Introduction Consider the following regional conflict scenario: India and Pakistan, both countries know that both would do better by spending less on defence and spending more on economic development. But both know that if one does not spend on arms the other can be possibly destroyed if conflict arises. What should they do? Spend more on overall development or spend more on securing itself? Take one more example: Person A works for Company B. Person A is paid by Company B and is expected to serve the interests of the company. One needs to remember that Person A’s interest and Company B’s interest might not be aligned. Therefore Person B might take an action that benefits him/her at the cost of his/her company. Given that not all actions of all employees are observable, it might be optimal for Person A to serve his own interest and at the same time go scot-free. This ‘vested interest’ on the part of person A is a strategic decision which is taken ‘rationally’. Scenario 3: Two criminals control two areas in a city where they run an extortion racket. Both know that if they do not enter each other’s area they can peacefully extort money from their respective zones and have a lavish life. But both also know that if somehow one can finish the other one then he/she can control a larger area, thus leading to more extortion possibilities and more money. What should be the optimal strategy for both the goons, to restrict himself to his own area and thus avoid possible confrontation or to try and capture the other’s area? What will be the outcome if the players interact once? What will be the outcome if the players interact repeatedly? The previous examples point to the fact that in reality often there are conflict scenarios and an agent has to decide on his/her optimal action keeping in mind his/ her objective of maximizing own wellbeing. In the process ‘rational’ agents, entities behave strategically maximizing their objective/payoff sometimes at the cost
92 Swapnendu Banerjee
of other’s well-being/payoff. This is sometimes referred to as behaving rationally and strategically and one can predict how a ‘rational’ agent will behave and interact under certain situations. ‘Game theory’ (non-cooperative) is a tool to predict and analyse such ‘rational’ strategic behaviour of agents (economic or otherwise) in the presence of possible ‘conflict of interests’.
1.1 Brief history of game theory In 1838 French economist Cournot constructed a model with two firms that choose quantities simultaneously. There was ‘strategic interaction’ in that model (which is at the core of any game theoretic model) but unfortunately Cournot didn’t realize it in his lifetime. Later, Bertrand in 1887 constructed similar model with two price setting firms which had similar strategic interaction dimensions. Like Cournot, he also did not realize that his model had seeds of a mathematical tool that would become famous and popular in a hundred years or so. In the late 1930s, French mathematician John von Neumann became interested in mathematical applications in economic theory and this brief tryst with economic theory convinced him that it was in need of a new sophisticated mathematical tool. The fruit of this venture was immediately realized when he along with Oskar Morgenstern wrote their now classic book Theory of Games and Economic Behavior in1944. ‘Game theory’ (a subfield of mathematics) was formally born. In that book they developed the concept of ‘two-person-zero-sum’ games and other ‘cooperative game’ theoretic concepts.To give a brief introduction two-person zero sum games are games where one person’s gain is exactly equal to other person’s loss. That is the sum of net-payoffs of both the players is equal to zero. But soon economists found out that the phenomenon of ‘one person’s gain is other person’s loss’ was too restrictive in many applications. Later John Nash1 took the next giant step and gave a generalized solution concept for broader class of games which need not be zero-sum. Thus ‘Nash equilibrium’ revolutionized what we now know as the ‘modern non-cooperative game theory’ and laid the foundation for further development in the field. Nash focused only on simultaneous move games with complete information. Interestingly the optimal outcomes in both the Cournot and Bertrand models are in fact Nash equilibria and the Cournot equilibrium is also sometimes referred to as ‘Cournot-Nash’ equilibrium. Later Reinhard Selten extended and refined the Nash equilibrium concept for sequential (dynamic) games of complete information and proposed what is now famously known as the ‘subgame perfect Nash equilibrium’ (SPNE). Subsequently the concept of Nash equilibria has been extended to games with incomplete information by Harsanyi, which led to the concepts of Bayesian Nash equilibria and perfect Bayesian equilibria. Nash, Selten and Harsanyi jointly received the Nobel Prize in Economics in 1994 for their contribution to game theory. Game theory (non-cooperative) is sometimes known as interactive decision theory. It deals with situations where people with different (mostly competing) goals try to take into account others’ actions in deciding on the optimal course of
Game theory 93
action. Game theory has wide applications in economics, computer science, biology, political science, sociology and other social sciences. Information wise games can be of broadly two types. (a) Complete information games: In these classes of games all players know everything and all players know that players know everything.2 This is known as common knowledge. Put simply, in complete information games all players have common knowledge about each other (i.e. about the structure and the possible payoffs of the game). Complete information games, although simple, are elegant enough to capture a multitude of economic and other strategic interactions realistically. (b) Incomplete information games: In these classes of games all players might not know everything about each other, i.e. the common knowledge assumption is relaxed. Incomplete information games, although more realistic, comes at a cost of increased complexity which is beyond the scope of this chapter. In this chapter we will discuss different complete information games with different types of examples. We will also discuss how these games can be applied to different fields other than Economics. But before proceeding further we need to state some basic assumptions that are made in standard game theory. First we will assume all players to be rational. Rationality means that all players know what is best for them and can ‘do’ the needful to achieve her best. One can have models where players are not fully rational or may be ‘boundedly rational’, but we will abstract from those issues in this chapter. Second it is assumed that this ‘rationality’ is ‘common-knowledge’. That is everybody (all players) knows that everybody is rational and everybody knows that everybody knows that everybody is rational. Given these two basic assumptions we go into static games (simultaneous move games) of complete information.
2 Static games of complete information Before going into static games we distinguish between a ‘pure strategy’ and a ‘mixed strategy’. A ‘pure strategy’ is where a player plays a strategy (takes an ‘action’ in static games) with probability 1. But a mixed strategy is where a player ‘randomizes over pure strategies’. As an example of mixed strategy, consider the following from the game of cricket: Anil Kumble bowling ‘a standard leg spinner with probability 1/3, a flipper with probability 1/3 and a googly with probability 1/3’. This kind of a strategy makes Kumble more potent as a bowler and the batsmen are always kept guessing about what’s going to come next. But a pure strategy can be that of bowling a leg-spinner always (with probability 1) or may be a googly always (with probability 1).This kind of a strategy if employed would have made Kumble predictable and in-effective. Technically put a mixed strategy is a probability distribution over pure strategies. A game might have both types of equilibria, pure strategy equilibria
94 Swapnendu Banerjee
and/or mixed strategy equilibria, but to keep things simple, tractable and less technical henceforth we will focus on pure strategies only. In static games all players move simultaneously and all players have common knowledge about each other’s strategy sets and payoff functions. One famous example of a static game of complete information is the ‘battle of the sexes’.
2.1 The battle of sexes game This is a story about a couple planning an evening out.The couple has two choices. They can either go to watch a ‘football’ match or can go to watch a ‘ballet’. They decide simultaneously implying that the husband is not sure what the wife has decided and the wife is also not sure about the husband’s decision. The preferences of the husband and the wife are as follows: The husband prefers football over ballet whereas the wife prefers ballet over football, but they hate to end up at different places. We formalize the game as follows:
Wife (Player 2)
Football
Football
Ballet
2, 1
0, 0
0, 0
1, 2
Husband (Player 1) Ballet
FIGURE 5.1
Payoffs in battle of sexes game
Note that when both players go to football the husband gets 2 and the wife gets 1. If both end up at ballet, the wife gets 2 and the husband gets 1. But if they end up at different places e.g. the husband at football and the wife at ballet, they both get 0 since they hate to end up at different places even if they go their preferred place respectively. Here the payoffs are nothing but numerical representations of the level of satisfaction or utility that both players get. This way of depicting a game is known as the normal form. Simply put a normal form is very similar to a matrix form of depiction and is useful for simultaneous move games. To find out the Nash equilibria of the game we need to explain the concept of Nash equilibrium. At Nash equilibrium both players’ action are optimal given other’s action, that is no player should have a ‘unilateral’ incentive to deviate. The game has two Nash equilibria. One is (football, football) and the other is (ballet, ballet). Note that when the husband chooses football it is optimal for the wife to
Game theory 95
choose football. The wife will not unilaterally deviate to ballet since that will fetch her 0 whereas she will at least get 1 by choosing football. Same argument holds for the husband as well, it is optimal for husband to choose football if the wife chooses football. Same logic will hold for the (ballet, ballet) equilibrium where both players will get a positive surplus sticking there compared to one player ‘unilaterally’ deviating from ballet. Note that it is assumed that both players are concerned about their own payoff. They do not derive any pleasure or loss of utility from increasing or reducing others payoffs, i.e. the players are ‘self-regarding’. Now how do we know which equilibrium will actually occur? The standard Nash equilibrium concept will not be able to answer this one. Nash equilibrium concept for static games can only predict the set of possible equilibria but will fail to predict exactly what’s going to happen. Thus the battle of the sexes game is also known as a ‘coordination problem’ or a ‘coordination game’. In the static set up we don’t really know exactly how the couple managed to coordinate to (football, football) or (ballet, ballet). We can only say that it will be either (football, football) or (ballet, ballet), that’s all. The game has multiple Nash equilibria. Next we go over to another interesting game which apart from economics has wide applications to different fields of study in social sciences, such as sociology, political science, international relations and development studies, amongst others. The game we are referring to is famously known as the ‘prisoners’ dilemma’.
2.2 Prisoner’s dilemma Two criminals have supposedly committed a crime and are caught by the police. The police officer investing the matter wants to extract confessions from both the criminals/prisoners. If both confess then the crime is unambiguously proved and both criminals get five years of jail. If no one confesses then the original crime is not proved but both can be booked for a lesser crime and can be sentenced for two years in jail. The interesting twist comes now. If one prisoner confesses and the other doesn’t confess then the confessing prisoner is rewarded for cooperating with the investigation and is given a lighter sentence of one year in jail whereas the other non-cooperating prisoner goes for ten years in jail. As if the confessing prisoner turns approver (very similar to becoming a ‘rajshakshi’) and gets a favourable treatment whereas the other is penalized for non-cooperation. The game is depicted in the normal form as follows. One can easily note that whatever prisoner 2 does prisoner 1 will always ‘confess’. Same holds for prisoner 2. That is ‘confess’ is a strictly dominant strategy whereas ‘don’t confess’ is a strictly dominated strategy. Thus it is optimal for both to confess and the unique Nash equilibrium of this game will be ‘confess, confess’ and both will go to jail for five years.The striking feature of this game is both prisoners know that both could have done better by cooperating amongst themselves by playing ‘don’t-confess, don’t-confess’ and going to jail for two years. This is common knowledge. But both know that if one is playing ‘don’t-confess’ and the
96 Swapnendu Banerjee
Prisoner2 Confess
Don’t confess
Confess
–5, –5
–1, –10
Don’t confess
–10, –1
–2, –2
Prisoner1
FIGURE 5.2
Payoffs in prisoners’ dilemma game
other backstabs and deviates to ‘confess’ then the deviating prisoner will get an even lesser sentence of one year and the other will go for ten years. Both know this and both will end up playing ‘confess’ and act like ‘rational fools’. Trust doesn’t work in the game and the ‘confess, confess’ outcome will continue to hold even if the game is played more than once (for any finite number of periods). Noteworthy is that the ‘confess, confess’ is the unique Nash equilibrium of this game and the ‘don’t-confess, don’t-confess’ outcome although makes both prisoners better off is a non-Nash and is never played by both the players. This game has wide applications in different fields of study. Consider the following examples.
Regional conflict revisited Revisit the game between India and Pakistan stated at the beginning of this chapter. The conflict game fits the prisoners’ dilemma example since both rationally end up spending more on defence and less on development. At the equilibrium both countries (the people) suffer, even though both countries know that both could have done better by cutting down spending on arms race. Therefore, this is one classic example of prisoner’s dilemma applicable to conflict studies, international relations and/or political science.
Political myopia Two political parties A and B are competitors in electoral politics. Both parties know that taking hard decisions on development will lead to betterment of the economy as a whole. But both know that taking populist measures will enable them to appease a sizeable section of voters and might enable them to win future elections and this although not good for the economy as a whole is good for the party. Note that a political party (ruling party) and the government are two different entities and might have conflicting interests.The party runs the government whose job is the betterment of the people, but at the same time the party has its own interests.
Game theory 97
At the optimum it might be the case the parties stick to populist measures and the overall development of the people takes a back seat.
Erosion of ethics/morality Consider two applicants X and Y are competing for a single job opportunity. Both candidates know that the other can employ unfair means to get the job. If X decides to be honest and Y employs unfair means, then X might lose out on the job. The same argument holds for candidate Y and both candidates know this. Therefore, it might be sub-optimal for one to remain honest given that the other candidate can always get through by deviating from honest practices. Thus, it might be optimal for both to employ unfair means leading to an overall erosion of the ethical and moral standards. One can construct and model similar situations like environmental degradation, cultural degradation and the proverbial ‘race to the bottom’ as examples of the prisoners’ dilemma. I leave it to the future researcher to formulate a problem that fits the prisoners’ dilemma story. Next, we go over to a game which is famous for an altogether different reason.
2.3 The matching pennies game Two players simultaneously put down two coins.The possible actions for both players are heads or tails. If the coins match then player 1 will give player 2 Re.1. If the coins do not match then player 2 will give player 1 Re.1.The game is described here:
Player-2 Heads
Tails
Heads
–1, +1
+1, –1
Tails
+1, –1
–1, +1
Player-1
FIGURE 5.3
Payoffs in matching pennies game
This game is an example of a ‘zero sum’ game mentioned at the beginning of this chapter. Note that a player’s gain is exactly equal to the other player’s loss and therefore the sum of payoffs is always exactly equal to zero. The game doesn’t have a pure strategy Nash equilibrium. One can always find a profitable ‘unilateral’ deviation by at least one player from any cell. The upshot is that a
98 Swapnendu Banerjee
game might not have any pure strategy Nash equilibrium. But a game will always have a mixed strategy Nash equilibrium. In this game a pure strategy by player 1 can be that she plays head (or tail) with probability 1. But a mixed strategy can be when she (player 1) plays head with probability1/2 and tail with probability 1/2, or generally when player 1 plays head with probability p and tails with probability (1-p), probabilities adding up to 1 always. This game has a mixed strategy equilibrium where both players play heads and tails with probability 1/2 but it does not have a pure strategy Nash equilibrium which is one of the reason why this game is famous.
2.4 The stag hunt game The stag hunt is a game which describes a conflict between safety and social cooperation. Jean-Jacques Rousseau first described the game which is also known as ‘assurance game’, ‘coordination game’ and ‘trust dilemma’. Two individuals go out on a hunt. Each can individually choose to hunt a stag or hunt a hare and each player decide simultaneously. If an individual hunts a stag, he must have the cooperation of his partner in order to succeed. An individual can get a hare by himself, but a hare is worth less than a stag. This is taken to be an important analogy for social cooperation. The game is depicted in the following figure. If both hunters choose to hunt a stag they get 6 each. But if the hunter-1 chooses to hunt a hare and hunter-2 a stag the hunter-1 gets 4 but hunter-2 gets 0. If both choose hare then both get 2 each. A stag hunt game has two pure strategy Nash equilibria – one is risk dominant and the other is payoff dominant. The strategy pair (stag, stag) is payoff dominant since payoffs are higher for both players compared to the other pure NE, (hare, hare). On the other hand, (hare, hare) risk dominates (stag, stag) strategy since if uncertainty exists about the other player’s action, going for a rabbit will provide
Hunter-2 Stag
Hare
Stag
6, 6
0, 4
Hare
4, 0
2, 2
Hunter-1
FIGURE 5.4
Payoffs in stag hunt game
Game theory 99
a higher expected payoff. The more uncertainty players have about the actions of the other player(s), the more likely they will choose the strategy corresponding to it. The stag hunt differs from the prisoners’ dilemma in that there are two Nash equilibria: when both players cooperate and both players defect. In the prisoners’ dilemma, however, despite the fact that both players’ cooperating (not confessing) is efficient for both, the only Nash equilibrium is when both players choose to backstab (confess). There is a substantial relationship between the stag hunt and the prisoner’s dilemma. In biology many circumstances that have been described as prisoner’s dilemma might also be interpreted as a stag hunt, depending on how appropriateness is perceived. It is also the case that some human interactions that seem like prisoner’s dilemmas may in fact be stag hunts. But occasionally players who defect against cooperators are punished for their defection. For example if a punishment of (–2) is inflicted on the defecting party (non-cooperating party) then the imposition of this punishment turns the prisoner’s dilemma into the stag hunt. Till now we have concentrated on static games or simultaneous move games. Now briefly we go over to dynamic games or simultaneous move games.
3 Dynamic games: a brief introduction Dynamic games are those where players move sequentially. Consider the following simple dynamic version of the ‘battle of sexes’ game. Suppose we make the wife the first mover. The wife decides first, lets the husband know her decision and then the husband make his decision. Here the husband knows what the wife has decided. Therefore this game is a game of ‘complete and perfect’ information. A perfect information game is where the ‘history’ of the game is known to all which is the case here. The wife knows that she decides first, the husband knows the wife’s choice, both know everything and both know that both knows everything –so on and so forth. We depict the game in the ‘extensive form’ (similar to a flow chart) as follows. In the diagram, F stands for football and B stands for ballet. The first payoff is the first mover’s payoff (i.e. the wife’s) and the second payoff is the husband’s payoff. The concept of strategies in dynamic games is slightly complicated and we will not go into that.We will talk about the optimal actions chosen by each player in both the periods.The first mover wife will reason like the following:‘If I choose football in period 1 it is optimal for my husband to choose football in period 2. In that case I will get 1. But if I choose ballet in period 1 then my husband being a rational player will also choose ballet. In that case I will get 2. Therefore it is optimal for me to choose ballet in period 1’.Therefore the wife will optimally choose ballet in period 1 and the husband will optimally respond by choosing ballet in period 2.Thus the unique subgame perfect outcome of this dynamic ‘battle of sexes’ game with a first mover wife is
100 Swapnendu Banerjee
Wife
B
F
Husband F 1, 2 FIGURE 5.5
B
F
0, 0
0, 0
B 2, 1
Dynamic battle of sexes game
(ballet, ballet). This way of reasoning that the wife employs is known as ‘backward induction’. We get a unique prediction now which we did not get in the static version of this game. Note that the wife is concerned about own payoff only and she wants to maximize her payoff. She is not concerned about her husband’s payoff. Thus technically the wife is ‘self-regarding’. Same assumption holds for the husband. If we want to model ‘other-regardingness’ in the sense that the husband and the wife are concerned about each other’s payoff then we should re-model the game accordingly that that will be a different game. For this chapter we abstract from such issues. Finally one can easily check that the equilibrium outcome will be (football, football) if we make the husband the first mover. The majority of the strategic interaction that we see in reality is dynamic in nature. In some games the first mover has a distinct advantage. One can think of games where there is a second mover advantage and some games are neutral in nature. The ‘tick-tack-toe’ is a dynamic game which is neutral; there is no distinct first or second mover advantage, i.e. if both players play rationally the game should always end in a draw. One player can win if and only if the other player makes a mistake. Chess is another example of a dynamic game which is sufficiently complex in terms of strategies, reasonably neutral although there is a common perceived notion that the player with white pieces (first mover) does have an advantage. In reality various economic, political and sociological situations can be modelled as dynamic games. A political party makes a set of promises before an election.Then the voters are supposed to vote. After coming to power that political party might or might not keep its promise. Depending on the reputation of that party and their ‘beliefs’ about the party, the voters will decide whether to vote for that party in the first place. Another example could be ‘crime and punishment’. While passing an anti-criminal law one has to keep in mind the fallout, that is whether the law is going to be productive or counter-productive. As an example, suppose person X commits a crime that affects person Y adversely. Now suppose there is death penalty for that crime. Now after committing the crime person X might have an
Game theory 101
incentive to eliminate person Y in order to remove evidence of the crime committed. Therefore this anti-criminal law, although stringent, leads to an escalation of crime which otherwise shouldn’t have happened. The law-makers therefore should factor in future possibilities while passing a truly effective law.
3.1 Repeated interaction Repeated interaction can also be modelled as dynamic games of complete information where the participants play the same game repeatedly over time. Scenario 3 described at the beginning of the introduction talks about a game between two goons that might have a different kind of outcome if the goons play the game repeatedly. Let’s explain solve the game intuitively here. If the goons interact for one period3 then the unique Nash equilibrium of that game will be to go and try to finish-off the other so that one can control a larger area. But in the process both might get hurt (or even die) leading to a ‘sub-optimal’ outcome. This game in essence is a prisoners’ dilemma game where both take the equilibrium strategy which is strictly worse than a ‘no-conflict’ outcome and both are strictly worse-off playing the equilibrium ‘conflict’ strategy. Now if the players play the game repeatedly and for infinite periods (meaning that both players do not know when the game will end) then one can show that the ‘no-conflict’ outcome is optimal if the players are ‘patient’. In other words if both the goons value their respective future payoffs and for that staying alive is necessary, they will refrain from going at each other and will take to ‘no-conflict’ route.This concept that players cooperate in the long run (with slight abuse of the term) if they value future payoffs is known as the ‘folk theorem’4 and one can support the ‘no-conflict’ solution as an equilibrium of this infinitely repeated game. The next section will have some applications of static and dynamic games and how those can be used in constructing models that capture a slice of reality.
4 Some applications In this section we discuss some game theoretic applications to issues of importance to academics and researchers. We will keep things sufficiently non-technical although, only exception will be the final part of our analysis on auctions. Readers not interested in technical details can skip that part but will not lose much of the intuition and general concepts. We start with a problem that affects all of us in our daily lives.
4.1 Bribery and corruption Consider the following story. An entrepreneur needs two licenses to start a business. To give an example, one can be an environmental permit and other can be a trade permit. Now each permit can be obtained by paying an official price set by the government say c1 and c2 but needs approval of a corrupt bureaucrat (one for each
102 Swapnendu Banerjee
permit) who demands a price p1 and p2 which is greater than c1 and c2 respectively. Thus (pi−ci) is the amount that bureaucrat i plans to pocket which is effectively the bribe that he demands to issue that particular permit. Suppose entrepreneurs are of different type, good or bad, and the government sets c1 and c2 in such a way such that good entrepreneurs with higher return can only afford to apply for the permit and bad entrepreneurs are screened out. Also the government needs to ensure that the corrupt officials do not charge too high (pi−ci) so that some good entrepreneurs fail to obtain the permits. The timing of the game can be formalized as follows: Period 1: The Government chooses {c1, c2}. Government’s objective is to maximize social welfare which depends on market conditions and entrepreneurial talent and also includes the welfare of the bureaucrats (who are part of the society). Period 2: Prospective entrepreneurs need to procure the licenses from the bureaucrats. The prices of the licenses are simultaneously chosen by the bureaucrats. The optimal prices demanded by the bureaucrats depend on c1 and c2. The detailed game has been analysed in Mukherjee et al. (2013). Analysing the game sequentially one can derive the optimal c1 and c2 set by the government in period 1 and p1 and p2 set by the bureaucrats in period 2. One important result that the game throws up is that the government sets {c1, c2} such that the bureaucrats get a positive bribe, i.e. (pi−ci) > 0, implying that the welfare maximizing {c1, c2} are set such that at the optimum some corruption exists. Put differently the government optimally allows some degree of corruption. This result might shock students of some other fields of study, but students of economics will realize that bureaucrats’ welfare is included in the total social welfare and also eliminating corruption can entail huge cost in terms of effort and monitoring for the government.Therefore it might be optimal for the government to reduce corruption but totally eliminating it might be prohibitively costly.We get this result in our chapter as well. In addition this chapter also shows that the extent of corruption allowed by the government reduces as the country prospers. This supports what we observe in the real world that the incidence of corruption in the developed world is probably lower than the developing world where the problem is more acute and endemic. Overall what these results point to are that the existence of corruption can be endogenous, i.e. the government can optimally and rationally allow some degree of corruption. Therefore it can be a rational choice on the part of the government and not exogenously given. Similar models of corruption can be constructed where an inspecting official can be bribed to get a favourable report (or may be to suppress illegal activities), a company can bribe an auditor to suppress financial misdeeds, a bank official can be bribed to sanction a huge loan that otherwise shouldn’t be sanctioned and what not. Next we cross over to an issue which will be of interest to students of economics as well as students of gender studies and sociology.
Game theory 103
4.2 Surrogate motherhood India is witnessing an outsourcing of a different kind – growing numbers of infertile couples from the US, Singapore, Britain, Taiwan, Nordic and other countries are flocking to India in search of ‘wombs for rent’. India enjoys an annual business of US $450 million (2006 data) and offers decent guidelines in assisted reproductive technology (ART). Most importantly, in India, commercial surrogacy is not illegal and recognizes the intended parent as the legal custodian of the child (ICMR guidelines). Moreover, decent medical facilities and cost effectiveness make India a preferred destination. But what is surrogate motherhood? According to the Merriam-Webster online dictionary, A surrogate is a substitute. Therefore, loosely speaking, a surrogate mother is one who carries a baby and gives birth, for another woman who cannot conceive and/or carry a baby successfully. The husband and wife, who need a genetic child, are known as the intended parents or the commissioning couple.There are two kinds of surrogacy: (1) straight (natural/traditional) and (2) gestational (or host). In a straight surrogacy, the sperm of the intended father is used to inseminate the surrogate. Therefore, the surrogate becomes both the genetic and gestational mother. While the intended father is also the genetic father, the intended mother is only a rearing mother. In gestational surrogacy, the egg and sperm of intended parents are fertilized externally and the embryo transferred to the uterus of the surrogate.This process is known as ‘invitro fertilization-embryo transfer’ (henceforth IVF-ET). Here, the surrogate is only a gestational carrier and the intended parents are the biological (or genetic) parents. Given this preamble, let us now focus on a simple problem on gestational surrogacy. Suppose a set of intended parents/commissioning couple hire a surrogate to carry their baby till term. Now according to the law the surrogate will have to hand over the baby to the commissioning couple in lieu of a handsome compensation after the successful child birth. The commissioning couple and the surrogate enter into a contract that specifies the dos and don’ts for the surrogate and the obligation of the commissioning couple. The problem is that the surrogate’s ‘effort’ or dos and don’ts are not always observable. Put differently it is very difficult to monitor the activities of the surrogate. Also, after the successful childbirth there is no guarantee that the commissioning couple will keep their commitment and given an imperfect legal system enforcing a contract is difficult and time consuming for both parties. This leads to an ex-ante mistrust and often we see opportunistic behaviour from both contractual parties. The surrogate might put in sub-optimal effort (care for the unborn foetus) which might put the unborn child at risk. To circumvent these contractual frictions infertility clinics have flourished and these middle agencies are making hay taking advantage of both party’s insecurities. Game theoretic models and specifically contract theoretic models (a sub-field of game theory) provide a rich framework to analyse such issues. For detailed modelling approach see Banerjee and Basu (2009) and Banerjee (2013). Finally we go over to our final example which is on auctions. Till now I have not used any mathematics. I will start explaining different auction mechanisms as
104 Swapnendu Banerjee
non-technically as possible. Later to illustrate different set of equilibria in different types of auctions I will use some mathematics but will keep it reasonably simple. Readers not interested in technical details can skip the technical final part without losing any intuition of the overall auction mechanism.
4.3 Auctions An auction is a mechanism by which an indivisible object or may be multiple objects are allocated to prospective buyers in lieu of a payment. In what way is auction different from other form of market transactions? For the sake of simplicity let us concentrate on single object only. Every individual buyer has a valuation for that particular object which can be regarded as the maximum price that she is willing to pay for the object. Now the seller doesn’t know the valuation of every individual buyer. In case of normal transaction if the seller sets too low a price then the seller might lose out on some revenue that she could have otherwise earned. On the other hand if the seller sets too high a price then the object might go unsold. Now what should be the optimal price that the seller should set such that he gets a decent enough price and the good gets sold. Auction provides a mechanism by which the seller could circumvent such a problem where the prospective buyers bid for the object and the good goes to the highest bidder. Thus the seller might possibly elicit some willingness to pay from each consumer and end up getting a sufficiently high price and the good gets sold. Auction theory is a huge field in itself and we can hardly do justice to that in this chapter. But for beginners we explain some popular auction mechanisms that the often observed in reality. Auctions can be sealed bid where buyers bid their willingness to pay in a sealed bid and the best bid wins the auction. These are simultaneous move games. Also buyers can bid sequentially one after another and those are similar to sequential move games. The allocation rules can vary and some of the common auction forms are first price auction and the second price auction. In the first price auction the highest bidder gets the good and pays her own bid. In case of second price auction the highest bidder gets the good but pays the second highest bid. Another form of auction which has similarity with different real-life situations is the all pay auction. In case of all pay auction the highest bidder gets the good but everybody has to pay their bid even if she doesn’t get the good. That is why it is known as an ‘all pay’ auction. To give an example of an all pay auction in real life, take the case of ‘corruption’. Suppose two or three (or may be any finite number) parties are competing and bidding to get a project order/tender. All parties bribe an influential official to get the order but only the person who pays the highest bribe gets the order. Now it is unlikely that the corrupt official will return the bribe amounts to the losing competitors/bidders.Therefore everybody’s bribes (which are conceptually bids) are gone and only the highest bribe giver (bidder) gets what she wanted.This is conceptually an all pay auction and one can conceive of other similar examples in reality. Now let us turn to sequential auctions. Sequential auctions in which the bidders gradually increase their bids (ascending in bids) is known as the English auctions.
Game theory 105
The IPL auctions for a particular player are roughly of this nature. Whereas an auction in which the seller (auctioneer) gradually reduces the price and the moment one party accepts she gets the good and the auction gets over is the descending auction also famously known as the Dutch auction. In reality auctions are incomplete information games since the seller and buyers doesn’t know other’s valuation, in technical parlance valuation is everybody’s ‘private information’. But analytically there are complete information versions of the sealed bid auctions that one can analyse. Since this is a chapter on research methodology, in what follows we will analyse the sealed-bid versions of complete information first price, second price and all pay auctions.We will see how the set of Nash equilibria changes with varying auction rules. We will do that in some detail although not much. Some portion of what follows is borrowed from my joint work with one of my student Sandeep.5 Without loss of generality we will focus on two bidders, but we conjecture that our results will go through even if we consider more than two bidders. We get started with first price sealed auction in the next section.
4.3.1 Complete information first price sealed bid auction Consider two bidders, bidder 1 and bidder 2 bidding for an indivisible object. The bidders have valuations ν1 and ν2 respectively. The valuations are assumed to be common knowledge. The bidders submit sealed non-negative bids, bi ≥ 0, and the highest bidder gets the good and pays her bid. If there is a tie then the object is assigned through a coin toss.The loser does not pay anything.Thus we are analysing the first price sealed bid auction with complete information. The strategy space of both the bidders can be written as Si = {bi ≥ 0}, i = 1,2. Without loss of generality we assume bidders to be symmetric, i.e. we assume ν1 = ν2 = ν. Henceforth we will focus on symmetric Nash equilibria in complete information auctions. Given those fundamentals the payoffs of both the bidders can be formalized as
π i = v − bi if bi > b j 1 = (v − bi ) if bi = b j (1) 2 if bi < b j =0 where i = 1,2, j = 1,2, i ≠ j. Given this, one can formalize the best response for both the players. To fix ideas first we chalk out the best response strategy of bidder 1 given bidder 2’s bids. The best response of bidder 1 is a function of all the possible bids of bidder 2 and can be put forward as follows: If b2 < ν, then b1 = b2 + ε where ε > 0 and ε ≈ 0 since bidding slightly higher than b2 will give bidder 1 the maximum positive payoff. If b2 = ν, then b1 ≤ b2
106 Swapnendu Banerjee
since it is impossible for bidder to win the auction with a positive payoff. Therefore bidder 1 can optimally lose the auction and can also bid exactly b2. In all such cases bidder 1 gets payoff equal to zero. Finally if b2 > ν, then b1 < b2 since in this situation it is best for bidder 1 to lose the auction. Since the bidders are assumed to be symmetric the best response of bidder 2 will also be similar. Therefore we can state our first result: Result 1 In the complete information first price sealed bid auction the unique (weak) Nash equilibrium is b1 = b2 = ν.The bidders will bid their true valuations. Proof We first show that given b2 = ν optimal b1 is also equal to ν. To achieve this we will make extensive use of the unilateral deviation logic. When b2 = ν if b1 = ν then 1 and gets an expected 2 payoff equal to zero. Now, given b2 = ν, if b1 > ν then bidder 1 wins the auction but gets negative payoff. Finally when b2 = ν if b1 < ν bidder 1 loses the auction and again gets zero. Thus bidder 1 doesn’t have an incentive to deviate from b1 = ν. Same logic applies for bidder 2 and we can argue that given b1 = ν, b2 = ν is (weakly) optimal. Therefore b1 = b2 = ν is a weak Nash equilibrium. For uniqueness we argue as follows, ∀bj < ν optimum bi = bj + ε till {b1 = ν, b2 = ν}, i = 1,2, j = 1,2, i ≠ j. Again ∀bj > ν, bi = ν is optimum, i = 1,2, j = 1,2, i ≠ j. Thus {b1 = ν, b2 = ν} is the unique (weak) Nash equilibrium.(QED) Thus what we get is that in case of first price sealed bid auction with complete information the unique symmetric equilibrium is where bidders bid their true valuations. Next we go over to the second price auction and characterize the set of Nash equilibria in such games. there is a tie and bidder 1 wins the auction with probability
4.3.2 Complete information second price sealed bid auction We now consider the second price auction with complete information with two bidders 1 and 2 bidding for an indivisible object. The structure is exactly similar to the first price auction described in section (2.1) except for the fact that the highest bidder gets the good and pays second highest bid. If there is a tie then the object is assigned through a coin toss. The loser does not pay anything. Once again we assume bidders to be symmetric, i.e. ν1 = ν2 = ν. We will search for the set of pure strategy equilibria in such an auction. Given these fundamentals the payoffs of both the bidders can be formalized as
π i = v − bi if bi > b j 1 = (v − bi ) if bi = b j 2 =0 if bi < b j
(3)
Game theory 107
where i = 1,2, j = 1,2, i ≠ j. Given this, one can formalize the best response for both the players. Similar to our earlier approach we chalk out the best response strategy of bidder 1 given bidder 2’s bids as follows: If b2 < ν, then b1 > b2 since bidding any bid higher than b2 will give bidder 1 the maximum positive payoff. ν − b2> 0 If b2 < ν, then b1 ≥ 0 since it is impossible for bidder to win the auction with a positive payoff. Therefore bidder 1 can optimally lose the auction and can also bid any positive amount. In all such cases bidder 1 gets payoff equal to zero. Finally if b2 > ν, then it is best for bidder 1 to lose the auction and therefore b1 < b2. Since the bidders are assumed to be symmetric the best response of bidder 2 will also be similar. As we will see this game is plagued with multiple equilibria. In our next result we characterize the set of weak Nash equilibria. Result 2 (a) In the complete information second price sealed bid auction one set of weak Nash equilibria can be characterized as {bi ≥ 0, bj ≥ ν} ∀i = 1,2, j = 1,2, i ≠ j (b) The other set of weak Nash equilibria can be characterized as {bi < ν, bj > ν} ∀i = 1,2, j = 1,2, i ≠ j. Proof (a) Given bj = ν any bi ≥ 0 gives player i zero payoff to player i. So player i doesn’t have any unilateral incentive to shift. Again given any 0 ≤ bi ≤ ν, player j doesn’t gain from shifting from bj = ν because she will pay anyway. If bi > ν player j again cannot gain since by bidding bj = ν she can get the best possible payoff which is zero. Otherwise she can end up getting a negative payoff by outbidding player i. Therefore {bi ≥ 0, bj = ν} ∀i = 1,2, j = 1,2, i ≠ j are weak Nash equilibria. (b) Given bi ≤ ν all bj ≥ ν will give payoff ν − bi to player j. So player j doesn’t have an incentive to deviate. Again given bj ≥ ν, all bi ≤ ν will give player i payoff equal to zero.This neither player i nor player j has any incentive to deviate.Thus {bi ≤ ν, bj ≥ ν} ∀i = 1,2, j = 1,2, i ≠ j is a set of weak Nash equilibria. (QED) Note that the truthful revelation equilibrium {bi = ν, bj = ν} ∀i = 1,2, j = 1,2, i ≠ j is a Nash equilibrium and this is similar to the price auction described earlier. But in addition to this we have a plethora of other (weak) Nash equilibria which is not there in the first price auction and thus we get infinite number of Nash equilibria in this case.
4.3.3 Complete information sealed-bid all-pay auction We now consider the all-pay auction with two bidders. The structure is exactly similar to the previously explained first and second price auction. Both bidders have valuations ν1 and ν2 respectively. The valuations are common knowledge. The bidders submit sealed non-negative bids, bi ≥ 0, and the highest bidder gets the good and pays her bid. In case of an all-pay auction the loser also pays her bid. If there is a tie then the object is assigned through a coin toss but again the loser has to pay her bid. Thus we are analysing an all-pay auction with complete information. The
108 Swapnendu Banerjee
strategy space of both the bidders are Si = {bi ≥ 0}, i = 1,2. One again both bidders are symmetric, i.e. ν1 = ν2 = ν and we will search for symmetric Nash equilibria in this game. Given the fundamentals the payoffs of both the bidders can be formalized as,
π i = v − bi 1 1 = (v − bi ) − bi 2 2 = −bi
if bi > b j if bi = b j
(5)
if bi < b j
where, i = 1,2, j = 1,2, i ≠ j. Once again we formalize the best response strategies for both the players. We state the best response strategy of bidder 1 given bidder 2’s bids. The best response of bidder 1 is a function of all the possible bids of bidder 2 and can be put forward as follows: If b2 < ν, then b1 = b2 + ε where ε > 0 and ε ≈ 0 since bidding slightly higher than b2 will give bidder 1 the maximum positive payoff. If b2 = ν then it is best for bidder 1 to lose the auction by bidding zero and thus get a zero net payoff. If bidder 1 bids a positive amount and loses then she has to pay her bid. Also if she matches bidder 2 by bidding b2 = ν she gets an ex-ante negative payoff.Thus b1 = 0 is optimal. Finally if b2 > ν, then once again b2 = 0 and the previous intuition applies. Again given the bidders are symmetric the best response of bidder 2 will be similar. Now we can state our next result: Result 3 In the complete information all – pay auction there doesn’t exist any pure strategy Nash equilibrium. Proof Let’s focus on situations where 0 ≤ bi ≤ ν, 0 ≤ bj ≤ ν, since in this situation both bidders will not bid beyond ν.We will show that a unilateral deviation always exists ∀{bi, bj} such that 0 ≤ bi ≤ ν, 0 ≤ bj ≤ ν. Let’s start from bi = 0 and bj = 0. Given bi = 0, optimal b j = 0 + ∀i = 1,2, j = 1,2, i≠j. Put differently ∀bi < ν, b j = bi + ∀i = 1,2, j = 1,2, i ≠ j. If bi = ν then optimal bj = 0 since any other bj will give player j expected negative revenue. Thus bi = bj = νcannot be a Nash equilibrium in this case. But if bj = 0 then again bi = b j + till any one reaches ν. Thus, ∃ {bi, bj} 0 ≤ bi ≤ ν, 0 ≤ bj ≤ ν where no unilateral deviation exists.Thus there are no pure-strategy Nash equilibria in this game. (QED) Thus in case of an all pay auction no pure strategy Nash equilibrium exists and the non-existence of pure strategy Nash equilibria comes out to be a striking feature of such an auction. One takeaway from the previous analysis is that in case of complete information first price sealed bid auction we get a unique pure strategy Nash equilibrium where both bidders bid their true valuations. In case of the complete information second price sealed-bid auction there are infinite pure strategy Nash equilibria. Finally
Game theory 109
in case of the complete information sealed-bid all-pay auction there are no pure strategy Nash equilibrium. Thus the nature of the Nash equilibria changes drastically with a change in the auction environment even in the presence of complete information. Although we constructed our model with two bidders we expect our results to generalize to more than two bidders. We assumed complete information throughout. As a generalization one can consider auction mechanisms under incomplete information. But this is beyond the scope of this chapter although one necessarily needs to mention that auction in reality is an incomplete information game. Each bidder might not know the valuations of other bidders and therefore one has to bring probability distribution in analysing auctions under incomplete information. But to keep things simple we focused in complete information auction. Advanced readers can consult the book by Vijay Krishna (2010).
5 Conclusion In this chapter we talked about the evolution of game theory as a discipline and how simple game theoretic tools provide a rich framework to analyse economic, political, social and other strategic interaction amongst agents in a simple but realistic way. We focused on complete information games where agents are assumed to know everything about each other. This simple but strong assumption is hardly an impediment in our pursuit of models that capture a slice a reality effectively. We discussed simultaneous move games and dynamic games with examples. We used examples that are relevant for economics, development studies, sociology political science/international relations/conflict studies and auctions.What we didn’t discuss is incomplete information games. One can have similar rich class of models in the incomplete information framework that provide additional intuition. But keeping in mind the interdisciplinary readership I decided to abstract from incomplete information games. Advanced readers can consult Gibbons (1992) and then Fudenberg and Tirole (1993) for more on game theory and its applications.
Notes 1 For more see Nash (1950a, 1950b, 1951). 2 This argument can be stretched till infinity (see Aumann (1976)). 3 Or for that matter any finite period. 4 The term comes from the word ‘Folklore’. Economists knew this concept from times immemorial but no one knows who first formalized it. That’s why this term. There are several versions of this theorem.To know more consult any standard book on game theory. 5 Sandeep (2018), Unpublished M. Phil Dissertation, Department of Economics, Jadavpur University.
References Aumann, R. 1976. “Agreeing to Disagree”. The Annals of Statistics,Vol. 4 (6), 1236–1239. Banerjee, S. 2013. “Gestational Surrogacy Contracts: Altruistic or commercial? A Contract Theoretic Approach”. The Manchester School,Vol. 81(3), 438–460.
110 Swapnendu Banerjee
Banerjee, S. and S. Basu. 2009. “Rent a Womb: Surrogate Selection, Investment Incentives and Contracting”. Journal of Economic Behavior and Organization, Vol. 69, 260–273. (Also see SSRN Working Paper # 894022 (2006)). Fudenberg, D. and J. Tirole. 1993. Game Theory. The MIT Press: Cambridge, MA. Gibbons, R. 1992. Game Theory for Applied Economists. Princeton University Press: Princeton, NJ. (Also available as ‘A Primer in Game Theory’, Prentice Hall, 1994). Krishna,V. 2010. Auction Theory, 2nd Edition. Academic Press: Burlington, MA. Mukherjee, V., S. Mitra and S. Banerjee. 2013. “Corruption, Pricing of Public Services and Entrepreneurship in Economies with Leakage”. MPRA Working paper #49049, University Library of Munich, Germany. Nash, John F., Jr. 1950a, “Equilibrium Points in n-Person Games”. Proceedings National Academy of Sciences,Vol. 36, 48–49. Nash, John F., Jr. 1950b. Non-Cooperative Games. Ph.D. thesis, Mathematics Department, Princeton University. Nash, John F., Jr. 1951. “Non-cooperative Games”. Annals of Mathematics,Vol. 54, 286–295. Sandeep. 2018. Some Auction Mechanisms with Jealousy. Unpublished M. Phil dissertation, Department of Economics, Jadavpur University.
6 IMPACT EVALUATION A simple need and a difficult choice of methodology Arijita Dutta
1 Introduction The introduction of development economics as a main theme of study under theory of economics had opened up the possibility of numerous policy interventions by public, semi-government and private agencies to garner new outcomes in terms of quality of life in all major economies. Numerous development programs and policies are typically designed to change outcomes such as raising incomes, improving learning, or reducing illness. Even in the era of minimalist approach to public control and hegemony of market mechanism, national and local governments, along with multilateral international agencies, have introduced different types of policies particularly in social sectors to improve several indicators pertaining to education, learning outcome, health care utilization, child care practices, marriage age etc. Whether or not these changes are actually achieved is a crucial public policy question, but that is not very often examined. During the time of fiscal austerity and extremely tight-fisted public budget, the governments become keen to know whether their investment actually made the changes that were targeted or not. Even the international agencies prefer to know about the effectiveness of the grants that they provide to low and middle income countries.The programme managers merely focus on measuring and reporting the inputs and immediate outputs of a program: like how much money has been spent and whether the utilization has been done at right time, how many cycles are distributed to improve retention of girls in high school, how many mothers deliver their babies in institutions; rather than on assessing whether the specific programs have achieved their intended goals of improving final outcomes. On the other hand, evaluation is a systematic and objective assessment of the results achieved by the program; it seeks to prove that changes in targets are due only to the specific policies undertaken; while monitoring can help only to improve policy design and implementation.Thus monitoring can identify the probable policy loopholes in process, while impact evaluation identifies the acceptability
112 Arijita Dutta
of the programme amongst potential beneficiaries and measures the exact long term outcome. For example, the Mukhya Mantri Cycle Yojna of Bihar launched in 2006 by Chief Minister Nitish Kumar is a much discussed policy interventions introduced in Bihar in current era. According to government’s monitoring department seventy Lakh cycles were distributed amongst girls in first ten years. But impact evaluation (IE) of this programme as carried out by Muralidharan and Prakash (2016) finds that being in a cohort that was exposed to the cycle programme increased girls’ age-appropriate enrolment in secondary school by 32 per cent and reduced the corresponding gender gap by 40 per cent.They also find that the increases in enrolment mostly took place in villages that were further away from a secondary school, suggesting that the mechanism of impact was the reduction in the time and safety cost of school attendance made possible by the bicycle. This essentially calls for further policy intervention of concentrated efforts needed for such far-off remote villages. Thus, IE is a part of the broader agenda of identifying the specific impact attributable solely to the programme and hence of generating a trend of evidence-based policy making. This global trend is marked by a shift in focus from inputs and their utilization to final outcomes and results and hence allows evaluating and reshaping the entire paradigm of public policy. The obvious need for impact evaluation is to help policy makers under whether programs are generating intended effects; to promote accountability in the allocation of resources across public programs; and to fill gaps in basic understanding what works, what does not, and how measured changes in well-being are attributable to a particular project or policy intervention. The last part is particularly crucial in a development setting where many simultaneous programs are introduced and the government officials often remain confused which one results in what. Impact evaluation allows culling out the results which has resulted from a specific program, rather than their concerted effects. This helps the authorities to streamline and change each programme by identifying their individual components.The benefits of a well-designed impact evaluation are therefore long term and can have substantial spillover effects. The formal literature on impact evaluation methods and practices is large, with a few very useful overviews (for example, Blundell and Dias [2000]; Duflo et al.[2008]; Ravallion [2008]; Khandekar [2010]; Khandker [1988]; Gertler et al.[2016]). However, majority of these articles and books refer to specific problems of Latin American and African countries, where targeted programs and conditional cash transfers were introduced since 1980s. Thus, there is an urgent need to put the theory into Indian perspective with certain specific examples of policy evaluations for India in a hands-on fashion for practitioners.This book chapter attempts to fill that gap with some specific examples of methodologies and their applications in Indian context.
2 The prime issue of impact evaluation The main challenge of an impact evaluation is to determine what would have happened to the beneficiaries if the programme had not existed. That is, if we want to determine the impact evaluation of microfinance programme on per capita
Impact evaluation 113
household income of beneficiaries, a beneficiary’s outcome in the absence of the intervention would be its counterfactual.The italics part of the last sentence hints an impossibility: a beneficiary’s outcome would be with the programme benefits; but the same person’s outcome without the programme would be impossible to observe. At any given moment in time, a unit either participated in the programme or did not participate. The unit cannot be observed simultaneously in two different states (in other words, with and without the program). This is called the counterfactual problem. Thus, the main problem of IE is that the first best evaluation would be to calculate the difference in outcome between the beneficiary with the programme and that of the same person had he not accepted the program.Though the first part is observable, the second part is not! So the challenge of an impact assessment is to create a convincing and reasonable comparison group for beneficiaries in light of this missing data. There can be two simple options to handle the situation. First, one can compare the outcomes of a person treated under the programme with that of a person who is not treated (we normally call them controls). Secondly, one can compare the outcomes before and after the interventions for the treated person.We discuss the problems associated with both of them serially. We think of the oft quoted policy of Janani Suraksha Yojna (JSY), which offers conditional cash transfers to pregnant mothers who deliver their babies within health care institutions. Figure 6.1 depicts the case where participants’ and control’s institutional delivery (ID) is treated as an outcome. The share of ID amongst JSY beneficiaries increased from Y0 to Y4. The corresponding share of those who did not participate in the programme (control group) increased from Y1 to Y3. If we follow the outlined strategy, the impact would be Y4 – Y3. However, actually some of the mothers receive JSY and deliver in institutions, while others still continue to deliver at home. Thus there existed a difference in nature of outcomes before the
Share of ID
Participant ID3 ID4 ID2
ID1
ID0
Time Program
FIGURE 6.1
Impact evaluation: a graphical presentation
114 Arijita Dutta
programme itself because the treatment and control groups are not their perfect clones.The more educated and socially advanced women accept the incentive, while less educated women living in a remote village do not accept the incentive scheme, even if they would have been better off with the program.Thus they self select into the program, depending upon their own household and personal level characteristics.The actual impact would be Y4 – Y2, while Y2 refers to the point which assumes that the controls would have changed similarly as the treated group (drawing of two parallel lines). However, Y2 is not directly observable, which depicts the position which a control would have reached had he chosen the programme for herself. The second option is to compare before and after situations of beneficiaries with data on their outcomes either with comparable survey or with retrospective data collected only after survey. As shown in Figure 6.2, one then has two points of observations for only the beneficiaries of an intervention: pre-intervention ID (Y0) and post-intervention ID (Y2). Accordingly, the program’s effect might be estimated as (Y2 – Y0). However, it is in no way conclusive about the impact of the programme of JSY alone. Such a simple difference method would not provide us with an accurate assessment because many other factors (outside of the program) might have changed over the period. For example, the probability of ID might have improved because of more education reaching the young girls under Sarva Siksha Abhiyan and they are being aware of the safety and cleanliness of hospitals for their children and for themselves during delivery; or because of improvement of connectivity by roads from the programme of golden quadrilateral. Not controlling for those other factors means that one might not attribute a part of the impact that was achieved outside the program. Thus, before-after comparison cannot really cull out the impact that is only attributable to the programme one is evaluating. Once we understand neither of the two conventional methods of comparison can yield good impact assessment, it becomes clear again that one needs to handle Participants Y2
Impact? Y1
Y0
Baseline Study
Program
Time FIGURE 6.2
Before-after comparison
Impact evaluation 115
the missing data problem of counterfactual efficiently to solve the problem. In that endeavour, one attempts to identify a comparison group that is very similar to the treated group, such that those who received treatment would have had outcomes similar to those in the comparison group in absence of treatment. Mathematically, equation (1) presents the basic evaluation problem comparing outcomes Y across treated and non-treated individuals i: Yi = α Xi + β Di + ε i
(1)
Here, D is a dummy equal to1 for those who participate in JSY and 0 for those who do not participate, meaning those who delivered in hospitals and received JSY grants are termed under D = 0; X is set of other observed characteristics of the individual (like education of the woman, education of her husband, location of household, social caste and religion group of the household etc. and finally, ε is an error term reflecting unobserved characteristics that also affect Y. The problem with estimating equation 1 is that treatment assignment is not often random because of self-selection into the program. A lady might want to deliver her children in hospital, but often she is not allowed by her in-laws or husband because of the fear that a male doctor might examine her. Self-selection could be based on observed characteristics, unobserved factors, or both. Thus, in equation (1), the dummy variable D would depend on Xi, thus violating the assumptions of OLS regression. The average treatment effect or ATE of the programme would be defined as: ATE = E (Yi (1)| Di = 1) − E (Yi (0)| Di = 0)
(2)
Here the ATE corresponds to a situation in which a treated and controls are randomly chosen household from the population and hence participating and nonparticipating households have an equal probability of receiving the treatment, which is actually not the case. The self selection into the programme is determined not randomly; rather by different attributes of individual and household. The problem with equation (2) is that the treated and control groups may not be the same before the intervention, so the expected difference in ID share between those groups may not be due entirely to the JSY programme intervention. One can add and subtract the expected outcome for control group members had they participated in the programme – E(Yi(0)/Di = 1), one gets, E (Yi (1) | Di = 1) − E (Yi (0) | Di = 0) + E (Yi (0) | Di = 1) − E (Yi (0) | Di = 1) . = ATE + E (Yi (0) | Di = 1) − E (Yi (0) | Di = 0)
= ATE + B. B is the bias operating and unless one knows the extent to which selection bias the exact difference in outcomes between the treated and the control groups
116 Arijita Dutta
remain inaccurate. That difference is ATT or average treatment effect on the treated. It is basically: ATT = E (Yi (1) − E (Yi (0) |Wi = 0) where, Wi isolates those who are likely to participate into the programme owing to their similar attributes: observed or unobserved.
3 Different methodologies of impact evaluation and their usages in India and the sub-continent Though the principle of randomized control trials (RCT) is not scientifically new, their large-scale use in developing countries is unprecedented. These methods borrowing from medical science and already used for public policy evaluation in developed countries since the 1960s have been tailored mainly to developing countries’ issues and circumstances. Methodologically, randomized evaluations are the best approach to estimate the effect of a program. Before the programme starts, eligible individuals are randomly assigned to two groups that are statistically identical before the program. Thus, the treated and control groups are identified randomly, neither by offering the programme by a specific criterion, nor allowing the participants to decide of their own to self-select in to the programme or not. Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment, any difference that subsequently arises between them can be attributed to the programme rather than to other factors. Thus under randomized evaluation, the term B in equation 3 actually vanishes. However, in practice, researchers can only partially reduce the bias as it becomes extremely difficult to locate a perfect counterfactual. Statisticians have proposed a two-stage randomization approach outlining these priorities. In the first stage, a sample of potential participants is selected randomly to ensure external validity of the experiment. In the second stage, individuals in this sample are randomly assigned to treatment and comparison groups, ensuring internal validity in that subsequent changes in the outcomes measured are due to the programme instead of other factors. The paper by Banerjee et al. (2007) reports the results from randomized evaluations of two programs by Pratham that provide supplementary inputs to children in schools in selected states that cater to children from poor families in urban India. The first intervention, targeted towards the weakest children, consists of a remedial education program, where a young community teacher woman (balsakhi) reworks on basic skills with poorly performing children from standards three or four. These children are taken out of the regular classroom to work with this young woman for two hours per day (the school day is about four hours). The second intervention is addressed to all children but is adapted to each child’s current level of achievement. It is a computer-assisted learning program, where children are offered two hours of shared computer time per week. The ideal randomized experiment to separate out the direct and indirect effects of remedial education would have been
Impact evaluation 117
possible if the students too were assigned to these classes randomly, after balsakhis are randomly assigned to the schools itself. However, this design was not feasible in this perfect setting, since teachers were not prepared to assign the children in the abstract, without knowing whether or not they were going to get a balsakhis. The authors started by predicting a child’s assignment to these additional teachers as a flexible function of his or her score in the pre-test score distribution and then estimated how the treatment effect varies as a function of the same variables. There were possibilities of the indirect effect, which gives the impact of being in a balsakhi school but not being assigned to the balsakhi, and a direct effect which gives us the impact of working with the balsakhi, over and above the effect of being in a balsakhi school. Thus here the problem was at two levels: school and student. The study found that being just in a balsakhi school has no effect for children who were not themselves sent to the balsakhis. However, the effect of the programme appears concentrated on children who indeed worked with the balsakhi. The computerassisted learning programme was also very effective, increasing average math scores significantly. This evaluation brings in strong messages for policy prescriptions as they are cheaper. The balsakhis are not as trained and educated like teachers; but their classes improved the weaker students perform better because of students share a common background with the balsakhis, but not the teachers. Also, they teachers teach to the prescribed curriculum, and may not take time to help students who are behind catch up. This study effectively deals with impact evaluation by randomized experiments, but adjusting the methodology by adding more instruments. However, given the spectacular popularity of RCTs as the prime tool of impact evaluation across the developing countries, a few concerns emerge slowly, but steadily. Since an RCT relies on random assignment of the treatment, this will nearly always mean that the evaluation has to be designed ex ante. Most of the policy interventions and programs that are initiated in developing countries, including India, are carried out under political compulsion and almost no randomization is done a prior. The need for evaluation arises after two to three years of running the programme and hence only ex post end-line information become available, thus making RCT almost impossible to run. Since RCTs are currently fashionable, one may encounter cases of less well-informed managers asking for an impact evaluation of a completed project, requesting for an RCT methodology. It has to be explained that this is not possible. In the case of a prospective impact evaluation of health insurance in India, the staff of the health ministry told us very clearly that we could assign the intervention how we liked, but that the Minister was sure to change it. So there was no sense in embarking on a RCT. (White [2013, p. 9]) The unit of randomization in most cases may not be the same as the unit of analysis, requiring a clustered design. For example, as in the earlier chapter, a treatment may be randomized at the school level, but the intervention takes place at the classroom
118 Arijita Dutta
level with outcomes measured in individual students. The standard errors must be adjusted for clustering for the right evaluation. In the absence of an experiment, propensity score method (PSM) compares treatment effects across participant and matched non-participant units, with the matching conducted on a range of observed characteristics. Thus PSM method assumes that selection bias is based only on observed characteristics; it cannot take care for unobserved factors affecting participation into the programme. These observed characteristics calculate the propensity or probability of self-selecting in to the programme and the treated and control members are matched individually in a pair. The validity of PSM depends on two conditions: (a) Conditional independence (meaning that unobserved factors do not affect participation); (b) Existence of a sizable common support or overlap in propensity scores across the participant and nonparticipant samples. The main advantage of PSM is that it can be used post treatment and it mimics an experiment in a quasi-experiment setting. The main idea is to locate, from a large group of nonparticipants, individuals who are observationally similar to participants in terms of characteristics not affected by the program. Normally, it first estimates the propensity score for each individual in the sample (both who actually participated and who did not) by a probit or logit model and then match individuals. Regression based techniques on the sample of participants and nonparticipants, using the propensity score as weights, are also used under PSM. The most oft-quoted example of evaluation using PSM is by Dehejia and Wahba(2002), who worked on a nationally supported work programme of the US government for providing work experience for poor and socially vulnerable individuals. For India, Ravallion and Jalan (1999) used cross-sectional data from 1993 to 1994 nationally representative survey on 33,000 rural HH from 1,765 villages found that impact of piped water on diarrhoeal incidences. There is a behavioural component: poor households may be less able to benefit from piped water because they do not properly store water. Another example of using PSM is found in Dutta and Ghosh (2017) while locating impact of integrated child development programmes in West Bengal. Using propensity score matching technique on primary data of 390 households in two districts of West Bengal, the study finds that impact of being included in the programme and receiving supplementary feeding is insignificant on child stunting measures, though the programme can break the intractable barriers of child stunting only when the child successfully receives not only just the supplementary feeding but also his caregiver collects crucial information on nutritional awareness and growth trajectory of the child. Availability of regular eggs in the feeding diet too can reduce protein related under-nutrition. This chapter contributes into the existing literature by identifying the divergence between self-selecting into the programme and receiving the services under the program. It is often found in the developing countries that though the beneficiaries get
Impact evaluation 119
themselves enrolled into a programme, in many cases they do not get the components of services they are expected to receive.Thus the treatment variable itself differs. Dutta and Ghosh (ibid.) defined seven treatment variables and find that those children with access to ICDS and received eggs more than thrice a week regularly and those children with simple access to ICDS and mothers received information about child nutrition from Anganwari workers are significantly better in terms of anthropometric measures of nutrition.Those just with access to ICDS were no way better-off from those without the access. The third important method of evaluation is difference-in-difference (DID). This method can handle the unobservable differences present in participation problem, which is not possible in PSM. DID essentially compares treatment and control groups in terms of outcome changes over time relative to the outcomes observed for a pre-intervention period.That is, given a two-period setting where t = 0 before the programme and t = 1 after programme implementation, Y 1T and Y 1C are the respective outcomes for a programme beneficiary and non-treated control units after the intervention and Y 0T and Y 0C are the respective outcomes before the intervention. The DID method measures the double difference as the average programme impact which can be written as DD = E (Y 1T − Y 1C ) − E Y 0T − Y 0C . DID estimator allows for unobserved heterogeneity(the unobserved difference in mean counterfactual outcomes between treated and control units) that may lead to selection bias; but assumes this unobserved heterogeneity is time-invariant, so the bias cancels out through differencing twice. Instead of comparing outcomes between the treatment and comparison groups after the intervention in PSM, the difference-in-differences method compares trends between the treatment and comparison groups. The main problem in DID is that there is no way we can prove that the differences between the treatment and control groups would have moved in tandem in the absence of the program. The reason is already discussed in section 2 of this chapter, where I identify the issue of counterfactual: one cannot observe what would have happened to the treatment group in the absence of the treatment. But one validity check of the equal trend assumption can be done if one repeats the observation at two time points before the programme is launched. If we find that the trend has remained unchanged, there is high probability that it would remain unchanged over the next phase after programme implementation too. The other way to check it might be comparing with a fake treatment group that is not expected to be affected by the programme. In ideal case, there should be zero impact, which would confirm equal trend assumption. But if we find a significant change in this DID, we are forced to reject the hypothesis of equal trend. This test is called placebo test. Duflo (2001) is a classic example of using DID in evaluating a policy in education sector.The chapter evaluates the impact of building many schools in Indonesia since 1973. There are two sources of variations in the intensity of the programme for a given individual. First, there is regional variation in intensity of building new schools. Second, children who were older than twelve years in 1972 did not benefit from the program. The younger a child was 1972, the more he or she benefited from the programme – because she spent more time in the new schools. To solve
(
)
120 Arijita Dutta
the problem the author stratified the regions by high and low according to the intensity of new schools built and the children in young and old groups, the first group receiving more exposure to the program. The chapter found that for each school built for 1,000 students, the average educational achievement increased by 0.12–0.19 years and the average salaries increased by 2.6–5.4 per cent. A unique problem emerged when Dutta and Sen (2018) attempted to evaluate the impact of Kanyashree programme of West Bengal government, a conditional cash transfer programme for increasing average age of marriage and reducing dropout from school amongst adolescent girls. They used the girls ages 18–21 years in the survey data as the treatment group, who were exposed to the scheme at some point of time. The control group consists of the girls ages 22–25 who were above 18 when the scheme was announced and thus could not avail the benefits of the scheme. The main problem of identification appeared that if a girl had continued her education and not got married till age 17 even without the scheme, and then received the program, she might be more motivated than the average and in all probability might have continued her studies and remained unmarried for another year even without the program. To address the second issue we have matched each girl in the treatment group with another one in the control group by PSM on the basis of household characteristics. The first difference of their study represents the average difference in outcomes where the differences were generated after matching after the program. The similar method was followed for the baseline data and the difference in these two differences as the estimate of the impact of this scheme on underage marriage and dropout. Thus a novel combination of PSM and DID was able to provide the right evaluation of the program.
4 Conclusion In this chapter, I put forward the case for impact evaluations as a worthwhile investment for many programs. Along with monitoring and other forms of evaluation, they enhance the understanding of the final effectiveness of particular policies and identify the loopholes of the policy, if any. However, for a successful impact evaluation, a robust methodology, derived from the operational rules of the programme and for locating a right counterfactual is of utmost importance. It is always best to introduce a new policy with an experimental framework and thus by collecting the baseline characteristics in details. But in developing countries, most often this remains as a far cry. For large universal programs, the treatment group itself remains different in many aspects from the control groups and hence bias reduction is the main objective of such an evaluation. Alternative methodologies like PSM, DID, regression discontinuity can be used in case of quasi-experimental cases; but which one should be used would depend upon the exact nature of the programme and possible spillover of its impact. Even with an experimental design, the quality of impact analysis depends ultimately on how the analysis is designed. Often the problems of compliance, spillovers, and unobserved sample bias hamper clean identification of programme effects even from randomization. Then one needs to stratify
Impact evaluation 121
the experiment in different groups. However, non-experimental methods such as propensity score matching, double difference, and use of instrumental variables have their own strengths and weaknesses and hence are potentially subject to bias for various reasons.Thus, this evidence-based research methodology must be very clear in identifying the particular outcome it considers. One must be conscious about the requirements before applying the right method of evaluation.
References Banerjee, A. V., S. Cole, E. Duflo, and L. Linden. 2007. “Remedying Education: Evidence from Two Randomized Experiments in India”. The Quarterly Journal of Economics, Vol. 122(3), 1235–1264. Blundell, R. and M. Costa Dias. 2000. “Evaluation Methods for Non-experimental Data”. Fiscal Studies,Vol. 21(4), 427–468. Dehejia, R. H. and S. Wahba. 2002. “Propensity Score-Matching Methods for Nonexperimental Causal Studies”. Review of Economics and Statistics,Vol. 84(1), 151–161. Duflo, E. 2001. “Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment”. American Economic Review, Vol. 91 (4), 795–813. Duflo, E., M. Kremer, and J. Robinson. 2008. “How High Are Rates of Return to Fertilizer? Evidence from Field Experiments in Kenya”. American Economic Review,Vol. 98(2), 482–488. Dutta, A. and S. Ghosh. 2017. “Impact of Integrated Child Development Scheme on Child Malnutrition in West Bengal, India”. Maternal & Child Nutrition,Vol. 13 (4), e12385. Dutta, A. and A. Sen. 2018. “Kanyashree Prakalpa in West Bengal: Justifications and Impact”. Report submitted to IGC, LSE, UK. Grant No 35321. Khandker, S. 1988. Fighting Poverty with Microcredit: Experience in Bangladesh. Oxford: Oxford University Press Khandker, S., B. Koolwal, G. and Samad, H., 2009. Handbook on impact evaluation: quantitative methods and practices. The World Bank: Washington DC. Muralidharan, K. and N. Prakash. 2016. “Cycling to School: Increasing Secondary School Enrolment for Girls in India”. American Economic Journal: Applied Economics, forthcoming. Ravallion, M. 2008. Evaluation in the Practice of Development. The World Bank: Washington, DC. Ravallion, M. and J. Jalan. 1999. Does Piped Water Reduce Diarrhea for Children in Rural India? The World Bank: Washington, DC. White, H., 2013. “An Introduction to the Use of Randomised Control Trials to Evaluate Development Interventions”. Journal of Development Effectiveness,Vol. 5 (1), 30–49.
7 CONSTRUCTION OF DIFFERENT TYPES OF INDICES IN SOCIAL SCIENCE RESEARCH Some numerical examples Sushil Kr. Haldar
Introduction An index is a set of items that structures or focuses multiple yet distinctly related aspects of a dimension or domain of behaviours, attitudes, feelings or perceptions. Single or multiple indicators may exist in order to capture a specific dimension. An indicator is something that points to, measures or otherwise provides a summary overview of a specific concept. A set of indicators that are combined is referred to as an index.Thus, an index is a single score made by combining several other scores, sometimes by straight forward addition but often in more complex ways, in order to measure some given variable. It is possible to use some statistical techniques (like factor analysis) to give them better construct validity (or factor weights). Index construction in social research is not arbitrarily done rather it follows certain stages: (a) (b) (c) (d)
Define the concept or conceptualize development; Dimensions of concept development; Choice of appropriate indicators in each dimension; Aggregation and index construction.
These four stages or processes can well be explained if we consider a measurement problem of social development in general and human development in particular across the countries in the world; contrary to development, one can proceed in the same way by conceptualizing human deprivations. Suppose our objective is to measure social development or human development across countries in the world for a particular time point. By the term social development, we generally mean the non-monetary measure of social progress. Therefore, indicators, which are used in measuring the social development, are called social
Indices building in social science research 123
indicators of development. Social indicators may be of different types: some indicators are used to measure the health status of the community and some are used to measure the educational and cultural progress of the community. Some indicators may be considered as output (achievement) and some are inputs (process). The concept of human development belongs to social development where human beings are the means and ends in all development process. The goal of development is viewed as the ‘promotion and expansion of valuable capabilities’ (Sen [1990]). Human development is a process of enlarging people’s choices. In principle, these choices can be infinite and change over time. But at all levels of development, the three essential ones are for people to lead a long and healthy life, to acquire knowledge and to have access to resources needed for maintaining a decent standard of living. If these choices are not available, many other opportunities remain inaccessible. In the second stage, we define the dimensions. Here, we have three dimensions – longevity, educational attainment, and access to resources. Longevity captures health dimension; educational attainment captures knowledge dimension and access to resource captures the dimension of standard of living or economic well-being. Once the dimensions are well defined, we have to explore the underlying indicators of each dimension. Practically, there may be multiple indicators capturing a specific dimension of development. For example, the health dimension can be captured by various indicators like life expectancy at birth, disability adjusted life years, quality adjusted life years, infant mortality rate, child or maternal mortality rate, morbidity incidence or prevalence rate etc. In the same way, knowledge dimension may be arrested by considering literacy rate, enrolment ratio at different levels (namely primary, secondary and tertiary levels etc.) and mean years of schooling etc. Similarly, the standard of living can be measured by various indicators like income, assets, owning durable goods etc. Keeping in mind the basic question as outlined, our primary objective is to determine the relative position of the countries in respect of social vis-à-vis human development. In the last phase, we have to estimate a relative score in each dimension of development of the specific country based on range (namely, maximumminimum). A relative distance is calculated that has a value between 0 and 1, for each dimension of a particular country; after that a mean is used for three dimensions giving human development index (HDI).
1 Development indices In this section we elaborate upon construction of three development indices: physical quality of life index (PQLI), human development index (HDI) and gender development index (GDI).
1.1 Physical quality of life index (PQLI) Before UNDP’s human development index (HDI) in 1990, the physical quality of life index (PQLI) was used to measure the quality of life in a quantitative way. The value of PQLI is the average of three basic indicators: literacy rate, infant mortality
124 Sushil Kr. Haldar
(IMR) and life expectancy at birth, all equally weighted on a 0 to 100 scale. This index was developed by Morris (1980) and popularized by the Overseas Development Council (ODC). The index of IMR (IIMR) is taken to be [166 – IMR] x 0.625, whereas the index of life expectancy at birth (ILEB) = [life expectancy – 42] x 2.7.With equal weights being placed on each of these three components, the PQLI is measured simply as the average of these three: PQLI =
LR + IIMR + ILEB (1) 3
The PQLI has been criticized on various grounds. The weights attached to IMR and life expectancy indices are arbitrary and there exists considerable overlap between IMR and life expectancy. Moreover, an important dimension like standard of living and the corresponding indicator is missing in PQLI.
A numerical example of PQLI The following information is given to a country pertaining to physical quality of life. Let, Literacy rate (LR) = 50, IMR = 75, life expectancy at birth (LEB) = 52. The index of IMR (IIMR) = (166–75).0.625 = 56.87; index of life expectancy (ILEB) = (52–42).2.7 = 27. Therefore, the PQLI of that country is 50 + 56.87 + 27 = 44.62 . 3
1.2 Human development index (HDI) Human development is evolving over time since 1990. Here, we have considered the updated methodology developed by UNDP (2010). The 2010 HDI uses a revised formula in measuring HDI. HDI combines three dimensions: A long and healthy life which is captured by life expectancy at birth (LE); knowledge dimension, which is captured by mean years of schooling and expected years of schooling; and access to resource or a decent standard of living is captured by gross national income (GNI) per capita (measured in $PPP). The dimension index is estimated following the relative distance principle or range equalization principle: Dimension Index =
Actual − Minimum Maximum − Minimum
In its 2010 Human Development Report, the UNDP began using a new method of calculating the HDI. The following three sub-indices are used: (a) life expectancy index (LEI): LEI =
LE − 20 (2) 85 − 20
Indices building in social science research 125
LEI is 1 when life expectancy at birth is 85 and 0 when life expectancy at birth is 20. The maximum and minimum life expectancies across countries are assumed to be 85 and 20 respectively. (b) Education Index: ( EI ) =
MYSI + EYSI (3) 2
MYS , MYS = 15 mean years of schooling, 15 is the projected maximum of this indicator for 2025; where, MYSI is the mean years of schooling index calculated as
EYS , EYS = 18 expected years of schooling and 18 years are required to complete a master’s degree in most countries in the world.
and EYSI is the expected years of schooling index calculated as
(c) Income index: ( II ) =
ln PCGNI − ln 100 (4) ln 75, 000 − ln 100
where PCGNI = per capita gross national income. PCGNI is measured in $PPP. The income index (II) is 1 when GNI per capita is $75,000 and 0 when GNI per capita is $100. It is assumed that $75,000 is the maximum country income whereas $100 is the minimum country income. Finally, the HDI is the geometric mean of the three normalized indices: 1
HDI = [LEI .EI .II ]3 (5) Thus, HDI is a composite index measuring average achievement in three basic dimensions of human development. The index is best seen as a measure of people’s ability to live a long and healthy life, to communicate and participate in the life of the community and to have sufficient resources to obtain a decent standard of living. It is a minimal measure. For a country that has achieved a high value of HDI, the question then arises about other dimensions in which people can grow. Therefore, human development goes beyond the essential choices that the human development index (HDI) captures. The concept of human development is larger than the measure contained in the index.
A numerical example of HDI We consider a hypothetical data as given of a country pertaining to human development parameters. We have to estimate HDI of that country. Life expectancy at birth = 80; mean years of schooling = 9, expected years of schooling = 14.5, gross national income per capita (GNIPC) measured in $PPP = 14,000.
126 Sushil Kr. Haldar
80 − 20 = 0.923 85 − 20 9 Mean years of schooling index ( MYSI ) = = 0.60 15 14.5 Expected years of schooling index ( EYSI ) = = 0.805 18 MYSI + EYSI 0.60 + 0.805 = = 0.702 Education index ( EI ) = 2 2 ln PCGNI − ln 100 ln 14000 − ln 100 Income index ( II ) = = ln 75, 000 − ln 100 ln 75000 − ln 1000 4.941 9.546 − 4.605 = = = 0.746 11.225 − 4.605 6.6202 Life expectancy index ( LEI ) =
Therefore, the HDI of the country = (0.923 x 0.702 x 0.746)0.333 = 0.784.
1.3 Gender development index (GDI) Just like human development, the methodology in measuring gender development is also evolving over time since 1990. The UNDP (2010) uses a revised formula in measuring GDI. Just like HDI, GDI considers three dimensions: health, knowledge and access to resources. The gender development index (GDI) measures gender inequalities in achievement in three basic dimensions of human development: health, measured by female and male life expectancy at birth; education, measured by female and male expected years of schooling for children, and female and male average years of schooling for adults aged 25 years and older; and command over economic resources, measured by female and male estimated earned income. The first step in the calculation of GDI is to index the variables for life expectancy and educational attainment. The maximum value for male life expectancy is fixed at 82.5 years and the minimum value is 17.5 years.There is indeed strong evidence that the maximum potential life expectancy for women is greater than that for men – given similar care, including health care and nutritional opportunities. Therefore, for female life expectancy, the maximum value is fixed at 87.5 years and the minimum value is at 22.5 years. The values for women and men are indexed accordingly following the relative distance methods: Dimension Index =
ActualValue − MinimumValue MaximumValue − MinimumValue
Second step is calculation of the variable for educational attainment, which is a composite index. It includes expected years of schooling (EYS) and mean years of schooling (MYS). The maximum values of EYS and MYS are 18 and 15 respectively; the minimum values of the two indicators are zero. The two indices are
Indices building in social science research 127
estimated separately for male and female. For education the dimension index is first obtained for each of the two subcomponents, and then the unweighted arithmetic mean of the two resulting indices is taken. The third step is calculation the indices for income of the male and female are, to some extent, complex based on some deliberate assumptions. Female and male shares of earned income are derived from data on the ratio of the average female wage to the average male wage and the female and male percentage shares of economically active population aged 15 and above. Assume that female and male employment levels are proportional to female and male participation in the labour force (this is assumed because of lack of data on employment by gender). The two ratios of female and male are derived from the following definition: WL = W f L f + Wm Lm (6) where, W = average wage, L = total labour force, f = female, m = male. Wf W The target is to calculate m and . Because of the unavailability of wage data W W between male and female, UNDP has assumed that if the male wage rate is Rs. 1, female wage rate is Rs. 0.75. Symbolically, we can write: Wf Wm
=
75 = 0.75 100
From (6), we have the female share of earned income (namely female share of wage bill): Wf Sf =
Wm Wf Wm
.EAP f (7)
.EAP f + EAPm
where EAPf and EAPm are the economically active population for female and male respectively. In the same way, the male share of the wage bill or male share of earned income(Sm) is Sm = 1 − S f (8) The estimated female earned income per capita (GNIPCf ) and male earned income per capita (GNIPCm) are obtained from gross national income per capita (GNIPC) as: Sf GNIPC f = Pf
.GNIPC (9)
128 Sushil Kr. Haldar
S GNIPC m = m .GNIPC (10) Pm Nf N where P f = and Pm = m are the female and male share of the popula N N tion respectively. The next step is to construct dimension index of income for both sexes, given the maximum and minimum level of income across the countries in the world. In 2016, the maximum income was considered as 75,000$ppp and 100$ppp. Next, the HDI for male and female are estimated using the following formula: 1
HDI f = LEI f .EI f .II f 3 (11) 1
HDI m = [ LEI m .EI m .II m ]3 (12) where, LEI = life expectancy index, EI = education index and II = income index. The last step in calculating the GDI is to take the ratio of female HDI to male HDI: GDI =
HDI f HDI m
(13)
Numerical example towards estimation of GDI The following goalposts relating to human development indicators are fixed. We estimate GDI following the updated methodology in 2010.The data of the country are given here:
TABLE 7.1 Goalposts for the GDI
Indicator
Minimum
Maximum
Expected years of schooling Mean years of schooling Estimated earned income, PPP$ in natural log Female life expectancy Male life expectancy
0 0 100 22.5 17.5
18 15 75,000 87.5 82.5
Source: UNDP (2010).
The female and male values corresponding to human development indicators are given in the following Table 7.2
Indices building in social science research 129 TABLE 7.2 Gender-specific human development indicators
Indicator
Female
Life expectancy at birth (years) Mean years of schooling Expected years of schooling Wage Gross national income per capita(GNIPC)$PPP: Share of economically active population Share of population
73 9 11 100 25,000 0.391 0.499
Male 66 10 12 110 0.609 0.501
Source: UNDP (2010).
50.5 73 − 22.5 = = 0.776 87.5 − 22.5 65 48.5 66 − 17.5 Male life expectancy index ( LEI m ) = = = 0.746 82.5 − 15.5 65 9 Female mean years of schooling index MYSI f = = 0.60 15 11 Female expected years of schooling index EYSI f = = 0.611 18 0.60 + 0.611 Female education index EI f = = 0.605 2 10 Male mean years of schooling index ( MYSI m ) = = 0.666 15 12 Male expected years of schooling index (EYSI m ) = = 0.666 18 0.666 + 0.666 = 0.666 Male education index (EI m ) = 2 W f 100 From this data, we estimate female to male wage ratio = = = 0.909; Wm 110 now the female share of wage bill (using equation 7) = 0.909x.0.391 0.355 = = 0.368. Using equation (9), we estimate 0.909x.0.391 + 0.609 0.964 0.368 female earned income per capita GNIPC f = x( 25, 000 ) = 18436.87. 0.499 Similarly, we estimate male earned income per capita (GNIPCm) = 1 − 0.368 x ( 25, 000) = 31536.92 0.501 Estimated female earned income index
(
)
Female life expectancy index LEI f =
(
)
(
)
( )
(
)
.87 − ln 100 9.822 − 4.605 5.217 = = = 0.788 (II ) = lnln18436 75000 − ln 100 11.225 − 4.6055 6.62 f
130 Sushil Kr. Haldar
Estimated male earned income index ln 31536.92 − ln 100 10.358 − 4.605 5.753 (II m ) = ln 75000 − ln 100 = 11.225 − 4.6005 = 6.62 = 0.869 0.333 Female HDI = (0.776x 0.605x 0.788) = 0.717 Male HDI = (0.746x 0.666x 0.869) Therefore, GDI of the country is
0.333
= 0.755
0.717 = 0.949 0.755
2 Deprivation indices 2.1 Index formation using a single variable It is generally argued that an Index consists of multiple indicators but only one variable may exist in an index number. For example, income of n number of individuals from different socio-ethnic groups are given in a survey and one is interested to find out the Sen’s index or Foster-Greer and Thorbecke (FGT) index of poverty based on income or consumption. Sen (1976) has suggested a measure of poverty that captures its severity, which is popularly known as Sen index (SI): SI = HCR R + (1 − R )G P (14) n* stands for headcount ratio, n* is the number of poor people n whose incomes are less than x* (namely threshold income or poverty level income) and n is the total number of persons; R is defined as poverty gap ratio (PGR),
where HCR =
x * −xP , where the mean income of poor people is xP ; GP is the Gini coefficient x* of incomes of the poor which captures income inequality. If the income inequality amongst the poor is zero, the SI reduces to poverty gap index (PGI) as noted here: PGI =
n * (x * − x P ) nx *
= HCR.R (15)
But Foster et al. (1984) have developed an Index which is considered as a generalized measure of poverty because it captures incidence, depth and severity of poverty for various values of sensitivity parameter (α). The FGT index is defined as:
∑ (x * − x )
α
FGT ( x; x *,α ) =
i
xi < x *
n (x * )
α
(16)
where, xi is the income of the i-th poor, x* is as defined earlier which is exogenously given, α stands for sensitivity parameter defining the FGT index.
Indices building in social science research 131
If we insert, α = 0, we can get the ‘incidence’ of poverty: FGT (0 ) = HCR =
n* (17) n
if α = 1, we obtain the second measure of poverty, namely the poverty gap index (PGI) which indicates the ‘depth’ of poverty: FGT (1) = PGI =
(
n* x* − xp nx
*
) = HCR.R (18)
When α = 2, we get the poverty measure that reflects the ‘severity’ of poverty and is given by: FGT ( 2) = HCR R 2 + (1 − R )2 CVP 2 (19) Here CVP is the coefficient of variation of income of the poor people. It can be proved that if α tends to infinity, then FGT will capture only one person who is the poorest of the poor (see appendix). Here, we have considered two inequality measures–one is Gini coefficient and the other is coefficient of variation (CV). Gini index falls under the category of Lorenz class, whereas CV falls under the category of entropy class. One can easily calculate the SI and FGT Index of income poverty from the following data given in Table 7.3. We hypothetically consider four social groups (namely Scheduled Caste (SC), Scheduled Tribe (ST), Other Backward Caste (OBC) and general) and income (measured in Rs.) earned (per day) by the members in each social group is also given, assuming equal number of members in each group. We assume that the threshold or cut-off level of income (x*) determining poverty is set at Rs.100.00. The HCR, mean income of the poor ( xP ), poverty gap ratio (R), poverty gap index (PGI), standard deviation of income of the poor (SDP), coefficient of TABLE 7.3 Income (in Rs.) earned by members per day of different social groups
SC
ST
OBC
General
150 80 125 200 90 100 80 110 130 70
80 90 40 110 140 50 30 110 140 60
170 140 180 70 140 60 110 120 150 50
200 110 210 90 160 80 120 150 160 70
Source: Hypothetical data generated by author.
132 Sushil Kr. Haldar
variation of income of the poor (CVP), FGT(2), Gini coefficient of income of the poor (GP) and Sen’s index (SI) are tabulated in Table 7.4. In order to facilitate the numerical calculation pertaining to poverty, we just cite an example of SC community following column-1 of Table 7.4: Amongst the SC community, there are four persons whose income is less than the cut-off or poverty level income (x* = 100) and the incomes of the four persons 4 = 0.4 or, 40% people are poor in 10 80 + 90 + 80 + 70 SC community. Now, mean income amongst the poor is = = 80; 4 100 − 80 the poverty gap ratio (R) = = 0.2. The poverty gap index (PGI) or depth 100 of the poverty is equivalent to FGT(1) = 0.4x0.2 = 0.08. The SD of income of the are: 80, 90, 80 and 70. Therefore, the HCR is
8.16 = 0.102 . 80 FGT(2) captures the severity of poverty and it is estimated for the SC community: FGT(2) = 0.4[0.04 + 0.64x0.0104] = 0.018. The Gini coefficient can be estimated by using various methods: Here the following formula of Gini coefficient is used:
poor amongst the SC = 8.16.The CV of income amongst the poor =
Gini =
1 2n 2 x
∑x i≠ j
i
− xj ,
where n stands for number of individuals, x is the mean income, xi is the income of the i-th person.Thus, Gini coefficient of the poor amongst the SC community is Gini =
1 80 − 90 + 80 − 80 + 80 − 70 + 90 − 80 + 90 − 80 + 90 − 70 2.42.80 80 − 80 + 80 − 90 + 80 − 70 + 70 − 80 + 70 − 90 + 70 − 80
Gini =
120 = 0.046 2560
+
TABLE 7.4 Poverty amongst different social groups
Measures of poverty
SC
ST
OBC
General
HCR or, FGT(0) Mean income of the poor R FGT(1) or, PGI SD of income amongst the poor CV of income amongst the poor Square of CV FGT(2) Gini coefficient amongst the poor SI
0.4 80 0.2 0.08 8.16 0.102 0.0104 0.0186 0.046 0.094
0.6 58.33 0.42 0.252 23.16 0.397 0.157 0.137 0.178 0.314
0.3 60 0.4 0.12 10 0.167 0.027 0.051 0.074 0.133
0.3 80 0.2 0.06 10 0.125 0.015 0.015 0.055 0.073
Source: Author’s calculations.
Indices building in social science research 133
Therefore, the Sen’s index (SI) of poverty amongst the SC = 0.4[0.2 + 0.8 x 0.046] = 0.094.
2.2 Multidimensional poverty and human deprivation: Sen and Anand’s human poverty index (HPI) The goal of development is viewed as the ‘promotion and expansion of valuable capabilities’ (Sen [1990]) which has taken root as well as reflected in the successive Human Development Reports starting from 1991. The capability poverty measure is a simple index comprising of three indicators that reflect the percentages of population with capability shortfalls in three basic dimensions of human development (UNDP [1996]): living a healthy and well-nourished life; having the capability of safe and healthy reproduction; and being literate and knowledgeable. The three corresponding indicators are (1) percentage of children under 5 who are underweight (U5CUW); (2) percentage of births unattended by trained health personnel (BUTHP); and percentage of women aged 15 years and above who are illiterate (WIL). The capability poverty measure differs from the human development index in that it focuses on peoples lack of capabilities rather than on the average level of capabilities in a country. The human development index uses income but the capability poverty measure does not. Comparing results of the capability poverty measure with those of human development index would show that some countries have done relatively better in raising average capabilities than in reducing capability poverty and others have done relatively better in reducing capability poverty than in raising average capabilities (UNDP [1996]). A composite index is made giving equal weights in three variables cited earlier. Thus, capability poverty measure is CPM =
U 5CUW + BUTHP + WIL (20) 3
CPM is also criticized because it has given more emphasis on women and children; the standard of living deprivation is not incorporated in CPM. Anand and Sen (1997) have formulated an index in order to measure capability deprivation in basic three dimensions of human life already reflected in the human development index, namely longevity, knowledge and decent standard of living.The first deprivation relates to survival – the vulnerability to death at a relatively early age. The second relates to knowledge – being excluded from the world of reading and communication. The third relates to a decent standard of living in terms of overall economic provisioning. This is the human poverty index (HPI) comprising three dimensions, which was introduced in the 8th Human Development Report (1997).The human poverty index (HPI) combines basic dimensions of poverty and reveals interesting contrasts with income poverty – it is written as the weighted mean of three different proportions. The human development index uses income but the CPM and HPI do not. The human development index measures average
134 Sushil Kr. Haldar
achievement whereas the HPI measures deprivations in the three basic dimensions of human development captured in the human development index. The HPI takes into account ‘output’ (or attainment) and ‘input’ (or process) indicators.
Algebraic formulation of the HPI and its component Human poverty index (HPI) measures deprivation in three basic dimensions of human development. These are longevity, knowledge and economic provisioning. The first two indicators used are: percentage of people not expected to survive to age 40 (P1) and the percentage of adults who are illiterate (P2). The third indicator (P3) is composed of three variables: percentage of people without access to safe water (P31), percentage of people without access to health services (P32) and percentage of moderately and severely underweight children under five (P33). P3 =
P31 + P32 + P33 3
Human poverty index, P(α), is a weighted mean of order α of P1, P2 and P3. Let Wi > 0 be the weight on Pi (≥ 0) for i = 1,2,3. P(α) is defined as follows: 1
W P α + W2 P2α + W3 P3α α P (α ) = 1 1 (21) W1 + W2 + W3 1
P α + P2α + P3α α with W1 = W2 = W3 = 1,P (α ) = 1 . 3 It is to be mentioned here that P(α) cannot be interpreted as a headcount or incidence of poverty with respect to a poverty line (hyper-plane) drawn in the product space of the three variables though P1, P2 and P3 are the headcount or incidence of poverty in each of three separate dimensions. P(α) is an average, albeit of order α, of the three sub-indices P1, P2 and P3. If the incidence of poverty happened to be the same in every dimension, then P(α) would clearly be equal to this 1
W P α + W2 P α + W3 P α α common number, since P (α ) = 1 = P . This observation W1 + W2 + W3 allows us to interpret P(α) as the degree of overall poverty that is equivalent to having a headcount ratio of P% in every dimension (HDR [1997]). This is suggested by UNDP in 1997 for measuring HPI for the developing world. However, the criterion and the indicators used in constructing HPI differs between developed and developing countries. In India, the Planning Commission measures human poverty index (HPI). It also considers three dimensions, health, education and economic provisioning (HDRC [2004]). The health indicator is percentage of people not expected to survive beyond age 40 (P1). The second indicator (P2) comprises
Indices building in social science research 135
two variables: illiteracy rate for the population in the ages 7 years and above and proportion of children in the age group 6–18 years not enrolled in the schools. The economic deprivation indicator (P3) is a composite of four parameters: Proportion of people below poverty line; proportion of children in the age group 12 to 23 months not fully vaccinated; proportion of people living in kutchha houses; proportion of people who do not have the access to safe drinking water, electricity and health facility. It is to be mentioned here that over time the indicators used in measuring HPI do vary depending on the availability and accessibility of data across countries or across regions in a particular country. UNDP has suggested α = 3 for calculation of HPI. HPI as suggested by UNDP (1997) has been considered a good index for capturing human deprivation and P(α) does satisfy the following important properties: (1) Min {P1 , P2 , P3 } ≤ P (α ) ≤ Max {P1 , P2 , P3 } (2) As α → ∞, P(α) → Max{P1, P2, P3} (3) P(α) is homogeneous of degree 1 in {P1, P2, P3} ∂P (α ) >0 (4) For each i =1,2,3; ∂Pi (5) P(α) is convex with respect to Pi. For each i = 1,2,3; (6) For any i,
∂ 2 P (α ) >0 ∂Pi 2
∂P (α ) ∂P (α ) ≥ 0 as Pi ≥ P(α), similarly, ≤ 0 as Pi ≤ P(α) ∂Wi ∂Wi
(7) For given P1, P2 and P3 that are not equal, if α > γ > 0, then P(α) > P(γ) m n j (8) The HPI is not sub-group decomposable. For α ≥ 1, ∑ Pj (α ) ≥ P (α ) where j =1 n m
nj be the population in the j-th group, n = ∑ n j , Pj(α) be the HPI of j-th group. j =1
(9) The elasticity of substitution (σ) between any two poverty sub-indices of P(α), 1 that is, between any two of P1, P2 and P3, is constant and equal to . α −1 The HPI (P(α)) does also suffer from the problem of aggregation, choice of the dimensions, overlapping and multicollinearity (Alkire [2002]; Krishnaji [1997]; Pillai [2004]; Rippin [2009]; Roy and Haldar [2010]). Moreover, the arbitrary weightage scheme is also unscientific. In order to overcome this problem, one can employ principal component analysis (PCA) through which one can reduce the number of variables on one hand and take care of the weights of the PCs on the other ( Johnson and Wichern [2006]; Raychaudhuri and Haldar [2009]; Haldar and Raychaudhuri [2015]). Here, the weights are not arbitrarily given; it is determined endogenously from the data matrix. The weights of the PCs are the corresponding Eigen Values (λ). We shall develop an index using PCA later on.
136 Sushil Kr. Haldar
Numerical example towards estimation of HPI (P(α)) The following information pertaining to poverty sub-indices of a particular country, A, is given as: P1 = Health deprivation measured by people who do not survive after attaining age 40(%) = 19.4; P2 = Knowledge deprivation measured by adult illiteracy rate (%) = 48.8; P31 = Population without access to safe drinking water (%) = 19; P32 = Population without access to health services (%) = 15; P33 = Children who are underweight under age five (%) = 53. As we know that deprivation in economic provisioning is measured by the P31 + P32 + P33 19 + 15 + 53 = = 29. 3 3 Using equation (2) and assuming W1 = W2 = W3 =1 and α = 3, we have:
mean of P31, P32 and P33, therefore, P3 = 1
P(α )
α =3
19.43 + 48.83 + 293 3 = = 36.66 3
2.3 Multidimensional poverty and human deprivation: Alkire and Foster’s multidimensional poverty index (MPI) In 2010, the twentieth anniversary year of the United Nations Development Programme, a new international measure of poverty was developed by Alkire and Foster (2009) and Alkire and Santos (2010), known as the multidimensional poverty index (MPI) which directly measures the combination of deprivations that each household experiences. This has been considered an improvement over HPI (UNDP [2010]). The new MPI supplants the HPI used in previous Human Development Reports. Unlike HPI, the new measure of poverty, namely MPI has some advantages because it can easily be implemented at the household level. The MPI is constructed using ten indicators covering three dimensions like health, education and standard of living, assigning equal weights (namely 1/3) to each dimension.The indicators and the corresponding weights in each of three dimensions are given in Table 7.5. The MPI assesses the nature and intensity of poverty at the individual level, with poor people being those who are multiply deprived and the extent of their poverty being measured by the extent of their deprivations.The MPI creates a vivid picture of people living in poverty within and across countries, regions and the world. It is the first international measure of its kind, and offers an essential complement to income poverty measures because it measures deprivations directly (Alkire and Foster [2009]). We assign value ’1’ for deprivation in each indicator and ’0’ otherwise. The maximum total deprivation score (d) will be 100.
Indices building in social science research 137 TABLE 7.5 Dimensions and indicators of multidimensional poverty at the household level
Dimension(Weight)
Indicators
Health (100/3) = 33.33
1] At least one member suffers from malnutrition 2] One or more child have died during the last five years
16.67 16.67
Education (100/3) = 33.33
1] No one has completed five years of schooling 2] At least one school-age child not enrolled in school
16.67 16.67
Living Condition 1] No electricity (100/3) = 33.33 2] No access to safe drinking water 3] No access to improved sanitation 4] House has dirt wall/floor 5] Household uses dirty cooking fuel (dung, firewood or charcoal) 6] Household has no car and does not own at least one of: bicycle, motorcycle, radio, refrigerator, telephone or television Total Weight 33.333 x 3= 100
Weight
5.56 5.56 5.56 5.56 5.56 5.56
100
Source: Compiled from UNDP Human Development Report, 2010.
According to UNDP a household (or all members of the household) is said to be multi-dimensionally poor if the sum of weighted deprivation score (WDS) for a household is 33.33% or more (UNDP [2015]; Alkire and Santos [2010]). The multi-dimensionally poverty head count ratio (H) is the proportion of the multi-dimensionally poor people to the total population. The head count ratio (HCR) is the proportion of a population that lives below the poverty line. q Symbolically, it is defined as: HCR( H ) = , where, q = total number of poor n whose income is less than a cut-off level of income/threshold level of income determining poverty line; n stands for total persons. It actually measures the incidence of poverty. The intensity of multi-dimensional poverty (A) reflects the proportion of the weighted component indicators, in which, on average, poor people are deprived of. It is the sum of the deprivation score of the poor divided by total poor. Technically, q
A = ∑ ci / q i
where, c i is the deprivation score that the i-th poor experiences. The deprivation score, c i of i-th poor person can be expressed as the sum of the weights associated with each indicator, j( j= 1,2, . . . 10) in which person i is deprived, ci = ci1 + ci2 + ci3 . . . ci10. Finally, the multi-dimensional poverty index (MPI ) is obtained by multiplying the multi-dimensionally poverty head count ratio (H) with the intensity of multi-dimensional poverty (A). Therefore, MPI = H×A
138 Sushil Kr. Haldar
The contribution of dimension k to multidimensional poverty can be expressed as: q
Contributionk =
∑∑c j ∈k
1
n
ij
/ MPI
The multidimensional head count ratio (H ), intensity of multidimensional poverty (A) and MPI are estimated considering a hypothetical data as shown in Table 7.6. The weighted deprivation score of the four households are estimated as follows: Weighted deprivation score of HH-1 = (1 x 16.67)+(1 x 5.56) = 22.2 percent; Weighted deprivation score of HH-2 = (3 x 16.67)+(4 x 5.56) = 72.2 percent Weighted deprivation score of HH-3 = (1 x 16.67)+(4 x 5.56) = 38.9 percent; Weighted deprivation score of HH-4 = (2 x 16.67)+(3 x 5.56) = 50.0 percent.
UNDP (2010) has fixed the MDP cut-off = 33.3 percent. Therefore, HH-2, 3 and 4 are multi-dimensionally poor. Multidimensional headcount ratio 7+5+4 (H ) = = 0.800. Thus, 80 percent of people live in poor households. 4+7+5+4 (72.2x 7) + (38.9x 5) + (50.0x 4 ) = 56.3. Intensity of multidimensional poverty ( A) = 7+5+4 x 5) + (50.0x 4 ) = 56.3. This means that the average poor person is deprived in 56.3% of the weighted +4 indicators. TABLE 7.6 Multidimensional poverty at the household level
Dimension(Indicator)
HH-1
HH-2
HH-3
HH-4
Size:4
Size:7
Size:5
Size:4
Education No one has completed six years of schooling At least one school-aged child not enrolled in school
0 0
1 1
0 0
1 0
Health At least one member suffers from malnutrition One or more children have died
0 1
0 1
1 0
0 1
Living Conditions No electricity No access to clean drinking water No access to adequate sanitation House has dirt floor Household uses dirty cooking fuel Household has no radio/tv/phone/cycle/bike
0 0 0 0 1 0
1 0 1 0 1 1
1 1 1 0 1 0
1 0 0 0 1 1
Source: UNDP’s HDR (2013).
Indices building in social science research 139
Therefore, MPI = H x A = 0.8 x 0.563 = 0.45 Contribution of deprivation in education, health and living conditions in MPI can be estimated as: Education = Health =
(16.67x 7x 2) + (16.67x 4 ) / 45 = 33.3% 4+7+5+4
(16.67x 7x 5) + (16.67x 4 ) / 45 = 29.6% 4+7+5+4
Living − Conditions =
(5.56x 7x 4 ) + (5.56x 4x 3) / 45 = 37.1% 4+7+5+4
The deprivation score and MPI across geographical settings or amongst different social groups are helpful for policy intervention. Contribution of each dimension to multidimensional poverty provides information that can be useful for revealing a country’s configuration of deprivations and can help with policy targeting.
3 Inequality indices Different types of inequality measures are used in social research. The inequality indices broadly categorized into three classes: social welfare-based measure; Lorenz class and entropy class. Some numerical examples are provided in each class of measure of inequality. I discuss below the Atkinson index which falls under welfare measure, and the Gini index which belongs to Lorenz class. On the other hand, mean log deviation, Theil(1) and coefficient of variation belong to entropy class. The merits and demerits in each class of measure of inequality are also discussed in this section.
3.1 Atkinson index The basic assumptions of constructing this index are as follows: (a) Assume that there are two mutually exclusive and exhaustive groups; here the two groups are male and female. (b) The objective of the social planner is to find out an optimum trade-off between higher overall achievement and lower gender differential in any development outcome. Following Atkinson (1970), Sen (1993), Blackorby and Donaldson (1984) and UNDP (1995), we consider X be the indicator of development, let it be the literacy rate. Assume, Xf and Xm stands for female and male literacy rate respectively and
140 Sushil Kr. Haldar
further assume that Xf < Xm; Nf and Nm are the female and male population size. Therefore, the overall or mean literacy ( X ) is given by: X=
N f X f + NmXm N f + Nm
(22)
It can be written as: X = P f X f + Pm X m where, Pf and Pm are the proportions of female and male population respectively. According to assumption (b), the planner wants to increase X in one hand and to reduce (X m − X f ) on the other hand. In order to solve this problem, we consider a social valuation function for achievement that is additively separable, symmetric and of constant elasticity marginal valuation form as: V (X ) =
X 1− ε , for ε ≥ 0 but ε ≠ 1 (23a) 1− ε
V ( X ) = ln X , for ε = 1 (23b) Only values of ε ≥ 0 are considered so as to reflect a preference for equality in the social valuation function. For any pair (Xf, Xm) of female and male achievements, we can construct an ‘equally distributed equivalent achievement,’ XEDEA which is defined to be the level of achievement that, if attained equally by women and men, as (XEDEA, XEDEA) would be judged to be exactly as valuable socially as the actually observed achievements (Xf, Xm). According to the formula for social valuation, for a given ε, XEDEA is thus defined as:
{
X EDEA = P f X 1f − ε + Pm X m1− ε
1 1− ε
}
(24)
Here, XEDEA, is formed from (Xf, Xm) by taking what we shall call a ‘(l – ε) average’ of Xf and Xm rather than a simple arithmetic average of the female and male achievements. The optimum trade-off between higher achievement and gender equality is achieved through XEDEA for ε > 0. When ε = 0, XEDEA reduces to X, the simple arithmetic average; here there is no concern for equality and the arithmetic mean implies the social achievement. But when ε > 0, there is a social preference for equality (or an aversion to inequality) that is measured by the magnitude of the parameter ε; thus ε is interpreted as preference to equality or inequality aversion parameter. 1 From equation (24), we have the following results assuming, P f = Pm = : 2 If ε = 0, X EDEA = X ; if ε = 1, X EDEA = X f .X m = GM of Xf and Xm; similarly, for ε=2, X EDEA =
2 = HM of Xf and Xm. 1 1 + X f Xm
Indices building in social science research 141
Therefore, we can write: X EDEA ≤ X for ε ≥ 0. That is, X EDEA ≤ 1 for ε ≥ 0 (25) X Now, instead of two groups, we can assume n number of groups having the popula1 tion share, , equation (24) can be represented as: n 1
X EDEA
1− ε 1 = ∑ X i1− ε (26) n i
for ε ≥ 0 but ε ≠ 1. If ε = 1, 1
n
X EDEA = ∏ X in (27) i =1
Given any achievement (let income distribution), therefore, EDEA can be easily calculated for different levels of inequality aversion. Different levels of inequality aversion ε give different values of XEDEA. For ε = 0, the equally distributed equivalent achievement (let income) is simply the average level of income. With ε > 0, XEDEA decreases (for convex social welfare function, its level is always below average income) and Atkinson index increases. Therefore, the Atkinson index is 1
1 X 1 − ε 1− ε X A(ε ) = 1 − EDEA = 1 − ∑ i (28) X n i X for ε ≥ 0 but ε ≠1. If ε = 1,
A(ε ) = 1 −
∏X i
X
1 n i
(29)
For any income distribution, if we know the mean incomes and the value of A(ε), the level of welfare can be calculated and compared. In this sense, XEDEA is a direct measure of welfare. The value of Atkinson index depends on inequality aversion parameter (ε); if ε tends to infinity, XEDEA will approach the minimum of Xi; now if Xi becomes zero, the XEDEA will be zero and Atkinson index will be unity. Thus, Atkinson index lies between zero and one.
142 Sushil Kr. Haldar
Numerical example of estimation of Atkinson index from a hypothetical data Let us consider the first column of Table 3, in which income earned by the members belonging to SC community is given. There are 10 individuals and their incomes are: 150, 80, 125, 200, 90, 100, 80, 110, 130, 70. Before calculating Atkinson index (A(ε)), we have to choose the inequality aversion parameter (ε). Suppose, we consider the values of ε = 0.5, 1 and 2. If the value of ε = 0.5, the expression in (28) is applicable. Now, expanding (28), and setting ε = 0.5, we have: A(ε )
ε = 0.5
1 n X 0.5 = 1− ∑ i n i =1 X
2
0.5 X 1 0.5 X 2 0.5 X 3 0.5 X 10 + + + .............. X X X X = 1− 10
2
0.5 150 0.5 80 0.5 125 0.5 70 + + + ................ 113.5 113.5 113.5 113.5 = 1− 10
1.149 + 0.839 + 1.049 + .................0.785 = 1− 10 9.887 = 1− 10
2
2
2
= 0.025 If we consider ε = 1, the expression in (29) is applicable and estimation process is straightforward:
A(ε )
ε =1
= 1−
∏X
1 n i
i
X 1
(X .X .X .....X 10 )10 = 1− 1 2 3 X
(150.80.125.........70) = 1− 113.5 107.643 = 1− 113.5 = 0.051
0.1
Indices building in social science research 143
If we consider ε = 2, the expression in (28) reduces to:
A(ε )
1 X =1 − ∑ i n i =1 X n
ε =2
= 1−
−1 −1
1 = 1− 1 n X −1 ∑ i n i =1 X
n HM = 1− AM 1 1 1 + + .......... X Xn X1 X 2
Inserting the numerical values as given, we have: A(ε )
ε =2
=1 −
103.25 = 0.090 113.5
Therefore, as preference for equality increases (ε), the value of Atkinson index rises slowly.
3.2 Gini index The Gini index or Gini coefficient was developed by Italian statistician Corrado Gini in 1912. It is defined as a ratio with values between 0 and 1: the numerator is the area between the Lorenz curve of the distribution and the uniform distribution line (namely line of equality or egalitarian line); the denominator is the area under the egalitarian distribution line. Graphically, the Gini coefficient or Gini index (as A shown in Figure 7.1) is defined as: . A+B 1 Since A + B = .1.1 = 0.5, the Gini coefficient, G = 2A = 1–2B. If the Lorenz 2 curve is represented by the functionY = L( X ), the value of B can be found with integration: 1
Gini = 1 − 2∫ L( X )dX (30) 0
If the Lorenz curve is discontinuous (namely discrete), then area B is the sum of the areas of trapeziums and Gini is defined as: n
Gini = 1 − ∑ (Pi − Pi −1 ) (Qi + Qi −1 ) (31) i =1
where P and Q are the cumulative proportion of persons and income respectively of the distribution. For estimation purposes, various formulas are devised in respect
A
B
Cumulative Share of People from Lowest to Highest Incomes FIGURE 7.1
Cumulative share of Income Earned
100%
144 Sushil Kr. Haldar
100%
Lorenz curve
of Gini index. One is in terms of the covariance between income levels and the cumulative distribution of income. In particular: Gini =
2 Cov( X , F ( X )) (32) X
where Cov is the covariance between income levels X and the cumulative distribution of the same income F(X) and X is average income. The Lorenz derivation of the Gini index has a direct correspondence with another, though cumbersome, way to calculate the Gini index: Gini = 1 +
x x x 1 2 xn − + 2. n −1 + 3. n − 2 + ....... + n. 1 (33) n n X X X X
n
where X = ∑ xi and xn > xn−1 > xn−2 . . . > x1. i =1
Note the peculiarity of the last round bracket, where each income share, from the highest to the lowest, is multiplied by the rank of individuals in the income distribution from the lowest to the highest, so that the largest share has rank 1 and the smallest share has rank n.This formula is applicable if the incomes of the individuals are distinct. This problem may be overcome in the following formula: Gini =
1 2.n 2 .x
n
n
j
i
∑ ∑x
i
− xj
(34)
Following Cowell (1977) and Yitzhaki (1983), we have the following properties of Gini index:
Indices building in social science research 145
n −1 as upper limit; n (b) Gini is scale invariant but not translation invariant; (c) Gini does satisfy the Pigou-Dalton transfer principle; (d) Gini reacts less to transfers occurring amongst individuals with closer ranks; (e) Generally, Gini is not sub-group decomposable and it is not distribution sensitive.
(a) Gini has zero as lower limit and
Numerical example of estimation of Gini index from a hypothetical data We consider the same income data as already used earlier towards estimation of Atkinson index. The covariance formula (as shown in equation 4.9) is less cumbersome and easy to estimate. At first we sort the income data in ascending order: 70, 80, 80, 90, 70 + 80 + ........200 100, 110, 125, 130, 150, 200.The mean income ( x ) = = 113.5, 10 whereas the covariance between income and cumulative distribution of same 2 .10.125 = 0.178. 113.5 On the other hand, applying (34), we can estimate Gini index as:
income, Cov( X , F ( X )) is 10.125. Therefore, Gini index =
Gini = =
1 2.n 2 .x
n
n
j
i
∑ ∑x
i
− xj
70 − 80 + 70 − 80 + 70 − 90 + 70 − 100 + 70 − 110 2. (10) .113.5 + 70 − 125 + 70 − 130 + 70 − 150 + 70 − 200 80 − 70 + 80 − 800 + 80 − 90 + 80 − 100 + 80 − 110 + 80 − 125 1
2
+ 80 − 130 + 80 − 150 + 80 − 200 + ................................. ............................................................................................................. ............................................................................................................. ............................................................................................................. ............................................................................................................ 200 − 70 + 200 − 80 + 200 − 80 + 200 − 90 + 200 − 100 + 200 − 110 + 200 − 125 + 200 − 130 + 200 − 150 ] 4050 22700 = 0.178 =
In order to implement the formula in (33), we slightly change the data of income of 10 individuals; there are two individuals having same income level, 80. We subtract 0.5 from second individual’s income and add that with the third individual’s
146 Sushil Kr. Haldar
income such that new income distribution becomes: 70, 79.5, 80.5, 90, 100, 110, 125, 130, 150, 200. Therefore, 150 130 125 110 200 + 2. + 3. + 4. + 5. 1 2 1135 1135 1135 1135 1135 Gini = 1 + − 100 90 80.5 79.5 70 10 10 +6. +7 + 8. + 9. + 10. 1135 1135 1135 1135 1135 Gini =
11 2 0.176 + 0.264 + 0.343 + 0.44 + 0.484 − 10 10 +0.528 + 0.555 + 0.567 + 0.630 + 0.616
= 1.1 − 0.2x 4.603 = 1.1 − 0.9206 = 0.179 Therefore, all the Gini formulas yield the same results.
3.3 Entropy measure of inequality index The use of complex inequality measures such as the entropy class can be associated to the use of a descriptive approach to measure inequality. In particular, the use of these indexes does not involve welfare judgements; however, there is a relationship between entropy measure of inequality and Atkinson index. The entropy class of inequality indexes gives a different description of inequality with respect to simple statistical indexes. In particular, using complex inequality measures will not give any information about the characteristics of the distribution whether the distribution is positively or negatively skewed. In fact most of these indices are translation invariant and they do not say anything about the position of the income distribution (Theil [1967]; Cowell [2003]). Unlike Gini, the entropy measures do satisfy the decomposability property and it is distribution sensitive. The mean log deviation (MLD), Theil index (GE(1)) and coefficient of variation belong to entropy class measure of inequality.The generalized form of entropy class measure of inequality is shown by the following equation (35). GE (α ) =
X i α 1 − 1 (35) ∑ 2 n(α − α ) X
where X is the mean income.The values of GE measures vary between 0 and ∞, with zero representing an equal distribution and higher value representing a higher level of inequality.The parameter α in equation (35) is the sensitivity parameter capturing the distribution of income (or consumption) and it represents the weight given to distances between incomes at different parts of the income distribution, and can take any real value. For lower values of α, GE is more sensitive to changes in the lower tail of the distribution, and for higher values GE is more sensitive to changes that affect the upper
Indices building in social science research 147
tail.The general practice of choosing α is 0, 1 and 2. If we directly put α = 0 or 1, the function (35) will not be evaluated but for α = 2, the function becomes: X i 2 1 X i2 − X 2 1 − 1 = .∑ ∑ 2 2 n( 2 − 2) X 2n X 1 1 1 = . 2 ∑ X i2 − X 2 2 X n 1 Var ( X ) = 2 X2 1 = .CoV 2 2
GE ( 2) =
(36)
∑X 1 Note that: Var ( X ) = ∑ X i2 − . In order to evaluate the limiting value of n n equation (35) for α = 0 and 1, we employ L’Hôpital’s rule and GE becomes: 2
GE (α ) = lim
α →0
Z 1 (α ) Z (α ) = lim = lim Z 2 (α ) α →0 Z (α ) α →0 / 1 / 2
GE (α )
α =0
X ∑ ln Xi =− = n
Xi
α
X ln i X n( 2α − 1)
∑ X
X ∑ ln X i n
(37)
X α where, Z 1 (α ) = ∑ i − 1 and Z2 (α) = n(α2−α). X Equation (37) can further be expressed as: GE (α )
α =0
=
(ln X − ln X ) + (ln X − ln X ) + ..... (ln X − ln X ) (38) 1
2
n
n
This is popularly known as mean log deviation or GE(0). Similarly, if α tends to 1, the GE(α) becomes: Xi
GE (α )
α =1
=
Xi
∑ X ln X n
(39)
GE(1) is called the Theil index, by the name of the author who first proposed it in 1967. Both indexes, however, share an undesirable feature, i.e. not being defined if there are zero incomes. Therefore, in a distribution with all zero incomes except for the last, their maximum value cannot be calculated directly. Rather, it can only
148 Sushil Kr. Haldar
be calculated by replacing zero incomes with very small incomes. However, if we replaced zero incomes with very small incomes, while GE(1) approaches the maximum value of ln(n), the maximum value of GE(0) would depend on how small these incomes are defined. In other words, GE(0) is not bound. Hypothetically, we assume an income distribution which is exceptionally unequal such that all incomes of (n-1) individuals are extremely small compared to n-th person whose income is very high. Under this circumstances, GE(1) may be approximated as: 1 X X 1 n.X n nX n ln ≈ 0 + 0 + 0............ n ln n = X X n ∑ X i ∑ X i n (40) 1 n.X n nX n = ln = ln(n ) n X n X n
GE (1) =
Therefore, we are in problem in finding out the limiting value of GE(1) in case of highly unequal income distribution; this means that as the number of individuals increases the GE(1) also increases. Therefore, for the purpose of an operational approach, it is worth defining a class of relative entropy (RE) inequality indexes, defined as the ratio between the value of the original entropy index and the maximum value each member of that class assumes for any given positive α. This excludes the possibility of considering GE(0), as it has not an upper limit.Therefore, RE(1) is defined as: 1 Xi Xi ∑ X ln X GE (1) n RE (1) = = (41) MaxGE (1) ln(n ) For α > 1, the maximum GE(α) for highly unequal income distribution is evaluated as: GE (α )
Max
X i α 1 1 0 + 0 + ....nα − n − 1 = ∑ 2 2 n(α − α ) X n ( α α ) − α n −n = n(α 2 − α ) =
Therefore, the RE(α) is defined as: X i α 1 − 1 ∑ 2 n(α − α ) X GE (α ) = RE (α ) = = MaxGE (α ) nα − n n(α 2 − α )
X i α ∑ X − 1 (42) nα − n
Indices building in social science research 149
GE(α) and RE(α) satisfy the following properties: (a) All members of both GE and RE class have zero as lower limit. GE(1) has ln(n) nα − n as upper limit, whereas GE(α), for α>1, has as upper limit. n(α 2 − α ) (b) All members of GE and RE class are scale invariant but not translation invariant. (c) All members of the GE and RE class satisfy the principle of transfers. (d) GE measures of inequality indexes are sub-group decomposable.
Numerical example of estimation of entropy indexes from a hypothetical data We consider the same data set of income distribution as it were used in Atkinson and Gini measures inequality indexes. We want to estimate numerically the values of MLD or GE(0), Theil Index or GE(1), GE(2) and RE(1). The mean income ( X ) of 10 individuals is 113.5, therefore, expanding the RHS of 4.14 we have: 150, 80, 125, 200, 90, 100, 80, 110, 130, 70 (ln 113.5 − ln 150 ) + (ln 113.5 − ln 80 ) + (ln 113.5 − ln 125) +...................(ln 113.5 − ln 70 ) GE (0 ) = MLD = 10 ( −0.278) + (0.349) + ( −0.096) + ................ + (0.483) GE (0 ) = 10 1.5693 − 1.075 0.4953 = = 0.049 GE (0 ) = 10 10 Now, expanding the RHS expression of equation (39), we have: X1 X1 X 2 X 2 ln + ln X X X X X X X X ∑ Xi ln Xi +................. Xn ln Xn GE (α ) α =1 = Theil _ Index = = n n 150 150 80 80 70 70 + ................. + ln ln ln 113.5 113.5 GE (1) = 113.5 113.5 113.5 113.5 n 1.321 * (0.278) + 0.704 * ( −0.349) + ........................... +0.616 * ( −0.483) = 10 1.6246 − 1.1176 = = 0.0507 10
150 Sushil Kr. Haldar
GE(2) for the income distribution data given earlier is estimated as: 2
2
SD 39.58 GE ( 2) = 0.5 * = 0.5 * 0.1216 = 0.0608 = 0.5 * Mean 113.5 The RE(1) is evaluated as: RE (1) =
0.0507 0.0507 GE (1) = = = 0.022 MaxGE (1) ln 10 2.302
3.4 Concentration index Health inequality is much more serious than income inequality. Income inequality can be reduced by imposing tax or by other measures but health inequality is structural in nature, i.e. there is no automatic mechanism to eliminate health inequality. It requires longer time to bring down health inequality as compared to income inequality as the latter can be reduced even in short term by transferring income from richer to poorer. Health inequality may create negative externality. For instance, a domestic helper with TB has a higher chance of affecting others by TB, especially kids, and the old aged. All cited measures of inequalities with their own merits and demerits are used mainly to estimate income, education and health inequality. But concentration curve (CC) and concentration index (CI) are explicitly used to identify socioeconomic inequality, i.e. whether it is more pronounced at one point in time than another or in one country than another. But a concentration curve does not give a measure of the magnitude of inequality whereas the concentration index (CI) does it by quantifying the degree of socio-economic inequality with a particular focus on a health variable. It is used to measure and compare the degree of socioeconomic inequality in respect of child mortality, child immunization, child malnutrition, adult health, health subsidies, health care utilization etc. In this section, we will first define the CI and its properties.We then describe its estimation procedure for individual and grouped data. The CC plots the cumulative percentage of health variable against the cumulative percentage of population ranked by socio-economic status or standard of living. For instance, in Figure 7.2, horizontally we have measured cumulative percentage of population ranked by income and vertically we have measured the cumulative percentage of ill health. The figure depicts that ill health (bad health) is disproportionately higher amongst the poor people since the CC lies above the line of equality. In other words, it plots shares of the health variable against the variable for living standards. The CI is defined as twice the area between the CC and the line of equality (i.e. the 45-degree line). Thus, when there is no socioeconomic inequality, the CI is zero. The convention is to assign a negative value to CI when
Indices building in social science research 151
Cumulative Percentage of ill-health
100
450 0
100
Cumulative Percentage of Persons by Income FIGURE 7.2
Concentration curve
the curve lies above the line of equality to indicate disproportionate concentration of ill-health amongst the poor, and a positive value when it lies below the line of equality. In particular, if the health variable is a ‘bad’ like ill health (e.g. infant mortality rate (IMR), malnutrition, anaemia), a negative value of CI means ill-health is higher amongst the poor.
Mathematical form of CI If the health variable is bad and the corresponding cumulative proportion of persons ranked by income) is continuous: 1
CI = 1 − 2∫ Lh ( p )dp (43) 0
where,
∫
1
0
Lh ( p )dp stands for the area between CC and line of equality.
For a discrete variable of living standards, it is CI =
2 n 1 hi ri − 1 − (44) ∑ N µ i =1 N
i is the fractional rank N of individual i in the living standards distribution with i = 1 for the poorest and i = N for the richest. For computational purposes, a more convenient formula for CI,
where hi is the health sector variable, μ is its mean and ri =
152 Sushil Kr. Haldar
defined in terms of the covariance between the health variable and the fractional rank in the living standards distribution, is given as: CI =
2 Cov(h, r ) (45) µ
It is to be noted that if we expand (45), we get (44). Hence, there is no difference between the equations (see appendix). In case of grouped data, the CI is estimated by using the following formula (Fuller and Lury [1977]): CI = (P1L 2 − P2 L1 ) + (P2 L3 − P3 L 2 ) + ..... (PT −1LT − PT LT −1 ) (46) where PT is the cumulative proportion of the sample ranked by economic status in group t (t = 1,2,3. . . T), and Lt is the corresponding concentration curve ordinate.
Properties of concentration index (CI) The properties of the CI depend on the measurement characteristics of the variable of interest. It is an appropriate measure of socioeconomic-related healthcare inequality when healthcare is measured on a ratio scale with non-negative values. The main properties of CI are: (a) CI is invariant to multiplication of the health sector variable of interest by any scalar; (b) CI is not invariant to any linear transformation of the variable of interest. Even adding a constant to the variable will change the value of the concentration index. Measurement of health inequality often relies on self-reported indicators of health. A concentration index cannot be computed directly from such categorical data, although the ordinal data can be transformed into some cardinal measure and a concentration index can be computed; and unlike the Gini coefficient, the CI lies between –1 and +1 (Kakwani et al. [1997]).
Empirical illustration of CI Hypothetically, the self-assessed health status (SAHS) and income of ten individuals are recorded as in Table 7.7. We calculate the CI for this individual sample data. The income of the individuals are given in ascending order, therefore, we need not arrange these further. If income is given in a haphazard order, we have to rearrange data first in ascending order and proceed. Necessary calculations towards estimation of CI are shown in Table 7.8.
Indices building in social science research 153 TABLE 7.7 SAHS data by income
Individuals
1
2
3
4
5
6
7
8
9
10
SAHS Income(’00)
38 150
25 80.5
30 125
40 200
24 90
23 100
31 79.5
30 110
32 130
28 70
Source: Hypothetical data generated by author. Note: We use same income data already used in estimation of different inequalities.
TABLE 7.8 Computations for estimation of CI
Income (’00)
SAHS (hi)
Income Rank
Fractional Income (ri)
hiri
70 79.5 80.5 90 100 110 125 130 150 200
28 31 25 24 23 30 30 32 38 40 Mean(µ) = 30.1
1 2 3 4 5 6 7 8 9 10
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
2.8 6.2 7.5 9.6 11.5 18 21 25.6 34.2 40 ∑ hir i = 176.4
Source: Author’s estimation. Note: Income is arranged in ascending order.
Inserting the computed values in equation (44) we get: CI =
2 n 1 2 1 hi ri − 1 − * 176.4 − 1 − = = 0.072 ∑ 10 N µ i =1 N 10 * 30.1
This is positive which means that good health (SAHS) is higher amongst the richer and the CC will lie below the line of equality. Instead of individual data, we may experience large-scale survey data. Groupspecific income quintiles are estimated and the corresponding health variables are sorted out. Concentration index for grouped data is estimated from a hypothetical data as given in Table 7.9.
4 Weighted index Weighted index is widely used in social research; in most of the cases, the weights are arbitrarily chosen. Here, I would like to highlight the method of principal component analysis (PCA) through which the weights can be determined endogenously within the data matrix. Before, estimating weighted index, I would like to calculate human deprivation index using Sen and Anand’s HPI methodology:
154 Sushil Kr. Haldar TABLE 7.9 Estimation of CI for grouped data
WG
NB
Poorest 29,939 Second 28,776 Middle 26,528 Fourth 24,689 Richest 19,739 Total/Mean 129,671
RPB CPB(PT) U5MR ND
RPD CPD(LT) CI
23 22 20 19 15
30 29 21 14 7
23 45 66 85 100
154.7 152.9 119.5 86.9 54.3 118.8
4,632 4,400 3,170 2,145 1,072 15,419
30 59 79 93 100
+0.0007 –0.0339 –0.0577 –0.08 0.0000 –0.1709
Source: World Bank, site resources.worldbank.org/INTPAH/Resources/Publications/ . . . /HealthEquityCh8.pdf, pp. 98. Note: WG = wealth group, NB = number of births, RPB = relative % of births, CPB = cumulative percentage of births, U5MR = under 5 mortality rate per thousand, ND = no. of deaths, RPD = relative percentage of deaths, CPD = cumulative percentage of deaths, CI = concentration index
Recall P(α) from (21) as: 1
W P α + W2 P2α + W3 P3α α P (α ) = 1 1 W1 + W2 + W3
P + P32 P + P22 + P23 P11 + P12 P2 = 21 , P3 = 31 ; P1 = standard of living 2 3 2 deprivation; P11= percentage of households having no improved sanitation facilities, P12 = percentage of non-institutional deliveries capturing basic healthcare deficiencies; P2 = health deprivation, P21 = percentage of children under 5 who are underweight, P22 = percentage of women belonging to age group 15–49 who are anaemic, P23 = percentage of women belonging to age group 15–49 whose BMI are below normal; P3 = knowledge deprivation, P31 = females above 6 years of age who never attended school and P32 = percentage of adult female who are illiterate. The indicators used in the present study do differ from the original indicators suggested by Anand and Sen (1997); however, we believe that all these indicators sufficiently capture human deprivations from multidimensional perspectives. UNDP (1997) has suggested equal weights of P1, P2 and P3; this means that W1 = W2 = W3 = 1. Therefore, we can calculate two indices – one with equal weights and other one with weights based on principal component analysis (PCA). Using NFHS-4 data set, we will estimate both the indices across sixteen major states and try to compare the ranks of the states in respect of two deprivation indices. The unweighted human poverty index is straightforward but one should keep in mind about the sign of factor loadings before the estimation of the weighted index that involves the application of PCA. First, we do not take the average values of the parameters, rather all the seven deprivation parameters are included simultaneously in PCA. Second, we calculate the correlation matrix of the seven deprivation parameters. where, P1 =
Indices building in social science research 155
Principal component (PC) is an artificial variable derived from a set of variables, say, X1, X2, X3, X4 and X5. Theoretically, we can have five PCs for five variables, which can be derived from correlation matrix of order (5 x 5).The first PC extracts the maximum possible proportion of variability of data of X (X1, X2 . . . X5); the second PC absorbs the residual variance and so on; lastly the fifth PC would absorb the least variance of the data of X’s.The Eigen value (λ) of a PC, say first PC, represents the amount of variance in all of the variables (X1, X2 . . . X5) that is explained by the first principal component, PCI; since we have five factors/variables, the sum 5 of the Eigen values = ∑ λi should be 5. One can easily prove that the sum of i =1 the squares of factor loadings of a PC is equal to its Eigen value (λ). This is why the Eigen value of first PC is taken as its weights. Only the first PC is taken into account because it absorbs the maximum variance of the data and it consists of seven original variables but with differential weights. There is no need to consider the second PC since it is orthogonal to the first one. Since all the deprivation parameters are highly correlated, therefore, the second PC becomes redundant. Since the second PC is orthogonal to the first one, therefore, the artificial variables are orthogonal in nature. Theoretically, we can have seven artificial variables (namely PCs) since the total number of deprivation parameters are seven; but the first few PCs are retained for analysis because these capture the maximum possible variance of the factors across observations (Nunnally and Bernstein [1994]). The first principal component (or artificial variable) captures the maximum variability of the data and it is written as: PC1 = a11 .P11 + a12 .P12 + a21 .P21 + a22 .P22 + a23 .P23 + a31 .P31 + a32 .P32 (47) the second principal component captures the remaining variance (after capturing the variance by the first PC) and it is not included in our analysis. The raw data pertaining to all the deprivation parameters drawn from NFHS-4 state fact sheet are given in Table 7.10 for sixteen major states in India. UNDP (1997) has suggested equal weights of P1, P2 and P3; this means that W1 = W2 = W3 = 1. Therefore, we first calculate the HPI. We estimate P1, P2 and P3 across the sixteen major states from Table 7.10 and estimate the HPI using Anand and Sen (1997) method. The result is given in Table 7.11. Construction of weighted HPI using PCA, on the other hand, is reported in Table 7.12 (see appendix for details), assuming α = 3. The result is given in the second column of the table. We also examine the rank correlation coefficient between the weighted HPI and HPI and it is found that the value of rank correlation is extremely high, 0.976. Therefore, the basic question is emerged relating to the merits and applicability of the two indices. Which is a better index? If we compare the two indices we notice that (a) a weighted index is higher than un-weighted (or unitary weightage)
156 Sushil Kr. Haldar TABLE 7.10 Parameters of human poverty across states, 2015–16
States
P11
P12
P21
P22
P23
P31
P32
Assam Arun. Pra Bihar Gujarat Haryana HP Karnataka Kerala MP Maharashtra Odisha Punjab Rajasthan TN UP WB
52.3 38.7 74.8 35.7 20.8 29.3 42.2 1.9 66.3 48.1 70.6 18.5 55 47.8 65 49.1
29.4 47.7 36.2 11.3 19.5 23.6 5.7 0.1 19.2 9.7 14.6 9.5 16 1 32.2 24.8
29.8 19.5 43.9 39.3 29.4 21.2 35.2 16.1 42.8 36 34.4 21.6 36.7 23.8 39.5 31.6
46 40.3 60.3 54.9 62.7 53.5 44.8 34.2 52.5 48 51 53.5 46.8 54.8 52.4 62.4
25.7 56.2 30.4 27.2 15.8 16.2 20.7 9.7 28.4 23.5 26.4 11.7 27 14.6 25.3 21.3
25 32.9 43.1 28 29.7 21 29.3 4.6 36 22.6 32.2 24 42.8 22.8 37 26
28.2 34.4 50.4 27.1 24.6 11.8 28.3 2.1 40.6 19.7 32.6 18.6 43.5 20.6 39 29.1
Source: Data are drawn from State Fact Sheet of NFHS-4(2015–16), IIPS, Mumbai.
TABLE 7.11 HPI across sixteen major states using Anand and Sen (1997) methodology
States
P1
P2
P3
PI (α = 3)
Assam Arun. Pra Bihar Gujarat Haryana HP Karnataka Kerala MP Maharashtra Odisha Punjab Rajasthan TN UP WB
40.85 43.2 55.5 23.5 20.15 26.45 23.95 1 42.75 28.9 42.6 14 35.5 24.4 48.6 36.95
33.83333 38.66667 44.86667 40.46667 35.96667 30.3 33.56667 20 41.23333 35.83333 37.26667 28.93333 36.83333 31.06667 39.06667 38.43333
26.6 33.65 46.75 27.55 27.15 16.4 28.8 3.35 38.3 21.15 32.4 21.3 43.15 21.7 38 27.55
34.61124 38.75372 49.28846 32.08796 29.10547 25.5848 29.19923 13.85299 40.69294 29.72623 37.74379 22.95469 38.64503 26.23882 42.27902 34.83095
Source: Author’s calculations.
index and (b) the ranking of the states are found to be more or less same of the two indices; the ranks of seven states (namely Arunachal Pradesh, Haryana, Karnataka, MP, Orissa, TN and UP) change in the two indices. I must say that the weighted index is better than the un-weighted (or unitary) index. PCA helps us to determine
Indices building in social science research 157 TABLE 7.12 HPI (weighted and un-weighted) and the ranks of states
States
HPI (Weighted)
HPI (Sen &Anand)
Rank_Wt HPI
Rank HPI (Sen &Anand)
Assam Arun. Pra Bihar Gujarat Haryana HP Karnataka Kerala MP Maharashtra Odisha Punjab Rajasthan TN UP WB
36.07 40.18 52.58 33.88 31.78 27.53 32.81 15.42 45.63 33.80 44.67 26.06 42.12 33.09 45.04 37.52
34.611 38.753 49.288 32.087 29.105 25.584 29.199 13.852 40.692 29.726 37.743 22.954 38.645 26.238 42.279 34.830
8 6 1 9 13 14 12 16 2 10 4 15 5 11 3 7
8 4 1 9 12 14 11 16 3 10 6 15 5 13 2 7
Source: Author’s calculations.
the weights of the parameters and hence gives relative merits and importance of parameters in multidimensional poverty index. Out of seven parameters, the policy planner should give more priority to parameters like P32 (percentage of adult female who are illiterate), P31 (females above 6 years of age who never attended school) and P11 (percentage of households having no improved sanitation facilities) since their weights are very high compared to other deprivation parameters in HPI. Weighted HPI is more scientific than un-weighted because it takes into account the variability of factors (namely parameters) across states.
APPENDIX
A.1 FGT(α) can be expressed as:
∑ (x * − x )
α
FGT (α ) =
i
x * > xi
n( x *)α
α
α
1 1 n* g x * − xi = ∑ = ∑ *i (A.3.1), n x * > xi x * n i =1 x
where gi = x * − xi , and xi is the income of the i-th poor. Equation (3.7) can be expressed as: α
1 n* g 1 n* FGT (α ) = ∑ i = ∑ K iα (A.3.2) n i =1 x * n i =1 gi x* If incomes of the poor are arranged as x1 < x 2 < ....xn * −1 < xn *, therefore, FGT(α)1 is defined as:
where, K i =
α
1 g 1 FGT (α )1 = 1 = K 1α (A.3.3), n n x* 1 α K1 FGT (α )1 K 1α n (A.3.4) = n* = α thus: α FGT (α ) 1 K 1 + K 2 + ......K nα* α ∑K n i =1 i
Indices building in social science research 159
Equation (A.3.4) can be simplified as: −1
α α α K K FGT (α )1 K 2 = 1 + + 3 + ....... n * (A.3.5) FGT (α ) K 1 K 1 K1
FGT (α )1 = 1, this proves that FGT (α ) FGT(α)1 = FGT(α). Here, FGT(α)1 stands for the poorest of the poor. Expanding (A.3.5) and taking the limit α → ∞,
A.2 Expanding (45) we get, This is illustrated as follows: CI =
2 1 n . ∑ (hi − h ) (ri − r ) µ n i =1
µ = h n
1 2 3 n + + + .......... i 2 1 n n n n n i =1 CI = . ∑ (hi − µ ) (ri − r ) , r = = µ n i =1 n n 1 n(n + 1) n + 1 = = 2 (1 + 2 + ...... + n ) = 2n 2n 2 n n + 1 2 1 2 1 CI = . ∑ (hi − µ ) ri − = . µ n 2n µ n
∑r
n + 1 n + 1 hi − µ ∑ ri + µ.n.. ∑ ∑ hi ri − 2n 2n CI = =
2 1 n + 1 n + 1 . ∑ hi ri − .n µ − µ ∑ ri + µ.n.. 2n 2n µ n 2 2 2 1 n + 1 .∑ hi ri − .µ. .∑ hi ri − 1 − = nµ nµ 2n nµ n
A.3 Construction of weighted HPI using PCA Following Nunnally and Bernstein (1994) and Pett et al. (2003), we first calculate zero-order correlation matrix, R of the seven deprivation parameters (as given in Table 7.5).
160 Sushil Kr. Haldar
R=
P11 P12 P21 P22 P23 P31 P32
P11 1 0.395 0.765 0.297 0.437 0.757 0.826
Columns Sum of R: 4.477
P12 0.395 1 0.149 0.147 0.748 0.539 0.572
P21 0.765 0.149 1 0.409 0.211 0.719 0.758
P22 0.297 0.147 0.409 1 –0.191 0.385 0.314
P23 0.437 0.748 0.211 –0.191 1 0.553 0.604
P31 0.757 0.539 0.719 0.385 0.553 1 0.960
P32 0.826 0.572 0.758 0.314 0.604 0.960 1
3.55
4.011
2.361
3.362
4.913
5.034
The first trial vector, T1A of correlation matrix (R) is T1A = [4.477 3.55. . . . . . . . . . . . . . . . . . . . . . . 5.034] , the elements of T1A are the column sum of R. The length of vector T1A= Sum of squares of elements of T1A= [(4.477)2 + (3.55)2 +. . . . . .. . .+(5.034)2] = 115.09. The square root of length is = (115.09)0.5=10.72. Now, we calculate the first normalized trial vector, V1A: 5.034 4.477 3.55 , ................... V1A = 10.72 10.72 10.72 = [ 0.4170.3310.3740.220.3130.4580.469 ], the 2nd estimate of trial vector, T1B is estimated as: T1B=V1A.R = [{(1*0.417) + (0.395*0.331) + (0.765*0.374) + (0.297*0.22) + (0.437*0.313) + (0.757*0.458) + (0.826*0.46)}+ ……………………… ……………………………………………………………………………… ……………………………………………………………………………… …………………………………………………………… + {(0.826*0.417) + (0.572*0.331) + …………………………………………………. + (0.960* 0.458) + (1*0.469)}] = [1.77 1.33 1.58 0.80 1.31 1.92 1.98], the length of T1B = (1.77)2 + (1.33)2 + ………..(1.98)2 = 17.46. The square root of T1B is = (17.46)0.5 = 4.17. Now, the 2nd normalized trial vector, V1Bis 1.98 1.77 1.33 , ,................................................ V1B = 4.17 4 . 17 4 . 17 V1B = [0.426 0.319 0.379 0.194 0.315 0.462 0.475] This input-output process continues until the weights in the normalized output vector no longer differ substantially from the previously obtained normalized input vector. At that point, it is said that the solution has converged and the newly obtained
Indices building in social science research 161
normalized vector becomes the first eigenvector, VPC1, for the correlation matrix, R. How do we determine when convergence has been achieved? Kline (1994) indicates that, for statistical computer programs, the general criterion for convergence is that ‘the sum of the squared differences between the pairs of elements in the two vectors is less than 0.00001’ (p. 32). For our two normalized vectors, V1A and V1B, the differences between the elements would be as follows: V1A − V1B = [0.417 0.331 0.374 0.22 0.313 0.458 0.469]
− [0.42260.3190.3790.1940.315 0.4620.475]
V1A - V1B = [- 0.009 0.012 - 0.005 0.026 - 0.002 - 0.004 - 0.006]. By squaring and summing the differences that these elements represent, we arrive at the sum of the squared differences: ( −0.009)2 + (0.012)2 + ………………………. + ( −0.006)2 = 0.000982 Because this value, 0.000982, though small, is still not less than the general criterion for convergence, 0.00001, we would need to continue with the inputoutput process until that criterion was reached. The third estimate of trial vector, T1C, is T1C = V1B.R = [1.779 1.591 0.789 1.326 1.935 1.992], the length of T1C = (1.779)2+(1.591)2+. . . . (1.992)2 = 17.568, the square root of the length is = (17.568)0.5 = 4.191. The third normalized vector, V1C: 1.992 1.779 1.591 , ,................................................ V1C = 4.191 4.191 4.191 V1C = [0.4240.3170.3790.1880.3160.4610.475], Let us now check for convergence once more: V1B – V1C = V1B – V1C =
[0.4260.3190.3790.1940.315 0.4620.475] − [0.424 0.3170.3790.1880.3160.4610.475] [0.00120.00160.00060.0050.00150.00020.0004]
The sum of the squared differences is [(0.0012)2 + (0.0016)2 + (0.0006)2 + . . . . . .. . . . . .. . . . . . . . (0.0004)2] = 3.87598E– 05. This value is very close to 0.00001, and therefore, roughly we can say that convergence is achieved. After convergence has been obtained, now we can identify the eigenvector, eigenvalue, and factor loadings associated with the first principal component, PC1.
162 Sushil Kr. Haldar
Because convergence has been obtained with the normalized vector V1C, it becomes the first eigenvector, VPC1, for the correlation matrix R. The square root of the sums of squares of T1C, 4.191, is also known as the square root of the vector VPC1’s length. This value becomes the first eigenvalue, λ1, for the correlation matrix (i.e. λ1 = 4.191). Remember that we have used that eigenvalue, 4.191 to divide into the elements of T1C to generate the normalized vector V1C. The factor loadings for PC1 are generated by multiplying the elements of the eigenvector, VPCI, by the square root of its eigenvalue, √λ1=√4.191 = 2.047. Factor loadings of First PC = √ λ 1 .V1C = 2.047 [ 0.424 0.3170.33790.1880.3160.4610.475] = [0.869 0.649 0.7770.385 0.647 0.945 0.973]
This process of estimation of PCs and the corresponding factor loadings is time consuming; now a day’s most of the statistical software, such as SPSS and STATA, routinely do this very easily. We are now producing the SPSS output relating to PCs, eigenvalues and factor loadings of seven deprivation parameters. Two components are extracted based on Kaizer normalization criterion but the first component sufficiently capture the variance (about 60 per cent) and we retain only one Component; the factor loadings are shown in Table A.7.2 TABLE A.7.1 SPSS output of PCA
Component
Initial Eigenvalue
% of Variance
Cumulative % of Variance
1 2 3 4 5 6 7
4.183 1.472 0.759 0.245 0.174 0.141 0.026
59.758 21.026 10.849 3.500 2.487 2.011 0.368
59.758 80.784 91.633 95.133 97.62 99.63 100
Source: Author’s estimation. TABLE A.7.2 Factor loadings
Parameter
Factor Loadings of 1st PC
Factor Loadings of 2nd PC
P11 P12 P21 P22 P23
0.869 0.649 0.776 0.382 0.649
0.160 –0.549 0.454 0.664 –0.703
P31 P32
0.944 0.973
0.061 0.010
Source: Author’s estimation.
Indices building in social science research 163
Therefore, software generated output pertaining to the extraction of first PC and the corresponding factor loadings are identical to the earlier results. Once the weights (namely factor loadings) are determined, now we insert the values of weights of deprivation parameters in the HPI (ignoring the average): 1
W P α + W12 P12α + W21P21α + W22 P22α + W23 P23α + W31P31α + W32 P32α2 α P (α ) = 11 11 W11 + W12 + W21 + W22 + W23 + W31 + W32
1
0.869 * P11α + 0.649 * P1α2 + 0.776 * P21α + 0.382 * P22α + 0.649 * P23α α 0.869 + 0.649 + 0.776 + 0.382 + 0.649 = α α + 0.944 * P31 + 0.973 * P32 + 0.944 + 0.973
References Alkire, S. 2002. Valuing Freedoms- Sen’s Capability Approach and Poverty Reduction, Indian Edition. Oxford University Press: New Delhi. Alkire, S. and J. Foster. 2009. “Counting and Multidimensional Poverty Measurment”. Oxford Poverty & Human Development Initiative, OPHI, Working Paper No. 32. Alkire, S. and M. E. Santos. 2010. “Acute Multidimensional Poverty: A New Index for Developing Countries”. United Nations Development Programme Human Development Reports Research Paper, July, Working Paper No. 38. Anand, Sudhir and A. K. Sen. 1997. “Concepts of Human Development and Poverty: A Multidimensional Perspective”. Background Papers for Human Development Report. Atkinson, Anthony B. 1970. “On the Measurement of Inequality”. Journal of Economic Theory, Vol. 2 (3), 244–263. Blackorby, C. and D. Donaldson. 1984. “Ethically Significant Ordinal Indexes of Relative Inequality”. In: R. Bassman and G. Rhodes, eds., Advances in Econometrics, Volume 3. JAI Press: Greenwich, pp. 131–147. Cowell, F. 1977. Measuring Inequality. Phillip Allan: Oxford, UK. Cowell, F. 2003. “Theil Inequality and the Structure of Income Distribution”. Discussion Paper, No. DARP 67, May. Distributional Analysis Research Programme, STCERD, LSE, The Toyota Centre. Foster, J., J. Greer, and E. Thorbecke. 1984. “A Class of Decomposable Poverty Measures”. Econometrica,Vol. 52, 761–776. Fuller, M. and D. Lury. 1977. Statistics Workbook for Social Science Students. Phillip Allan: Oxford, UK. Haldar, S. K. and A. Raychaudhuri. 2015. “Principles of Allocating Development Grant at Sub-National Level in India: A Suggested Methodology”. In: S. Banerjee, V. Mukherjee, and S. K. Haldar, eds., Understanding Development: An Indian Perspective on Legal and Economic Policy. Springer: India. Human Development Resource Centre. 2004. Planning Commission, Government of India. Oxford University Press: New Delhi. Johnson, R. A. and D. W. Wichern. 2006. Applied Multivariate Statistical Analysis. Pearson Education: New Delhi. Kakwani, N. C., A. Wagstaff, and E. van Doorslaer. 1997. “Socioeconomic Inequalities in Health: Measurement, Computation and Statistical Inference”. Journal of Econometrics,Vol. 77(1), 87–104.
164 Sushil Kr. Haldar
Kline, P.1994. An Easy Guide to Factor Analysis. Routledge: New York. Krishnaji, N. 1997. “Human Development Index: A Critique”. Economic and Political Weekly, Vol. 32(35), August 30, 2202–2205. Morris, M. D. 1980. “The Physical Quality of Life Index (PQLI)”. Development Digest, Vol. 18(1), 95–109. Nunnally, J. C. and I. H. Bernstein. 1994. Psychometric Theory, 3rd Edition. McGraw-Hill: New York. Pett, M. A., R. L. Nancy, and J. J. Sullivan. 2003. Making Sense of Factor Analysis. Sage Publications: New Delhi. Pillai, V. N. 2004. “CES Function, Generalised Mean and Human Poverty Index: Exploring Some Links”. Working Paper No. 360, Centre for Development Studies, India. Raychaudhuri, A. and S. K. Haldar. 2009. “An Investigation into the Inter-District Disparity in West Bengal, 1991–2005”. Economic and Political Weekly,Vol. XLIV (26–27), June 27. Rippin, N. 2009. “The Concept of Multidimensional Poverty: Accounting for Dimensional Poverty”. Discussion Paper No. 179, Ibero-America Institute for Economic Research. Roy, K. and S. K. Haldar. 2010. “Measuring Poverty and Socio-Economic Deprivation Inequalities in India at Sub- National Level”. Asia Pacific Social Science Review,Vol. 10(1). Sen, A. K. 1976. “Poverty: An Ordinal Approach to Measurement”. Econometrica, Vol. 44(2), March, 219–231. Sen, A. K. 1990. “Development as Capability Expansion”. In: K. Griffin and J. Knight, eds., Human Development and the International Development Strategy for the 1990s. Palgrave Macmillan: London. Sen, A. K.1993. “Life Expectancy and Inequality: Some Conceptual Issues”. In: P. Bardhan, M. Datta Chaudhuri, and T. N. Krishnan, eds., Development and Change. Oxford University Press: Bombay. Theil, H. 1967. Economics and Information Theory. North-Holland: Amsterdam,The Netherlands. UNDP (United Nations Development Programme). 1990. Human Development Report1990. Oxford University Press: New York. UNDP. 1995. Human Development Report 1995. Oxford University Press: New York. UNDP. 1996. Human Development Report 1995. Oxford University Press: New York. UNDP. 1997. Human Development Report 1997: Technical Appendix. Oxford University Press: New York. UNDP. 2010. Human Development Report 2010: Technical Appendix. Oxford University Press: New York. UNDP. 2013. Human Development Report 2013: Technical Note on Multidimensional Poverty. Oxford University Press: New York. UNDP. 2015. Human Development Report 2015: Technical Appendix. Oxford University Press: New York. Yitzhaki, S. 1983. “On the Extension of the Gini Index”. International Economic Review, Vol. 24, 617–628.
PART IV
Quantitative research methods and predictive analysis
8 DESIGNING A PRIMARY SURVEYBASED RESEARCH Tanmoyee Banerjee (Chatterjee)
1 Introduction Present chapter will discuss the tools and techniques of designing a primary surveybased research. It is a quantitative research method used widely in various disciplines of social science. A primary survey enables a researcher to gather information or responses from a group of individuals regarding his specific research question on which secondary sources shed no light or the available information is not reliable. According to Blackstone (2014) survey research is often used by researchers who wish to explain trends or features of large groups. It enables the researcher to learn what a group of individuals feel or think about his/her research question. A primary survey-based research must be based to principles of the scientific method to minimize the researcher’s bias. First, a researcher must develop a feasible research question and form his hypotheses. For a primary survey-based empirical research, the researcher must identify the target group of population from whom the necessary data can be collected. Given the target population, the researcher must identify the sample from which data on events, objects, or people can be collected that is measurable, observable and replicable. The goal of the primary research is to learn about something new that can be confirmed by others and to eliminate our own biases in the process. An empirical research is also necessary to validate a theoretical model. As Eichner (1985) has pointed out, theoretical research and empirical research are not two separate activities. A theoretical model must always have an empirical validity; similarly, an empirical research question must also be based on a theoretical framework. Given this backdrop, the rest of the chapter is as follows: section 2 will discuss the ethical issues related to primary research. Section 3 will discuss the sampling methods. Section 4 will discuss the methods of data capture and questionnaire design. Section 5 will briefly present the methods of data cleaning and conclude the chapter.
168 Tanmoyee Banerjee (Chatterjee)
2 Ethical principles In this section, we will discuss the ethical issues related to primary survey-based research.The ethical principles are defined as the practices that must be followed to protect privacy and confidentiality of the human subjects participating in the survey. According to Valerio and Mainieri (2008) ethical principles in survey research are in place to protect individual participant(s) beginning at the start of study recruitment, through participation and data collection, to dissemination of research findings in a manner that is confidential, private and respectful. There are several instances of ethical misconduct in the past, such as the Tuskegee syphilis experiment (1932–72) or the ways German scientists used captive human subjects as human Guineapigs in scientific experiments.1The Nuremberg Code (1947) formalized different rules, such as informed consent and absence of coercion, properly formulated scientific experimentation and beneficence towards experiment participants.2 The Belmont Report (National Institutes of Health [1979]) summarized ethical principles and guidelines for research involving human subjects in biomedical and behavioural research. The report identified three core principles: respect for persons, beneficence and justice. It also stated three primary areas of application: informed consent, assessment of risks and benefits and selection of subjects. Informed consent implies that a participant in a survey must be informed regarding all aspects of the research and the researcher must obtain consent from the participants regarding aspects like answering questions, videotaping the interview process etc. This in turn implies that the following: Voluntary participation: The participants must voluntarily participate in the survey and have the right to withdraw at any point of time. Thus, a researcher must get permission from people before involving them in any primary research (Hogan[2008]). Confidentiality and Anonymity: The researcher should keep the participant’s identity as well as sensitive information revealed by the subject during the survey confidential. The field investigators employed in the primary survey should be trained to keep the privacy of the collected information. Valerio and Mainieri (2008) has pointed out that the researchers sometimes give incentives to survey participants as a note of thanks. However, incentives may pose ethical conflict and may bias the participants’ decision to participate. The respect for the dignity of research participants should be prioritized and participants should not be subjected to harm in any ways whatsoever. In addition, as Blackstone (2014) has advised that a researcher must submit an outline of their research proposals to the Institutional Review board or Ethical Review Committee of the concerned Institution for review and approval prior to beginning their research. This will ensure the rights and welfare of the human subjects involved in the study. Further, the research process should be transparent at every stage.The researcher must be careful about documentation and presentation of the collected data and
Primary survey-based research design 169
deceitful means should not be applied or no fabrication of the data should be made to exaggerate the aims and objectives of the research.The researcher must be careful to present and communicate the research results in an unbiased and transparent way. Finally, affiliations in any forms, sources of funding, as well as any possible conflicts of interests have to be declared.
3 Sampling methods Sampling is the selection of a given number of units of analysis (people, households, firms, etc.) called cases, from a population of interest. Generally, the sample size (n)is chosen in order to reproduce, on a small scale, some characteristics of the whole population (N).(Trobia [2008]). The method of collecting data from entire population it is called census method. The census method has some advantages as well as some disadvantages. The census method is applied by national governments for population census, agricultural census, livestock census to get an accurate view at the country level. But this method is very time and labour consuming as well as costly.The census method of data collection is only applicable when the population or the universe is not vast, the researcher does not have any time and financial constraint and high degree of precision in data collection is required. The researchers usually apply sampling method while collecting data from a population. A sample must be a true representative of the population and must have all the characteristics of the population. A researcher draws his samples from the list which is called ‘sampling frame’ that contains the information related to the population of interest. A sampling frame may be obtained from some secondary sources like voter list, telephone directories depending upon the target group of interest to be included in the survey research. Alternatively, it can be generated from the complete enumeration of the target group of people. The effectiveness of the primary survey research crucially depends on representativeness of the sample. This in turn depends on sampling procedure, sample size and response rate or the participation rate of the subjects. If a population is homogenous then a small number of samples can represent the population characteristics. However, for a heterogeneous population sample size may be large for proper representation of the population. The sampling procedure can be of two types: probability sampling and non-probability sampling.Though different sampling methods are discussed in the next chapter in more details, it would be instructive here to have a brief understanding of different types of sampling. Probability sampling refers to the method where each member of the population has some positive probability of being selected in the sample. The probability sampling helps us to measure the sampling error (Trobia [2008]).We generally, concentrate on equal probability sampling methods or EPSEM which refers to application of sampling technique that results in the population elements having equal chance of being included in the sample (Battaglia [2008a]). There are mainly four types of probability sampling: namely simple random sampling, systematic random
170 Tanmoyee Banerjee (Chatterjee)
sampling, stratified random sampling and cluster sampling. Apart from these there are two more types of sampling: multi-stage sampling and multi-phase sampling. Simple random sampling: In this method each unit in the sampling frame is assigned a number. To choose the samples the researcher has to use a table of random number or lottery system. As a result, all the elements in the sampling frame have same probability of selection. The sampling schemes may be without replacement (‘WOR’ – no element can be selected more than once in the same sample) or with replacement (‘WR’ – an element may appear multiple times in the one sample). Simple random sampling is easy to apply for a small, homogeneous population. However, for a large sampling frame, this method is not appropriate. Further, if the population consists of many minority subgroups of interest it may not be reflected in the sample in sufficient numbers for study (Fink [1995]) Systematic random sampling: In the systematic random sampling, sampling units are selected using a sampling interval with a random start from the sampling frame (Battaglia [2008a]). In this case also the units of sampling frame are ordered or assigned some number and then the elements are selected at regular intervals beginning from a random start. This interval is called the skip interval. A simple example would be to select every kth name from a complete list of target population in alphabetical order, where first sample is chosen randomly. With the presence of a suitable sampling frame, this method is easiest method to draw a sample where the sample units are evenly distributed over the target group of population. However a disadvantage of this method is that the sample may be biased if there exist hidden periodicity in the population that coincides with the skip interval (Cox [2008]). Stratified random sampling: When a population is subdivided in distinct categories or subpopulation the sample frame can be organized into separate subgroups known as ‘strata’ on the basis of one or more stratification variables, then samples can be drawn randomly from each stratum considering it as an independent subpopulation where every unit of the strata has the same chance of being selected in the sample. While drawing the samples, a prudent method is to follow the population proportion of each subgroup in the sample. This will represent members from each subpopulation or strata in the sample according to their population proportion. As a result, this type of sampling leads to greater precision (Kalsbeek [2008]). Cluster sampling: Cluster sampling is an example of ‘two-stage sampling’ where a researcher begins by sampling groups (or clusters) of population elements and then selects elements from within those groups (Blackstone [2014]). In the first stage, the researcher chooses a sample of areas or clusters under the assumption that the population is divided into clusters of homogenous units. In the second stage a sample of respondents within those areas is selected. In this respect it must be noted that under stratified sampling it is assumed that the population is divided across the non-overlapping strata and all the strata are represented in the sample. However, only a randomly selected subset of clusters is included in the sample. Fahimi (2008) noted that because of feasibility and economy this methods is applied but the sample estimates may be less precise compared to the samples drawn through simple random sampling and stratified sampling method.
Primary survey-based research design 171
Apart from these methods, there are two more methods of sampling: multi-stage sampling and multi-phase sampling. Multi-stage sampling: Under multi-stage sampling the sampling is done under various stages. It is basically a complex form of cluster sampling where various stages of sampling are done following the hierarchical ordering of the population. An example is this: suppose at the first stage a random number of states are chosen. Then from each state a random number of districts are chosen followed by random number of villages. In the final stage random number households are selected for surveying. Multiphase sampling: In this sampling procedure some information is collected from the entire sample whereas additional information may be collected from the subsamples. According to Daniel (2011) under multiphase sampling same sampling units may be sampled multiple times. Basically, when available sample frame does not provide sufficient information for stratification, the first stage sampling is carried out for extracting information for further sampling and stratification. From the collected information initial sample may be stratified and all elements of the target group of strata may be interviewed or a subsample from this group may be drawn for collecting additional information (Daniel [2011]). Next we discuss the non-probability sampling. Non-probability sampling refers to the sampling technique where the probability or the likelihood that each element of the population will be selected in the population is unknown. In this case, a section of the population may be uncovered during the process of sampling. The selection of sample units from the population depends on the subjective methods. The methods of non-probability sampling are convenience sample, purposive sample and quota sample. Convenience sampling method: It is a method of non-probability sampling where the samples are drawn from the part of the population which is conveniently available at hand. This process of sampling is also known as grab or opportunity sampling or accidental or haphazard sampling. The basic problem with this sampling is that the sample does not represent the entire population. As a result any statistical analysis carried out on the basis of this sample will give biased results. Under convenience sampling a popular method is the snowball sampling where initially the researcher includes some subjects in the study. Subsequently, initial sample units are used to recruit more participant for survey. This method is also called chain referral sampling. Snowball sampling is an especially useful strategy when a researcher wishes to study some stigmatized group or behaviour (Blackstone [2014]). Purposive Sampling: This sampling method is also called the judgemental sampling as in this method researcher includes those subjects from the population whom he considers as representative of the population and appropriate for the study. The purposive sampling may also be used when a researcher wishes to include only people who meet very narrow or specific criteria (Blackstone [2014]). Quota Sampling: In quota sampling the population is first divided in mutually exclusive segments or sub groups like the stratified sampling method. The sampling procedure then proceeds using a nonrandom selection mechanism until the desired
172 Tanmoyee Banerjee (Chatterjee)
number or quota of completed interviews is obtained for each subgroup (Battaglia [2008b]), in this process in this second stage samples are drawn in a nonrandom method.
Sampling bias and sampling error A bias in sample is caused when a sample does not represent the population truly, like a part of the population has higher representation than the others while another part remains uncovered in the sample.This is called sampling bias.The sampling bias is mainly caused by defective sampling procedure, use of non-probabilistic sampling procedure, use of defective sampling frame leading to non-representation of a part of the population. The examples of this bias may be as follows: Suppose in a study, target population could be all eligible voters while sample population could be a population reachable by telephone. This will cause bias due to under coverage and non-representation of a part of the population (that is voters with no telephones) in the sample. Suppose in estimating students’ scores at Jadavpur University one chooses, say, students from Arts Faculty Only. Again this sample will be biased. (Author) Sampling error is measured by the deviation of an estimated sample statistic from the true population value. Suppose in a population, average family income be Rs. 25,000 per month. However, it is found that average family income in a sample drawn from that population turns out to be Rs. 40,000. This deviation is called the sample error. Unlike the sampling bias, the sampling error is measurable.
4 Methods of data capture and questionnaire design The last section discusses the sampling procedures. Once the sampling is done the researcher knows the group of sampling units from whom the data is to be collected. Next, the major task is to determine what data must be collected to answer the research question, how the questions are to be asked and finally, the best instrument must be identified for collecting the survey data. The different methods of data collection under primary survey research include face to face interview, telephone interviews, mail surveys, email surveys and measurement surveys.
4.1 Methods of data collection Face to Face interview: According to Blackstone (2014) interviews are a method of data collection that involves two or more people exchanging information through a series of questions and answers. The questions are designed by a researcher to elicit information from interview participant(s) on a specific topic or set of topics. Interviews can be taken with a structured questionnaire where the interviewer asks the interviewee some predetermined set of questions and these are recorded using
Primary survey-based research design 173
standardized techniques (Kothari [1990]). Structured interviews are mostly used for data collection under quantitative surveys. However, under qualitative studies interviews can be unstructured and the respondents can freely express their views in their own words. This is much more flexible way of data collection compared to structured interview process. There are certain advantages and disadvantages of face to face interview method. First is that people usually give their responses when confronted directly, and thus it raises the rate of response. Second, it reduces confusions as the interviewer can explain and clarify the questions when required. Third, in this method the interviewer can capture the reaction of the respondents while asking different questions. At the same time, there are certain drawbacks of this method. Firstly, it is very costly method of data collection. Secondly, it is also very time consuming when the sample size is large.Thirdly, this method requires an intensive training of the field investigators who play an important role in the data collection process.The field investigators must be honest, sincere and impartial.The views of the field investigators must not be reflected during the interview process. Otherwise, this will introduce bias in the collected information. Telephone interviews: This is a fast method of data collection where the respondents are contacted through their telephones.The telephone numbers can be randomly selected from the telephone directory or it can be randomly generated. This method of data collection is more appropriate when the questionnaire is brief and short. The advantage of this method is that it is a fast and economical method of data collection, and sampling can be done randomly. On the other hand, disadvantages are that people may be unwilling to answer phone interviews, phone calls can usually only be made during a stipulated time. Phone surveys also normally need to be shorter in length than personal interviews. Mail survey/email survey: In case of mail survey, structured questionnaire is posted to the respondents with a request that they should return the filled in questionnaire. In case of email surveys structured questionnaire or Google forms are emailed. In this case the respondent has to fill in the questionnaire online and submit accordingly. The advantage of this method is that it is a fast and cheap way of data collection. The respondent can answer according to his/her convenience. In this case, since the respondent himself/herself fills the questionnaire, it is free from interviewer’s bias which may be present in interview process. Finally, in case of email surveys using online forms like Google forms, responses are automatically tabulated in the spreadsheet which saves time. The most important drawback of this method of data collection is that the response rate may be very low, as it is entirely dependent on the motivation of the respondents to fill the questionnaire voluntarily. Respondents can simply ignore the mails/emails. Moreover, since respondents are filling the forms, they may simply omit questions or keep the questionnaire incomplete. This is a serious problem for mail surveys. However, for email surveys important questions must be kept in compulsory fields. Sometimes this method may be very time consuming as the respondents may take time to give their responses. Most problematic, people who have email may not be representative of the population as a whole, especially in developing countries.
174 Tanmoyee Banerjee (Chatterjee)
Measurement surveys It is a unique type of survey that measures some features of the population like height, weight, blood pressure, other health parameters, nutritional intake or indoor air pollution using some scientific methods and techniques. This method is free of bias the as data is collected using some scientific machineries, and thus provides us most accurate data. However, this is the most expensive method of data collection. Ethical considerations must be taken into account while measuring, say, impact of drugs on human beings.
4.2 Instrument of data collection: questionnaire design Krosnick and Presser (2010) pointed out that there are four steps involving a cognitive process when a respondent answers a particular question. First, they must interpret the question and deduce its intent. Next, they must search their memories for relevant information and then integrate whatever information comes to mind into a single judgement. Finally, they must translate the judgement into a response, by selecting one of the alternatives offered by the question. During the process of answering question some bias can occur. No response bias occurs when the respondent does not answer all the questions. It can occur when the questions are not very clearly written, or the questionnaire is very long and boring. Also, lack of incentive on the part of the respondent causes no response. Apart from this, measurement error occurs when wrong or inaccurate answers are noted. This error can occur for various reasons. Firstly, the lack of clarity of wordings of questions, ordering of questions or poor quality printing of questionnaire leads to inaccurate answers. Secondly, sometimes the respondents need to recall past events to answer questions.The long recall period may cause inaccurate answers.Thirdly, attitude of the interviewers may cause inaccurate answers. The interviewers must be polite, sincere and honest. They should read and explain the questions to respondents and record the answers correctly. Measurement error can occur if the interviewers misinterpret the questions or record the answers incorrectly. Sometimes sex of the interviewers may cause inaccurate responses. In a survey where female respondents from rural areas of India need to answer questions related to use of contraceptives or other birth control measures, they will give answers more comfortably if the interviewer is also a female. Finally, as Kothari (1990) had pointed out that the respondents are sometimes reluctant to admit their inability to answer a particular question because of lack of knowledge or ignorance. As a result they provide inaccurate information based on their guess. Thus, an effective primary survey requires a good survey questionnaire that is likely to reduce no response bias or measurement error. The interviewers must also be trained regarding how to ask the questions or how to explain the importance to the survey to the respondents. They should also be trained regarding the accurate recording of the responses. Sometimes tangible rewards may be offered to the respondents after completion of the interview as a token of gratitude. However, for mail survey or email survey the lucid and concise questionnaire can reduce no
Primary survey-based research design 175
response or measurement error. Next, we discuss the qualities of an effective survey instrument and types of questions that can be included in the questionnaire.
Features of a good questionnaire •
•
•
The questions can be divided in three broad categories: ‘must-know questions’ (these are the questions that are most important and relevant for the study), ‘useful to know questions’ (these are the questions that will help to understand different features related to research question)and ‘nice to know questions’ (may be not directly related to research question). A good questionnaire must be short. Thus, a researcher will try to avoid ‘nice to know’ questions to keep the length of the questionnaire short. Questions must be asked in lucid language and they must be clear and concise so that a respondent can comprehend it easily. For example, in a household survey a question such as do you own a house, may have YES or NO answer depending on what a respondent understands. The researcher must avoid double barrelled question as it may confuse respondent. For example researcher should not ask a question like this: Do you think economics as a subject is very mathematical and relies on intuitive logic? These are two actual questions: Do you think economics is very mathematical? Do you think economics relies on intuitive logic? This type of questions cannot be answered with a single response and may create confusion.
• • • •
•
Questions related to a particular topic should be grouped together. The ordering of questions is also very important. The easy questions must be asked at the beginning, followed by the ‘must know questions’. The sensitive questions will be asked at the end of the survey to avoid respondents ending the interview before relevant information is collected. The researcher must be aware of social factors that may influence responses while conducting the survey. In a survey researcher must not ask questions related to topics that are socially or culturally prohibited. Questionnaires must be clearly printed. It is better to have it in a booklet form if it contains more than two pages to avoid loss of pages.
Close and open-ended question Generally, in a survey questions can be either open ended or closed ended. In case of open-ended questions, the respondent can give more than just one- or twoword responses. To answer these questions the respondent can express his own views. As a result, new dimension of the research problem may be revealed which was not initially thought of at the time of survey design. Open-ended questions are necessary when the researcher wants to know certain information like asset level of the family in nominal terms, total size of family land holding in acre or annual
176 Tanmoyee Banerjee (Chatterjee)
family disposable income in cash, because there can be very large number of possible responses and the researcher may be unaware about the possible responses. However, sometimes the open-ended answers are qualitative in nature. In these cases, the researcher may face coding or measuring difficulties for the responses. Under closed questions, the respondent has to choose from a set of alternatives. These are also called forced-choice questions. In this case, the respondents do not have to articulate the answer.They can only choose his preferred alternative from the list of possible answers. From the point of view of the data tabulation, this method is very helpful for coding the data. However, there are certain drawbacks of this method. Firstly, the researcher has to think about all possible responses to a question. Secondly, this method does not allow the respondent to give some answer other than the suggested ones. As a result, he/she may indicate a choice that is not his/her true answer. However, this problem can be avoided by using the following form of question that includes some choices as well as allows respondents to express their views if it is different from the indicated answers. For example, consider the following question: What is the basic fuel used by the household for cooking purpose? a) LPG b) kerosene c) coal d) wood e) electricity f ) other:______________________ In this case, the respondents are allowed to give their own answer if it is different from the indicated answers.
Different types of close-ended questions Binary-dichotomous questions: These are the generally yes-no type questions. For example: Do you have a smart phone? a) yes b) no Response over a continuum Here the respondents have to select from a continuum of choices. Example: How important do you feel the types of fuels are to the indoor air pollution? a) very important b) important c) somewhat important d) not important e) not sure
Primary survey-based research design 177
Branching In this case, not all questions are asked to everybody.There are branches in the questions that will be answered only by the relevant groups. Example: 1
Do you operate a bank account at least one time per month? a) no > go to question 3 b) yes
2
Do you operate bank account by _____ a) b) c) d) e) f )
visiting the branch through ATM Internet banking mobile banking through banking correspondents any other responses: _________________________
Rank-order In the case the respondent has to rank the alternatives in order of importance. For example: The following are characteristics of a teacher. Please order them in terms of importance from 1(most important) to 5 (least important). _____ honest _____ the ability to develop relationships with students _____ patient and caring _____ dedication to teaching _____ encouraging students in learning Likert scales Respondents are asked to rate items on a response scale to assess agreement. For example, Do you think Kolkata police acted efficiently to control crimes during puja • • • • •
strongly agree (SA) agree (A) neutral (N) disagree (D) strongly disagree (SD)
4.3 Conducting a pilot survey Before conducting a full-scale survey, a researcher must conduct a pilot survey over a small number of representative respondents for pre-testing of the draft questionnaire. This will help researcher to identify the questions that need revision to increase the rate of response. It also helps to understand whether the questions are
178 Tanmoyee Banerjee (Chatterjee)
in proper order, whether they are comprehendible by the respondents and whether there is need to add more questions or eliminate questions. The pilot survey is also a way to train the field investigators. The final questionnaire can only be designed after completion of the pilot survey.
5 Concluding remarks: data cleaning After the data collection, the researcher needs to tabulate the data using software, such as MS Excel, Stata and SPSS. This is followed by data cleaning. Data cleaning will help to reduce responses with errors, missing values and other anomalies. The data cleaning reduces the possibility of model misspecification, errors in estimation process. Any type of data analysis or testing of research hypothesis can be undertaken after the data cleaning process. The errors in data can be caused for various reasons. Firstly, suppose while coding for ‘sex of the respondent’ enumerator should enter 0 if female or 1 if male. However because of data entry error it has become 2. Secondly, respondents may enter wrong answer codes, which is not present in the indicated answers. Thirdly, there may be some missing data but all of them may not be of same type. Suppose a respondent is asked the following questions: How many hours do you spent in watching television in a day? Answers a) do not watch at all b) less than 1 hour c) 1–2 hours d) 2–3 hours e) 3–4 hours f ) more than 4 hours g) do not know h) undecided i) refuse to answer The missing data can occur if the respondent opts for answer (a) or (g) or (h) or (i). However, all these cases represent different situations. So, during enumeration these cases must be coded differently for better result. Finally, there can be problems of extreme values or outliers in the data. Elimination of outliers improves the quality of data. There are two types of outlier detection methods: univariate outlier detection and multivariate outlier detection method. Univariate outliers are the data points with extremely high values for a single variable. Boxplot method is a simple way of eliminating extreme values from the data. This is based on distribution of data around median and between first and third quartiles as depicted in Figure 8.1. In Figure 8.1, the shaded region depicts the data points lying between first quartile (Q1) and third quartile (Q3). The line in the box gives the median value.
Primary survey-based research design 179
200 150 Extreme outliers
100 X 50
Q3 + 1.5(Q3 – Q1) Q1– 1.5(Q3 – Q1)
Third Quartile (Q3) Median First Quartile (Q1)
0 FIGURE 8.1
Box plot diagram
The upper fence is defined as Q3 + 1.5(Q3 – Q1) and the lower fence is defined by Q1 – 1.5(Q3 – Q1). The data points lying beyond upper fence and lower fence are considered as extreme outliers and should be eliminated to improve the quality of the data. Multivariate outliers are data points that have an unusual or extreme combination of values for a number of variables. The value for any of the single variables may not be a univariate outlier, but, in combination with other variables they behave abnormally. Multivariate outliers can be detected using Mahalanobis distance function or Cook distance function, the discussion of which is, however, beyond the scope of this chapter. Interested readers may refer to Aggarwal (2013), Cook (1977) and Cook and Weisberg (1982).
Notes 1 This was revealed during Nuremberg War Crimes Trial following World War II. 2 https://en.wikipedia.org/wiki/Nuremberg_Code
References Aggarwal, C. 2013. Outlier Analysis. Springer. ISBN 978-1-4614-6396-2. Battaglia, M. P.2008a. “EPSEM Sample”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications: Thousand Oaks, CA. doi: http://dx.doi. org/10.4135/9781412963947.n161 Battaglia, M. P.2008b. “Nonprobability Sampling”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications: Thousand Oaks, CA. Blackstone, A. 2014. Principles of Sociological Inquiry: Qualitative and Quantitative Methods. Saylor.org: Washington, DC. Cook, R. D. 1977. “Detection of Influential Observations in Linear Regression”. Technometrics, American Statistical Association,Vol. 19 (1), 15–18. Cook, R. D. and S. Weisberg. 1982. Residuals and Influence in Regression. Chapman & Hall: New York, NY.
180 Tanmoyee Banerjee (Chatterjee)
Cox, B. G. 2008. “Systematic Sampling”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications: Thousand Oaks, CA. doi: http://dx.doi.org/10.4135/9781412963947.n167 Daniel, J. 2011. Sampling Essentials: Practical Guidelines for Making Sampling Choices. Sage Publications: Thousand Oaks, CA. Eichner, A. S. 1985. “Towards an Empirically Valid Economics”. Eastern Economic Journal,Vol. 11 (4), 437–449. Fahimi, M. 2008. “Cluster Sampling”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications: Thousand Oaks, CA. doi: http://dx.doi. org/10.4135/9781412963947.n167 Fink, A. 1995. How to Sample in Surveys,Volume 6. Sage Publications: London. Hogan, S.O. 2008. “Voluntary Participation”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications: Thousand Oaks, CA. Kalsbeek, W. D. 2008.“Stratified Sampling”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications: Thousand Oaks, CA. doi: http://dx.doi. org/10.4135/9781412963947.n167 Kothari, C. R. 1990. Research Methodology: Methods and Techniques. New Age International Publishers: New Delhi. Krosnick, J. A. and S. Presser. 2010. “Question and Questionnaire Design”. In: P.V. Marsden and J. D. Wright, eds., Handbook of Survey Research, 2nd Edition. Emerald: Bingley, UK. National Institutes of Health. 1979. “National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research”. The Belmont Report. Ethical Principles and Guidelines for the Protection of Human Subjects of Research. Trobia, A. 2008. “Sampling”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications. doi: http://dx.doi.org/10.4135/9781412963947.n508 Valerio, M.A. andT. Mainieri. 2008.“Ethical Principles”. In: P. J. Lavrakas, ed., Encyclopedia of Survey Research Methods. Sage Publications. doi: http://dx.doi.org/10.4135/9781412963947. n167
9 SAMPLING METHODS A survey Malabika Roy
Introduction Since there are plenty of books and surveys on methods of sampling, a new survey needs some justification. The present survey is intended for an interdisciplinary audience with different levels of exposure to statistical and mathematical techniques. So my aim here is to keep the mathematics to a minimum and relegate the derivations to the appendix whenever possible, that also at a very basic level. The idea here is to explain the basic sampling techniques in a simple language and point out when and where to use them. In other words, the main purpose of the present survey is to focus on the applications of the different sampling techniques rather than on their mathematical and statistical properties, though they are not totally ignored. However, a basic knowledge of probability theory will come useful in understanding the ideas. There are certain research questions, which need an empirical answer. Also theoretical hypotheses sometimes need to be tested against real world observations. In all such situations data needs to be generated and/or collected to find an answer to the research question. Put simply, the entire literature on sampling methods is a search for the best way to collect primary data from the field to address such questions. In explaining the applicability of different sampling techniques, I have also borrowed from surveys conducted by us on various occasions. If the researcher understands the basic techniques of sampling, she can explore the field using more advanced literature. Or the reader can take help from statisticians and explain her requirements more coherently. The purpose of the present chapter is to act as a stepping stone towards these directions. Another point needs to be noted. Sampling techniques are widely used methodologies. They are used in disciplines other than social sciences, such as biology,
182 Malabika Roy
physiology, botany, zoology, medical science etc.We confine our discussion to social sciences, where the units of interest are either humans or institutions related to humans. So the population is finite and each unit is distinct and identifiable and only those sampling techniques are discussed, which are relevant for such samples. Writing a literature review of the sampling methods is a daunting task, as there is a huge literature with different levels of rigours. I will attempt to give a snapshot view of a part of the literature. Most of the standard textbooks on statistics cover at least probability sampling methods, though at different depths. For instance Goon et al. (2016) provide a very comprehensive discussion on different aspects of sampling. So in a brief literature survey, I will concentrate on books specializing in sampling techniques. Cocharn (1953) is a classic text, which is still relevant, so is Kish (1965). The other books can be classified into two categories: books that give a lucid exposition of different sampling techniques without delving too much into the mathematical details and the books with detailed discussion of the mathematics involved in developing and assessing the sampling techniques. Sudman (1976), Thompson (2002) and Lohr (2009) fall in the first category. Lohr (2009) provides an extremely accessible and lucid discussion on the various issues related to probability sampling. Chaudhuri and Stenger (2005) and Mukhopadhyay (2008) belong to the second category. If the reader wants to understand the mathematical intricacies involved in probability sampling then these two books will be very useful. Groves et al. (2008) provide a very comprehensive discussion of sampling methods putting it within the broader context of survey research. The books mentioned mostly concentrate on probability sampling methods. While probability sampling is the most adopted method in physical sciences and in the discipline of economics amongst social sciences, other social science streams like sociology find non-random sampling methods to be more useful to address their research questions. Unfortunately, these techniques are not covered adequately in standard text books. Hibberts et al. (2012) cover non-probability sampling as well as probability sampling methods in their survey. Sudman (1976) covers snowball sampling technique. The present chapter is also an attempt to include a discussion of non-probability sampling techniques in a survey of sampling techniques. The chapter is organized as follows. Section 2 discusses the necessity of sampling and explains some concepts relating to sampling. Section 3 explains various methods of probability sampling. Section 4 explains multi-stage sampling with an example. Section 5 discusses the various methods of non-probability sampling. Section 6 concludes.
1 Sampling basics The first question that needs to be addressed is: why do we need to know sampling at all? In other words when and why do we go for sampling? The issue of sampling comes in when we want to describe certain characteristics of a population statistically or try to explain some such characteristics on the basis of other characteristics.
Sampling methods 183
For instance we may want to know the average income of a population and dispersion in the income. In another instance we may want to know the biometric characteristics of a population like average height and average weight. Or the research question may involve establishing relationships between two or more variables. We may want to know how nutrition measured in calorie intake contributes to the growth of a child measured in height and weight. Or we may want to know how the mother’s education contributes to the child’s education. All such questions can only be answered by analysing the data collected from the relevant population through surveys, which have been already discussed in the earlier chapter. Now the data can be collected in two ways: complete enumeration and sampling. The data analysis would of course be most accurate if we could collect data from each individual unit of population, or in other words if we could cover the entire population in our survey. This method of data collection is known as ‘complete enumeration’ or ‘census’. However data collection is a costly process.You need field investigators to collect the data and supervisors to check the authenticity of the data. There are costs involved in data entry and data cleaning, all of which require either time or money and most often both. So cost of data collection is both in terms of expenditure as well as in terms of time spent in data collection. For a large population this cost of data collection may not be negligible. The larger the population the larger is the cost of data collection through complete enumeration. Sometimes the cost can be so large that complete enumeration or census becomes infeasible.Think of a survey that involves the entire population of India or China. Such a survey will involve such humongous a cost in terms of money and time that the process of complete enumeration becomes infeasible or impossible. In such a situation we take recourse to the second method of data collection, namely sample survey. Sampling methods involve ways of selecting a part of the population and using this part as a representative of the entire population. In other words, inferences are made about the entire population based on the estimates obtained from the samples. Suppose the research question involves comparing the economic prosperity of Gujarat as compared to West Bengal and for that purpose the average income of Gujarat needs to be compared to the average income of West Bengal. Now a complete enumeration of the two states is very costly. So we select a ‘representative sample’ for Gujarat and West Bengal using an appropriate sampling technique.Then we compare the average incomes of these two representative samples and make inferences about the two population’s income based on the samples. In other words, if we think of the population as a set, a sampling technique is a methodology to choose an appropriate subset from this set and then population (set) attributes are calculated/estimated using this subset. So our target is to select a representative sample, such that the sample is a miniature version of our population. The better is the representativeness of the sample the more accurately the estimates based on the sample will represent the population values. In the next subsection, certain basic concepts related to sampling are introduced.
184 Malabika Roy
1.1 Basic definitions The research question that we want to address defines our ‘target population’. Target population is an aggregation of all units, the characteristics of which we want to study. To continue with the earlier example, if we want to compare the average income of Gujarat with average income of West Bengal, then our ‘observation unit’ is each individual who earns some income in the two states, and the target population is the collection of all such individuals. The constitution of the target population will depend on the research question. The target population needs to be clearly defined in several dimensions. First, the researcher needs to be very clear about unit of analysis. ‘Observation unit’ or the ‘unit of analysis’ is any unit from which the data/ information can potentially be collected. Depending on the research question the unit can be an individual, a household, or a portion of the household, a business unit or a school or hospital. In a consumer survey, a household may be our unit of analysis and the ‘target population’ will be the collection of all observation units. If the research question is the impact of mother’s education on children’s education, our unit of analysis is each adult individual and his/her mother and our target population will consist of all such pairs. Second, the target population needs to be identified in terms of space and time as well. If the researcher wants to restrict her study within districts of West Bengal, then the target population is confined within West Bengal, whereas if she wants to extend her study over entire India, her target population will immediately change. On a similar vein each empirical study is confined to a specific time frame. In the example used, a cross section study1 examining the relation between mother’s education and children’s education may be carried out in West Bengal (space) for the year 2017 (time frame). Identifying the target population can get difficult. Think about this particular example: in a political poll, what should be your target population: all adults? Registered voters? Voters who have actually voted in the last election?2 Here is another example. Suppose the research question is: what are the factors that determine the age of marriageable age of a girl child in a family. To address this question your target population can be any of the following: families with daughters in ‘marriageable’ age group, families with married women, families with married women within a certain age group or families where a daughter has been recently married. A ‘sampling unit’ is a unit of observation from which the data/information is actually collected. When data is collected through interviews, the sampling unit is called the ‘respondent’. However in many cases a respondent and a sampling unit may be a little different. For example in a household survey the respondent may be the household head though the sampling unit is the household. Again in a survey involving very young children the sampling units may be the children but respondents can be the guardians or school teachers. A ‘sample’ is a collection of all sampling units or that subset of target population on which the data analysis will be finally based. ‘Sampling population’ is the population from which the sample is finally chosen. Sampling population can be the target population itself or a subset
Sampling methods 185
thereof if appropriate information is not available on all observation units or some observation units need to be discarded for one reason or the other. To carry out a proper sampling, creating a ‘sampling frame’ is very important. A sampling frame is a complete enumeration of the sampling population/target population from which the sample will be finally chosen. It can be a directory, a Panchayat or municipality list, a voters’ list, a street directory or a map – anything that contains a listing of entire sampling population. A sampling frame should have the following desirable properties. First, it should be a complete list of sampling population where every unit is present. For example in a consumer survey, if the telephone directory is used as a sampling frame, then we are leaving out all those households (observation units), who do not own a telephone. Second, all units should have a logical numerical and distinct identifier in the list. For example in a voters’ list each household has a household number. Third, contact information of each unit should be present in the list. Fourth, the sampling frame should be organized in a logical and systematic fashion so that no unit appears more than once in the list. In other words any unit is not repeated. Fifth, no unit outside the sampling population should be present in the sampling frame. For example, a sampling frame for voters’ opinion poll should not include any minors. Last but not the least, the sampling frame should ideally be a list that is up-to-date. The sampling is of course a less costly procedure, but the sample collected is always an approximation of the population at best. As a result different types of biases or errors creep in the sampling process and that is what we are going to discuss in the next section.
1.2 Sampling and non-sampling errors Surveys and sampling are subject to errors.The errors can be divided into two parts: non-sampling errors and sampling errors. To quote Wikipedia: In statistics, non-sampling error is a catch-all term for the deviations of estimates from their true values that are not a function of the sample chosen, including various systematic errors and random errors that are not due to sampling. In other words even if we went for complete enumeration non-sampling errors will still be there. Tabulation biases, which occur because of mistakes in tabulating the data, can also be a source of non-sampling error as the process is tedious and exacting. There are two broad sources of sampling errors: selection bias, measurement bias.3 Selection bias occurs mainly because of misspecification of the target population or the sampling frame. It can be due to under coverage that is missing out part of the target population or over coverage that is including observational units in the survey that are not part of the sampling frame and should have been screened out.
186 Malabika Roy
Selection bias can also occur if different sections in the sampling frame are sampled at different rates. Measurement errors occur basically because of faulty process of data collection. One of the major sources of measurement error is the respondent not revealing the true information. It happens with sensitive questions about incomes, political affiliation etc. The other sources are the respondent not understanding or misunderstanding the question; the respondent having poor memory recall; interviewer’s own opinion or prejudices eliciting a biased response; respondent giving only socially acceptable response.4 Certain words may convey different things to different people. Another important issue is the choice of sample size: what should be the optimal sample size. There are two issues involved here: the larger the sample size the more efficient and less biased are the statistical estimates of the sample parameters and the more representative is the sample. But collecting larger samples also involves larger operational costs. So the optimal sample size is obtained through a tradeoff between costs of collecting a larger sample and benefits from a larger sample. Without going into the technicalities the researcher should go for a larger sample in the following situations: when higher precision and higher accuracy is required; when population is heterogeneous; when the researcher is planning to conduct complex statistical analysis. The sample size should also be larger in case of cluster sampling but can be smaller for stratified sampling, as we will discuss later. Generally speaking, sampling has become the preferred method of collecting primary data for most researchers. It is important that the sample be representative of the population. It is also important that statistical analysis of the sample should provide answers to the research questions for which the sample survey was carried out in the first place.These two requirements have given rise to various methods of sampling, which can broadly be divided into two categories: probability sampling methods and non-probability sampling methods
2 Probability sampling Probability sampling covers a broad category of sampling techniques, where the common thread is that each unit in the sampling frame has a given predetermined probability of being selected. That is to avoid systematic biases in the selection process we introduce random probabilistic choice in sample selection. Let us take a hypothetical example.5 Let the sampling frame consist of 5 units U = (1, 2, 3, 4, 5). Sample of size 4, i.e. consisting of four units are chosen from the population in such a way that each sample is distinct, that is each sample has at least one element different from the other samples. There are five such possible samples (5C4). Table 9.1 details out all possible samples of size 4. We pre-assigned a probability to each sample being chosen. The third column gives this probability. The samples can be chosen with the pre-specified probabilities by the following sample selection process. Suppose we put five cards in a box, two cards marked with 1 and one card each marked with 2, 3 and 5. Then if we
Sampling methods 187 TABLE 9.1 Example of probability sampling
Sample
Sample Composition
Probability (P(Si ))
S1 S2 S3 S4 S5
(1,2,3,4) (1,2,3,5) (1,3,4,5) (2,3,4,5) (1,2,4,5)
2/5 1/5 1/5 0 1/5
Source: Adopted from Lohr (2009).
randomly choose a card from the box, each of the samples will have a probability of being chosen, which is exactly equal to the one given in column 3 of Table 9.1. Now applying the additive law of probability, we can calculate the probability of each individual unit being chosen: P (1) = P (S1 ) + P (S2 ) + P (S3 ) + P (S5 ) = 2/5 + 1/5 + 1/5 + 1/5 = 1 P ( 2) = P (S1 ) + P (S2 ) + P (S5 ) = 2/5 + 1/5 + 1/5 = 3/5 P (3) = P (S1 ) + P (S2 ) + P (S3 ) = 2/5 + 1/5 + 1/5 = 3/5 P (4) = P (S1 ) + P (S3 ) + P (S5 ) = 2/5 + 1/5 + 1/5 = 3/5 P (5) = P (S2 ) + P (S3 ) + P (S5 ) = 1/5 + 1/5 + 1/5 = 3/5 Thus each unit here has a predetermined probability of being included in the sample. One of the main reasons for sampling is to statistically characterize certain features of the population, which are statistically measurable. To this end, we estimate certain statistical features of population called population parameters. The most important population parameters are measures of central tendency: mean, median, mode; proportion; measures of dispersion: range, standard deviation and variance. Sometimes skewness and kurtosis are also important to understand the nature (frequency distribution) of the population.6 The population parameters are given constants, since they are based on all units of target population. However, when we estimate the same parameters from samples (called statistic or estimator) they obviously become variable across samples. So, one important test of a good sampling method is how well the sample values can capture or approximate the population values, more on this later. Figure 9.1 gives a pictorial classification of different sampling methods. The left-hand part of it describes different methods of probability sampling. Keeping in mind the established background, we will now discuss them one by one.
2.1 Simple random sampling (SRS) Simple random sampling is a very important sampling technique, which forms the basis of other more complicated sampling techniques. The basic feature of simple
188 Malabika Roy
Methods of Sampling
Probability sampling
SRS
With replacement
Non-probability sampling
Stratified sampling
Without replacement
Purposive sampling FIGURE 9.1
Systematic sampling
Cluster sampling
Single stage Multistage
Quota sampling
Convenience sampling
Snowball sampling
Different methods of sampling
random sampling is that each sample has equal probability of being chosen. Simple random sampling can be conducted in two ways: with replacement and without replacement.
2.1.1 Simple random sampling with replacement (SRSR) Suppose we are to choose a sample (1, 2, 3, . . ., n) of size n from a population (1, 2, 3, . . ., N ) of size N > n, where, the measured values are (X1, . . ., Xn ). That is our sampling frame consists of N units, from where we have to choose a random sample of n units. Under simple random sampling, we ‘randomly’ choose 1 unit from the sampling frame each time replacing the unit in the sampling pool until we have chosen n units. The simplest way to understand the process of ‘random’ choice is the following: you are given a bowl with N cards and on each card one number from 1 to N is written. The cards are well mixed up by shaking the bowl. You choose a single card from the bowl, note the number and then replace the card in the bowl. This process continues until n numbers are selected. In SRSR each observation unit has a probability of 1/N to be chosen in each specific draw and the probability of a unit being chosen in a particular draw, say the i-th draw, is independent of it being chosen in any earlier draw. That is each trial is impendent of other trials. Let P(i) be the probability of it being chosen in the i-th draw.Then the probability of it being chosen at least once in n draws is P(1) + P(2) + P(3) + . . . + P(n) = 1/N+1/N+ . . . +1/N = n/N. Since the first element can be chosen in N independent ways, and given the first element, the second element can again be chosen in N independent ways, the third element can again be chosen in
Sampling methods 189
N independent ways. The nth element can also be chosen in n independent ways, a sample of size n can be chosen in N n ways. Hence, the probability of a particular sample of being chosen is 1/N n. Let us take an example. Suppose Table 9.2 describes the population (N = 4): TABLE 9.2 Example of SRSR
Serial
Population Values
1 20 2 12 3 25 4 16 Population mean:18.25 Source: Author’s construction.
From this population we choose all possible samples of size 2 (n = 2) using SRSR. Table 9.3 describes all possible samples.
2.1.2 Simple random sampling without replacement (SRSWR) Purpose of sampling is to get as much information about the target population as possible. Repetitive samples do not yield additional information, so a preferred method of sampling especially in social science, is simple random sampling without replacement (SRSWR). To continue with our earlier example, we are randomly TABLE 9.3 All possible samples
Sample
Values
Sample Mean
Probability
(1,1) (1,2) (1,3) (1,4) (2,1) (2,2) (2,3) (2,4) (3,1) (3,2) (3,3) (3,4) (4,1) (4,2) (4,3) (4,4)
(20,20) (20,12) (20,25) (20,16) (12,20) (12,12) (12,25) (12,16) (25,20) (25,12) (25,25) (25,16) (16,20) (16,12) (16,25) (16,16)
20 16 22.5 18 16 12 18.5 14 22.5 18.5 25 20.5 18 14 20.5 16
1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16 1/16
Source: Author’s construction.
190 Malabika Roy
drawing n cards from a bowl consisting of N cards, but each time a particular card is drawn, it is not replaced in the bowl –this is simple random sampling without replacement. In other words, in simple random sampling without replacement the units are chosen in such a way that a particular unit is included in a sample only once. So under SRSWR, every possible subset of size n distinct units has the same probability of being selected. In SRSWR also, the probability of a particular unit i being chosen in any particular draw is again 1/N. The logic works in the following way: the probability that a particular unit i will be chosen in the first draw is 1/N. The probability that N − 1 1 , that is, the probability it will be chosen in the second draw is N N − 1 that it is not chosen in the first draw multiplied by the probability that it is chosen in the second draw when units left is N-1. In the same way the probability that N − 1 N − 2 1 the i-th unit will be chosen in the third draw is , and N N − 1 N − 2 so on, ending with the probability that the unit will be chosen in the nth draw as N −n 1 N − 1 N − 2 ...... . Since n distinct units can be chosen N N − 1 N − n − 1 N − n in NCn ways from N units, the probability that a specific sample of size n will be chosen is 1/NCn.7 Again let us take an example. If a sample of size 2 is drawn from the hypothetical population given in Table 9.2 in such a way that each sample is distinct, then 4C2 such samples can be selected each with probability 1/4C2.Table 9.4 describes all the possible samples, sample means and the probability with which each sample occurs. In practice, random samples used to be generated using random number tables or random number generators. Thompson (2002) explains in details, how random number tables are used to draw random samples from a population. Nowadays most Statistical or Econometric software packages have the capacity of generating a random sample from a sampling frame. Simple random sampling offers several crucial advantages. First, since each unit has equal probability of being selected, the resulting sample is representative of the TABLE 9.4 All possible samples with probability and mean
Sample
Sample Values
Sample Mean
Probability
(1,2) (1,3) (1,4) (2,3) (2,4) (3,4)
(20,12) (20,25) (20,16) (12,25) (12,16) (25,16)
16 22.5 18 18.5 14 20.5
1/6 1/6 1/6 1/6 1/6 1/6
Source: Author’s construction.
Sampling methods 191
population in the sense no particular unit has higher chance of being chosen over another unit. Secondly, the samples have some very good statistical properties, so that parameter estimators can satisfy some desirable statistical properties. One such desirable property is unbiasedness of the estimators. This property is discussed in greater detail in Appendix 1. The most important parameters that are estimated from samples are: population mean, population variance and population proportion.These estimators under SRS are also discussed in some detail in Appendix 1. However, like all good things SRS method also suffers from some drawbacks. Since the samples are a random selection from the sampling frame, the sampling frame has to be very well defined. Otherwise, simple random sampling will not yield the desired statistical properties for the estimators and the advantage will be gone. On a similar vein, if the sampling frame is inordinately large, selecting a random sample again can be a problem. Finally, if a minority group represents a very small section of population, they can be underrepresented in a simple random sample. So, usually simple random sampling is combined with more complicated forms of sampling to which we now turn in the next subsection.
2.2 Stratified sampling Suppose we are conducting a sample survey in a village on determinants of marriageable age of girl children. You have prior knowledge that this is very likely to depend on the mother’s education level. However, if you conduct a simple random sampling, the mothers who have higher education (say those who have completed graduation) maybe left out, because they are a minority in the village. So to make sure that all education categories are adequately represented in the sample, a better strategy is to divide the sampling frame into different non-overlapping classes (strata) according to mothers’ education level and then carry out random sampling within each class. This in essence is ‘stratified sampling’. More generally, suppose again we want to draw a sample of size n from a population of size N. But this time the population is divided into k distinct and disjoint groups/sub-population (N1+N2+. . .+Nk = N ), such that the population parameter values are distinctly different across the k groups. In other words, the k groups are homogenous within but there is considerable heterogeneity across groups. In such a scenario, instead of conducting a simple random sample survey, a better approach is to divide the sampling frame into k mutually exclusive strata, and draw a sample of size ni from the i-th stratum using an appropriate sampling technique (simple random sampling in most cases). The sample size ni from each stratum should be such that n1+n2+. . . . . .+nk = n. This method of sampling is called stratified sampling. Stratified sampling should be the preferred method of sampling in the following situations. First, stratified sampling may be used as a protection against bad samples when certain sub-population (minority groups) has lesser presence in population than other sub-population. In such a situation, there is a chance that the minorities may have inadequate presence in the sample under simple random sampling. The
192 Malabika Roy
example we started with is a case in point. Second, each stratum may be a separate domain of study. For example, if we are studying the job profiles of students with technical education, we may want to study this aspect for the female students and male students separately. In that case it is better to divide the students into two strata: male and female. There is another advantage to this stratification: samples can be drawn at different rates from different stratum for better precision, instead of applying population proportion. Third, stratified sampling may be more convenient to execute and lower the cost of sample survey, because we can use different methods of information collection for different strata. For example in an opinion poll, sampling frame can be divided into two strata: the individuals with an email address and those without.Then, those with email addresses can be sent a questionnaire through email and those without email access can be subjected to personal interview. Different sampling methods may be required for different strata for the following reasons as well, when, for example, the physical distribution of the population varies across different strata; when different lists are available or preferred for different parts of the population, such as in the context of a survey that covers both rural and urban population requiring the municipality lists for urban areas and Panchyat lists for rural areas; when the nature of population may be diverse across different groups, such as in a sample survey of workers, people with white collar jobs may respond better to a written questionnaire, while those with blue collar jobs may give better information with personal interviews. Finally, stratified sampling may give better estimation of population parameters by lowering standard errors of estimators8 as the population variance within group is lower than the population variance in the entire population. There are some problems with stratified sampling as well. For instance, even when we know that the population is divided in strata, we may not be able to clearly identify each stratum. Second issue is how to determine the sample size from each stratum. Several rules are followed here. Usually, sample size for each straN tum is determined by the population proportion. That is, ni = n i . For example N in a population of 2,500 if there are 500 females and 2,000 males, then a sample of size 1,000 stratified into male and female would require 200 females and 800 males to be surveyed. This method of allocation is best when we want the sample distribution across strata to mimic or mirror the population distribution across strata. But this rule can be broken for better precision in special circumstances as discussed earlier. In some situations, sample size from each stratum is chosen according to the variability of the parameter that we want to estimate (for examples see Lohr [2009]). Finally, in certain situations samples of equal size are chosen, when the requirement is that all strata should equally be represented in the sample irrespective of the population proportions. The estimators under stratified sampling are discussed in Appendix 2.
Sampling methods 193
2.3 Cluster sampling Often we find that the target population is divided into unique non-overlapping groups or clusters. Often such clusters are groups that occur naturally: Panchayats, villages, municipalities, house blocks, schools, geographical units, plots of lands etc. The method of cluster sampling is sometimes adopted in such situations. In cluster sampling, the clusters are the ‘primary sampling units’ in the sense the clusters are chosen using some appropriate sampling technique say, simple random sampling without replacement. Then information/data is collected from all the units within each cluster called ‘the secondary units’. The sample selection process will become clearer if we use an example. Suppose you want to study the provision of toilet facilities under Swatch Bharat mission and your target population consists of 200 villages with 100 households in each village. Thus the target population is of size 20,000 households. Suppose we want to draw a sample consisting of 5,000 households. We can select the sample using simple random sampling from 20,000 households across 200 villages. Or we can choose 50 villages from 200 villages (primary units) using simple random sampling and then include all the households (secondary units) from these 50 villages in the sample and information/data is collected from the secondary units.This is cluster sampling. More generally, let there be a target population of size N, divided into k clusters (C1, C2,. . ., Ck) with (p1, p2, . . ., pk) secondary units in each cluster, such that n
∑p i =1
i
= N . A sample of size n is to be chosen from the target population. Under
cluster sampling, a subset S of (C1, C2,. . ., Ck) is chosen in such a way that
∑p
i
= n,
i
where pi is the number of secondary units in Ci, for all Ci which belong to S. Cluster sampling can further be divided into two subgroups: one stage cluster sampling and two stage cluster sampling. In one stage cluster sampling, the primary units are chosen using an appropriate sampling technique like SRS, then all the secondary units belonging to the chosen primary units are included in the sample.The examples we have discussed so far are all examples of one stage cluster sampling. In two stage cluster sampling, in the first stage, the primary units are chosen using SRS. Then in second stage, from each chosen primary unit again a sample of secondary units is chosen using SRS, such that the required sample size is obtained. To continue with our example, instead of including all the pi secondary units in the sample, a sample of size zi is chosen such that
∑z
i
= n. The clusters can be of the
i
same size or different sizes. When the clusters are of equal size, they can be selected with equal probabilities. However, more often the clusters can be of unequal size. Then, it is better to choose the clusters, with unequal probabilities. That is a cluster k
of size pi will be chosen with a probability pi/N, where N = ∑ pi . i =1
194 Malabika Roy
Cluster sampling reduces the cost of sampling substantially, as the clusters are usually close knit units. But the units within the cluster are likely to be more homogeneous, and hence the precision or information content of the estimators under cluster sampling can be less as compared to simple random sampling or stratified sampling. Also for the same reason it may not give a representative sample. Since every secondary unit within a primary unit are included in the sample, to make the samples under cluster sampling representative of the population, the cluster ideally should have the same degree of heterogeneity as the population, that is the clusters should be representative of the population. In other words if the clusters should be more heterogeneous within with respect to the variables of interest and there should be homogeneity across clusters so that the estimators have lower variance and the information content of the samples is maximum.
2.4 Systematic sampling Suppose again we want to choose a sample of size n from a target population of size N. Under systematic sampling the target population is divided into n subsets with k = N/n elements in each group. An element Y1 is chosen using simple random sampling from the sequential list 1,2,. . ., k(= N/n). Then every k-th element is chosen from the target population so that the sample of size n is (Y1, Y1+k, Y1 + 2k, Y1 + 3k . . ., Y1+ (n-1)k). Let us take an example. Let the N = 20 and the target population units are (Y1, Y2,. . . . , Y20).We want to draw a sample of size n = 4 from this target population using systematic sampling. So the population needs to be divided into four subsets with 5 (= 20/4) elements in each.Table 9.5 shows these arrangements in five arrays. A unit is then chosen randomly from the first array Y1. . ., Y5. Then starting from that element every fifth element is chosen. For instance, if the first randomly chosen element is Y3, then the sample chosen will be (Y3, Y8, Y13, Y18). There are five such possible samples of size 4 as shown in Table 9.5. Problem arises when N/n is not an integer. There are several ways of dealing with the situation. One is to choose a k such that nkc σ σ/ n
An introduction to statistical inference 213
But what is c? Observing that, under H0 i.e. μ = μ0, z = ard normal distribution, i.e. z∼N(0, 1),
n X − µ0
σ
has the stand-
n X − µ0 >c µ = µ =P z >c =α P [ ] 0 σ which, from standard results, implies that c = zα, where zα is the upper α-point of the z-distribution. The critical region, after simplification, is thus W : X > µ0 +
σ zα n
Since under H1: μ>μ0, z = given by,
n X − µ0
σ
(
∼ N (0, 1) , the power of the test is
n X−µ σ P X > µ0 + zα µ > µ0 = P σ n
) >
n ( µ0 − µ ) + zα σ
n = P z > µ0 − µ ) + zα ( σ The probability can then be obtained from standard normal Tables. Numerical Example: Suppose, given σ² = 256, we want to test H0: μ = 200 against H1: μ> 200 on the basis of a sample of size n = 100.Then the critical region for level α = 0.05 is given by, W : X > 200 +
16 × 1.64 or > 202.624 10
where, from standard normal tables z0.05 = 1.64. Thus, if the sample mean turns out to be more than 202.624, we reject the assertion that population mean is 200, and if less, we do not. The power of the test can be calculated for all alternative values μ > 200. For example, for an alternative value of μ = 204, the power of the test is 10 (200 − 204) + 1.64 ] = P [ z > − 0.86] 1 = 0.8051
Power ( µ = 204) = P[ z >
Note that the power increases as μ moves further away from μ0.
214 Sugata Sen Roy
Remark 7: To test H0: μ = μ0 against H1: μ < μ0, the same arguments lead to the critical region, W : X < µ0 −
σ zα n
Remark 8:To test H0: μ = μ0 against a both sided alternative H1: μ ≠ μ0, a two-sided critical region is obtained as, W : X < µ0 −
σ σ zα /2 or µ0 + zα /2 n n
In many situations, particularly for non-continuous variables, it may not be possible to construct a test of exact level α. Here we take recourse to what is known as the p-value. Definition 4: Let tobs be the observed value of the test statistic T. If the distribution of T is fθ(t), the p-value for a test of H0: θ = θ0 against an alternative H1: θ>θ0 is ∞
∞
p = ∫ fθ0 (t )dt or p = ∑ fθ0 (t ) t0
t =t0
depending on whether T is continuous or discrete. H0 is rejected if pπ0. n
Here again if compared to π0, the number of successes S = ∑ X i is too large, i =1
then we will reject H0, i.e. for some critical value c, we reject H0 if observed Sobs>c. The value of c again is determined from the level condition P [S > c ] = α . However, notice that S follows the Bin(n, π) distribution, and can only take values 0, 1, . . ., n. Hence, for some value s0 of S it is likely that P [S > s0 ] > α > P [S > s0 + 1] , i.e. no c exists such that the condition P[S > c] is exactly equal to α.
An introduction to statistical inference 215
In such situations the critical point approach is difficult to take. Instead, we take recourse to the p-value: Given the observed number of successes sobs, we then reject H0 if, under H0, p = P [S ≥ sobs ] =
n
n
∑ s π
s = sobs
s 0
(1 − π 0 )n − s < α .
Note that this implicitly implies that sobs> c. Remark 9: To test H0: π = π0 against H1: πλ0. Following arguments similar to that for the binomial and observing that S = ∑ i =1 X i follows P(nλ), H0 is rejected if n
p = P [S ≥ sobs ] =
e n λ (n λ ) ∑ s! < α s = sobs s
∞
Similarly, to test H0: λ = λ0 (λ0 known) against H1: λ GΔ – 1, order condition holds, i.e. the numbers of equations are greater than the number of variables to solve and all the equations are independent as rank (Δ) = G – 1, i.e. rank condition holds, thus, there exists multiple solutions and the equation is over identified. Rule (3) is interesting. It says that even if K** > GΔ – 1, showing order condition holds, i.e. the numbers of equations are greater than the number of variables to be solved, there is not sufficient number of independent equations to solve for the number of variables, as rank (Δ) < G – 1, implying the rank condition does not hold. One equation can be expressed as a linear combination of the other, so that one may end up with fewer equations than required; hence all the parameters cannot be solved; the equation is under identified. Rule (4) says that if K** < GΔ – 1, there is not enough equation to solve for the number
Problems of endogeneity 233
of variables and hence the equation is under identified. Here testing of the rank condition is not required.
3.3.3 Test for identification Example 1 Consider the demand supply model as represented without the rainfall parameter. Here number of endogenous variable G = 2, and the number of exogenous variable K = 0. For identification of the system, the order condition, K** ≥GΔ – 1, and the rank, (Δ) = G – 1, must be satisfied for both the equations. Identification of the first equation: for the first equation, K** = 0, GΔ = 2. Hence, 0 = K** < GΔ – 1 = 1. Thus, order condition is not satisfied and the first equation is under identified. Identification of the second equation: for the second equation K** = 0, GΔ = 2. Thus, 0 = K** < GΔ – 1 = 1, implying order condition is not satisfied and the second equation is under identified. Example 2 Consider the system (7), where endogenous variable G = 2, exogenous/predetermined variable K = 1 Identification of the first equation: for the first equation, K** = 1, GΔ = 2, implying K** = GΔ – 1 = 1.Thus, order condition is satisfied. Checking of rank condition needs construction of Δ, for which consider the coefficient matrix. Qt 1 1
Pt –β1 –α1
Rt 0 α2
Thus, Δ for the first equation can be represented as Δ = [α2] and rank (Δ) = 1 = G – 1. Hence, for the first equation K** =GΔ – 1, rank (Δ) = G – 1 and the equation is just identified. Identification of second equation: for second equation K** = 0, GΔ = 2, So, Δ G – 1 = 1. Hence K** < GΔ – 1 and thus the second equation is under identified. Therefore, in this example the demand equation is identified but not the supply equation, as already discussed earlier. Example 3 Consider the model (11), where endogenous variable G = 2, predetermined variable K = 1. Identification of the first equation: for the first equation K** = 1, GΔ = 2. Therefore, K** = GΔ – 1 = 1. Thus, order condition is satisfied. Checking of rank condition needs construction of Δ, for which consider the coefficient matrix. Qt 1 1
Pt –β1 –α1
Yt –β2 0
Rt 0 –α2
234 Arpita Ghose
Identification of the first equation: Δ = (–α2)and rank (Δ) = 1 = G – 1. Hence, rank condition is satisfied. As K** = GΔ – 1 and rank (Δ) = G – 1, the first equation is just identified. Identification of second equation: here K** = 1, GΔ = 2. Hence K** = 1 = Δ G – 1. Thus, order condition holds. Also, Δ for the second equation can be represented as Δ = (–β2), rank (Δ) = 1 = G – 1. Hence, rank condition is satisfied. As K** = GΔ – 1 and rank (Δ) = G – 1, the second equation is just identified. Thus, in this example both demand and supply equation are identified as discussed earlier.
3.3.4 The condition of identification under general homogeneous linear restrictions Our discussion regarding the restrictions on structural parameters, till now, has been in terms of exclusion restrictions, i.e. some variables are included and some variable are excluded from any particular equation. But in practical situation one can have general homogeneous linear restrictions. The condition of identification under general homogeneous linear restrictions also consists of two parts: (1) Order condition: As earlier order condition represents the algebraic consistency condition. It is necessary but not sufficient condition. (2) Rank condition: It says that the number of independent equations must be at least as great as the number of variable to be solved. The statement of the order and rank condition of identification under general homogeneous linear restrictions as well the examples of test of identification of the model using such restrictions are presented in the Appendix.
3.3.5 Identification of the system by using variance covariance restriction It has been discussed that in the model (7) the demand function is identified but supply function is not. The supply function can be made identified by introducing the restriction Cov(u1t, u2t) = 0.Therefore, under such restriction, it will be possible to estimate the structural parameters of both the demand and the supply function. For the proof and explanations see the Appendix. Identification is prior to estimation. Once the equation is identified then only one can go for estimating the structural parameters of that equation. If the equation is not identified then appropriate restriction must be incorporated to make the equation(s) identified.
4 Method of estimation for simultaneous equation system OLS is not the appropriate method of estimation for simultaneous equation system (SES), as application of OLS method to SES will lead to biased and inconsistent
Problems of endogeneity 235
estimate of the structural parameters. To handle the endogeneity problem, the used methods of estimation for SES can be divided into two broad groups: Single-equation method Here each equation is estimated separately. First, the identifiability of the i-th equation is tested, and if identified, the equation is estimated. Then one moves to the j-th equation and choose the same procedure, and so on. Under single equation method one can apply the following:(1) indirect least square method (ILS); (2) instrumental variable (IV) method; (3) two stage least square method (2SLS); (4) generalized least square (GLS); (5) k-class estimator; (6) limited information maximum likelihood method (LIML); and (7) least variance ratio (LVR) method. Amongst these the most widely used methods are ILS, 2SLS and IV. It will be argued in the next section that ILS method can be applied if the equation is just identified, whereas the other methods are applicable for over identified case also. But in the single equation method since each equation is estimated separately, so identification conditions of the j-th equation will not affect the estimation of i-th equation. Thus, i-th equation can be estimated even if j-th equation is not identified. Also, the contemporaneous correlations amongst different equations will not affect the estimated values of the parameters. System method The system method of estimation overcomes the problem of single equation method as stated by estimating all the equation simultaneously. Under system method, as all the equations are estimated simultaneously, the estimated values of the parameters will not be consistent if all the equations of the system are identified, and one needs to take into account cross covariance restrictions of the error term across different equations. Under this method we have – (1) three stage least square method (3SLS) and (2) full information maximum likelihood method (FIML). Thus, estimated values of the structural parameters will be affected by the existence the contemporaneous correlation amongst different equations. The next subsections discuss single equation and system method of estimation respectively.
4.1 Single equation method 4.1.1 Indirect least square method (ILS) ILS method involves following steps: Step 1: The reduced form of the structural model is obtained, where joint determination (two-way causation) of the variables appearing in an equation is avoided. Step 2: Provided the other usual assumptions about the disturbance terms of the reduced form equation are satisfied, OLS is applied to each equation of the reduced form system and estimates of the reduced form coefficients is obtained. Step 3: The unknown structural parameters are solved from the known values of reduced form parameters obtained from Step 2, using the equations showing the relationship of structural and reduced form parameters.
236 Arpita Ghose
These discussions suggest that it will be meaningful to apply ILS if there is one to one correspondence between reduced and structural form parameters, i.e. there is unique estimate of structural form parameters, given the estimated value of reduced form parameters. Thus, ILS can be applied if the equation is just identified. Further, the following assumptions are needed for application of ILS. First of all, the errors terms of the reduced form of the model must have zero mean, constant variance, must be serially independent and must not be correlated with the exogenous/ predetermined variables. Secondly, the exogenous/predetermined variables of the model must not be perfectly correlated (known as perfect multi-collinearity), in which case OLS estimator will not exists. Since OLS estimators are unbiased and consistent, the ILS estimators are also unbiased and consistent. For illustration of ILS method consider the demand supply model (15) as cited. It has been shown in the Appendix that given the estimated values of reduced form parameters one can successfully estimate the structural form parameters for both equations.
4.1.2 Two stage least square method (2SLS) 2SLS has been developed by Theil (1953a, 1953b, 1961), Basmann (1957) and Sargan (1958). It can be applied even if the equation in question is over identified. 2SLS method consists of replacing the value of the endogenous variables that appear as explanatory variables in structural equation by their estimated values obtained from reduced form system and then performing OLS to the equation that contains the estimated values of the endogenous explanatory variables and is outlined as follows: Step 1: Obtain reduced form of the model. Step 2: Apply OLS to reduced form model to obtain the estimates of the reduced form parameters and hence obtain estimated values of the endogenous explanatory variables, appearing as explanatory variables in the structural equations. Step 3: Replace endogenous explanatory variables by their estimated values as obtained from Step 2 in the original structural equation and then apply OLS to the equation which contains the estimated values of the endogenous explanatory variables. As an example, consider the macroeconomic model (7). Here OLS cannot be applied either to Ct or It equation as both contains∧ endogenous explanatory variable (Yt). But one can get ∧estimated value of Yt, i.e. Yt , from the reduced form of the model and can replaceY t in Ct or It equation and then∧ can apply OLS to the equation that ∧ contains Yt as an explanatory variable. Since Yt is not correlated with error term, the correlation between the explanatory variable and the structural error term can be avoided. For the detailed derivation of 2SLS method see Judge et al.(1982).
4.1. 3 Instrumental variable (IV) method IV method can be applied if the equation in question is over identified. The idea of IV method is that, if one can find a variable Z, known as instrumental variable,
Problems of endogeneity 237
such that, (1) Z is not correlated with the structural error term, (2) Z is correlated with the endogenous explanatory variable, (3) Z is correlated with the dependent variable, (4) The variance covariance matrix of Z is finite symmetric and positive definite, then Z can be used as an instrument for the endogenous explanatory variable.The resulting estimator is known as instrumental variable estimator. IV estimator is consistent. However, IV estimator is biased for small sample. For the detailed discussions of IV method see Judge et al. (1982). However, in finite samples, the IV estimator is biased. Hahn and Hausman (2002) provide an approximation of the bias in the IV estimator. For a detailed discussions of other single equation methods, see Kmenta (1991), Judge et al. (1982) and Johnston (1984).
4.2 System method of estimation The system method which estimates the parameters of all the equations jointly can be of two types: (1) three stage least square method and (2) full information maximum likelihood method.
4.2.1 Three stage least square (3SLS)method 3SLS method involves following steps: Stage 1: Obtain reduced form of the model, the estimated values of reduced form parameters by applying OLS to the reduced form and hence obtain estimated value of the i-th endogenous variable. Stage 2: Obtain 2SLS estimators for the equation i = 1, . . . . . . G, using the estimated value of endogenous variable from stage 1. Stage 3: Using the estimated values of the endogenous variables based on 2SLS estimator as obtained from stage 2, estimate 2SLS residuals and hence obtain variance covariance matrix of the error terms. Using this estimated value of the variance covariance matrix of the error terms, estimate all the structural parameters of the model after taking into account joint dependence of all the equations. The other system method of estimation, full information maximum likelihood method (FIML), is complicated and needs specialized discussions.
5 Omitted variables It has already been discussed in the introductory part that problem of endogeneity can arise because of the existence of omitted explanatory variables because the regressors will be associated with the error term because of the association between the omitted variables and other regressors (Bascle [2008]). For more clarification about the endogeneity arising out of the omitted variables, assume that structural equation is given by Y= β0 + β1X + β2 Z + u.(16)
238 Arpita Ghose
Suppose instead of estimating (16) one estimates Y = β0 + β1X + u.
The assumption here is that the omitted variable Z in not associated with Y. But in reality Z is associated with Y, because as Z is not included, the effect of Z is captured by the error term which effects Y. If Z is also associated with X, then X will not be independent of the error terms. This violates basic assumption of the OLS regression model i.e. the Cov(X, u) = 0. Using OLS method the expected value of the estimate for bˆ1 can be written as (see Wooldridge [2000]): Cov ( X,Z) E β1 = β1 + β2 (17) Var ( X )
( )
The expression (17) shows that the estimator bˆ1 is biased and the bias is equal to Cov ( X,Z) E β1 - β1 + β2 Var ( X )
( )
In this equation, β2 reflects the sign and magnitude of impact of Z on Y and Cov(X, Z) reflects the strength of the relationship between X and Z. The bigger the impact of Z on Y, the larger is the bias. If β2 and/or Cov(X, Z) are ‘sufficiently’ small, the endogeneity is not a serious problem, because the bias is small. More importantly, if either β2 or Cov(X, Z) equals zero, then there is no endogeneity. In other words, if the purpose is to test whether a theoretically proposed relationship between Y and X exists, then one is not concerned about omitted variables that are not correlated with X. Similarly, one need not consider the omitted variables that are correlated with X, but are not affecting Y. One is concerned only about omitted variables that are correlated with both explained and explanation variables. This observation is important given that causes of omitted variables are mainly due to the necessity to limit the research model to a theoretically manageable number of most important variables, the data on which can easily be collected. The key is thus to identify those variables that are likely to have a major impact on both the explained and explanatory variables and are thus most likely to potentially affect the results, if not included in the analysis. Researchers from related discipline have extensively discussed endogeneity bias emanating from omitted variables. Particularly, in relationship to child development studies (Duncan [2008a, 2008b]; Duncan et al. [2004]). The literature also provides solutions to the problem of omitted variables. If omitted variable problem needs to be solved, then there are basically two standard ways to do this. First, if one believes that there is omitted variable but we are unable to include it because of the data unavailability, one can then include a proxy variable in the analysis. The proxy variable is a variable that is related to the unobserved omitted variable. For example, suppose our explained variable is usefulness
Problems of endogeneity 239
of budget and the explanatory variable is timeliness of information. It is true that ‘knowledge of accounting’ will also be an important explanatory variable but one is unable to obtain data on this omitted variable. In this case data on tertiary education in accounting is available and if one believes that this is related, to ‘knowledge of accounting’, it can be used as a proxy for ‘knowledge of accounting’. By including this variable, one can control on the explained variable and thus make explanatory variable exogenous.This solution is called the ‘plug-in-solution’ to the omitted variable problem’. Second, if a proxy variable is not available, one can use the method of instrumental variables. The variable which can be used as an instrumental variable is explained earlier. In the example of the modelling of usefulness of budget as cited, the ‘digitization of IT system’ can be used as an instrument for timeliness of information’. Institutively, one can use the instrumental variable to predict the explanatory variable and use this prediction to estimate the impact of the explanatory variable on the explained variable. Under the instrumental variable assumption previously stated, the prediction is uncorrelated with the omitted variable and the endogeneity problem is thus solved. That is, in our example, digitization of IT system could be used as the instrumental variable as it is correlated with ‘timeliness of information’, provided the assumptions required for a valid instrument as mentioned is specified. Another common example of endogeneity that is caused by omitted variable is self-selection (Heckman [1979]). Although the econometric solution to the selfselection problem is specific, the essence of the problem is identical to the omitted variable problem as previously discussed.The self-selection problem is not discussed in great detail here and can be found in Heckman (1979). Endogeneity occurs when there is a selection of sample members into certain strata of independent variables that might be related to the outcome. In other words, members are not selected out of or into the sample, but rather that are selected out of certain values of key regressors in multiple regression method (see Duncan [2008a, 2008b]; Duncan et al. [2004]). A detailed discussion of the sample selection bias can be found in Heckman (1979) and Cuddeback et al. (2004).
6 Measurement errors In case of measurement error, the true value of the variable X1, i.e. X*1 is not observed, rather one observe the indicator variable X1. The difference between the indicator variable and the true value of the variable represents the measurement error e1 = X1 – X1*. For example, X1* is the actual income and X1 is the reported income. Measurement error can be there either with the dependent variable or with the explanatory variable. The measurement error in the dependent variable will produce downward bias in the estimator (see Wooldridge [2009]). However, the measurement error in explanatory variable is considered as a much more important problem than measurement error in the dependent variable. For clarity, consider the regression model
Y = β0 + β1X1* + u (18)
240 Arpita Ghose
and it is assumed that u satisfies the properties needed to satisfy Gauss-Markov theorem. So, the estimation of equation (18) will lead to unbiased and consistent estimator for β0 and β1. It is assumed that the average measurement error in the population is zero, i.e. E(e1) = 0. It is assumed that u is uncorrelated with both X1 and X1*. The problem is that when the linear relationship is defined using the indicator X1, i.e. regressing a variable Y on X1, the measurement error is taken up by the error term of the regression, making X1 correlated with the error term of the equation, which violates the assumption of basic regression. It is important to find out the properties of OLS if we simply replace X1* by X1 and run the regression of Y onX1. The properties of the estimator will depend on the assumptions made about the measurement error. First assumption is thate1 is uncorrelated with, X1, i.e. Cov(X1, e1) = 0. Since e1 = X1 – X1* it implies under the assumption Cov(X1, e1) = 0, e1 must be correlated with the observed variable X1*.To determine the properties of OLS, replaceX1* = X1 – e1 into (18) yielding, Y = β0 + β1X1 + (u − β1e1 ) (19) Because both u and e1 have zero mean and uncorrelated with X1, (u1 – β1e1) also has zero mean and is uncorrelated with X1. It follows that OLS estimation with X1 in place of X1* produces consistent estimator of β1 (and also β0). Since u is uncorrelated with e1, the variance of the error (u1 − β1e1 ) = σ u 2 + β 2 σ e 12 , where σ u 2 and 1 σ e12 stands for variance of u and e1 respectively. Thus, except β1 = 0, measurement error increases with error variance. But this does not affect any OLS properties (except that the variable of bˆ1 will be larger than if X1* can be observed directly). Next assume classical error in variable assumption; the measurement error is uncorrelated with the unobserved explanatory variable, i.e. Cov(X1*, e1) = 0. Under this assumption it can be shown that Cov(X1, e1) = σ e12 (Wooldridge [2000]). Thus, the covariance between X1 and e1 is equal to the variance of the measurement error and hence not equal to zero. The existence of nonzero covariance between X1 and e1 will create endogeneity bias. To visualize this problem consider (19) and the covariance between X1 and error term (u –β1, e1). The error term of the regression equation (19) is a linear function of e1 and hence will be correlated with e1. Further, as previously shown, since e1 itself is correlated with X1, error term of equation (19) will also be correlated with X1. Thus, regression of Y on X1 employing OLS and using (19) yields biased and inconsistent estimate of the parameters. Considering the asymptotic properties it can be shown using Wooldridge that ∧ σ 2x * Plim β 1 = β1 2 * 1 2 (20) σ x1 + σ e 1
where, σ 2 x 1* = var ( X1 ) and σ e 12 = var (e1 ) .
Problems of endogeneity 241
The term multiplying β1 in (20), is the ratio [Var(X1*)/Var(X1)] which is always less than one. Thus, term bˆ is always closer to β1. This is called the attenuation 1
bias in OLS because of classical error in variables: On average (or in large sample), the estimated OLS effect will be attenuated. If β1 is positive, bˆ will tend to 1
underestimate β1. If the Var(X1*) is large relative to the variance in the measurement error, then the inconsistency of OLS will be small. This is due to the fact that Var(X1*/X1) will be close to unity when ( σ 2x * / σ e12 ) is large.Therefore, depending 1
on how much variation there is in X1* relative to e1, the measurement error need not cause large biases. Things are complicated if more explanatory variables is added. For illustration, consider Y = β0 + β1X1* + β2 X 2 + β3 X 3 + u.
The first of the three explanatory variables is measured with error and is replaced by the indicator variable X1. It is assumed that u is uncorrelated with X1*, X2, X3 and X1. It is assumed that e1 is uncorrelated with X2, X3 the variables which are measured without error. The key issue is whether e1 is uncorrelated with X1. If this is so then regression of Y on X1, X2 and X3 will provide consistent estimators. Like single explanatory variable, this regression equation can be written as Y = β0 + β1X1 + β2 X 2 + β3 X 3 + (u − β1e1 )
u and e1 in the composite error term are both uncorrelated with all the explanatory variables. But under the assumption Cov(X1*e1) = 0 it has already been shown that Cov(X1, e1) ≠ 0. Thus application of OLS will produce biased and inconsistent estimate of the parameters. In general all the estimators will be biased, not just bˆ . In 1
the special case when X1* is uncorrelated with X2 and X3, bˆ2 and bˆ3 are consistent. However, this situation is very rare. In general, the measurement error in a single explanatory variable causes inconsistency in all estimators, but the sizes, even the direction of bias, are not easily obtained. In the field of social science research, as a general practice, attention to measurement related processes that minimize such bias best occurs before implementation of an evaluation. Such attention is given through strong conceptualization and operationalization (Nugent et al. [2000]) including selection of ‘high specificity’ outcome measure, i.e. standardized instruments (Rosen et al. [1999]) as well as deliberate use of multiple measurement methods (Podaskoff et al. [2003]). In addition, a body of post evaluation strategies include efforts to provide evidence that the instruments are psychometrically sound (Nugent et al. [2000]) including empirical assessment of potential biases in known instruments (Unick and Stone [2010]) and the strategies to actively model measurement error using latent variable and the
242 Arpita Ghose
related approaches (see Gehhert [1994]; Harrington [2008]; Nugent and Hankins [1992]). The problem of measurement error has been extensively discussed in the relationship to the social work knowledge base in general (Nugent and Hankins [1992]; Unick and Stone [2010]) as well as in terms of specific relationship of measurement error to the endogeneity bias (Nugent et al. [2000]).
6 Conclusion This chapter provides an overview and the discussions of the endogeneity problem. However, in the discussion mention is made only about different studies relating to social research and hence it can be generalized to other field of research as well. Different causes of endogeneity, such as the existence of simultaneity, omitted variables and measurement errors and the possible solutions to the problem are discussed. At a practical level, it is unlikely that any single study is completely free of endogeneity issues and therefore the initial consideration should be sought in careful theory construction.
References Ashenfelter, O. and D. Card, eds., 2001. Handbook of Labour Economics. North Holland: The Amsterdam. Bascle, G. 2008. “Controlling for Endogeneity with Instrumental Variables in Strategic Management Research”. Strategic Organization,Vol. 6(3), 285–328. Basmann, R. L. 1957. “A Generalised Classical Method of Linear Estimation Coefficients in Structural Equations”. Econometrica,Vol. 25, 77–83. Chenhall, R. H. and F. Mores. 2007.“The Issue of Endogeneity in Theory Based, Quantitative Management Accounting Research”. European Accounting Review,Vol. 16 (1), 173–195. Core, J. W. 2001. “A Review of the Empirical Disclosure Literature: Discussion”. Journal of Accounting and Economics,Vol.31, 441–456. Cuddeback, G., E. Wilson, J. G. Combs-Orme, and T. Combs-orme. 2004. “Deleting and Statistically Correcting Sample Selection Bias”. Journal of Social Science Research, Vol. 30, 19–33. Davis, J. A. 1985. The Logic of Causal Order. Sage: Beverly Hills, CA. Duncan, G. J. 2008a. “When to Promote and When to Avoid, a Population Perspective”. Demography,Vol. 45, 763–784. Duncan, G. J. 2008b. “Unexpected Pathways”. Journal of Social Issues,Vol. 64, 213–218. Duncan, G. J., K. Magnuson, and J. Ludwig. 2004. “The Endogeneity Problem in Developmental Studies”. Research in Human Development,Vol. 1, 59–80. Eisenberg, T. et al. 1998. “Larger Board Size and Decreasing Firm Value in Small Firms”. Journal of Financial Economics,Vol. 48, 35–54. Fields, T. D. et al. 2001. “Empirical Research on Accounting Choice”. Journal of Accounting and Economics,Vol. 31, 255–307. Gehhert, S. 1994. “The Applicability of Generalizability Theory to Social Work Research and Practice”. Journal of Social Service Research,Vol. 18, 73–88. Hahn, J. and J. Hausman. 2002. “Notes on Bias in Estimators for Simultaneous Equation Models”. Economic Letters,Vol. 75, 237–241. Hamilton, B. H. and J. A. Nickerson. 2003. “Correlating for Endogeneity in Strategic Management Research”. In: Strategic Organization. Sage: London.
Problems of endogeneity 243
Harrington, D. 2008. Confirmatory Factor Analysis. Oxford University Press: New York, NY. Heckman, J. 1979. “Sample Selection Bias as a Specification Error”. Econometrica, Vol. 47(1), 153–161. Jensen, M. 1993. “The Modern Industrial Revolution, Exit, and the Failure of Internal Control System”. Journal of Finance,Vol. 48, 831–880. Johnston, J. 1984. Econometric Methods, 3rd Edition. McGraw-Hill International Book Company Ltd.: Japan. Judge, G. G., R. Carter Hill, W. Griffiths, H. Lutkepoh, and T. C. Lee. 1982. Introduction to the Theory and Practice of Econometrics. John Wiley & Sons: New York. Kmenta, J. 1991. Elements of Econometrics, 2nd Edition. Macmillan Publishers Company. Lipton, K. and J. Lorsch. 1992. “A Modest Proposal for Improved Corporate Governance”. Business Lawyer,Vol. 48, 59–77. Moffitt, R. 2005.“Remarks on the Analysis of Causal Relationships in Population Research”. Demography,Vol. 42, 31–108. Nugent, W. and J. Hankins. 1992. “A Comparison of Classical, Item Response and Generalizability Theories of Measurement”. In: D. Gillespie and C. Gilsson, eds., Quantitative Methods in Social Work: State of Art. Haworth Press: New York, NY. Nugent,W. R., L.White, and R. Bashman. 2000. “A ‘devil’ Hidden in the Details:The Effects of Measurement Error in Regression Analysis”. Journal of Social Service Research,Vol. 27(1), 53–74. Podaskoff, P. M., S. B. Mackenzie, J. Y. Lee, and N. P. Podaskoff. 2003. “Common Method Biases in Behavioral Research: A Critical Review of the Literature and Recommended Remedies”. Journal of Applied Psychology,Vol. 88, 879–203. Rosen, A., E. K. Procler, and M. M. Staudt. 1999. “Social Work Research and the Quest for Effective Practice”. Social Work Research,Vol. 23, 4–14. Sargan, J. 1958. “Estimation of Economic Relationships Using Instrumental Variables”. Econometrica,Vol. 67, 557–586. Shugan, S. M. 2004. “Endogeneity in Marketing Decision Models”. Marketing Science, Vol. 23, 1–3. Stone, S. I. and R. A. Rose. 2011.“Social Work Research and Endogeneity Bias”. Journal of the Society for Social Work Research,Vol. 2 (2), 45–75. Svare, G. M., S. Jay, E. J. Bruce, B. Freisthler, and L. S. Miller. 2007. “Introductory Simultaneous Equation Model for Social Work Research”. Journal of Social Science Research,Vol. 34, 15–24. Theil, H. 1953a. Repeated Least-Squares Applied to a Complete Equation Systems. Mimeo Central Planning Bureau: The Hague. Theil, H. 1953b. Estimation and Simultaneous Correlation in Complete Equation Systems. Mimeo Central Planning Bureau: The Hague. Theil, H. 1961. Economic Forecasts and Policy, 2nd Edition. North-Holland: Amsterdam, The Netherlands. Unick, J. and S. Stone. 2010.“State of Modern Measurement in Social Work Research”. Social Work Research,Vol. 34(2), 94–101. Wooldridge, J. M. 2000. Introductory Econometrics.Thompson South Western: Cincinnati, OH . Wooldridge, J. M. 2009. “ On Estimating Firm-level Production Functions using Proxy Variables to Control for Unobservable”. Economics Letters,Vol. 104, 112–114. Yermack, D. 1996. “Higher Market – Valuation of Companies with a Small Board of Directors”. Journal of Financial Economics,Vol. 40, 185–211.
APPENDIX
Identification problem: solution of structural parameters from reduced form parameters Example 1: Identification of (7) Consider the identification of (7). Mathematically it can be proved that for the system (7), given the estimated values of reduced form parameters one can determine the estimates of the structural parameters of demand equation, but not the structural parameters of supply equation. Equating demand with supply equation and solving for price: Pt = π10 + π11R t + w1t (A.1) where, π10 =
α 0 − β0 u − u1t α2 , π1 = , w1t = 2t β1 − α1 β1 − α1 β1 − α1
Since (A.1) expresses price as function of exogenous variables rainfall only, it is reduced form of price equation. Replacing (A.1) into supply equation we get Q t = π 20 + π 21R t + w 2t (A.2)
where, π20 = (α0 + α1 π10), π21 = (α1π11 + α2), w2t = (α1ω1t + u2t) (A.2) expresses Qt in terms of exogenous variable (Rt) only and hence it is reduced form of Qt. α − β0 Replacing expression for π10 = 0 in π20 and simplifying we get β1 − α1 α β − α1β0 π 20 = 0 1 β1 - α1
Problems of endogeneity 245
Similarly, π 21 = α1π11 =
α 2 − β1 α2 . and after simplification, π 21 = β 1 − α1 β1 − α1
OLS can be applied to reduced form equation of Pt and Qt and hence the ˆ are known. ˆ and Π ˆ ,Π ˆ ,Π numerical values of the estimated parameters Π 21 In the next step perform the operation:
10
20
11
ˆ Π α2 β − α1 1 11 = . 1 = (A.3) ˆ β α α β1 − Π 21 1 1 2 − β1 Since the left-hand side of (A.3) is determined from OLS estimate of reduced from parameters,β1 can be determined from (A.3). β1 is the coefficient of price in structural demand equation Similarly, after simplification we obtain β0 (β1 − α1 ) ˆ −β Π ˆ Π = β0 (A.4) 20 1 10 = (β1 − α1 ) The left-hand side of (A.4) is determined from estimated value of reduced form coefficients and from the estimated value of β1 given by (A.3). β0 is the constant term of the structural demand equation and can be determined from A(29) Hence, using the estimated values of the reduced form coefˆ ,Π ˆ ,Π ˆ ,Π ˆ , one can solve for the structural parameters of the ficient Π 10 11 20 21 demand equation and demand equation is identified. Actually, one needs to solve five numbers of the structural parameters β0, β1, α0, α1, α2. But estimation of ˆ ,Π ˆ ,Π ˆ ,Π ˆ . Thus, out of five reduced form yields four known numbers Π 10 11 20 21 structural parameters, only four can be solved. It is already shown that β0, β1 (the parameters of demand equation) can be estimated. Hence all the three parameters of the supply equation cannot be solved and hence the supply function is not identified.
Example 2: Identification of (12) Consider the identification of (12). Mathematically it can be proved that for (12) given the estimated values of reduced form parameters the estimates of structural parameters for both demand and supply equation can be determined. Equating demand with supply and solving for price we get, Pt = Π10 + Π11R t + Π12 Yt + w it (A.5)
Where Π10 =
α 0 − β0 − β0 α2 u − u1t , Π11 = , Π12 = , w it = 2t . β1 − α 1 β1 − α 1 β1 − α 1 β1 − α 1
246 Arpita Ghose
In (A.5) price is expressed as a function of rainfall and income which are exogenous variables and hence it is the reduced form of price equation. Replacing (A.5) in supply equation, Q t =α 0 + α1[Π10 + Π11R t + Π12 Yt + w1t ] + α 2 R t + w 2t = Π 20 + Π 21R t + Π 22 Yt + É 2t where Π 20 = (α 0 + α 1Π10 ), Π 21 = (α 1Π11 + α 2 ), Π 22 = α 1Π12 ,ω 2t = ω1t α1 +w 2t Now perform the operation Π12 β2 β − α1 1 = × 1 = (A.6) Π11 β1 − α 1 α1β2 α1 (A.6) shows given the numerical estimate of Π12 and Π11, α 1 can be solved, which is the coefficient of price in supply equation. Next use the estimated coefficient Π12 =
β2 . β1 can be solved without the income variable (i.e. earlier β1 − α 1
model [7]).(A.6) gives numerical value of α1. Thus, β2, the coefficient of income in demand equation can be solved given the numerical estimate of Π12. Next consider the operation α − β0 α 0β1 − β0 Π 20 =α 0 + α1 0 = (A.7) β1 − α1 β1 − α1
β1 , β 0 ,α 1 are already determined in the RHS of (A.7). Thus (A.7) suggests given the numerical estimate of Π 20, the intercept term of the supply equation, α 0 can be estimated from (A.7). Finally, replacing the expression for Π11 Π21 = α1Π11 + α 2 =
α1α 2 α 2β1 + α2 = (A.8) β1 − α1 β1 − α1
Since β1 , α 1 are already determined, knowing the estimated values of Π 21, one can solve from (A.8) the slope of rainfall in supply equations, α 2.Thus, all the structural parameters of the supply equation can be estimated. The demand functions are already identified. Hence both the supply and the demand function are identified as discussed earlier.
Identification by variance covariance restrictions In case of identification of (7), it is seen that if variable rainfall is included into the supply relation, it will be possible to identify the parameters of demand equation only but not the supply relation. It can be shown, if it is further assumed that covariance between the error term of demand and supply equation is zero, i.e. Cov(u1t, u2t) = 0, then it is possible to estimate the parameters of supply equation as well. In earlier case
Problems of endogeneity 247
without the restriction Cov(u1t, u2t) = 0 the reason why supply function is not identified is that, the mongrel equation (i.e. [11]) looks like the supply equation and there is no way one can distinguish between original supply equation and the mongrel equation. But with the restriction Cov(u1t, u2t) = 0, in order that mongrel equation belongs to the family of supply equation, the error term of ‘mongrel equation’ must not be correlated with the original demand equation, i.e. Cov(u1t,Vt*) = 0.This requires: E [{u1t – E(u1t)} {Vt* – E(Vt*)}] = 0 Replacing the expression for Vt* and after simplification the required condition reduces to
λu 2 E 1t = 0 As E (u1t u2t ) = 0 by assumption λ + µ
{
}
λu 2 The condition E 1t = 0 will not be true, as E(u1t2) = σ 12 = variance of u1t ≠ 0. λ + µ Thus, mongrel equation does not belong to the family of supply equation and supply equation is identified. Mathematically also, one can show that under this restriction Cov(u1t, Vt*) = 0, using the estimated values of reduced form parameters all the structural parameters can be identified. Consider the variance and covariance of the error term of the reduced form. u − u1t 1 Var ( w1t ) = Var 2t Var (u2t ) + Var (u1t ) − cov(u1t , u2t } = 2 { β1 − α 1 ( β1 − α 1 ) = σ11 ( say ) ,
where var (u1t ) = σ 12 , var (u2t ) = σ 22 α ( u − u 2t ) 1 β 2 σ 22 +α12 σ12 var( w 2t ) = var 1 1t + u 2t = 2 1 β α − ( ) β α 1 1 1 1 ( After simplification)
{
}
Cov ( w1t , w 2t ) = E w1t − E ( w1t ) { w 2t − E ( w 2t )} Substituting w1t, w2t and after simplification Cov(w1t, w2t) = E β1 u22t + α1 u1t2 − α1(u1t u2t ) − β1 (u1t u2t ) Cov ( w1t , w 2t ) = E (β1 − α1 )2 =
β1σ 22
(β1 − α1 )
2
+
α1σ12
(β1 − α1 )2
= σ12 (say)
248 Arpita Ghose
From reduced form model one can estimate σ11, σ 21, σ12 . Now consider, σ11 σ 2 + σ12 1+σ12 / σ 22 σ2 1+λ 1+λ if λ = 12 (A.9) = 2 22 = = 2 2 2 2 2 2 2 σ 22 β1 σ 2 + α1 σ 2 β1 + α1 λ β1 + α1 λ σ σ2 β12 + α12 12 σ2 and,
σ 12 β1 σ 22 + α 1 σ 12 β1 + α 1 σ 12 / σ 22 β1 + α 1λ = = = (A.10) σ 11 1+ λ σ 22 + σ 12 1 + σ 12 / α 22
Since β1 has already been estimated without variance covariance restriction, (A.9) and (A.10) can be used to solve for α1 and λ. α1 is the slope of price in the supply equation. Next consider the expression α (α − β0 ) + α 0 (A.11) π 20 = 1 0 β1 − α1
The solution of π20, is obtained through an estimate of reduced form. The parameters α1, β0, β1 are estimated. Hence the intercept term of the supply equation, α0, can be obtained using A(36) Finally consider α α α 2 β1 π 22 = 1 2 + α 2 = (A.12) β1 − α 2 ( β1 − α 1 )
The solution of π22 is made through estimate of reduced form. The parameters β1, α1 are estimated. Thus, α2, the coefficient of rainfall in supply equation, can be solved using (A.12). Hence, using covariance restriction E(u1t, u2t) = σ 12 = 0 , all the parameters of the supply equation is solved and the supply equation is identified.
Condition of identification of the model under general homogeneous linear restriction. Before discussing the condition of identification in the presence of general homogeneous linear restriction, let us first discuss how one can represent the homogeneous linear restriction. Consider the structural form of the model at any time point t, BYt + Γ X t = e t Or, Or,
Yt = e t, X t
( BΓ )
Az t = e t
Problems of endogeneity 249
where A : (BГ): the coefficient of structural parameters, A : Gx (G + K) Yt Zt = : vector all observations on endogenous and exogenous variable at Xt time t, zt: (G + K)x1. Thus the system of equation can be represented as: e1t ~1 z = e where e = e t 2t ~2 t t e Gt ~G Where αi is the i-th row of A. Example 1: Let Y3 does not appear in first equation; the restriction is represented as β13 = 0. This restriction is on the first row of A matrix and hence can be represented as 0 0 0 0 [β11 β12 β13 ........γ 11 ......γ k ] 1 = 0~ Or, α1 Ø= 0, where Ø = 1 , Ø : (G + K) x 1. 0 0 0 0 Example 2: Suppose the coefficients of Y1 and Y2 are equal, i.e. β11 = β12 Or, β11 – β12 = 0. This restriction is also on the elements of first row of A matrix and can be represented as: 1 1 -1 [β11 β12 ........γ 11 ......γ k ] 0 =0, Or,α1 Ø = 0,where Ø = −1 = 0, Ø :(G + k) x 1. 0 δ Example 3: If both the restrictions, [i] β13 = 0, [ii] β11 – β12 = 0, are satisfied it can be represented as: 0 0 11 12 ........11...... k 1 0 0
1 0 1 -1 0 -1 0 , Or, α1 Ø = 0, where Ø = 1 0 , Ø : G + K x 2 . 0 . 0 0 0
250 Arpita Ghose
Note [i] number of columns in Ø is equal to the number of prevailing restrictions and the number of rows in Ø is equal to total number of variables (endogenous and exogenous/predetermined). Order condition: This is algebraic consistency condition, is necessary but not sufficient R ≥ G –1 It shows number of equations must be at least as great as number of variable to solve.
Rank condition Rank ( AØ) = G – 1. This condition shows number of independent equations must be equal to the number of variable to solve. For proof of rank and order condition of identification see Johnston (1984, 3rd ed.) and Judge et al. (1982).
Rule of identification using general homogeneous linear restrictions [1] R = (G – 1), rank (AØ) = G – 1, the equation is just identified. [2] R > (G – 1), rank (AØ) = G – 1, the equation is over identified. [3] R ≥ (G – 1), rank (AØ) < G – 1, the equation is under identified. [4] R < (G – 1), rank (AØ) < G – 1, the equation is under identified. Rule (1) says that order condition holds, i.e. since R = G – 1, the number of equation is exactly equal to the number of variables to solve and also rank condition gets satisfied, i.e. rank (AØ) < G – 1, i.e. all the equations are independent equations and the equation is just identified. Rule (2) says that since R < (G – 1), order condition holds, i.e. the number of equation are greater than the number of variables to solve and since rank conditions holds, i.e. rank (AØ) = G – 1, all the equations are independent equations, the equation is over identified. The rule (3) is interesting. It says that even if R > (G – 1) showing order condition holds, i.e. the number of equation is greater than the number of variable to solve, there is not sufficient number of independent equation to solve for number of variables as rank conditions does not gets satisfied, i.e. rank (AØ) < G – 1. Hence the equation is under identified. Rule (4) says that as R < (G – 1), there is not enough equations to solve for the number of variables and hence the equation is not identified. Example: To test the identifiability of the model. β11 y1t + β12 y 2t + γ 11 x1t + γ 12 x 2t = u1t β21 y1t + β 22 y 2t + γ 21 x1t + γ 22 x 2t = u2t
Problems of endogeneity 251
under the following restriction γ1= 0, γ12= 0, γ12 = 0, in matrix notation the system can be expressed as, By t + Γ x t = ut β11 β12 γ 11 where, B = ,Γ= β21 β22 γ 21
γ 12 u1t , ut = γ 22 u2t
Or, Az t = ut γ 12 α1 = γ 22 α 2
yt β11 β12 γ 11 A = [ BΓ ], z t = , A = (β, Γ ) = x β21 β22 γ 21 t
where α 1 = ( β11 β12 γ 11 γ 12 ), α 2 = ( β 21 β 22 γ 21 γ 22 )
To check identification condition of two equations The restriction on the first structural equations are γ 11 = 0, γ 12 = 0, which can be expressed as 0 0 0 0 ( β11 β12 γ 11 γ 12 ) = 0, or α 1 Ø = 0, where Ø = 1 0 0 1
0 0 0 0 1 0 0 1
The restriction on second equation γ 22 = 0 can be expressed as 0 0 0 0 ( β 21 β 22γ 21γ 22 ) 0 = 0, Or, α 2 φ = 0, where Ø = 0 1 1 Order condition: We need to show, R ≥ G – 1. Rank condition: We need to show, rank (AØ) = G – 1. Identification of first equation: For first equation R = 2, G = 2, G – 1 = 1. Hence R > G – 1 and order condition is satisfied. For the first equation AØ can be expressed as.
β11 A∅ = β 21
β12 γ 11 β 22 γ 21
(Under full restriction)
0 0 γ 12 0 0 γ 11 = = γ 22 1 0 γ 21 0 1
γ 12 0 0 = , γ 22 γ 21 0
252 Arpita Ghose
∴ Rank (AØ) = 1 = G – 1, implying rank condition is satisfied. ∴ For first equation R > G – 1, and rank (AØ) = G – 1, thus, the equation is over identified. Identification of second equation: Here, R = 1 = G = 1. Thus, R=G – 1, and order condition holds. For second equation, AØ can be expressed as 0 β11 β12γ 11γ 12 0 γ 12 0 β β γ γ 0 = γ = 0 , ( Under the full restriction of the model.) , 21 22 21 22 22 1 Hence rank (AØ) = 0 < G – 1, showing rank condition is not satisfied. Thus for second equation R = G – 1, Rank (AØ) < G – 1, showing the equation is under identified How to make second equation identified: Introduce additional restriction β 21 + γ 22 = 0; in addition to γ 22 = 0, these restrictions can be represented as 0 1 0 1 0 0 0 0 ( β 21 β 22γ 21γ 22 ) 0 0 = 0, or α 2 Ø = 0, where Ø = 0 0 1 1 1 1 with this new restriction, R = 2 > G – 1 = 1 hence order condition holds and 0 1 β11β12 γ 11 γ 12 0 0 γ 12 = AØ = β21β22 γ 21 γ 22 0 0 γ 22 1 1
β11 +γ 11 0 β11 = β21 +γ 21 0 0
(under full restriction of the model ) Rank (AØ) = 1. Thus R > G – 1 and rank (AØ) = G – 1, showing the equation is identified. Thus, with appropriate restrictions an under identified equation can be made identified.
Notes 1 For a discussion of the OLS method and the necessary assumptions see Wooldridge (2009). 2 See Kmenta (1991) and Johnston (1984) for structural, reduced and final forms of a general simultaneous equation system consisting of G number of endogenous variables. 3 See appendix for a mathematical demonstration of identification of demand equation.
12 QUANTITATIVE METHODS FOR QUALITATIVE VARIABLES IN SOCIAL SCIENCE An introduction Ajitava Raychaudhuri
1 Introduction Social science, unlike physical science, deals with subjects which in many cases are not easily quantifiable. It includes human behaviour which itself is an integration of psychology, philosophy, rationality, social existence etc. Similarly a subject like political science deals with certain stylized facts like strategy, security, political culture etc. Another subject history is not only a repository of important events in the past, but it shapes the present as well through its underlying explanations on the basis of materialistic and ideological conflicts amongst others. However, many of these events or actions are qualitative in nature which does not show up in terms of quantitative variables. In the course of this chapter, some such examples will be highlighted to establish the fact that quantification of apparently qualitative variables can throw important light on possible trajectories these variables may take in future.
2 Types of qualitative variables Many qualitative variables appear as binary variables to use a technical term. In other words, the event or the aspect one wants to study does not have a quantification property except that the event can only be categorized as yes or no. Examples of this abound from almost every field of social as well as behavioural sciences. One may be interested to know what factors prompt people to be smokers. Thus the variable which matters in this case is whether a person is a smoker or not –again a binary occurrence so that a smoker may respond yes (having an arbitrary quantitative value of 1) and a non-smoker responds no (having an arbitrary quantitative value of 0). Now a researcher can then subject these responses to some rigorous techniques and try to answer what factors may cause an individual to be more
254 Ajitava Raychaudhuri
inclined to smoke – some such factors may well be age, education, family background, health issues, social factors like neighbourhood, type of job etc. Similarly a political scientist may be interested to know voting behaviour after an election but not sure how to quantify the factors responsible for the voting pattern. Here also the response for a voter may be yes or no regarding vote cast for either the incumbent candidate or its opposition. Once this is collected, the binary responses may be subjected to rigorous empirical techniques to judge the importance of age, education, income, gender, religion, caste, location, language etc. in shaping the decisions in the voters’ minds. Sometime the responses may not be binary. Suppose a television channel wants to elicit responses from its viewers about the attractiveness of certain newly launched TV serial. The problem is that the viewers in this case may not have a clear yes or no response. Instead they might be able to categorize their responses in an ordered fashion like, excellent, very good, good, average, bad. The researcher then will be interested to know what prompts a viewer to go for very good instead of good or excellent or the other lower order responses. Since these are qualitative responses, it may apparently seem to be impossible to do a quantitative assessment about what factors may shape such ranking in the mind of the viewers. However, techniques have been discovered where the researchers are able to transform such qualitative ranking into quantitative variables and then try to decipher the underlying factors which might be responsible for such categorizations.
3 Binary dependent variables Binary dependent variables pose a major challenge for standard quantitative analysis pursued in social science disciplines like economics. In fact many qualitative variables have binary characterization. For example, to take the example cited earlier, if one is interested to know the major likely factors which drive a person to be a smoker, then one faces this problem of binary dependent variables. In other words, one can undertake a survey where the respondents identify themselves as smokers or non-smokers. After all, this is an attribute or qualitative character of the respondent. The response is binary namely if the person is a smoker she is given a number 1 and 0 otherwise. Similarly if one wants to infer about the factors which lead to some students getting grade A and others not, one needs to quantify some qualitative attributes of the students who are doing well. Here also the variable to be explained, namely getting grade A is given a number 1 while those getting grades below are given a number 0. Apparently this looks innocuous. But on closer scrutiny one finds several troubleshooters in this approach of using binary variables to be explained. The most important lies in the fact that the variable to be explained no more remains to be a continuous variable. This creates problem since the most common technique used to find impact of some explanatory factors to any event is regression. This is very common as a statistical tool, but this depends on the assumption that the variable to be explained is a continuous variable which can assume all possible values in a range which is predefined by the researchers. In economics, this is better explained by
Quantitative methods for qualitative variables 255
regression techniques used in econometrics (which is the technique used to measure and test economic hypotheses which stem from some underlying theories).
3.1 Linear probability models (LPM) This technique assumes a binary response can be explained by a linear relationship involving the binary response as the variable to be explained and some explanatory variables. For example, to take the example mentioned earlier, suppose the dependent variable (or the variable to be explained) Yi is the response that the student gets a grade A or not. In case the student achieves a grade A, he is assigned a value 1 and 0 otherwise. Let us assume the performance is explained by the number of hours the student studies (X). Then the LPM uses a standard linear regression model which looks like:Yi = β1 + β2Xi + ui. This relationship tries to explain Y with the help of X and it is called linear since the relationship is a straight line with an intercept if one ignores the term ui. Further the coefficients are constant parameters, so that they do not change with variation in X. The subscript ‘i’ refers to the i-th individual student. The term ui (called the error term) is very common in econometrics since it suggests that however good the explanatory power of X on Y may be, there always remains some unexplained part which cannot be captured by X alone. This may be due to some variables which are omitted or for variables which are measured incorrectly or any reason which is not known. The real trouble starts with this error term.This term in statistical terminology is called a random variable having a distinct probability attached with each value.The reason for this is the simple fact that one cannot be sure what value this ui might take, given the components in it as previously mentioned. In standard estimation procedure, ui is taken to be a normal variable by which it is understood that it can take any value, positive or negative, in the range –∞ to +∞, which implies almost all plausible real numbers could appear in the range. Further, any value can occur with a distinct probability, and if the occurrence of values follows a normal probability distribution then identical positive and negative values of ui have equal chance of occurrence. Thus the normal distribution of probability is called a symmetric distribution around a value 0. A negative value of the error term for an individual student really means that number of hours of study has less explanatory power to explain his grade compared to an average student. Similarly for a positive value of the error term it is the opposite. This assumption of normality of probability distribution of the error term cannot be true for a variable which is a binary response variable. The reason is somewhat obvious – it takes only two values, namely (1 – β1 – β2Xi) or (–β1 – β2Xi). The first term follows from Yi taking a value 1 and the second one follows from Yi taking a value 0 (since ui = Yi– β1 – β2Xi). Thus if p is the probability of Yi taking a value 1 and (1– p) is the probability of Yi taking a value 0, then the probability distribution of ui is also confined to these two probabilities with the values of ui being either (1 – β1 – β2Xi) or (–β1 – β2Xi). This is similar to a Bernoulli trial in statistical terminology which produces a binomial probability distribution and not normal.
256 Ajitava Raychaudhuri
Further, the standard regression models can be used provided it is assumed that the variance of the error term (which measures the variation of ui values around its mean value) is identical for all values of ui. Unfortunately, given the peculiar nature of the LPM, variance of ui is not identical – it can well vary for different values. In addition another problem arises, namely the estimated values of Yi may well lie beyond the two possible values 1 and 0. This defeats the whole purpose of having the binary response model. As a result, researchers have to resort to other methods of estimation for getting an idea about the most important determining factors for such binary responses. One such important estimation model is known as logit model. A similar method which is also popular in this context is known as probit model. The two models essentially differ about the underlying distribution of the probability attached to the values. The logit model uses a logistic distribution while a probit model assumes a normal distribution for the probability distribution underlying the variables.
3.2 Logit and probit model The main advantage of logit model is that it avoids the problems mentioned in the context of linear probability model (LPM) mentioned earlier. The first is the problem of handling binary responses which has a dichotomous nature of values, namely 1 or 0. Thus one cannot use a distribution of probability of variables which can assume any value from a continuous range between –∞ to +∞. Further, the standard regression needs to impose restrictions on the error term in order to have consistent estimates. This itself poses problem and logit avoids that too. In fact, it avoids one of the standard regression tools namely ordinary least squares (OLS) which is used in LPM too. Instead of OLS it uses an iterative technique to find estimates of relevant parameters through a method called the maximum likelihood method (ML). The latter tries to find the maximum probability of having a response 1 or 0 based on the given values of the determining variables for such a response by varying the associated coefficients of these variables (the coefficients basically measure the degree of influence of the determining variables on a response). Logit model (as also in probit model), the variable under study is not only dichotomous but actually represents some latent variables. For example, one may try to analyse the factors that determine students who scores well in examinations. Thus the exact variable may be a dichotomous variable like those getting grades over A respond yes and those below respond no. But the latent variable which really drives the response may be ability of the students to score well measured by the past scores of the student. Similarly, whether a person will buy a car may be again captured by a binary yes or no response, but the underlying latent variable may be some cutoff utility (or satisfaction) level. The connection between the actual variable and the latent variable may be represented as follows:
y
k
* i
= β 0 + ∑ β j xij + ui j =1
Quantitative methods for qualitative variables 257
where yi* is latent variable and not observed, xij stands for the j-th explanatory factor and i stands for the i-th individual. The actual observed variable is represented by 1 if yi* > 0 yi = 0 otherwise The difference between logit and probit model is entirely dependent on the distribution of the error term ui. If the cumulative distribution of ui is logistic in nature, then the resulting model is called logit. On the other hand if ui has a normal distribution, the resulting model is called probit. It really depends on the researcher which model is to be chosen –but generally the results do not differ much between logit and probit models. One must note that in linear probability models, the variable considered is the observed one and there is no latent variable involved. In order to have a better idea about the technique used in the logit model, some technical details are given here. First one notes that if one observes yi = 1, then the latent variable must have a value greater than zero. Following our given examples, students must have some abilities (whatever be that) more than an average cutoff (to get a good grade now) or individuals have higher desire than a minimum (in order to buy a car), so that the latent variable is given a positive value if these cutoff values are satisfied. To reiterate, the latent variables are mostly qualitative in nature which cannot be easily observed or quantified. In such a case, it is obvious that the following holds: k k Yi = 1 implies Yi* > 0 or β 0 + ∑ β j xij + ui > 0 or ui > – β 0 + ∑ β j x ij j =1 j =1
Therefore, probability that Yi = 1 is given by, k Prob ( Yi =1) = Prob ui > − β 0 + ∑ β j x ij j =1 k = 1 – F − β 0 + ∑ β j x ij j =1
where F [. . .] is the cumulative distribution of probability associated with the random error term ui. k
If we denote Z i = β 0 + ∑ β j X ij then if the distribution of ui is symmetric, as j =1
in normal distribution, we get 1 –F(–Zi) = F(Zi), and as a result the following holds: k Pi = F β 0 + ∑ β j x ij j =1
258 Ajitava Raychaudhuri
Noting that yi is nothing but the realization of a binomial process (where the probability of occurrence is either 1 or 0), and it varies according to the values of the determining explanatory variables x ij , the likelihood function of occurrence of a probability is given by the following (Maddala [2002, pp. 322–323]): L = Π Pi Π (1 − Pi ) yi =1
yi = 0
If ui is distributed cumulatively as logistic, then the distribution of F(Zi) satisfies the following: e Zi F (Z i ) = , which yields the following distribution for Zi 1 + e zi Z i = log
F (Z i ) 1 − F (Z i )
Thus the logit model follows what is known as the Weibull distribution and is represented by the following equation: log
k Pi = β 0 + ∑ β j x ij . 1 − Pi j =1
This is equivalent to the expression, Pi = e Zi 1 − Pi k
where, Z i = β 0 + ∑ β j X ij . j =1
The ratio
Pi is known as the odd for the i-th observation of the j-th explana1 − Pi
tory variable. Similarly, the odds-ratio gives ratio of the probability of getting the correct (or desirable) response, that is prob(Yi = 1) relative to the probability of getting an incorrect (or not desirable) response which is prob(Yi = 0) for two alternative values of any explanatory variable. To make it clearer, ignoring the i-th subscript for the time being since this will be similar for all i, suppose an independent variable X1 takes the values c and c+k. Then odd for getting P(Y=1) for X1 = c, written as P(Y=1|X1 = c) is given by odd 1= P(Y=1|X1 = c)/(1 – P(Y=1|X1 = c)). Similarly for the other value c+k, the odd is written as odd 2 = P(Y=1|X1 = c+k)/(1 – P(Y=1|X1 = c+k)).Then the odd ratio is the change in the probability of having yes when only one of the explanatory variables X1 changes from c to c+k and the odd ratio (OR) = odd2/odd1.
Quantitative methods for qualitative variables 259
3.3 Estimation of logit or probit models As mentioned earlier, logit or probit models are estimated by the maximum likelihood (ML) methods instead of ordinary least square method. Now the estimation by ML methods has become quite easy with the help of standard econometric software, such as STATA, Eviews and SPSS. But one important point to note is the meaning of the coefficient βi which measures the impact of a change in Xij on prob(Yi = 1) in case of probit or on log of odds in case of logit. The estimated value will be identical in its effect on the latent variable Yi* as well. But this creates a problem of interpretation.The reason is simple. As Wooldridge (2006, p. 585) points out,Yi* in general has no well-defined unit of measurement. In some cases it measures difference in psychic values of two different events or actions or some other attribute which is qualitative in character. In such cases, it makes little meaning to emphasize too much on the estimated values of the coefficients βi. To overcome such ambiguities, the researchers use the concept of marginal effect of a variable which is nothing but the partial effect of change of an explanatory variable on the probability of (Yi = 1) in case of both logit or probit regression. The usefulness of calculating marginal effects is to quantify the impact of change in one explanatory variable Xij on the probability of getting a success (meaning a yes response). Thus if say the yi variable denotes some employment indicator for the i-th individual surveyed, for example yes meaning ‘having a job’ and 0 indicating ‘not having it’, then yi depends on the Xij variables (which is the value of j-th explanatory variable for the i-th individual). Now the explanatory variable can also be binary – for example it can have a value 1 indicating the individual has gone through some job training programme and 0 indicating the individual has not taken the training. These kind of binary explanatory variables are also known as dummy variables which are used to denote some kind of qualitative explanatory variables. Then the marginal effect is given by the change in the cumulative probability of having the training over not having it. Suppose X1 is this dummy variable representing the training. Thus the effect can be written as follows: F (β0 + β1 + β2 X12 + β3 X13 + ............ + βk X1K )
− F (β0 + β2 X12 + β3 X13 + ............. + βk X1K )
This expression clearly shows the difference in cumulative probability (see the definition stated earlier) of obtaining a job for a person with training over not having the training. The first term is the cumulative probability of having the training given the values for all other explanatory variables like, say, age, education, income, past experience etc. and the second one is the same for the individual having the same characteristics but not having the training. In case the X variable is not a dummy variable then the marginal probability measures the change in cumulative probability of changing the value of j-th explanatory variable by one unit. In this case the change is over whatever be the existing value of the variable (need not be 0).
260 Ajitava Raychaudhuri
Thus, the slope coefficients in both logit or probit gives an idea to which direction the odd ratio or the probability of success will move if the explanatory variable changes, but for the exact quantitative magnitude of the effect on probability of getting a success of a change in one of the variables (keeping other explanatory variables fixed) is given by the marginal effect. In terms of notations the marginal effects are given as follows: For Logit
∂pi = β j pi (1 − pi ) ∂xij
For Probit:
k ∂pi = f ( β 0 + ∑ β j X ij )β j ∂xij j =1
3.3.1 Example of logit regression The following logit regression was run by STATA (ver. 13) software to understand probability of surviving cancer. If the patient survives, the value of 0 is given to the respondent and 1 in case the patient dies. The explanatory variables (sometimes also known as predictors since these are used for future prediction) are as follows: studytime = period available to treat the patient before his/her death; drug = use of medicine, a dummy variable with placebo = 1 and 0 otherwise; age = age of the patient. The equations estimated are the same as previously discussed. The output looks as follows (directly reproduced from STATA software): A logistic regression
no. of obs=48
LR chi2(3)=13.67 prob> chi2=0.0034 log likelihood = –24.364293 pseudo R2=0.2191 ---------------------------------------------------------------------------------died | coef. std. err. Z P>|z| [95% conf. interval] ------------+-------------------------------------------------------------------studytime | –0.0236468 0.0457671 –0.52 0.605, –0.1133487, 0.0660551 drug | –1.150009 0.5549529, –2.07 0.038, –2.237697, –0.0623212 age | 0.0793438 0.0699391, 1.13 0.257 –0.0577344, 0.2164219 cons| –1.113136 3.945369, –0.28, 0.778, –8.845918, 6.619646 ---------------------------------------------------------------------------------margins, grand ey/ex (drug age studytime) average marginal effects no. of obs=48 model VCE: OIM
Quantitative methods for qualitative variables 261
expression: pr(died), predict() ey/ex w.r.t.: studytime drug age --------------------------------------------------------------------------------- | delta-method | ey/ex std. err. Z P>|z| [95% conf. interval] ------------+-------------------------------------------------------------------studytime | –0.1768585, 0.35401, –0.50, 0.617, –0.8707054, 0.5169884 –0.9822992, 0.5370982, –1.83, 0.067, –2.034992, 0.0703938 drug | 1.522409, 1.343293, 1.13, 0.257, –1.110396, 4.155214 age | ---------------------------------------------------------------------------------B logistic regression
no. of obs=48
LR chi2(3)=13.67 prob> chi2=0.0034 pseudo R2=0.2191 log likelihood = –24.364293 ---------------------------------------------------------------------------------died | odds ratio std. err. Z P>|z| [95% conf. interval] ------------+-------------------------------------------------------------------studytime | 0.9766306, 0.0446976, –0.52, 0.605, 0.8928393, 1.068286 0.316634, 0.1757169, –2.07, 0.038, 0.106704, 0.939581 drug | age | 1.082576, 0.0757144, 1.13, 0.257, 0.9439007, 1.241626 cons | 0.328527, 1.296161, –0.28, 0.778, 0.000144, 749.6794 ---------------------------------------------------------------------------------margins, grand ey/ex(drug age studytime) average marginal effects no. of obs=48 model VCE: OIM expression: pr(died), predict() ey/ex w.r.t.: studytime drug age --------------------------------------------------------------------------------- | delta-method | ey/ex std. err. Z P>|z| [95% conf. interval] ------------+-------------------------------------------------------------------studytime | –0.1768585, 0.35401, –0.50, 0.617, –0.8707054, –0.5169884 drug | –0.9822992, 0.5370982, –1.83, 0.067, –2.034992, 0.0703938 age | 1.522409, 1.343293, 1.13, 0.257, –1.110396, 4.155214 ---------------------------------------------------------------------------------C probit regression
no.of obs=48
LR chi2(3)=13.87 prob> chi2=0.0031 log likelihood = –24.263142
pseudo R2=0.2223
262 Ajitava Raychaudhuri
---------------------------------------------------------------------------------died | coef. std. err . Z P>|z| [95% conf. interval] ------------+-------------------------------------------------------------------studytime | –0.0135064, 0.0274798, –0.49, 0.623, –0.0673658, 0.040353 drug | –0.7166321, 0.3290735, –2.18, 0.029, –1.361604, –0.0716598 age | 0.0458223, 0.0410394, 1.12, 0.264, –0.0346134, 0.1262581 cons | –0.5295926, 2.35364, –0.23, 0.822, –5.142642, 4.083456 ---------------------------------------------------------------------------------margins, grand ey/ex(studytime drug age) average marginal effects no.of obs=48 model VCE: OIM expression: pr(died), predict() ey/ex w.r.t.: studytime drug age --------------------------------------------------------------------------------- | delta-method | ey/ex std. err. Z P>|z| [95% conf. interval] ------------+-------------------------------------------------------------------studytime | –0.1698999, 0.360357, –0.47, 0.637, –0.8761866, 0.5363867 –1.029713, 0.5578239, –1.85, 0.065, –2.123028, 0.0636013 drug | age | 1.484033, 1.355774, 1.09, 0.274, –1.173235, 4.141301 ---------------------------------------------------------------------------------In these estimations there are three alternative forms: case A is a logit regression where coefficient values are given; case B is also a logit regression where odd ratios are estimated; and the third or case C is the probit regression where once more coefficients are estimated. Let us concentrate on the cases A and B as they are basi cally same logit regressions. In case A, the coefficients estimated are of the regression model k Pi = β 0 + ∑ β j x ij . Since this refers to the i-th individual 1 − Pi j =1 or observation, this depends on the number of observations included in the estimation. To make the interpretation clearer, let us drop the i-th subscript and take this particular case for the purpose. Here the logit model looks as follows:
mentioned, namely log
log
p = β 0 + β1studytime + β 2 Drug + β 3 Age + e 1− p
where e is the error term which captures all the variables left out or if there exists some measurement errors. The first term is the constant term and β1, β2 and β3 are the coefficients of the logit regression as in case A. First let us understand the meaning of the coefficients. Here the study time and age are continuous variables which may take any non-negative value. On the other hand, drug is a dummy binary type variable having 1=placebo and 0 for otherwise. The main purpose of the study is to predict the survival chance of a cancer patient depending on some variation in the explanatory variables. So the final target for the researcher is to find out the whether the probability increases or decreases.
Quantitative methods for qualitative variables 263
For the variable drug, one sees whether use of placebo influences the probability of survival.Thus if the drug variable changes from 0 to 1 keeping the other explanatory variables studytime and age constant, then the probability of death (denoted as pi here) compared to that of survival (denoted as 1 – pi) declines. However, the odd ratio (OR) is more focused in this context – the value (pi/1 – pi) shows that if this is less than 1 then survival probability is higher than that of death. Without delving too much into mathematical niceties, it is sufficient to say OR for drug keeping all other variables constant is what is given in case B which is approximately 0.32. In other words it can be written as OR =
Odds when Drug = 1 = 0.32 Odds when Drug = 0
which implies that use of drugs (placebo) reduces the chance of death by almost 1/3. The relation between the two cases A and B is given as e β2 = OR where β2 is the coefficient of drug estimated in case A. It should be noted that odd < 1 implies the event of death has lower probability compared to survival. On the other hand age has just the opposite impact. If age increases by one year then odd ratio turns out to be 1.08. Thus OR > 1 implies odd of death increases by 8 per cent when age increases by one year. The marginal effect mentioned earlier is to see how the probability of getting a response 1 (which in this case means the person died) changes because of change in the explanatory variables. The marginal effects are calculated here in elasticity term so that it shows one per cent change in the explanatory variable leads to what percentage change in the probability of getting 1 response. However for dummy variables, it has to be one unit change as this value can only change from 0 to 1. Further, the sign of coefficients determine the sign of the marginal effect too. Also, in either of the cases A or B for logit regression, the marginal effect values will be the same. The results are intuitive once the coefficients and odd ratios are estimated. The probability of death decreases for increase in studytime and drug but increases with age. As is clear, the sign of the coefficients in case A is the same as signs of the marginal effects as explained earlier in the expressions for marginal effects.
4 Ordered logit and probit So far, the discussion centred around explaining causal factors for probability of occurrence of some binary or dichotomous variables having only two possible responses, yes or no. However, in many cases, especially in surveys based on perception, one cannot possibly give an exact answer. Examples of this kind are plenty (already mentioned in the introduction for TV viewership surveys) and one may start with electoral politics. In a study on why certain candidate is chosen for voting, respondents may answer on several traits of the candidate or its supporting political party (if any). Thus one voter may like honesty of the candidate the most while another may like most her ideology. Some others may like her support to charity for the poor the most. Now if voter 1 likes the honesty the most, voter 2
264 Ajitava Raychaudhuri
also likes the honesty trait but not so much. Similarly the support to charity may be liked by voter 1 also but with much less preference. In these cases, respondents are asked to rank or order their preferences for certain characteristics of a candidate in a 5-point scale (also known as Likert scale). Thus 5 stands for most important, while 4 may mean just important, 3 implies neutral, 2 means not important and 1 may stand for least important. As a result the variable to be explained now becomes an ordered variable, having a decreasing rank from 5 to 1. The researcher is interested to know the factors which may influence these responses. Here again the latent variable becomes very important concept. The voter who gives a rank 5 may have some unobservable psychological factor which pushes her to choose 5 instead of, say, 3. This latent variable must have some range of unobservable psychological range of values which determines the observed rank revealed by the voter. Interestingly, the voter also is unable to explain in clear terms why she has given such ranks. In these cases, ordered logit and probit are important measurement techniques which try to estimate the influence of certain explanatory factors on the ordering behaviour of the individuals. In a study by Raychaudhuri and De (2012), the authors reported about a survey amongst patients who had come to India during 2008 and 2009 to avail of medical facilities there. The survey questionnaire asked the respondents about their perception about some specific barriers described as follows: 1 2 3 4 5
visa problem; language problem; problem of getting right information; problem of getting rail tickets for travelling within India; problem because of non-existence of an integrated agency.
The respondents were asked to rate each category of the problem mentioned earlier in a scale of 1 to 5 where 1 denotes least problematic and 5 denotes extremely problematic.The levels 1 to 5, which expresses degree of the problem, is called order of the categorical variable. In other words it is not only a categorical variable, but it is also an ordered categorical variable. Using the information, the authors used an ordered probit regression to answer the following questions – (a) what are the most important determinants for each of the problem category mentioned earlier and (b) the marginal effects for each determinant for each of the response order under any one of the problem categories. Let us clarify it further with the following example. For the problem category visa, the authors regressed visa as a dependent variable on the following explanatory variables: age, level of education, annual household income, amount of own resources available with the patient for travel and waiting time here to get the treatment. Here only the marginal effects expressed in elasticity form for two probability values, the lowest order value (that is 1) and the highest order value (which is 5) are mentioned. The ordered probit model is shown here:
Quantitative methods for qualitative variables 265
Let yi = dependent variable (any of the 5 categorical variables mentioned) Xm = mth independent or explanatory variable, where m =1 = age, m = 2 = education, m = 3 = income, m = 4 = own resources and m = 5 = waiting time. Here, i stands for the i-th individual. Small x is the matrix for independent variables. Following Greene (2003) the regression looks like: yi* = β ' xi + ε i , where y*i is a latent variable whose observed outcomes yi follows the following ordering: 1 if yi* ≤ γ 1 * 2 if γ 2 ≤ yi ≤ γ 1 yi = . . 5 if γ 4 ≤ yi* One must note that the latent variable drives the response which is recorded. Clearly, the psychological latent variable lies within specific range (called cut-off values) that pushes the individual respondent to choose a specific ordering for the categorical variable. Using this, the probability ordering and log likelihood function are created. Maximizing the log likelihood, one can estimate the marginal effects in elasticity form for various probability orders with respect to each of the independent explanatory variables (also called regressors). We do not report the log likelihood estimators but report the marginal effects in elasticity form for the lowest and highest probability for each categorical variables instead of reporting all the five probability orderings. All the elasticities are calculated at the mean values of the concerned explanatory variable. Table 12.1 summarizes the result of our calculations. The sign of the coefficients shows whether relative frequencies in the two tails of the underlying normal distribution are increasing or decreasing. In other words, the distribution of responses shifts consequent upon change in the values in any one of the explanatory variables. Usually, a positive sign in the left tail is matched with a negative sign for the right tail. A negative sign on any tail implies lesser number of predicted observations for highest or lowest rank in any category – that means lesser number of people consider the category as creating less barrier. The opposite happens for a positive sign. However, an opposite movement in the two tails either implies a change in perception for increased or decreased difficulty in cross-border trade in health services. The value of the coefficient shows the degree of responsiveness of prediction to a change in the explanatory variables.
0.41 (1.90) –0.06 (–0.60) 0.08 (1.26) –0.002 (–0.05) –0.24*** (–3.62)
–0.66 (–1.45) 0.37* (1.79) 0.02 (0.22) 0.14* (1.72) –0.23 (–1.58)
0.29 (1.63) –0.17** (–2.08) –0.01 (–0.20) –0.03 (–0.87) 0.10 (1.75)
Pr(5) 0.05 (0.24) 0.02 (0.21) 0.15** (1.99) 0.02 (0.40) –0.08 (–1.17)
Pr(1)
Information
0.53 (0.78) 0.13 (0.42) –0.40 (–1.41) 0.10 (0.94) 0.20 (1.01)
Pr(5) –1.34 (–1.08) 0.93** (1.98) –0.14 (–0.45) 0.35** (2.47) –0.93*** (–1.76)
Pr(1)
Rail Ticket
0.04 (0.82) –0.03*** (–1.71) 0.004 (0.36) –0.01 (–1.59) 0.03** (2.07)
Pr(5)
0.23 (0.55) –0.31 (–1.39) 0.36*** (3.18) 0.17** (1.95) 0.03 (0.25)
Pr(1)
Agency
0.10 (0.33) 0.23 (1.58) –0.35*** (–2.77) 0.05 (0.66) 0.03 (0.37)
Pr(5)
Notes: a) Pr(1) and Pr(5) are elasticities of change in probabilities of the variables ordered at rank 1 (lowest) and 5 (highest) respectively, all calculated at mean values of the respective regressors. b) Values in bracket are z values. c) ‘***’,’**’ and ‘*’ denote significant at 1, 5 and 10 per cent levels respectively.
–0.68 (–1.48) 0.34 (1.56) –0.22 (–1.36) 0.09 (1.09) 0.40*** (2.74)
Pr(1)
Pr(1)
Pr(5)
Visa
Language
Source: Raychaudhuri and De (2012).
Waiting Time
Own Resource
Income
Education
Age
Regressor
TABLE 12.1 Elasticities of change in lowest and highest probabilities (the marginal effects)
Quantitative methods for qualitative variables 267
The findings from the ordered probit regression are summarized as follows. First, age of patients is insignificant as a predictor for the formation of perceptions regarding any category of barrier mentioned earlier. Second, better level of education for patients significantly reduces perceived barriers of patients coming to India regarding obtaining visa and procuring rail tickets to travel within India. One should remember rail travel is crucial for patients coming from across the border to India since the patients need to travel to distant hospitals within India to avail of better and appropriate facilities.Third, higher income reduces perception of both information and absence of an integrated agency as barriers to health services exports for India. This is important since patients who are relatively poor feel the need for availability of more information and an agency which take care of crossborder issues for availing health services. It is understood that both these are taken to be responsibility of the governments in the countries involved in trade in health services. Thus less wealthy people expresses more dependence on government help compared to their relatively well-off counterparts. Fourth, patients who command more resources express lesser problems regarding visa, rail ticket and absence of an integrated agency for facilitating. This is expected and is consistent with the other findings. Fifth, more waiting time to get hospital services creates an adverse perception in the mind of patients who do not have command over local languages (or do not enjoy services from local interpreters) or could not manage rail tickets easily.Thus language is a big problem for crossborder health service exports in case of India. The problem with rail ticket points to the need of either better rail connectivity or availability of alternative modes of affordable transport which reduce waiting time (a type of transaction cost) for the patients.
5 Conclusions The discussion in this chapter gives an introductory idea about the methodology used to quantify mainly qualitative variables. This is particularly relevant in social and behavioural sciences where a researcher may be interested in predicting some future outcome of some qualitative variables based on some other determining factors. This may be health related issues like survival chances in some critical illness or chances of having more smoking individuals in a particular region. One may be interested in predicting electoral politics in terms of voters’ behaviour or one may try to understand the benefit of getting membership in a trade union. In this chapter a study result is elaborated about barriers to health tourism in India from neighbouring countries which are based on perception surveys. The major advantage of using such techniques like logit or probit or in case of ordered data, ordered probit is the predictive power of these techniques. Usually, in many social science or behavioural science research, descriptive analysis gets primacy since it is always difficult to quantify apparently qualitative responses. The analysis here gives an idea how such stereotypes can be overcome for strengthening the analysis. Although the methods discussed here have some underlying technical
268 Ajitava Raychaudhuri
details, one needs a synoptic idea about them so that it is not difficult to read and understand output generated by user friendly statistical software.
References Greene, W. H.2003. Econometric Analysis, 5th Edition. Pearson Education: New Delhi. Maddala, G. S. 2002. Introduction to Econometrics, 3rd Edition. Wiley India: New Delhi. Raychaudhuri, A. and P. De. 2012. International Trade in Services in India. Oxford University Press: New Delhi. Wooldridge, J. M. 2006. Introductory Econometrics: A Modern Approach, India Edition. Thomson South-Western: New Delhi.