Genomic Diversity in People of India: Focus on mtDNA and Y-Chromosome polymorphism 9811601623, 9789811601620

This book is the output of Anthropological Survey of India's National Project "DNA Polymorphism of Contemporar

211 42 22MB

English Pages 468 [460] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Foreword
Contributors
Acknowledgement
Contents
Abbreviations
List of Figures
List of Tables
Chapter 1: Introduction
People of India
Ethnic and Linguistic Diversity
Tribal Populations
The Anthropological Survey of India´s Initiative
References
Chapter 2: Mitochondrial DNA Phylogeny in Indian Population
MtDNA M-Macrohaplogroup
M-Haplogroups in North East India
East Asian M-Haplogroups in India
Age Estimates
Detailed mtDNA Genotypes and Peopling of Andaman Island
Link Between India to Australia
References
Chapter 3: Mitochondrial DNA Phylogeny of N-Haplogroup in Indian Population
Haplogroup U
References
Chapter 4: mtDNA 9-bp INDEL Polymorphism (np 8272-8280) Among Indian Population
References
Chapter 5: Population Diversity and Molecular Diversity Indices Based on mtDNA Among Indian Population
Macrohaplogroup M Lineages in North-East India
Origin of Macrohaplogroup M (Part of the Data Has Been Published in the Article from Elsewhere: Chandrasekar et al. 2009)
Migration Routes of Modern Human (Part of the Data Has Been Published in the Article from Elsewhere: Chandrasekar et al. 2009)
Evolutionary Antiquity of mtDNA Lineage, M2 Among Indian Populations (Part of the Data Has Been Published in the Article from ...
Genetic Link Between Indians and Australian Aborigines
Genetic Link Between Indians and Andaman Islands Aborigines (Part of the Data Has Been Published in the Article from Elsewhere...
Molecular Diversity Indices
References
Chapter 6: Y-Chromosome Phylogeny in Indian Population
Multi Dimension Analysis for Y-Chromosome SNP
YAP (Y-Chromosome Alu Polymorphism) Insertion in Indian Samples (Part of the Data Has Been Published in the Article from Elsew...
Evolutionary Implications
References
Chapter 7: Genomic Diversity of 75 Communities in India
Alu Kurumba
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Andh
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Angami Naga
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Betta Kuruba
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Bharia
Paternal Lineage (Y Chromosomal Haplogroups)
Bhoi Khasi
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Bhoksa
Paternal Lineage (Y Chromosomal Haplogroups)
Bhotia
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Bondo
Paternal Lineage (Y Chromosomal Haplogroups)
Chenchu
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Damor
Paternal Lineage (Y Chromosomal Haplogroups)
Dhodia
Paternal Lineage (Y Chromosomal Haplogroups)
Dirang Monpa
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Dungri Bhil
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Gadia Lohar
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Galong
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Garwali Brahmin
Paternal Lineage (Y Chromosomal Haplogroups)
Garhwali Rajput
Paternal Lineage (Y Chromosomal Haplogroups)
Ghorkha
Paternal Lineage (Y Chromosomal Haplogroups)
Kolam
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Hmar
Paternal Lineage (Y Chromosomal Haplogroups)
Irular
Paternal Lineage (Y Chromosome Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Jarawa
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Jaunsari
Paternal Lineage (Y Chromosomal Haplogroups)
Jenu Kuruba
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Ka Thakur
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Kamar
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Kanikkar
Paternal Lineage (Y Chromosomal Haplogroups)
Karen
Paternal Lineage (Y Chromosomal Haplogroups)
Kathodi
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Katkari
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Kattunayakan
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Kutia Kondh
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Konda Reddis
Paternal Lineage (Y Chromosomal Haplogroups)
Koraga
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Korku
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Kota
Paternal Lineage (Y Chromosomal Haplogroups)
Koya
Paternal Lineage (Y Chromosomal Haplogroups)
Lachungpa
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Lepcha
Paternal Lineage (Y Chromosomal)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Ma Thakur
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Madia
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Mal Paharia
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Mara
Paternal Lineage (Y Chromosomal Haplogroups)
Mathur
Paternal Lineage (Y Chromosomal Haplogroups)
Mina
Paternal Lineage (Y Chromosomal Haplogroups)
Melacheri
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Mizo
Paternal Lineage (Y Chromosomal Haplogroups)
Mullu Kurumba
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Munda
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Nayaka
Paternal Lineage (Y Chromosomal Haplogroups)
Nicobarese
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Nihal
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Nishi
Paternal Lineage (Y Chromosomal Haplogroups)
Padhar
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Paite
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Paniyan
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Pauri Bhuinya
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Porja
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Rabha
Paternal Lineage (Y Chromosomal Haplogroups)
Raji
Paternal Lineage (Y Chromosomal Haplogroups)
Saharia
Paternal Lineage (Y Chromosomal Haplogroups)
Savaras
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Sherdukpen
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Soliga
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Sonowal Kachari
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Tai Ahom
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Tai Khampti
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Tharu
Paternal Lineage (Y Chromosomal Haplogroups)
Toda
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Toto
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Urali Kuruman
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Wancho
Paternal Lineage (Y Chromosomal Haplogroups)
Maternal Lineage (mtDNA Haplogroups)
Molecular Diversity
Yanadi
Paternal Lineage (Y Chromosomal Haplogroups)
Yerukulas
Paternal Lineage (Y Chromosomal Haplogroups)
References
Index
Recommend Papers

Genomic Diversity in People of India: Focus on mtDNA and Y-Chromosome polymorphism
 9811601623, 9789811601620

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Anthropological Survey of India

Genomic Diversity in People of India Focus on mtDNA and Y-Chromosome polymorphism

Genomic Diversity in People of India

Genomic Diversity in People of India Focus on mtDNA and Y-Chromosome polymorphism

Anthropological Survey of India Ministry of Culture, Government of India Kolkata, West Bengal, India

ISBN 978-981-16-0162-0 ISBN 978-981-16-0163-7 https://doi.org/10.1007/978-981-16-0163-7

(eBook)

# Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Foreword

It is a great privilege for me to write this foreword for the book entitled, Genomic Diversity in People of India: Focus on mtDNA and Y-Chromosome Polymorphism which is an outcome of tenth and eleventh Five Year Plan Project, known as the ‘DNA Polymorphism of Contemporary Indian Populations’, which was undertaken by the Anthropological Survey of India (AnSI). Soon after the completion of the mammoth project, ‘People of India’, the AnSI focused on the genetic diversity of Indian populations. The People of India project generated enormous data on language, society, culture and economy of the communities of India. It unearthed regional identities and variations of 4635 communities of India, which show themselves in language, religion, lifestyle and material culture. The work also highlighted internal migrations in India. It necessitated AnSI to take up a project called ‘DNA Polymorphism in Contemporary Indian Populations’ to trace the ancient successive human migrations, peopling of India and to understand variations in the Indian gene pool. Soon after the expansion of molecular biology in the 1950s, it became evident that by comparing the proteins and nucleic acids of one species with those of another, one could expect to obtain a quantitative and objective estimate of the ‘genetic distance’ between species. Until then, there was no common yardstick for measuring the degree of genetic differences among species. Serological comparisons revealed the phylogenetic relations of man, chimpanzee, gorilla, orang-utan and gibbon. In 1981, Anderson et al. published the complete sequence of the 16,569base pair human mitochondrial genome. The genes for the 12S and 16S rRNAs, 22 tRNAs, cytochrome c oxidase subunits I, II and III, ATPase subunit 6, cytochrome b and 8 other predicted protein-coding genes have been located. The sequence shows extreme economy in that the genes have none or only a few noncoding bases between them and in many cases the termination codons are not coded in the DNA, but are created posttranscriptionally by polyadenylation of the mRNAs. The high rate of evolution of mitochondrial DNA makes this molecule suitable for genealogical research on such closely related species as humans and apes. Genealogical analysis of the sequence differences supports the view that the human lineage branched off only slightly before the gorilla and chimpanzee lineages diverged and strengthens the hypothesis that humans are more related to gorillas and chimpanzees than to the orang-utan. This is one of the two v

vi

paths of those which worked far above the species level and were concerned with genealogical trees, timescales and the accumulation of new mutations on surviving molecular lineages. The other path is of those which worked at and below the species level and were concerned mainly with population structure, migration and the frequencies of alleles that existed in an ancestral population. This fusion of paths is made possible by the high rate at which mutations accumulate on mtDNA lineages and by this molecule’s uniparental and apparently haploid mode of inheritance. These properties make mtDNA a superb tool for building trees and timescales relating molecular lineages at and below the species level. In addition, owing to its mode of inheritance, mtDNA is more sensitive to the bottlenecks in population size and to population subdivision than are towards nuclear genes. Joint comparative studies of both mtDNA and NRY region of Y-Chromosome variability give us valuable insights into how effective population size has varied through time. In 1987, Rebecca L. Cann, Mark Stoneking, and Allan C. Wilson analysed mitochondrial DNAs from 147 people, drawn from five geographic populations. All these mitochondrial DNAs stem from one woman who is postulated to have lived about 200,000 years ago, probably in Africa. All the populations examined except the African population have multiple origins, implying that each area was colonized repeatedly. Thus, studies on mtDNA, Y chromosome and many autosomal regions support that modern humans originated in Africa between 166,00 and 249,000 years ago, then expanded throughout Africa and into rest of the world with little or no interbreeding between modern humans and archaic populations which lived elsewhere in the old-world, including the Neanderthals in Europe and Homo erectus in Asia. The initial dispersal of modern humans from east-Africa en route North and East Africa has now been documented, following the African mtDNA haplogroups into Saudi Arabia and then West India. Indian subcontinent, due to its geographical location and ecological conditions, played a pivotal role in the initial dispersal of modern humans. In the light of the above, in the process of inventing and acquiring new methods and techniques, the Anthropological Survey of India, a premier Government of India institute in Anthropological Research in India, acquired world class technology of DNA sequencing technology during 2004–2005. To elucidate the ‘southern route’ hypothesis of anatomically Modern Man’s migration, construct maternal and paternal phylogenies, lineages, prehistoric dispersals of modern man in the Indian subcontinent and the role of India in peopling the world, a diverse set of 2124 Y-Chromosome and mitochondrial genomes was sampled from 75 tribal and nontribal populations and was analysed and presented in this book. The book is organized in seven chapters. It is incredible to learn that more than 130 crore population of India is organized into 4635 communities and consists of 102 maternal lineages and 35 paternal lineages. I believe this book is one of the best endeavours of the physical anthropology division of the

Foreword

Foreword

vii

Anthropological Survey of India, which is not only of immense importance and interest to the anthropologists but also to all those who are connected with questions like ‘who we are and from where have we come’. Anthropological Survey of India Kolkata, India

Vinay Kumar Srivastava

Contributors

The contributors of the DNA Polymorphism Consortium are present and former employees of the Anthropological Survey of India (AnSI), Ministry of Culture, Government of India posted in the different Regional Centres of AnSI viz. Head Office, Kolkata, Andaman and Nicobar Regional Centre, Port Blair, Southern Regional Centre, Mysore, Central Regional Centre, Nagpur, Eastern Regional Centre, Kolkata, North-East Regional Centre, Shillong, North-West Regional Centre, Dehradun, Western Regional Centre, Udaipur, Sub-Regional Centre, Jagdalpur, Ranchi Field Station, Sagar Field Station and Vizag Field Station. Index for Contribution Δ = Collection of Blood Samples and Conducted Field Investigation £ = Laboratory Analysis β = Conducted Bioinformatics Research Φ = Report Writing ★ = Final Manuscript Preparation The names of the contributors are placed in alphabetical order. The index is placed after the contributor’s name. A. Chandrasekar Δ£βΦ A. K. Sahani Δ Bandana Das Δ£ B. N. Sarkar Δ£Φ B. P. Urade Δ£βΦ B. R. Bhatnagar Δ£ B. V. Ravi Prasad Δ£βΦ Chitta Ranjan Mandal Δ£ Charles Sylvester Δ£ Deimaphishisha Sun Δ£ D. Xaviour Δ£ Dipak Kumar Adak Δ£Φ Debasish Basu Δ Dipesh Chowdhury Δ G. R. Lakshmi Δ Harashawaradhana Δ£Φ I. Arjun Rao Δ£β ix

x

Contributors

J. S. Jaya Shankar Rao Δ£βΦ J. Sreenath Δ£ Kiran Uttaravali Δ£ Koel Mukherjee Δ£βΦ Mithun Sikdar Δ£βΦ Murali Kotal Δ£ N. K. Das Δ Nandini Bhattacharya Φ★ P. Aditi Mukherjee Δ Partha Dhar Δ£ P. Mangalakshmi Δ£ P. B. S. V. Padmanabham Δ£ Pinuma Barua Δ£ Pradyut Gangopadhyay Δ£ R. Th. Varte Δ Reddy N. Naidu £ Ramesh Sahani Δ£ S. Yaseen Saheb Δ£ S. S. Bandopadhyay Δ S. S. Barik Δ£ S. S. Gajbhiye Δ£ Shampa Gangopadhyay Δ£ Satish Kumar Δ£ Saumitra Barua Δ£ Shiv Kumar Patel Δ£β Sikha Chatterjee Δ Subhra Bhattacharya Δ£Φ★ Sujit Mallick Δ£ Sujitlal Bhakta ★ Venugopal P.N Δ£βΦ★ Wanpli C. Synnah ★ Yumnam Momo Singh Δ£

Acknowledgement

We, the contributors for this book, express our utmost gratitude to the Anthropological Survey of India, Department of Culture, Ministry of Culture, Government of India, for providing the opportunity to study the “Genomic Diversity in People of India”. The project was initiated by the then Director of the Anthropological Survey of India, Prof. V.R. Rao, and was spearheaded by other former Directors, namely Prof. K. K. Basa, Prof. K. K. Misra, Shri. G. S. Rautela, Prof. J. Sengupta and former Director-in-charge Dr. M. Sasikumar. We are thankful to Prof. Vinay Kumar Srivastava, Director, Dr. M. Sasikumar, Joint Directror and Dr. Umesh Kumar, Senior Ecologist, Anthropological Survey of India, for providing us with continued motivation, inspiration and logistic support for the smooth initiation and completion of this book. Mere words fail to express our gratitude to all the former and present senior officials of the Anthropological Survey of India for extending their cooperation and help to carry out the investigative study, particularly Dr. Madhu Bala Sharma, Dr. Suresh Patil, Dr. B. Francis Kulirani, Dr. C. R. Sathyanarayanan, Dr. Kakali Chakrabarty and Dr. Vinod Kaul. We also gratefully acknowledge the immense help and cooperation received from the various State Governments, District Administrations, Department of Health and Family Welfare, and Block Development Officers of the areas in which the fieldwork was conducted. Further, we would have failed in our duties if we do not acknowledge the co-operation extended by all the administrative staff of the Anthropological Survey of India. Last but not least, the Anthropological Survey of India acknowledges and dedicates this book to the People of India who voluntarily participated in the study, helping the Ministry of Culture create a repository of knowledge, data which is an asset to the country and which will fuel further research and investigative studies in the Genomic Diversity of India.

xi

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . People of India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ethnic and Linguistic Diversity . . . . . . . . . . . . . . . . . . . . . . . . . Tribal Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Anthropological Survey of India’s Initiative . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

1 1 1 1 2 8

2

Mitochondrial DNA Phylogeny in Indian Population . . . . . . . MtDNA M-Macrohaplogroup . . . . . . . . . . . . . . . . . . . . . . . . . . M-Haplogroups in North East India . . . . . . . . . . . . . . . . . . . . . East Asian M-Haplogroups in India . . . . . . . . . . . . . . . . . . . . . Age Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Detailed mtDNA Genotypes and Peopling of Andaman Island . . Link Between India to Australia . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

11 11 29 29 35 35 38 80

3

Mitochondrial DNA Phylogeny of N-Haplogroup in Indian Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Haplogroup U . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4

mtDNA 9-bp INDEL Polymorphism (np 8272–8280) Among Indian Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5

Population Diversity and Molecular Diversity Indices Based on mtDNA Among Indian Population . . . . . . . . . . . . . Macrohaplogroup M Lineages in North-East India . . . . . . . . . . . Origin of Macrohaplogroup M (Part of the Data Has Been Published in the Article from Elsewhere: Chandrasekar et al. 2009) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Migration Routes of Modern Human (Part of the Data Has Been Published in the Article from Elsewhere: Chandrasekar et al. 2009) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evolutionary Antiquity of mtDNA Lineage, M2 Among Indian Populations (Part of the Data Has Been Published in the Article from Elsewhere: Kumar et al. 2008) . . . . . . . . . . . . . . . . . . . . . Genetic Link Between Indians and Australian Aborigines . . . . . .

. 121 . 122

. 122

. 123

. 123 . 124 xiii

xiv

Contents

Genetic Link Between Indians and Andaman Islands Aborigines (Part of the Data Has Been Published in the Article from Elsewhere: Barik et al. 2008) . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Molecular Diversity Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6

7

Y-Chromosome Phylogeny in Indian Population . . . . . . . . . . Multi Dimension Analysis for Y-Chromosome SNP . . . . . . . . . . YAP (Y-Chromosome Alu Polymorphism) Insertion in Indian Samples (Part of the Data Has Been Published in the Article from Elsewhere: Chandrasekar et al. 2007) . . . . . . . . . . . Evolutionary Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. 145 . 151

Genomic Diversity of 75 Communities in India . . . . . . . . . . . Alu Kurumba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Angami Naga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Betta Kuruba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bharia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Bhoi Khasi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bhoksa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Bhotia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bondo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Chenchu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. 162 . 164 . 168 171 171 172 172 173 175 175 177 179 181 181 183 183 186 187 187 190 191 191 192 193 194 195 197 198 198 199 199 200 203 204 204 205 205

Contents

xv

Damor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Dhodia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Dirang Monpa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dungri Bhil . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gadia Lohar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Galong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Garwali Brahmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Garhwali Rajput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Ghorkha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Kolam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hmar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Irular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosome Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jarawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaunsari . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Jenu Kuruba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ka Thakur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

205 206 209 209 210 210 210 211 213 215 215 216 218 220 221 221 221 225 225 226 227 227 228 231 231 232 232 233 233 235 237 237 238 239 239 240 241 243 243 243 244 245 245 248 248 250 252

xvi

Contents

Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kamar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kanikkar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Karen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Kathodi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Katkari . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kattunayakan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kutia Kondh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Konda Reddis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Koraga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Korku . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kota . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Koya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Lachungpa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lepcha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal) . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

252 253 253 255 257 257 257 258 260 263 263 263 264 264 265 268 268 268 269 270 273 273 273 274 275 275 280 280 280 282 283 283 284 287 287 288 288 292 292 293 294 294 295 295 295 299 300

Contents

xvii

Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ma Thakur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Madia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mal Paharia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mara . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Mathur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Mina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Melacheri . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mizo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Mullu Kurumba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Munda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nayaka . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Nicobarese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nihal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nishi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Padhar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

300 300 302 304 305 307 307 309 309 309 310 311 312 314 316 318 318 318 320 320 321 321 321 322 322 323 325 326 326 328 330 330 331 331 333 333 336 336 338 338 338 338 342 344 345 345 345 346

xviii

Contents

Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paniyan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pauri Bhuinya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Porja . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rabha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Raji . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Saharia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Savaras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sherdukpen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Soliga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sonowal Kachari . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tai Ahom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tai Khampti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

346 350 350 351 351 352 353 356 356 356 360 361 363 363 363 365 367 368 368 368 371 372 372 373 373 373 374 374 376 379 380 382 383 384 384 384 386 386 390 392 392 393 395 397 397

Contents

xix

Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tharu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Toda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Urali Kuruman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wancho . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Maternal Lineage (mtDNA Haplogroups) . . . . . . . . . . . . . . . Molecular Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yanadi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . Yerukulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paternal Lineage (Y Chromosomal Haplogroups) . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

398 398 400 402 403 404 404 405 408 408 409 409 410 411 413 413 414 415 418 419 422 422 423 423

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

Abbreviations

AA AIAS AIBAS AMOVA AnSI bp DNA DR FST GSIP HP HVR IE INDEL ISEA ky kya kyBP LGM MCMC MPD mtDNA np NSIP p PCR PGM POI PVTG rCRS SD SNP TB TMRCA tRNA yBP

Austro-Asiatic All India Anthropometric Survey All India Bio Anthropological Survey Analysis of Molecular Variance Anthropological Survey of India Base pair Deoxyribo nucleic acid Dravidian F statistics (AMOVA) Genetic structure of Indian population Haptoglobin Hypervariable region Indo-European Insertion/deletion Islands South East Asia Thousand years Thousand years ago Thousand years before present Last glacial maximum Markov Chain Monte Carlo Mean pairwise differences Mitochondrial DNA Nucleotide position Nutritional Status of Indian Population Probability Polymerase chain reaction Phosphoglucomutase People of India Particularly Vulnerable tribal groups Revised Cambridge Reference Sequence Standard deviation Single nucleotide polymorphism Tibeto-Burman The most recent common ancestor Transfer ribonucleic acid Years before present xxi

List of Figures

Fig. 1.1 Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 2.4 Fig. 2.5 Fig. 2.6 Fig. 2.7 Fig. 2.8 Fig. 2.9 Fig. 2.10 Fig. 2.11 Fig. 2.12 Fig. 2.13 Fig. 2.14 Fig. 2.15 Fig. 2.16 Fig. 2.17 Fig. 2.18 Fig. 2.19 Fig. 2.20 Fig. 2.21 Fig. 2.22

Map showing the geographical distribution of studied 75 population .. . . . .. . . . .. . . .. . . . .. . . . .. . . . .. . . .. . . . .. . . . .. . . . .. . . . Population wise distributions of M- and N-haplogroups . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . Distribution of mtDNA macro-haplogroup M and N in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M2a and M2b haplogroup phylogenetic tree along with communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M2(M2a) haplogroup detailed phylogenetic tree . . . . . . . . . . M2(M2b) haplogroup detailed phylogenetic tree . . . . . . . . . . M3 haplogroup phylogenetic tree along with communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M3 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . . . M40 650 67 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M40 650 67 haplogroup detailed phylogenetic tree . . . . . . . . . . Percentage distribution of mtDNA haplogroup M5 in Indian population . . . . .. . . . .. . . .. . . . .. . . . .. . . . .. . . .. . . . .. . . . M5 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M5 (M5a) haplogroup detailed phylogenetic tree . . . . . . . . . M5 (M5b0 c) haplogroup detailed phylogenetic tree . . . . . . . M6 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M6 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . . . M300 370 180 380 450 630 64 haplogroups phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M30 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M37 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M38 and M18 haplogroup detailed phylogenetic tree . . . . M45, M63 and M64 haplogroups detailed phylogenetic tree . .. . .. . . .. . .. . . .. . .. . . .. . .. . . .. . .. . .. . . .. . .. . . .. . .. . . .. . .. . . . M33 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M33 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . .

7 14 15 22 23 27 29 30 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 xxiii

xxiv

Fig. 2.23 Fig. 2.24 Fig. 2.25 Fig. 2.26 Fig. 2.27 Fig. 2.28 Fig. 2.29 Fig. 2.30 Fig. 2.31 Fig. 2.32 Fig. 2.33 Fig. 2.34 Fig. 2.35 Fig. 2.36 Fig. 2.37 Fig. 2.38 Fig. 2.39 Fig. 2.40 Fig. 2.41 Fig. 2.42 Fig. 2.43

Fig. 2.44 Fig. 2.45 Fig. 2.46 Fig. 2.47 Fig. 2.48 Fig. 2.49 Fig. 2.50 Fig. 2.51 Fig. 2.52 Fig. 2.53

List of Figures

M34 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M34 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M35 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M35 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M36 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M36 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M39 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M39 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M40 and M41 haplogroups phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M40 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M41 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M44 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M44 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M53 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M53 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M32 M56 haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M56 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M57 and M58 haplogroups phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M57 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M58 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . M43, M48, M50, M60, M61, M62 and M71 haplogroups phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M43, M60, M13, M61, M62 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M80 C0 Z, M9, M10, M11 haplogroups phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . M80 C0 Z (C4) haplogroup detailed phylogenetic tree . . . . . . M80 C0 Z (C7 and Z) haplogroup detailed phylogenetic tree . .. . .. . . .. . .. . . .. . .. . . .. . .. . . .. . .. . .. . . .. . .. . . .. . .. . . .. . .. . . . M9 haplogroup detailed phylogenetic tree . . . . . . . . . . . . . . . . . M10 and M11 haplogroups detailed phylogenetic tree . . . M120 G haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M120 G haplogroups detailed phylogenetic tree . . . . . . . . . . . . D haplogroup phylogenetic tree along with the communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D (D4) haplogroups detailed phylogenetic tree . . . . . . . . . . . .

47 47 48 49 50 51 52 53 54 55 56 56 57 57 58 59 59 60 61 62

63 64 65 66 67 68 69 70 71 72 73

List of Figures

xxv

Fig. 2.54 Fig. 2.55 Fig. 2.56

Fig. 2.57 Fig. 2.58 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 3.7 Fig. 3.8 Fig. 3.9 Fig. 3.10 Fig. 3.11 Fig. 3.12 Fig. 3.13 Fig. 3.14 Fig. 3.15 Fig. 3.16 Fig. 3.17 Fig. 3.18 Fig. 3.19 Fig. 3.20 Fig. 3.21 Fig. 5.1 Fig. 5.2 Fig. 5.3 Fig. 5.4

Fig. 5.5

D (D5) haplogroups detailed phylogenetic tree . . . . . . . . . . Percentage distribution of Andaman-specific mtDNA haplogroup M31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M310 32 haplogroups detailed phylogenetic tree. AND Andaman Island, AA Astro Asiatic, TB Tibeto-Burman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M420 74 haplogroups detailed phylogenetic tree. AA Astro Asiatic, TB Tibeto-Burman . . . . . . . . . . . . . . . . . . . . . Percentage distribution of M2 haplogroup among Indian tribal communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—U1 . .. . .. .. Phylogenetic tree of mtDNA N-haplogroup—U2a . . . . . . Phylogenetic tree of mtDNA N-haplogroup—U2b . . . . . . Phylogenetic tree of mtDNA N-haplogroup—U2c . . . . . . Phylogenetic tree of mtDNA N-haplogroup—U2e–U4 . . . Phylogenetic tree of mtDNA N-haplogroup—U5 . .. . .. .. Phylogenetic tree of mtDNA N-haplogroup—U7 . .. . .. .. Phylogenetic tree of mtDNA N-haplogroup—U8 . .. . .. .. Phylogenetic tree of mtDNA N-haplogroup—W . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—A . . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—N1–N5 . . . Phylogenetic tree of mtDNA N-haplogroup—HV . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—R1 . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—R2’JT . . . . Phylogenetic tree of mtDNA N-haplogroup—B4–B5 . . . Phylogenetic tree of mtDNA N-haplogroup—R7 . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—R6 . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—R5 . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—R8 . . . . . . . . Phylogenetic tree of mtDNA N-haplogroup—R30 . . . . . . Phylogenetic tree of mtDNA N-haplogroup—R32 . . . . . . Migratory history of the modern human by genomic studies .. . .. . .. . . .. . .. . . .. . .. . .. . . .. . .. . .. . . .. . .. . . .. . .. . .. . . .. Plausible Route of human migration within India . . . . . . . Multidimensional plot of mtDNA SNP’s for the linguistic families in the Indian population . . . . . . . . . . . Multidimensional plot of mtDNA SNP’s for different linguistic groups in the Indian population. (a) Multidimensional (MDS) plot of Indo-European Linguistic communities. (b) Multidimensional (MDS) plot of Tibeto-Burman Linguistic communities. (c) Multidimensional (MDS) plot of Astro-Asiatic Linguistic communities. (d) Multidimensional (MDS) plot of Dravidian Linguistic communities . . . . . . . . . . . . . . . . Multidimensional plot for mtDNA SNPs of the geographic categories in the Indian population . . . . . .

74 75

76 77 78 90 91 92 94 95 96 97 98 98 99 100 101 102 102 103 104 105 106 107 108 109 126 127 127

133 135

xxvi

Fig. 5.6

Fig. 5.7 Fig. 5.8

Fig. 6.1 Fig. 6.2 Fig. 6.3 Fig. 6.4

Fig. 6.5

Fig. 6.6

List of Figures

Multidimensional plot of mtDNA SNPs for the geographic categories in the Indian population. (a) Multidimensional (MDS) plot of eastern Indian communities. (b) Multidimensional (MDS) plot of central Indian communities. (c) Multidimensional (MDS) plot of north-east Indian communities. (d) Multidimensional (MDS) plot of western Indian communities. (e) Multidimensional (MDS) plot of South Indian communities. (f) Multidimensional (MDS) plot of Island residing communities Islands . . . . . Multidimensional plot of mtDNA SNPs for the Ethnic Groups in the Indian population . . . . . . . . . . . . . . Multidimensional plot of mtDNA SNPs for the Ethnic Groups in the Indian population. (a) Multidimensional (MDS) plot of Mongoloid communities. (b) Multidimensional (MDS) plot of Astroloid communities. (c) Multidimensional (MDS) plot of Europoid communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distribution of Y chromosome haplogroups percentages among Indian population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multi-dimension plot for Y-chromosome SNP for 71 Indian population .. . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . .. Multi-dimension plot of Y-chromosome SNP’s for the linguistic families in the Indian population . . .. . . . Multi-dimension plot showing comparison between Astro-Asiatic and other Linguistic communities for Y-chromosome SNPs. (a) Multi-dimension plot Astro-Asiatic communities. (b) Multi-dimension plot between Astro-Asiatic vs Indo-European. (c) Multi-dimension plot between Astro-Asiatic vs Tibeto-Burman. (d) Multi-dimension plot between Astro-Asiatic vs Tai-Kadia. (e) Multi-dimension plot between Astro-Asiatic vs Dravidian. (f) Astro-Asiatic vs Andamanese . . . . . . . . . . . . . . . . . . . . . . . . . . Multi-dimension plot showing comparison between Dravidian and other Linguistic communities for Y-chromosome SNPs. (a) Multi-dimension plot of Dravidian communities. (b) Multi-dimension plot between Dravidian vs Indo-European. (c) Multi-dimension plot between Dravidian vs Tibeto-Burman. (d) Multi-dimension plot between Dravidian vs Tai-Kadia. (e) Multi-dimension plot between Dravidian vs Andamanese . . . . . . . . . . . . . . . . . . Multi-dimension plot showing comparison between Indo-European and other Linguistic communities

136 139

140 146 152 152

153

156

List of Figures

xxvii

Fig. 6.7

Fig. 6.8

Fig. 6.9 Fig. 7.1 Fig. 7.2 Fig. 7.3 Fig. 7.4 Fig. 7.5 Fig. 7.6 Fig. 7.7 Fig. 7.8 Fig. 7.9 Fig. 7.10 Fig. 7.11 Fig. 7.12 Fig. 7.13 Fig. 7.14 Fig. 7.15 Fig. 7.16 Fig. 7.17 Fig. 7.18

for Y-chromosome SNP’s. (a) Multi-dimension plot of Indo-European communities. (b) Multi-dimension plot between Indo-European vs TibetoBurman. (c) Multi-dimension plot between Indo-European vs Tai Kadia. (d) Indo-European vs Andamanese . . . . . . . . . . . . . . 159 ... Multi-dimension plot showing comparison between Tibeto-Burman and Other Linguistic communities for Y-chromosome SNPs. (a) Multi-dimension plot of Tibeto-Burman communities. (b) Multi-dimension plot between Tibeto-Burman vs Tai-Kadia. (c) Multi-dimension plot between Tibeto-Burman vs Andamanese . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Y-chromosome phylogeny tree based on 71 communities (S Southern India, E Eastern India, N Northern India, NE North East India, W Western India, IA Andaman and Nicobar Island) . . .. . .. . .. .. . .. . .. . .. .. . .. 163 Principal component analysis for Y-chromosome SNP’s among 71 Indian population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Y Chromosomal haplogroups of Alu Kurumba . . . . . . . . . . 172 mtDNA phylogenetic tree of M-haplogroup among the Alu Kurumba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 mtDNA phylogenetic tree of N-haplogroup among the Allu Kurumba . .. . . . .. . . . . .. . . . . .. . . . .. . . . . .. . . . . .. . . . . .. 174 Mismatch distribution of nucleotide differences of Alu Kurumba Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Y chromosome haplogroups of Andh . . . . . . . . . . . . . . . . . . . . . 177 mtDNA phylogenetic tree of M-haplogroup among the Andh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 mtDNA phylogenetic tree of N-haplogroup among the Andh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Mismatch distributions of nucleotide differences of Andh Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 mtDNA phylogenetic tree of M-haplogroup among the Angami Naga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 mtDNA phylogenetic tree of N-haplogroup among the Angami Naga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Mismatch distributions of nucleotide differences of Angami Naga Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Y chromosome haplogroups of Betta Kuruba . . . . . . . . . . . . 187 mtDNA phylogenetic tree of M-haplogroup among the Betta Kuruba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 mtDNA phylogenetic tree of N-haplogroup among the Betta Kuruba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Mismatch distributions of nucleotide differences of Betta Kuruba community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Y chromosome haplogroups of Bharia . . . . . . . .. . . . . . . . . . . . 192 Y chromosome haplogroups of Bhoi Khasi . . . . . . . . . . . . . . 193 mtDNA phylogenetic tree of M-haplogroup among the Bhoi Khasi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

xxviii

Fig. 7.19 Fig. 7.20 Fig. 7.21 Fig. 7.22 Fig. 7.23 Fig. 7.24 Fig. 7.25 Fig. 7.26 Fig. 7.27 Fig. 7.28 Fig. 7.29 Fig. 7.30 Fig. 7.31 Fig. 7.32 Fig. 7.33 Fig. 7.34 Fig. 7.35 Fig. 7.36 Fig. 7.37 Fig. 7.38 Fig. 7.39 Fig. 7.40 Fig. 7.41 Fig. 7.42 Fig. 7.43 Fig. 7.44 Fig. 7.45 Fig. 7.46 Fig. 7.47

List of Figures

mtDNA phylogenetic tree of N-haplogroup among the Bhoi Khasi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distributions of nucleotide differences of Bhoi Khasi Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Bhoksa . . . . . . . . . . . . . .. . . . . Y chromosome haplogroups of Bhotia . . . . . . . . .. . . . . . . .. . . mtDNA phylogenetic tree of M-haplogroup among the Bhotia . . .. . . .. . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . . .. . . .. . mtDNA phylogenetic tree of N-haplogroup among the Bhotia . . .. . . .. . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . . .. . . .. . Mismatch distributions of nucleotide differences of Bhotia population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Bondo . . .. . .. . .. . .. . . .. . .. Y chromosome haplogroups of Chenchu . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Chenchu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Chenchu population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Damor . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Dhodia . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Dirang Monpa . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Dirang Monpa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Dirang Monpa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Dirang Monpa population . . . . . . .. . . . . . .. . . . . . . .. . . . . . .. . Y chromosome haplogroups of Dungri Bhil . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Dongri Bhill . . . . . .. . . . . . . . .. . . . . . . . . .. . . . . . . . .. . . . . . . . .. . mtDNA phylogenetic tree of N-haplogroup among the Dongri Bhill . . . . . .. . . . . . . . .. . . . . . . . . .. . . . . . . . .. . . . . . . . .. . Mismatch distribution of nucleotide differences of Dungri Bhil population . . . . .. . . . . . .. . . . . .. . . . . . .. . . . . . .. . Y chromosome haplogroups of Gadia Lohar . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Gadia Lohar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Gadia Lohar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Gadia Lohar Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Galong . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Gallong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Galong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Galong population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

195 197 198 199 200 201 203 203 205 206 208 208 209 211 212 213 215 216 217 218 220 220 222 223 225 226 227 228 230

List of Figures

xxix

Fig. 7.48 Fig. 7.49 Fig. 7.50 Fig. 7.51 Fig. 7.52 Fig. 7.53 Fig. 7.54 Fig. 7.55 Fig. 7.56 Fig. 7.57 Fig. 7.58 Fig. 7.59 Fig. 7.60 Fig. 7.61 Fig. 7.62 Fig. 7.63 Fig. 7.64 Fig. 7.65 Fig. 7.66 Fig. 7.67 Fig. 7.68 Fig. 7.69 Fig. 7.70 Fig. 7.71 Fig. 7.72 Fig. 7.73 Fig. 7.74 Fig. 7.75 Fig. 7.76 Fig. 7.77

Y chromosome haplogroups of Garhwali Brahmin . . . . . . Y chromosome haplogroups of Garhwali Rajput . . . . . . . . Y chromosome haplogroups of Ghorkha . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Kolam . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Kolam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Kolam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Kolam Population . .. .. . .. .. . .. .. . .. .. . .. .. . .. .. . .. .. . .. .. Y chromosome haplogroups of Hmar . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Irula . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Irular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Irular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Irular population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Jarawa . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Jarawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Jarawa population . . .. .. . .. . .. . .. . .. .. . .. . .. . .. .. . .. . .. . .. Y chromosome haplogroups of Jaunsari . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Jenu Kuruba . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Jenu Kuruba . . . .. . . . . .. . . . . . .. . . . . .. . . . . .. . . . . . .. . . . . .. . . mtDNA phylogenetic tree of N-haplogroup among the Jenu Kuruba . . . .. . . . . .. . . . . . .. . . . . .. . . . . .. . . . . . .. . . . . .. . . Mismatch distribution of nucleotide differences of Jenu Kuruba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . Y chromosome haplogroups of Ka Thakur . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Ka Thakur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Ka Thakur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Ka Thakur Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Kamar . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Kamar . . .. . .. .. . .. . .. .. . .. . .. .. . .. . .. .. . .. . .. .. . .. . .. .. . .. mtDNA phylogenetic tree of N-haplogroup among the Kamar . . .. . .. .. . .. . .. .. . .. . .. .. . .. . .. .. . .. . .. .. . .. . .. .. . .. Mismatch distribution of nucleotide differences of Kamar population . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . Y chromosome haplogroups of Kanikkar . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Karen . . . . . . . . . . . . . . . . . . . .

230 231 232 233 234 235 237 238 239 240 241 243 244 245 247 247 248 249 250 252 253 254 255 257 258 259 260 262 262 263

xxx

Fig. 7.78 Fig. 7.79 Fig. 7.80 Fig. 7.81 Fig. 7.82 Fig. 7.83 Fig. 7.84 Fig. 7.85 Fig. 7.86 Fig. 7.87 Fig. 7.88 Fig. 7.89 Fig. 7.90 Fig. 7.91 Fig. 7.92 Fig. 7.93 Fig. 7.94 Fig. 7.95 Fig. 7.96 Fig. 7.97 Fig. 7.98 Fig. 7.99 Fig. 7.100 Fig. 7.101 Fig. 7.102 Fig. 7.103 Fig. 7.104 Fig. 7.105

List of Figures

Y chromosome haplogroups of Kathodi . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Kathodi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Kathodi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Kathodi population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Katkari . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Katkari . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . mtDNA phylogenetic tree of N-haplogroup among the Katkari . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . Mismatch distribution of nucleotide differences of Katkari population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Kattunayakan . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Kattunayakan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Kattunayakan .. . .. . .. . .. . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . .. Y chromosome haplogroups of Kutia Kondh . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Kutia Kondh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Kutia Kondh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Kutia Kondh population .. . .. .. . .. . .. .. . .. .. . .. .. . .. .. . .. Y chromosome haplogroups of Konda Reddis . . . . . . . . . . . Y chromosome haplogroups of Koraga . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Koraga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Koraga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Koraga population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Korku . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Korku . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Korku . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Korku population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Kota . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Koya . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Lachungpa . . .. . . .. . .. . . .. mtDNA phylogenetic tree of M-haplogroup among the Lachungpa . .. . .. . . .. . .. . .. . . .. . .. . . .. . .. . .. . . .. . .. . . .. . ..

264 265 266 268 269 270 271 273 274 275 277 277 278 279 282 282 283 284 285 287 288 289 290 292 293 294 295 296

List of Figures

xxxi

Fig. 7.106 Fig. 7.107 Fig. 7.108 Fig. 7.109 Fig. 7.110 Fig. 7.111 Fig. 7.112 Fig. 7.113 Fig. 7.114 Fig. 7.115 Fig. 7.116 Fig. 7.117 Fig. 7.118 Fig. 7.119 Fig. 7.120 Fig. 7.121 Fig. 7.122 Fig. 7.123 Fig. 7.124 Fig. 7.125 Fig. 7.126 Fig. 7.127 Fig. 7.128 Fig. 7.129 Fig. 7.130 Fig. 7.131 Fig. 7.132 Fig. 7.133

mtDNA phylogenetic tree of N-haplogroup among the Lachungpa . .. . .. . . .. . .. . .. . . .. . .. . . .. . .. . .. . . .. . .. . . .. . .. Mismatch distribution of nucleotide differences of Lachungpa population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Lepcha . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Lepcha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Lepcha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Lepcha population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Ma Thakur . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Ma Thakur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Ma Thakur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Ma Thakur population . . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . Y chromosome haplogroups of Madia . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Madia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Madia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Madia Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Mal Paharia . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Mal Paharia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Mal Paharia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Mal Paharia population .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . . .. Y chromosome haplogroups of Mara . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Mathur . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Mina . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Melacheri . .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. . . . . . mtDNA phylogenetic tree of N-haplogroup among the Melacheri . .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. . . . . . Mismatch distribution of nucleotide differences of Melacheri population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Mizo . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Mullu Kurumba . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Mullu Kurumba . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Mullu Kurumba . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . .

297 299 300 301 302 304 304 305 306 309 310 311 312 314 314 315 316 318 319 319 320 322 323 325 325 326 327 328

xxxii

Fig. 7.134 Fig. 7.135 Fig. 7.136 Fig. 7.137 Fig. 7.138 Fig. 7.139 Fig. 7.140 Fig. 7.141 Fig. 7.142 Fig. 7.143 Fig. 7.144 Fig. 7.145 Fig. 7.146 Fig. 7.147 Fig. 7.148 Fig. 7.149 Fig. 7.150 Fig. 7.151 Fig. 7.152 Fig. 7.153 Fig. 7.154 Fig. 7.155 Fig. 7.156 Fig. 7.157 Fig. 7.158 Fig. 7.159 Fig. 7.160 Fig. 7.161

List of Figures

Mismatch distribution of nucleotide differences of Mullu Kurumba population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Munda . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Munda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Munda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Munda population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Nayaka . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Nicobarese . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Nicobarese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Nicobarese population . . . .. . . . . . . .. . . . . . .. . . . . . .. . . . . . .. . Y chromosome haplogroups of Nihal . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Nihal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Nihal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distributions of nucleotide differences of Nihal population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Nishi . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Padhar . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Padhar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Padhar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Padhar population . .. .. . .. .. . .. .. . .. .. . .. .. . .. .. . .. .. . .. .. Y chromosome haplogroups of Paite . . .. . .. . .. . .. . .. . .. . .. mtDNA phylogenetic tree of M-haplogroup among the Paite . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . . .. mtDNA phylogenetic tree of N-haplogroup among the Paite . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . . .. Mismatch distribution of nucleotide differences of Paite population .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . .. . . . Y chromosome haplogroups of Paniyan . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Paniyan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Paniyan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Paniyan population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Pauri Bhuinya . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Pauri Bhuinya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

330 331 332 333 335 335 336 337 340 340 341 342 344 345 346 347 348 350 351 352 353 355 355 357 358 360 360 361

List of Figures

xxxiii

Fig. 7.162 Fig. 7.163 Fig. 7.164 Fig. 7.165 Fig. 7.166 Fig. 7.167 Fig. 7.168 Fig. 7.169 Fig. 7.170 Fig. 7.171 Fig. 7.172 Fig. 7.173 Fig. 7.174 Fig. 7.175 Fig. 7.176 Fig. 7.177 Fig. 7.178 Fig. 7.179 Fig. 7.180 Fig. 7.181 Fig. 7.183 Fig. 7.182 Fig. 7.184 Fig. 7.185 Fig. 7.186 Fig. 7.187 Fig. 7.188 Fig. 7.189

mtDNA phylogenetic tree of N-haplogroup among the Pauri Bhuinya . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Pauri Bhuinya population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Porja . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Porja . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Porja . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Porja population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Rabha . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Raji . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Saharia . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Savaras . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Savaras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Savaras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Savaras population . .. . . .. . .. . . .. . .. . . .. . . .. . .. . . .. . .. . . .. Y chromosome haplogroups of Sherdukpen . . . . . . . . . .. . . . mtDNA phylogenetic tree of M-haplogroup among the Sherdukpen .. . . .. . . . .. . . .. . . .. . . . .. . . .. . . .. . . . .. . . .. . . .. . mtDNA phylogenetic tree of N-haplogroup among the Sherdukpen .. . . .. . . . .. . . .. . . .. . . . .. . . .. . . .. . . . .. . . .. . . .. . Mismatch distribution of nucleotide differences of Sherdukpen population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Soliga . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Soliga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Soliga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Sonowal Kachari . . . . . . . Mismatch distribution of nucleotide differences of Soliga population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Sonowal Kachari . . .. .. . .. . .. . .. . .. .. . .. . .. . .. .. . .. . .. . .. mtDNA phylogenetic tree of N-haplogroup among the Sonowal Kachari . . .. .. . .. . .. . .. . .. .. . .. . .. . .. .. . .. . .. . .. Mismatch distribution of nucleotide differences of Sonowal Kachari population . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Tai Ahom . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Tai Ahom . .. .. . .. . .. . .. . .. . .. .. . .. . .. . .. . .. .. . .. . .. . .. . .. mtDNA phylogenetic tree of N-haplogroup among the Tai Ahom . .. .. . .. . .. . .. . .. . .. .. . .. . .. . .. . .. .. . .. . .. . .. . ..

362 365 365 366 367 370 370 371 372 374 375 376 378 378 379 380 382 383 385 386 388 388 389 390 392 393 394 395

xxxiv

Fig. 7.190 Fig. 7.191 Fig. 7.192 Fig. 7.193 Fig. 7.194 Fig. 7.195 Fig. 7.196 Fig. 7.197 Fig. 7.198 Fig. 7.199 Fig. 7.200 Fig. 7.201 Fig. 7.202 Fig. 7.203 Fig. 7.204 Fig. 7.205 Fig. 7.207 Fig. 7.206 Fig. 7.208 Fig. 7.209 Fig. 7.210 Fig. 7.211 Fig. 7.212

List of Figures

Mismatch distribution of nucleotide differences of Tai Ahom population .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . Y chromosome haplogroups of Tai Khampti . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Tai Khampti . . . . . . . . . . . .. . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . mtDNA phylogenetic tree of N-haplogroup among the Tai Khampti . . . . . . . . . . . .. . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . Mismatch distribution of nucleotide differences of Tai Khampti population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Tharu . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Toda . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Toda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Toda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Toda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Toto . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Toto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Toto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Toto population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Urali Kuruman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of N-haplogroup among the Urali Kuruman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Wancho . . . . . . . . . . . . . . . . . . Mismatch distribution of nucleotide differences of Urali Kuruman population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . mtDNA phylogenetic tree of M-haplogroup among the Wancho .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . mtDNA phylogenetic tree of N-haplogroup among the Wancho .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . . . . . .. . . . Mismatch distribution of nucleotide differences of Wancho population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Yanadi . . . . . . . . . . . . . . . . . . . Y chromosome haplogroups of Yerukulas .. . .. . .. .. . .. . ..

397 398 399 400 402 403 404 405 406 408 409 410 411 413 414 415 417 417 418 419 421 422 423

List of Tables

Table 1.1 Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 2.5 Table 3.1 Table 4.1 Table 4.2 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table 5.5

Table 5.6

Table 5.7

Table 5.8

Sample size, linguistic affinity, geographic location, ethnic groups of 75 Indian populations studied . . . . . . . . . . . Percentage distribution of M- and N-haplogroups of 61 communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Average frequency distribution of mitochondrial haplogroups M and N by Geographical Region . . .. . .. . .. . Frequency and percentage of M-haplogroups by populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of molecular variance of the tribes in India . . . . . Diversity and age estimates for M-haplogroups in India . .. . .. . .. . .. . . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . .. . . .. Frequency and percentage of N-haplogroups by populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frequencies of 9-bp insertions/deletion among Indian populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HVR-I haplotypes and haplogroup affiliation of the 9-bp deletion/insertion samples . . . . . . . . . . . . . . . . . . . . . Alternative scenarios for the settlement of the Andaman Islands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of Molecular Variance (AMOVA) of mtDNA among Indian population . .. . . . . . . .. . . . . . . .. . . . . . .. . . . . . . .. . . Population-specific FST of mtDNA among Indian population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Community wise Molecular Diversity Indices . . . . . . . . . . . . Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population between the linguistic categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population as per linguistic categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population between the geographic categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population as per geographic categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 12 15 16 78 79 84 112 114 125 128 128 130

132

132

134

135 xxxv

xxxvi

Table 5.9

Table 5.10

Table 6.1 Table 6.2 Table 6.3 Table 7.1 Table 7.2 Table 7.3 Table 7.4 Table 7.5 Table 7.6 Table 7.7 Table 7.8 Table 7.9 Table 7.10 Table 7.11 Table 7.12 Table 7.13 Table 7.14 Table 7.15 Table 7.16 Table 7.17 Table 7.18 Table 7.19 Table 7.20 Table 7.21 Table 7.22 Table 7.23 Table 7.24 Table 7.25 Table 7.26 Table 7.27 Table 7.28 Table 7.29 Table 7.30 Table 7.31 Table 7.32 Table 7.33 Table 7.34 Table 7.35 Table 7.36

List of Tables

Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population as per ethnic groups . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population between the ethnic groups . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . AMOVA based on Y chromosome haplogroup frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Population-wise FST values and neutrality test for Y-chromosome SNP’s for 71 Indian Population . . . . . . . . . . Y chromosome haplogroups in YAP Frequency . . . . . . . . . . Molecular Diversity Indices among the Allu Kurumba . . Molecular Diversity Indices among the Andh . . . . . . . .. . . . . Molecular Diversity Indices among the Angami Naga . . . Molecular Diversity Indices among the Betta Kuruba . . . . Molecular Diversity Indices among the Bhoi Khasi . . . . . . Molecular Diversity Indices among the Bhotia . . . . . . . . . . . Molecular Diversity Indices among the Chenchu . . . . . . . . . Molecular Diversity Indices among the Dirang Monpa . . Molecular Diversity Indices among the Dongri Bhill . . . . Molecular Diversity Indices among the Gadia Lohar . . . . . Molecular Diversity Indices among the Galong . .. . . . . . .. . Molecular Diversity Indices among the Kolam . . . . . . . . . . . Molecular Diversity Indices among the Irular . . . . . . . . . . . . . Molecular Diversity Indices among the Jarawa . . . . . . . . . . . Molecular Diversity Indices among the Jenu Kuruba . . . . Molecular Diversity Indices among the Ka Thakur . . . . . . . Molecular Diversity Indices among the Kamar . . . . . . . . . . . Molecular Diversity Indices among the Kathodi . . . . . . . . . . Molecular Diversity Indices among the Katkari . . . . . . . . . . Molecular Diversity Indices among the Kattunayakan . . . Molecular Diversity Indices among the Kutia Kondh . . . . Molecular Diversity Indices among the Koraga . . . . . . . . . . . Molecular Diversity Indices among the Korku . . . . . . . . . . . . Molecular Diversity Indices among the Lachungpa . . . . . . Molecular Diversity Indices among the Lepcha . . . . . . . . . . . Molecular Diversity Indices among the Ma Thakur . . . . . . Molecular Diversity Indices among the Madia . . . . . . . . . . . . Molecular Diversity Indices among the Mal Paharia . . . . . Molecular Diversity Indices among the Malacherri . . . . . . . Molecular Diversity Indices among the Mullu Kurumba . .. . . .. . . . .. . . .. . . . .. . . .. . . . .. . . . .. . . .. . . . .. . . .. . . . .. . Molecular Diversity Indices among the Munda . . . . . . . . . . . Molecular Diversity Indices among the Nicobarese . . . . . . Molecular Diversity Indices among the Nihal . . . . . . . . . . . . . Molecular Diversity Indices among the Padhar . . . . . . . . . . . Molecular Diversity Indices among the Paite . . . . . . . . . . . . . Molecular Diversity Indices among the Paniyan . . . . . . . . . .

138

138 162 165 168 176 180 185 189 196 202 207 214 219 224 229 236 242 246 251 256 261 267 272 276 281 286 291 298 303 308 313 317 324 329 334 339 343 349 354 359

List of Tables

xxxvii

Table 7.37 Table 7.38 Table 7.39 Table 7.40 Table 7.41 Table 7.42 Table 7.43 Table 7.44 Table 7.45 Table 7.46 Table 7.47 Table 7.48

Molecular Diversity Indices among the Pauri Bhuinya . .. . . . . . .. . . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . Molecular Diversity Indices among the Porja . . . . . . . . . . . . . Molecular Diversity Indices among the Savaras . . . . . . . . . . Molecular Diversity Indices among the Sherdukpen . . . . . Molecular Diversity Indices among the Soliga . . . . . . . . . . . . Molecular Diversity Indices among the Sonowal Kachari . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Molecular Diversity Indices among the Tai Ahom . . . . . . . Molecular Diversity Indices among the Tai Khampti . . . . Molecular Diversity Indices among the Toda . . . . . . . . . . . . . Molecular Diversity Indices among the Toto . . . . . .. . . . . .. . Molecular Diversity Indices among the Urali Kuruman . . Molecular Diversity Indices among the Wanchoo . . . . . . . .

364 369 377 381 387 391 396 401 407 412 416 420

1

Introduction

People of India

Ethnic and Linguistic Diversity

India has over one billion population and one of the most populous nation in the world. Anthropological Survey of India through its ‘People of India’ project listed 4635 unique population groups from small villages to mega metros across the country (Singh 1994). India possesses a high degree of gene pool diversity because of its geographical location, history of foreign invasions and migrations since ancient times from other countries of Asia, Europe and perhaps Africa. Because of structured social and cultural practices of matrimony, this diversity was maintained by restricted gene flow through several thousand years. The genetic implications of which remain unexplored. Contemporary India manifests enormous cultural, linguistic, biological and genetic diversity, which can be primarily attributed to (a) its inimitable position at the tri-junction of the African, the northern Eurasian and the Oriental realm. The crossroads of many historic and pre-historic human migrations (Béteille 1998; Gadgil et al. 1997; Majumder 1998) and (b) the presence of large number (4635) of communities with unique population structure have been maintained over several generations (Singh 1994).

The ethnic diversity of the Indian subcontinent has represented by four major ethnic groups, distinguishable by their physical features, viz. Caucasoid (European), Proto-Australoid (Aboriginal Australian), Mongoloid (East Asian) and Negroid (African). Caucasoids are distributed in most of the regions, Proto-Australoids in the west, central and southern regions, Mongoloids in the sub-Himalayan and Northeast regions and the Negritos in the Andaman Islands (Malhotra 1978). The ethnic diversity of the subcontinent corresponds to the four major linguistic families, namely Austro-Asiatic, Dravidian, Indo-European and Tibeto-Burman, spoken by the Indian populations.

Tribal Populations Among the different population groups, the tribal populations, generally believed to be the earlier inhabitants, constitute about 8.6% of the total Indian population (Census of India 2011). The subcontinent has large number of tribal communities (about 532) of diverse ethnic origins (Australian, East Asian Eurasia and African)

# Springer Nature Singapore Pte Ltd. 2021 Anthropological Survey of India, Genomic Diversity in People of India, https://doi.org/10.1007/978-981-16-0163-7_1

1

2

belonging to the four linguistic families inhabiting different geographical regions of the country (Singh 1994). Several genetic diversities among the population groups of India were reported initially based on the variation in allele frequencies among four genetic loci (ABO, MN, HP and PGM1), in different regions of Asia (Mourant et al. 1976; Bhasin and Walter 2001).

The Anthropological Survey of India’s Initiative India with its rich genetic, ethnic, social, cultural, linguistic, ecological diversities, central geographical location and an extended period of human occupation initiated Anthropological Survey of India to take up a project, ‘DNA Polymorphism in Contemporary Indian Populations’ to trace out ancient successive human migrations, peopling of India and to understand variation in Indian gene pool. Anthropological Survey of India, an institution of its unique kind in the world, was focussed to study the bio-cultural attributes of the Indian population since its establishment in 1945. Over the decades, AnSI grew to acquire an unparalleled blend of holistic perspective to study the people of India. AnSI employed trained manpower, not only from cultural and biological anthropology but also from psychology, biochemistry, human ecology, folklore, linguistics and statistics to arrive at an interdisciplinary approach that holism demands. In recent times, AnSI went about an all-round modernization of its infrastructure to usher in the DNA technology to study the phylogenies of Indian population (based on the uni-parental inherited mitochondrial and Y-chromosome markers), the anthropological genetics of diseases and the frontier areas of Anthropology. Today, AnSI has well acknowledged whether for the coverage of all the 4635 unique populations covering the people of the country or for its publications in human genetic coverage of the populations in the country (Chap. 7).

1

Introduction

In short, the human diversity in India harbours one of the most ancient gene pools anywhere in the world, hidden in its unique set(s) of ethnic populations. That diversity of a population can be brought to light only by a specialist in cultural attributes as well as the biological parameters with the support of institutional expertise of the AnSI. All populations, anywhere in the world, are in a constant state of flux under various socioeconomic changes. The regime of values, norms and practices that govern the mating structures of biological populations in the country can only be discerned by an anthropologist, with the institutional support, otherwise the unique human diversity in the country could never be entirely understood. AnSI expertise in studying the biological variation of the Indian population using various genetic markers, i.e. blood groups (ABO, Rh, MN, Duffy and other systems), red cell enzymes and serum proteins. It also maintains a database of the studied samples based on the ethno-history of the population (Negi and Das 1963; Negi 1990; Negi et al. 1972, 1992; Singh et al. 1994; Srivastava 1975; Pawar et al. 2015). AnSI developed DNA laboratories at its three Regional Centres, namely the Central Regional Centre, Nagpur, Southern Regional Centre, Mysore and in its Head Office Kolkata with the satellite DNA laboratory at its other regional centres till PCR level. The initial infrastructure and work force development was carried out through the collaborative efforts between the AnSI and the Bhabha Atomic Research Centre (BARC), Trombay and the Centre for Cellular and Molecular Biology (CCMB), Hyderabad, to facilitate the study on DNA polymorphism in Indian Population. Some of the most extensive all India anthropological research of AnSI in the area of anthropometry were carried out through AIAS, AIBAS and nutritional profiles of the country through NSIP, genetic structure of Indian population (GSIP), besides the mega project on People of India (POI). The advent of PCR techniques in the 1980s along with the use of restriction enzymes and later the use of direct sequencing for identifying polymorphisms at the DNA level provided

The Anthropological Survey of India’s Initiative

anthropologist with new and more powerful techniques and markers for testing different anthropological hypothesis. Traditionally genetic markers are generally located in the functional regions of the genome, and therefore under effects of natural selection. This leads to restricted level of variation. DNA polymorphisms, especially those that are in the non-functional regions of the genome, are ‘neutral’ polymorphisms and therefore exhibit much greater levels of variation. This property makes DNA study extremely useful for population variation studies. Further, at the DNA level one can detect a much greater number of polymorphisms, which become easy to select polymorphisms in any region of the genome. This has facilitated the reconstruction of haplotypes, and the study of haplotype variation has been informative for population genetic studies. In recent times, enormous advances in the analysis of such genetic variation have made. If the era of anthropological research particularly the physical or biological anthropology in India has divided as past (pre-2003) and present (till date), important changes in the emphasis of anthropological research in India and around the world have been observed. In post 2003, anthropological research in India has changed both qualitatively and quantitatively as well. This pace accelerated several folds during this period, after the completion of sequencing of human genome and the opening up of Indian economy, development of biotechnology and its utility for investigating in anthropological research. These were development of new concepts for understanding the human variation. One of the most interesting findings of anthropological research in India today being the increasing participation of various scholars cutting across the disciplines to solve the societal problems. Traditional anthropometry has extensively been used in research on nutritional anthropometry, medicine for disorders, forensic medicines, community genetics, public health, etc. Anthropology has in the verge of becoming a purely applied science. Advance scientific and technical knowledge in the human genome

3

project help to understand the rich genetic resources that India possesses. India with diverse population has incredible scope to move forward in the field of genetic epidemiology and to develop effective strategies to prevent both genetic disorders and non-communicable complex lifestyle diseases like type 2 diabetes, obesity, hypertension, cardiac diseases, cancer and others. Mountain et al. (1995) attempted one of the first studies based on mtDNA variation among the Indian population. He studied the Havik Brahmins, the Mukri and the Kadar and published in the American Journal of Human Genetics in the year 1995. Later, diversity based on DNA studies was reported by Barnabas et al. (1996), Bamshad et al. (1996), Bhattachayya et al. (1999), Ramana et al. (2001), Sengupta et al. (2006), Cordaux et al. (2004a, b, c), Roychoudhury et al. (2000), Palanichamy et al. (2004), Basu et al. (2003), Thangaraj et al. (2005a, b), Kumar et al. (2006), Kayser et al. (2003), Kumar et al. (2007), Chandrasekar et al. (2007, 2009), QuintanaMurci et al. (2004), Krithika et al. (2009), Kumar et al. (2008, 2009), Barik et al. (2008), Rao et al. (2013), Khurana et al. (2014), Sylvester et al. (2018, 2019, 2020) and Isukapatla et al. (2019). To understand the complexity of Indian population structure, gene-language, gene-geography and gene-ethnicity were studied. Those studies have shown inverse correlation between genetic affinities and geographical distance (Malhotra and Vasulu 1993; Cordaux et al. 2003). Significant genetic differentiation between caste and tribal populations has been reported (Bamshad et al. 2001; Basu et al. 2003; Cordaux et al. 2004a, b, c), as against a model which suggests that there is considerable sharing of Pleistocene heritage among them with a limited gene flow (Kivisild et al. 2003a, b). Similarly, similarities between language and genes have been proposed by various scholars (Majumder 1998; CavalliSforza et al. 1992; Roychoudhury et al. 2001) along with a competing view agreeing that genetic affinities may not necessarily be dependent on linguistic similarities (Chaubey et al. 2008a, b; Kshatriya et al. 2011; Sharma et al.

4

2012). Recent study suggested that the presentday linguistic affiliation of any Indian population has to be considered with caution while reconstructing the demographic history of the country (Khurana et al. 2014). Further, a study reveals that there are four dominant ancestries in populations within mainland India, contrary to two ancestries inferred earlier by Reich et al. (2009) and also showed that there has been a distinctive ancestry of the Andaman and Nicobar Islands population that has been ancestral to the Oceanic populations (Basu et al. 2016). However, earlier studies among Indian population have contributed significantly for understanding the role of language, culture and geography in relation to the genetic affiliation and demographic history of Indian populations; a major limitation of these studies has been restricted to few tribal population belonging to specific linguistic families or specific geographic area. This has a critically important limitation since the Indian populations are organized into 4635 communities (Singh 1997), which include self-defined castes, tribes and religious groups. Therefore, the present study provides many opportunities for examining the influence of linguistic, geographic and ethnic assimilation, on the genomic diversity of India with the following objectives: • To understand the genomic variation of Indian population. • To reconstruct evolutionary history of human in India using molecular evidences. • To create a database pertaining to genomic diversity of various populations of India. • To trace the migration history of Indian population through mtDNA and Y-chromosome haplotypes.

1

Introduction

Keeping in view of the above facts, AnSI has studied 75 communities with 7807 blood samples from different parts of the country under the project ‘DNA Polymorphism in Contemporary Indian Population’ (Table 1.1). The Indian population has organized into 4635 communities (Singh 1997), which include self-defined castes, tribes and religious groups. About 450 tribes constitute about 8.6% by 2011 census of the total Indian population. They speak more than 750 dialects, which can broadly be classified into AustroAsiatic, Dravidian, Tibeto-Burman and Indo-European language families. The tribes are endogamous in nature and socio-culturally distinct. They inhabit mostly the forests and hilly terrain areas. Government of India has notified 75 tribes as the PVTGs among the original inhabitants of India. Out of 75 PVTGs, we studied 25 communities of India (Fig. 1.1), inhabiting the northern, southern, western, eastern and central regions of India representing four major linguistic families, namely Austro-Asiatic, Dravidian, Indo-European and Tibeto-Burman besides two Tai-speaking communities in northeastern part of India. Full sequences of 2124 mtDNA and Y-SNPs of 1907 sequences from 74 communities have been analysed in the present study. The Institutional Ethics Committee of the Anthropological Survey of India approved the project DNA Polymorphism of the Contemporary Indian Population. Written consent was obtained from the participants, and 5 ml of venous blood was drawn in K2 EDTA vacutainer (BD make) by trained and certified phlebotomists following standard procedure from healthy and unrelated individuals. The samples were transported from field sites to the Survey’s laboratories.

The Anthropological Survey of India’s Initiative

5

Table 1.1 Sample size, linguistic affinity, geographic location, ethnic groups of 75 Indian populations studied Sl. No. 1 2 3

Name of the community Alu Kurumba Andh Angami Naga

Community code AKB AD AN

Sample size 93 115 114

4 5

Betta Kuruba Bharia

BK BHR

115 88

6 7

Bhoi Khasi Bhoksa

BHK BOK

120 98

8

Bhotia

BOH

89

9 10

Bondo Chenchu

BND CNH

129 96

11 12 13

Damor Dhodia Dirang Monpa

DMR DHD DMP

120 95 100

14 15

Dungri Bhil Yerukulas

DB EKL

118 94

16 17

Gadia Lohar Adi Gallong

GLR GL

118 108

18

GBM

100

19 20

Garhwali Brahmin Garhwali Rajput Ghorkha

GRJ GRK

100 101

21 22

Kolam Hmar

HK HMR

123 115

23 24

Irular Jarawa

IRL ANI

88 10

25

Jaunsari

JAN

94

26 27

Jenu Kuruba Kamar

JK KM

114 111

28 29

Kanikkar Karen

KNI KRN

94 92

30 31 32 33 34 35

Ka Thakur Kathodi Katkari Kattunayakan Kutia Kondh Kondareddis

KU KD KR KTN KKD KRY

220 120 50 37 91 94

36

Koraga

KRG

69

Linguistic affinity Dravidian Indo-European TibetoBurman Dravidian Indo-European Austro-Asiatic TibetoBurman TibetoBurman Austro-Asiatic Dravidian Indo-European Indo-European TibetoBurman Indo-European Dravidian Indo-European TibetoBurman Indo-European Indo-European TibetoBurman Dravidian TibetoBurman Dravidian Andamanese TibetoBurman Dravidian Indo-European Dravidian TibetoBurman Indo-European Indo-European Indo-European Dravidian Dravidian Dravidian Dravidian

Geographic location South Central North East

Ethnic group Australoid Australoid Mongoloid

Karnataka Madhya Pradesh Meghalaya Uttarakhand

South Central

Australoid Australoid

North East North

Mongoloid Mongoloid

Uttarakhand

North

Mongoloid

Odisha Andhra Pradesh Rajasthan Gujarat Arunachal Pradesh Gujarat Andhra Pradesh Rajasthan Arunachal Pradesh Uttarakhand

East South

Australoid Australoid

West West North East

Australoid Australoid Mongoloid

West South

Europoid Australoid

West North East

Europoid Mongoloid

North

Caucasoid

Uttarakhand Uttarakhand

North North

Caucasoid Mongoloid

Maharashtra Manipur

Central North East

Australoid Australoid

Tamil Nadu Andaman Islands Uttarakhand

South Island

Australoid Negrito

North

Caucasoid

South Central

Australoid Australoid

South Island

Negroid Mongoloid

West West West South East South

Europoid Europoid Europoid Negroid Australoid Australoid

South

Australoid (continued)

State Tamil Nadu Maharashtra Nagaland

Karnataka Madhya Pradesh Kerala Andaman Islands Gujarat Gujarat Maharashtra Tamil Nadu Odisha Andhra Pradesh Karnataka

6

1

Introduction

Table 1.1 (continued) Sl. No. 37 38 39

Name of the community Korku Kota Koya

Community code KK KTA KYD

Sample size 110 91 88

Linguistic affinity Austro-Asiatic Dravidian Dravidian

40

Lachungpa

LC

104

41

Lepcha

LP

109

42 43 44 45

Madia Malpaharia Melacheri Mara

MA ML MCH MAR

140 114 96 104

46 47 48 49

Ma Thakur Mathur Mina Mizo

MT MTH MEN MZO

121 51 122 104

50 51 52 53

Mullu Kurumba Munda Nayaka/Naikda Nicobarese

MKB MN NYK NIC

86 102 72 114

TibetoBurman TibetoBurman Dravidian Indo-European Indo-European TibetoBurman Indo-European Indo-European Indo-European TibetoBurman Dravidian Austro-Asiatic Indo-European Austro-Asiatic

54 55

Nihal Nishi

NI NSH

107 115

56 57

Padhar Paite

PDR PTI

91 105

58 59 60

Paniyan Pauri Bhuinya Porja

PNN PB PRJ

86 120 147

61 62

Rabha Raji

RBA RAJ

120 97

63

Saharia

SHR

113

Indo-European TibetoBurman Austro-Asiatic

64

Savaras

SAV

148

Austro-Asiatic

65

Sherdukpen

ST

103

66 67

Soliga Sonowal Kachari

SLG SK

96 112

68 69

Tai Ahom Tai Khampti

THM TKM

103 120

70 71 72

Tharu Toda Toto

THR TDA TOT

144 96 102

TibetoBurman Dravidian TibetoBurman Tai TibetoBurman Tai Dravidian TibetoBurman

Indo-European TibetoBurman Indo-European TibetoBurman Dravidian Austro-Asiatic Dravidian

State Maharashtra Tamil Nadu Andhra Pradesh Sikkim

Geographic location Central South South

Ethnic group Australoid Australoid Australoid

North East

Mongoloid

Sikkim

North East

Mongoloid

Maharashtra Jharkhand Lakshadweep Mizoram

Central East Island North East

Australoid Australoid Europoid Mongoloid

Maharashtra Rajasthan Rajasthan Mizoram

Central Western West North East

Europoid Europoid Australoid Mongoloid

Tamil Nadu Jharkhand Gujarat Nicobar Island Maharashtra Arunachal Pradesh Gujarat Manipur

South East West Island

Australoid Australoid Australoid Australoid

Central North East

Australoid Mongoloid

West North East

Europoid Mongoloid

South East South

Australoid Australoid Australoid

North East North

Mongoloid Mongoloid

Central

Australoid

South

Australoid

North East

Mongoloid

South North East

Australoid Mongoloid

North East North East

Mongoloid Mongoloid

North South East

Caucasoid Mongoloid Mongoloid

Kerala Odisha Andhra Pradesh Assam Uttarakhand Madhya Pradesh Andhra Pradesh Arunachal Pradesh Karnataka Assam Assam Arunachal Pradesh Uttarakhand Tamil Nadu West Bengal

(continued)

The Anthropological Survey of India’s Initiative

7

Table 1.1 (continued) Sl. No. 73 74

Name of the community Urali Kuruman Wancho

Community code UK WA

75

Yanadi

YND

Sample size 100 125 96

Linguistic affinity Dravidian TibetoBurman Dravidian

State Kerala Arunachal Pradesh Andhra Pradesh

Fig. 1.1 Map showing the geographical distribution of studied 75 population

Geographic location South North East

Ethnic group Australoid Mongoloid

South

Australoid

8

References Bamshad M, Fraley A, Crawford M, Cann R, Busi B, Naidu J, Jorde L (1996) mtDNA variation in caste populations of Andhra Pradesh, India. Hum Biol 68 (1):1–28 Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker CE et al (2001) Genetic evidence on the origins of Indian caste populations. Genome Res 11(6):994–1004 Barik SS, Sahani R, Prasad BVR, Endicot P, Metspalu M et al (2008) Detailed mtDNA genotype permit a reassessment of the settlement and population structure of the Andaman Islands. Am J Phys Anthropol 136:19–27 Barnabas S, Apte RV, Suresh G (1996) Ancestry and interrelationships of the Indians and their relationship with other world populations: a study based on mitochondrial DNA polymorphisms. Ann Hum Genet 60:409–422 Basu A, Mukherjee N, Roy S, Sengupta S, Banerjee S et al (2003) Ethnic India: a genomic view, with special reference to peopling and structure. Genome Res 13 (10):2277–2290 Basu A, Roy NS, Majumder PP (2016) Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. Proc Natl Acad Sci U S A 113:1594–1599 Béteille A (1998) The idea of indigenous people. Curr Anthropol 39(2):187–192 Bhasin MK, Walter H (2001) Genetics of castes and tribes of India. Kamla-Raj Enterprises, Delhi Bhattachayya NP, Basu P, Das M, Pramanik S, Banerjee R et al (1999) Negligible male gene flow across ethnic boundaries in India, revealed by analysis of Y-chromosomal DNA polymorphisms. Genome Res 9:711–719 Cavalli-Sforza LL, Minch E, Mountain JL (1992) Coevolution of genes and languages revisited. Proc Natl Acad Sci U S A 89:5620–5624 Census of India (2011) Office of the Registrar General and Census Commission. Government of India, New Delhi Chandrasekar A, Saheb SY, Gangopadyaya P, Gangopadyaya S, Mukherjee A et al (2007) YAP insertion signature in South Asia. Ann Hum Biol 34:582–586 Chandrasekar A, Kumar S, Sreenath J, Sarkar BN, Urade BP et al (2009) Updating phylogeny of mitochondrial DNA macrohaplogroup M in India: dispersal of modern human in south Asian corridor. PLoS One 4(10): e7447. https://doi.org/10.1371/journal.pone.0007447 Chaubey G, Karmin M, Metspalu E, Metspalu M, SelviRani D, Singh VK et al (2008a) Phylogeography of mtDNA haplogroup R7 in the Indian peninsula. BMC Evol Biol 8:227. https://doi.org/10.1186/1471-2148-8227 Chaubey G, Metspalu M, Karmin M, Thangaraj K, Rootsi S et al (2008b) Language shift by indigenous population: a model genetic study in South Asia. Int J Hum Genet 8(1–2):41–50

1

Introduction

Cordaux R, Saha N, Bentley GR, Aunger R, Sirajuddin SM et al (2003) Mitochondrial DNA analysis reveals diverse histories of tribal populations from India. Eur J Hum Genet 11:253–264 Cordaux R, Aunger R, Bentley GR, Nasidze I, Sirajuddin SM et al (2004a) Independent origins of Indian caste and tribal paternal lineages. Curr Biol 14(3):231–235 Cordaux R, Deepa R, Vishwanathan H, Stoneking M (2004b) Genetic evidence for the demic diffusion of agriculture to India. Science 304:1125 Cordaux R, Weiss G, Saha N, Stoneking M (2004c) The Northeast Indian passageway: a barrier or corridor for human migrations? Mol Biol Evol 21(8):1525–1533 Gadgil M, Joshi NV, Shambu Prasad UV, Manoharan S, Patil S (1997) Peopling of India. In: Balasubramanian D, Appaji NR (eds) The Indian human heritage. Universities Press, Hyderabad, India, pp 100–129 Isukapatla AR, Sinha M, Pulamagatta V, Chandrasekar A, Ahirwar B (2019) Genetic architecture of Southeastcoastal Indian tribal populations: a Y-chromosomal phylogenetic analysis. Egypt J Forensic Sci 9:30 Kayser M, Brauer S, Weiss G, Schiefenhövel W, Underhill P, Shen P, Oefner P, Tommaseo-Ponzetta M, Stoneking M (2003) Reduced Y-chromosome, but not mitochondrial DNA, diversity in human populations from West New Guinea. Am J Hum Genet 72 (2):281–302 Khurana P, Aggarwal A, Mitra S, Italia YM, Saraswathy KN et al (2014) Y Chromosome haplogroup distribution in Indo-European speaking tribes of Gujarat, Western India. PLoS One 9(3):e90414. https://doi. org/10.1371/journal.pone.0090414 Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K et al (2003a) The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. Am J Hum Genet 72:313–332 Kivisild T, Rootsi S, Metspalu M, Metspalu E, Parik J et al (2003b) Genetics of the language and farming spread in India. In: Renfrew C, Boyle K (eds) Examining the farming/language dispersal hypothesis, McDonald Institute monographs series. McDonald Institute for Archaeological Research, Cambridge, pp 215–222 Krithika S, Maji S, Vasulu T (2009) A microsatellite study to disentangle the ambiguity of linguistic, geographic, ethnic and genetic influences on tribes of India to get a better clarity of the antiquity and peopling of South Asia. Am J Phys Anthropol 139:533–546 Kshatriya GK, Aggarwal A, Khurana P, Italia YM (2011) Genomic congruence of Indo-European speaking tribes of Western India with Dravidian-speaking populations of southern India: a study of 20 autosomal DNA markers. Ann Hum Biol 38:583–591 Kumar V, Banrida TL, Biswas S et al (2006) Asian and non-Asian origins of Mon-khmer and Mundari speaking Austro-Asiatic population of Indian American. J Hum Biol 18:461–469 Kumar V, Reddy ANS, Babu JP, Rao TN, Langstieh BT, Thangaraj K, Reddy AG, Singh L, Reddy BM (2007) Y-chromosome evidence suggests a common paternal

References heritage of Austro-Asiatic populations. BMC Evol Biol 7:47 Kumar S, Padmanabham PB, Ravuri RR, Uttaravalli K, Koneru P et al (2008) The earliest settlers’ antiquity and evolutionary history of Indian populations: evidence from M2 mtDNA lineage. BMC Evol Biol 8:230 Kumar S, Ravuri RR, Koneru P, Urade BP, Sarkar BN et al (2009) Reconstructing Indian-Australian phylogenetic link. BMC Evol Biol 9:173. https://doi.org/10.1186/ 1471-2148-9-173 Majumder PP (1998) People of India: biological diversity and affinities. Evol Anthropol 6:100–110 Malhotra KC (1978) Morphological composition of the people of India. J Hum Evol 7:45–63 Malhotra KC, Vasulu TS (1993) Structure of human populations in India. In: Majumdar PP (ed) Human population genetics: a centennial tribute to J.B.S. Haldane. Plenum, New York, pp 207–233 Mountain JL, Hebert JM, Bhattacharyya S, Underhill PA, Ottolenghi C, Gadgil M, Cavalli-Sforza LL (1995) Demographic history of India and mtDNA-sequence diversity. Am J Hum Genet 56(4):979–992 Mourant AE, Kipec AC, Domanjewska-Sobezak K (1976) The distribution of human blood groups and other polymorphisms. Oxford University Press, London Negi RS (1990) ABO blood groups in the North-Western India; a regional round up. In: Human variations in India. Anthropological Survey of India, Kolkata, pp 71–81 Negi RS, Das A (1963) The blood groups (ABO, MN, Rh) ABH secretor in saliva and colour blindness in the Rajputs of Western U.P. and Dholpur. Bull Anthropol Surv India 16:221–231 Negi RS et al (1972) Distribution of ABO blood groups in Central and Western Himalayan populations. Bull Anthropol Surv India 21(3–4):57–76 Negi RS et al (1992) Distribution of ABO blood groups in Central and Western Himalayan population. In: Raha MK (ed) Himalayas and Himalayans. Anthropological Survey of India, Calcutta Palanichamy M, Sun C, Agrawal S, Bandelt H-J, Kong Q-P et al (2004) Phylogeny of mitochondrial DNA macrohaplogroup N in India based on complete sequencing: implications for the peopling of South Asia. Am J Hum Genet 75:966–978 Pawar S, Mukherjee K, Venugopal PN et al (2015) Distribution of blood groups among Jad Bhotia of Uttarakhand – a transhuman community of Tibetan origin. Indian J Genet Mol Res 4(2):47–52 Quintana-Murci L, Chaix R, Wells S, Behar D, Sayar H et al (2004) Where West meets East: the complex mtDNA landscape of the Southwest and Central Asian corridor. Am J Hum Genet 74(5):827–845 Ramana GV, Su B, Jin L, Singh L, Wang N et al (2001) Y-chromosome SNP haplotypes suggest evidence of gene flow among caste, tribe, and the migrant Siddi populations of Andhra Pradesh, South India. Eur J Hum Genet 9:695–700 Rao AI, Venugopal PN, Chandrasekar A (2013) Genetic continuity of anatomically modern human between

9 India and Island Southeast Asia ISEA: last glacial dispersal of mtDNA lineage N22. Int J Res Advent Technol 1(5):300–305 Reich D, Thangaraj K, Patterson N, Price AL, Singh L (2009) Reconstructing Indian population history. Nature 461(7263):489–494 Roychoudhury S, Roy S, Dey B, Chakraborty M, Roy M et al (2000) Fundamental genomic unity of ethnic India is revealed by analysis of mitochondrial DNA. Curr Sci 79(9):1182–1192 Roychoudhury S, Roy S, Basu A, Banerjee R, Vishwanathan H et al (2001) Genomic structures and population histories of linguistically distinct tribal groups of India. Hum Genet 109:339–350 Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA et al (2006) Polarity and temporality of high-resolution Y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of central Asian pastoralists. Am J Hum Genet 78(2):202–221 Sharma G, Tamang R, Chaudhary R, Singh VK, Shah AM et al (2012) Genetic affinities of the Central Indian tribal populations. PLoS One 7(2):e32546 Singh KS (1994) People of India: the scheduled tribes, vol 3. Oxford University Press, Delhi Singh KS (ed) (1997) The scheduled tribes. Oxford University Press, Oxford Singh KS, Bhalla V, Kaul V (1994) The biological variation in Indian Populations. Oxford University Press, Delhi Srivastava AC (1975) ABO, MNS and Rh blood groups among Sayyads and Pathans of Lucknow. In: Rakshit HK (ed) Bio-anthropological Research in India, Proceedings of Seminar on Physical anthropology and allied disciplines. Anthropological Survey of India, Calcutta Sylvester C, Krishna MS, Rao JS, Chandrasekar A (2018) Neolithic phylogenetic continuity inferred from complete mitochondrial DNA sequences in a tribal population of Southern India. Genetica 146:383–389 Sylvester C, Krishna MS, Rao JS, Chandrasekar A (2019) Maternal genetic link of a south Dravidian tribe with native Iranians indicating bidirectional migration. Ann Hum Biol 46(2):175–180 Sylvester C, Krishna MS, Rao JS, Chandrasekar A (2020) Y-chromosome marker characterization of Epipaleolithic and Neolithic groups of Southern India. Proc Natl Acad Sci India Sect B Biol Sci 90:425–430 Thangaraj K, Chaubey G, Kivisild T, Reddy AG, Singh VK et al (2005a) Reconstructing the origin of Andaman Islanders. Science 308:996 Thangaraj K, Sridhar V, Kivisild T, Reddy AG, Chaubey G et al (2005b) Different population histories of the Mundari and Mon-Khmer speaking Austro-Asiatic tribes inferred from the mt-DNA 9-bp deletion/insertion polymorphism in Indian populations. Hum Genet 116:507–517

2

Mitochondrial DNA Phylogeny in Indian Population

To construct maternal phylogeny and dispersal of modern human being in the Indian subcontinent out of 5497 samples from 61 tribal communities, a diversified subset of 2124 complete mtDNA sequence representing 48 tribal communities have been chosen on the basis of whole-genome resolution of mtDNA. The study characterized 61 number of haplogroups based on their specific coding region mutations.

MtDNA M-Macrohaplogroup Out of the screened 5497 mtDNA genomes from 61 tribes of India, the macrohaplogroups M and N accounted for 68.2% and 31.8%, respectively (Table 2.1 and Fig. 2.1). In the mainland India haplogroup M ranges from 40% to 100%, except in the Toda of Nilgiri Hills where it has about 20%. The highest average (77%) of haplogroup M has been observed in Southern Indian populations. The Nicobarese of Nicobar island harbour 100% of haplogroup N, whereas it has been absent in the Jarawa tribe of Andaman Islands, the Kattunayakan of Tamil Nadu, and the Chenchu of Andhra Pradesh. The frequency distribution of macrohaplogroup M and N varies significantly (P < 0.02) among studied tribes with a cline towards southern and eastern regions of India as shown in Table 2.2, Fig. 2.2. In the present study, 51 haplogroups have been notified and the phylogenetic status of previously identified haplogroups has been clearly defined

based on control region and/or coding region information from 61 tribal population-based data set (Table 2.3). Out of 1500 complete mtDNA sequence, the phylogeny tree of haplogroup M in India has been drawn based on 789 (677 from our study and 112 from earlier studies (Sun et al. 2006; Tanaka et al. 2004; Kong et al. 2003, 2011; Thangaraj et al. 2005a, b, 2006, 2008, 2009; Kivisild et al. 2006; Ingman et al. 2000; Reddy et al. 2007; Herrnstadt et al. 2002; van Holst Pellekaan et al. 2006; Palanichamy et al. 2006; Fornarino et al. 2009; Nagle et al. 2017; Gunnarsdóttir et al. 2011; Jinam et al. 2012; Summerer et al. 2014; Schönberg et al. 2011)) complete mtDNA sequences, shown in Figs. 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 2.10, 2.11, 2.12, 2.13, 2.14, 2.15, 2.16, 2.17, 2.18, 2.19, 2.20, 2.21, 2.22, 2.23, 2.24, 2.25, 2.26, 2.27, 2.28, 2.29, 2.30, 2.31, 2.32, 2.33, 2.34, 2.35, 2.36, 2.37, 2.38, 2.39, 2.40, 2.41, 2.42, 2.43, 2.44, 2.45, 2.46, 2.47, 2.48, 2.49, 2.50, 2.51, 2.52, 2.53, 2.54, 2.55, 2.56, and 2.57. In the said suffixes A, C, G and T indicate transversions, ‘d’ signifies a deletion and a plus sign (+) an insertion; recurrent mutations are underlined. The prefix ‘@’ indicates back mutation. From the present dataset, under the haplogroup M2, the percentage distribution of M2 haplogroup among Indian tribal communities has been given in Fig. 2.58. Sub-haplogroup M2a and M2b phylogenetic tree has been shown in Fig. 2.4, and detailed phylogenetic tree has been constructed for the samples of Andh, Betta

# Springer Nature Singapore Pte Ltd. 2021 Anthropological Survey of India, Genomic Diversity in People of India, https://doi.org/10.1007/978-981-16-0163-7_2

11

12

2

Mitochondrial DNA Phylogeny in Indian Population

Table 2.1 Percentage distribution of M- and N-haplogroups of 61 communities Sl. No 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 44 31 32 33 34 35 36 37 38 39 40 41 42 43 45 46 47 48 49

Region Island (Nicobar) Island (Andaman) Island (Nicobar) Island (Lakshadweep) S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India S. India C. India C. India C. India C. India C. India C. India C. India W. India W. India W. India W. India W. India W. India W. India W. India W. India E. India E. India E. India E. India E. India N.E. India N.E. India N.E. India N.E. India

Population Nicobarese Jarawa Karen Melacheri Kani Irula Mullu Kurumba Kattunayakan Kota Alu Kurumba Urali Paniyan Toda Soliga Betta Kuruba Jenu Kuruba Koraga Yanadi Erakula Konda Reddi Koya Dora Savara Porja Chenchu Maida Hill Kolam Andh Ma Thakur Nihal Korku Kamar Ka Thakur Katkari Padhar Dhodia Naika Dungribhil Kathodi Gadialohar Mathur Kutia Khondh Pauri Bhuinya Munda Mal Paharia Bharia Lepcha Lachungpa Toto Bhoi Khasi

M 0 10 14 22 82 58 70 37 67 75 62 69 20 34 87 112 33 87 75 69 61 90 53 91 107 90 69 71 33 22 76 73 23 74 56 41 78 62 71 16 73 86 78 85 46 64 83 82 71

N 124 0 34 46 3 21 13 0 8 13 38 23 76 34 17 3 36 4 18 21 21 15 35 0 25 32 15 50 41 8 31 48 23 17 37 29 30 62 29 14 18 40 22 26 38 20 21 6 24

Total 124 10 48 68 85 79 83 37 75 88 100 92 96 68 104 115 69 91 93 90 82 105 88 91 132 122 84 121 74 30 107 121 46 91 93 70 108 124 100 30 91 126 100 111 84 84 104 88 95

M% 0.00 100.00 29.17 32.37 96.47 73.42 84.34 100.00 89.33 85.23 62.00 75.00 20.83 50.00 83.65 97.39 47.83 95.60 80.65 76.67 74.39 85.71 60.23 100.00 81.06 73.77 82.14 58.68 44.59 73.33 71.03 60.33 50.00 81.32 60.22 58.57 72.22 50.00 71.00 53.33 80.22 68.25 78.00 76.58 54.76 76.19 79.81 93.18 74.70

N% 100.0 0.00 70.83 67.63 3.53 26.58 15.66 0.00 10.67 14.77 38.00 25.00 79.17 50.00 16.35 2.61 52.17 4.40 19.35 23.33 25.61 14.29 39.77 0.00 18.94 26.23 17.86 41.32 55.41 26.67 28.97 39.67 50.00 18.68 39.78 41.43 27.78 50.00 29.00 46.67 19.78 31.75 22.00 23.42 45.24 23.81 20.19 6.82 25.20 (continued)

MtDNA M-Macrohaplogroup

13

Table 2.1 (continued) Sl. No 50 51 52 53 54 55 56 57 58 59 60 61

Region N.E. India N.E. India N.E. India N.E. India N.E. India N.E. India N.E. India N.E. India N.E. India N.E. India N. India N. India

Population Tai Ahom Sonowal Kachari Tai Kampti Shatukpen Nishi Wanchoo Adi Gallong Dirang Monpa Paite Angami Naga Bhotia Jaunsari Grand Total

Kuruba, Dungri Bhill, Kolkam, Jenu Kuruba, Kathodi, Korku, Kamar, Katkari, Ka Thakur, Madia, Malpharia, Munda, Ma Thakur, Munda, Nihal and Pauri Bhuinya communities (Figs. 2.3, 2.4, 2.5). The study samples indicate that M2 has completely been absent among the eight tribes of northeast India. Excluding northeast tribes, the M2 haplogroup frequency is about 14.09% among the studied communities. Its frequency is ~10–50% in tribes of western and central India. The frequency declines gradually to farther north and east. In southern region tribes, the UraliKuruman shows the highest frequency (i.e. 100.00%) whereas the frequency among the adjacent Toda, Alu Kurumba, Betta Kuruba, Jenu Kuruba and Irular tribal communities is 75%, 74%, 57%, 6% and 5%, respectively. The distribution of subclade M2b varies greatly from complete absence among Indo-European speakers of western and central India to as high as 35.65% among Betta Kuruba. Irrespective of region, its frequency is high (>50% of total M2) in all Dravidian speakers, except the Madia tribe of central region whose linguistic affiliation remains unclear. Similarly, M2 frequency has been high in Korku, an Austro-Asiatic tribe of central India. In eastern region M2b frequency remains low (0.1 to 0 indicates sudden population contraction. In the context of the human evolution, these differences of mtDNA sequences reveal multimodal mismatch distributions in some of the hunter gathers and unimodal distribution in post-Neolithic populations. It reveals from the present study that out of 48 populations majority of them shows the value of Tajima’s D < 0, suggesting a recent population expansion after a bottleneck, while 3 communities, namely the Jarawa, Betta Kuruba and Konda Reddy, suggesting recent population contraction or bottleneck effect (Table 5.4). The P value across the different linguistic groups is shown in Table 5.5, which is highly significant at 1% level among all the groups

129

except between the Tai Kadai and the AustroAsiatic language group. The FST values of different linguistic group are also mentioned in Table 5.6. The Andamanese show high correlation of 0.03658, followed by Indo-European (0.03512), Tai Kadai (0.0349), Austro-Asiatic (0.03467), Tibeto-Burman (0.03458) and Dravidian (0.3381). The FST values between the linguistic groups are calculated to see the correlations among the various groups. The Andamanese are highly correlated with all other linguistic groups, and the least correlation is noticed between the Austro-Asiatic and Dravidian groups. Based on the language, the different communities that are studied have been categorized with the major linguistic groups such as the Indo-European, the Dravidian, the Austro-Asiatic, the Tibeto-Burman, The Tai Kadai and the Andamanese and the mitochondrial DNA variants of these linguistic groups have been run into the multidimensional scaling (MDS) and found that the Dravidian and the Austro-Asiatic show a close affinity; these two groups are separated from the Indo-European linguistic group, which is falling apart within the scale. While the Tibeto-Burman and the Tai Kadai linguistic groups are closely linked to each other which share a common geographical area also. The Andamanese linguistic group fall quite apart without having any links with other linguistic groups, which clearly indicates they belong to a separate group (Fig. 5.3). The MDS plot within the Indo-European linguistic group show a close relationship between the Dungri Bhill and the Kamar; the KaThakur and the Katkari which is slightly separated with the Mal Paharia, whereas the Ma Thakur, the Kathodi, the Nihal and the Andh belong to the same line in a cluster. The Gadia Lohar and the Padhar share a common zone but maintain a wide distance. The island population, i.e. the Melacheri of Lakshadweep, fall quite away from all these communities, which clearly indicates that they have a separate origin (Fig. 5.4a). Figure 5.4b depicts the clustering of various Tibeto-Burman groups, of which the Sonowal Kacharia, the Wanchoo and the Lepcha share a common zone. In terms of close affinity the

22

21

16 17 18 19 20

11 12 13 14 15

Community Alu Kurumba Andh Angami Naga Betta Kuruba Bhoi Khasi Bhotia Chenchu Dirang Monpa Dungri Bhil Gadia Lohar Gallong Hill Kolam Irula Jarawa Jenu Kuruba Ka Thakur Kamar Kathodi Katkari Kattu Nayakan Kutia Khond Konda Reddy

Nucleotide diversity 0.00157  0.0008

0.001445  0.0007 0.002216  0.001

0.00114  0.0005

0.00213  0.001 0.00117  0.0007 0.00166  0.0008 0.00158  0.0007

0.00113  0.0005 0.0019  0.0009

0.001696  0.0008 0.0017  0.0009 0.001449  0.0007 0.00096  0.0005 0.00142  0.0007

0.0014  0.0007 0.001380  0.0006 0.00139  0.0007 0.00147  0.0007 0.000843  0.0004

0.00229  0.001

0.00144  0.0007

Pairwise differences 21.510  9.83

23.950  10.89 36.7126  16.08

18.8873  8.60

26.5731  11.93 16.8000  8.74 20.76923  9.76 26.25806  11.82

18.7884  8.48 25.2728  11.24

28.0971  12.56 29.4545  13.86 23.19743  10.42 16.0222  7.81 23.64429  10.50

23.92029  10.89 22.86666  10.21 23.04210  10.58 24.35672  11.20 13.96396  6.41

37.85413  16.60

23.2930  10.47 0.01183

0.00045

0.01065 0.01937 0.01437 0.00392 0.04062

0.00271 0.01717 0.00469 0.20836 0.01244

0.00399 0.00212

0.00163 0.0481 0.01241 0.02263

0.08035

0.00631 0.00039

0.39

0.95

0.4 0 0.11 0.92 0.69

0.66 0.44 0.79 0.05 0.03

0.6 0.81

0.89 0.7 0.7 0

0

0.21 0.87

Sum of squared deviation SSD P 0.0185 0.53

0.00847

0.00044

0.01697 0.02533 0.02146 0.00721 0.04647

0.00326 0.03053 0.00385 0.2222 0.0077

0.00764 0.00207

0.00196 0.11555 0.03526 0.03533

0.09715

0.00853 0.00056

0.51

1

0.2 0 0.15 0.92 0.74

0.8 0.62 0.9 0.36 0.02

0.33 0.91

0.99 0.62 0.27 0

0

0.51 0.99

Harpending’s raggedness index HRI P 0.0089 0.8

23.29303  11.64

37.85413  18.39

23.92029  12.14 22.86666  11.33 23.04210  11.82 24.35672  12.51 13.963964  7.12

28.09716  13.95 29.4545  15.61 23.1974  11.57 16.0222  8.84 23.64429  11.63

18.7884  9.42 25.27288  12.47

26.5731  13.26 16.800  10.09 20.7692  10.96 26.2580  13.14

18.8873  9.57

23.9500  12.13 36.712  17.81

Pi 21.5108  10.95

Theta

25.9754  8.70

20.70646  6.36

101.284  24.98

28.38557  9.47 27.31983  7.70 32.4150  11.23 30.3284  10.66 12.93549  4.12

53.6909  15.84 35.7630  14.02 30.79785  9.19 10.2510  4.47 24.98966  6.55

29.8145  8.78 40.530  11.06

55.1210  16.62 20.5839  9.99 26.4139  10.04 40.9709  12.69

13.8830  4.59

38.6656  12.66 67.6589  16.27

S

0.4587

2.139

0.62498 0.5762 1.1912 0.8184 0.1675

1.7793 0.8231 0.9095 2.69924 0.18164

1.3494 1.329

1.9623 1.1765 0.9502 1.3723

1.3495

1.5103 1.5312

0.742

0.001

0.267 0.302 0.095 0.198 0.617

0.014 0.195 0.189 1 0.516

0.068 0.07

0.004 0.108 0.17 0.065

0.939

0.052 0.028

Tajima’s D D P 0.6814 0.285

20.782

24.072

8.946 24.1192 6.5796 5.7086 24.3599

18.963 1.8492 22.603 2.278 24

24.2027 24.097

16.3193 0.2191 3.6297 14.0261

16.0146

9.63172 23.981

Fu’s Fs Fu’s Fs 9.728

0

0

0.005 0 0.011 0.015 0

0 0.102 0 0.096 0.001

0 0

0 0.27 0.048 0

0

0.004 0.001

P 0

5

9 10

5 6 7 8

4

2 3

Sl. No 1

Table 5.4 Community wise Molecular Diversity Indices

130 Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

48

43 44 45 46 47

38 39 40 41 42

31 32 33 34 35 36 37

29 30

23 24 25 26 27 28

Korku Lachungpa Lepcha Ma Thakur Madia Mal Paharia Melacheri Mullu Kurumba Munda Nicobarese Nihal Padhar Paite Paniyan Paudi Bhuiya Porja Savara Sherdukpen Soliga Sonowal Kachari Tai Ahom Tai Khamti Toda Toto Urali Kuruba Wancho

0.00175  0.0008 0.00174  0.0008 0.00153  0.0007 0.00105  0.0005 0.00134  0.0006 0.00181  0.0009

0.000751  0.0004 0.00092  0.0004

0.00190  0.0009 0.00130  0.0006 0.001587  0.0007 0.00166  0.0008 0.00236  0.001 0.00136  0.0006 0.00179  0.0008

0.00210  0.001 0.00189  0.0009 0.00131  0.0006 0.00184  0.0009 0.00166  0.0008

0.00215  0.001 0.0007  0.0003 0.00084  0.0004 0.00099  0.0005 0.00069  0.0003

0.001851  0.0009

29.09523  3.22 28.97333  13.11 25.40000  11.63 17.40952  8.20 22.22000  10.12 30.04575  13.77

9.63809  4.60 12.6537  5.78

31.4789  14.07 21.6029  10.02 26.294  11.84 27.57003  12.18 35.90909  16.22 22.5543  10.02 29.67867  13.27

34.89341  15.33 31.40072  13.81 21.71428  10.15 30.58029  13.64 27.50877  12.60

35.6567  15.72 11.1029  5.30 13.93982  6.31 16.51851  7.58 9.80303  4.83

30.52173  13.83 0.0053

0.00198 0.01531 0.03905 0.01919 0.013

0.00059 0.00646 0.0758 0.00288 0.02105

0.00697 0.05698 0.01049 0.00321 0.00454 0.01824 0.01185

0.02865 0.11887

0.01357 0.00534 0.05857 0.02169 0.00601 0.0462

0.71

0.51 0.34 0.6 0.19 0.92

0.84 0.13 0 0.97 0.02

0.04 0 0.11 0.59 0.57 0.02 0.02

0.21 0

0.17 0.57 0 0.56 0.52 0.01

0.00917

0.00175 0.0459 0.05325 0.01807 0.01767

0.0005 0.00112 0.13941 0.00223 0.05536

0.01047 0.0426 0.01688 0.00175 0.00796 0.00957 0.00585

0.04544 0.00682

0.01328 0.00523 0.10612 0.0273 0.00755 0.09457

0.64

0.41 0.09 0.41 0.31 0.99

1 1 0 0.98 0.01

0 0.11 0.02 0.77 0.58 0.04 0.4

0.1 1

0.38 0.93 0 0.59 0.74 0

30.5217  15.43

35.6567  17.43 11.10294  5.94 13.9398  6.99 16.51851  8.44 9.80303  5.43

34.8934  16.99 31.40072  15.30 21.71428  11.38 30.58029  15.16 27.50877  14.08

31.47899  15.64 21.60294  11.22 26.294  13.18 27.57003  13.49 35.90909  18.09 22.5543  11.11 29.67867  14.74

9.63809  5.13 12.6537  6.41

29.09523  14.76 28.97333  14.61 25.40000  12.99 17.40952  9.19 22.2200  11.28 30.04575  15.40

47.4150  15.74

117.379  30.96 12.71915  4.79 15.9717  4.26 16.18931  5.37 15.8946  6.43

87.1483  21.76 72.5934  17.68 22.7582  8.56 41.8647  12.43 38.6254  13.47

61.9201  18.63 20.7055  7.58 44.555  13.86 55.2109  13.86 73.9674  24.35 32.2340  8.31 59.8865  17.82

15.28737  5.42 19.88043  5.54

43.0684  14.48 37.8711  12.41 25.36826  8.86 17.53003  6.67 30.7206  10.13 37.5048  13.26

1.446

2.4695 0.5271 0.4196 0.07746 1.756

2.0563 1.915 0.1997 1.0025 1.2013

1.865 0.1828 1.575 1.70491 2.08756 1.0174 1.8985

1.477 1.2518

1.3188 0.932 0.00513 0.02973 1.093 0.8391

0.047

0 0.316 0.398 0.609 0.02

0.005 0.007 0.473 0.157 0.106

0.012 0.614 0.037 0.013 0.004 0.127 0.013

0.037 0.089

0.088 0.187 0.557 0.536 0.149 0.207

6.7772

24.169 8.6291 24.1954 15.637 5.0174

24.05 23.962 3.999 17.7895 5.1538

14.322 5.111 13.232 23.986 5.9091 24.001 16.566

13.8493 24.36

6.488 8.264 6.072 4.819 10.216 4.305

0.013

0 0.003 0 0 0.01

0 0 0.036 0 0.02

0 0.023 0 0.001 0.016 0 0.001

0 0

0.006 0.005 0.009 0.016 0.002 0.021

Molecular Diversity Indices 131

132

5

Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

Table 5.5 Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population between the linguistic categories Linguistic Catagories TibetoBurman IndoEuropean Dravidian AustroAsiatic

TibetoIndoAustroDravidian Burman European Asiatic P 0.0000 0.0000 0.0000 FST P 0.0442 0.0000 0.0000 FST P 0.0424 0.0289 0.0000 FST 0.0293

0.0226

0.0198

Andamanese

P FST

Andamanese

0.1612

0.1796

0.1357

0.1503

Taikadai

0.0204

0.0475

0.0310

0.0272

Taikadai

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.02703 P

FST 0.1926

0.0000 P FST

Table 5.6 Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population as per linguistic categories Sl. No 1 2 3 4 5 6

Linguistic Tibeto-Burman Indo-Europian Dravidian Astro-Asiatic Andamanese Tai Kadai

Sonowal Kacharia and the Wanchoo are closely related. The Dirang Monpa, the Lachungpa and the Angami Naga lie in one cluster. While the Shertukpen falls apart. Similarly, The Toto, The Gallong and the Paite belong to a cluster which are widely separated each other. Figure 5.4c explores the clustering of various Austro-Asiatic groups among the Bhoi Khasi, the Pauri Bhuiya, the Munda, the Nicobarese, the Khutia Khond, the Savara and the Korku. The Pauri Bhuiya and Munda share very close affinity with each other. The Bhoi Khasi separates apart with the other two communities, i.e. the Pauri Bhuiya and the Munda. The Khutia Khond, the Savara and the Korku come under one cluster. The Nicobarese are distanced apart from all these communities. The MDS plot drawn among the Dravidian populations is shown in Fig. 5.4d. This figure indicates that the Jenu Kuruba, the Kattu

FST 0.03458 0.03512 0.03381 0.03467 0.03658 0.0349

Nayakan, the Irula and the Chenchu are in close association and maintains a wide distance with the Mullu Kurumba. Nevertheless, the Paniyan and the Madia fall in the same cluster. The Betta Kuruba and the Hill Kolam fall in the opposite zone of the above-mentioned communities. The Koraga, the Alu Kurumba and the Urali Kuruman lie in a category and are widely separated from each other. The Porja, the Soliga and the Toda, though come under one cluster, are distanced away with others. F-statistics gives a measure of correlation between the geographical zones and subdivided population. The FST values of different geographically area have been shown in Table 5.7. The P value across all the zones is found to be highly significant at 1% level. The FST values are slightly high in North (0.04890) followed by Central (0.04806), Island (0.04782), West (0.4775), Northeast (0.04715)

Molecular Diversity Indices

133

a Bohikashi

0.6

Dimension 2

Pauribuhia

0.3

Munda Nicobaries

0.0

Khutiakhond

–0.3

Savara Korku

–0.6 –4

–3

–2

1

0

–1 Dimension 1

2

b 2

Mullukuruman

Bettakuruba

Dimension 2

Paniyan

Madia

1 0

Katunayakan Jenukuruba

Hillkolam Chenchu

Koraga Alukurumba

Irula Porja

Uralikuruman

Soliga

–1

–2

Toda

–3 –2

–3

–1

1

0 Dimension 1

2

c Padar Dongribhil

0.6

Dimension 2

Kamar

0.3 Gadialohar Kathodi Mathakur Nihal Andh

0.0 Mallichari

–0.3

Kathakur Katkari Malpharia

–0.6

–5

–4

–3

–2 Dimension 1

–1

0

1

Fig. 5.4 Multidimensional plot of mtDNA SNP’s for different linguistic groups in the Indian population. (a) Multidimensional (MDS) plot of Indo-European Linguistic communities. (b) Multidimensional (MDS) plot of Tibeto-Burman Linguistic communities. (c) Multidimensional (MDS) plot of Astro-Asiatic Linguistic communities. (d) Multidimensional (MDS) plot of Dravidian Linguistic communities

134

5

Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

d Lepcha

1 Dimension 2

Toto

Sonwalkacharia Gallong Paite

0

Wanchoo Dirangmanpa Lachungpa

Shertukpen Bhotia

–1 Angaminaga

–2 –4

–3

–2

–1 Dimension 1

0

1

2

Fig. 5.4 (continued)

Table 5.7 Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population between the geographic categories

Geography North South Central Northeast

North South Central Northeast P 0.0000 0.0000 0.0000 FST P 0.0000 0.0000 0.0239 FST P 0.0000 0.0685 0.0501 FST P 0.0072 0.0384 0.0650 FST

West

East

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

0.0000

P

West

0.0509

0.0312

0.0355

0.0414

East

0.0179

0.0314

0.0235

0.0408

0.0316

Island

0.1917

0.1034

0.2371

0.1337

0.1769

and the East (0.04706). Across the geographical zones the FST values are also depicted in Table 5.8. Description on neighbouring distances among various populations based on mitochondrial variations in different geographical areas and linguistic groups. As described elsewhere, the different geographical areas of the Indian subcontinent have been bifurcated as different zones, the North, the Northeast, the East, the West, the Central, the South and the Island. The communities who are residing in these geographical zones had been categorized accordingly, and

FST

Island

P FST 0.1557

0.0000 P FST

a multidimensional scaling graph is drawn to see the close affinity across the zones. Figure 5.5 shows the genetic distance of various communities across the different geographical zones. The Northeast and the North Indian communities are found to have very close relationship in comparison to other zones. Similarly, the populations of the West and the Central zones have close affinity. While the Island population are totally isolated and falling in a separate region which are quite apart from all the other zones. The Southern and the Eastern populations are falling in the median zone which separates the north-

Molecular Diversity Indices

135

Table 5.8 Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population as per geographic categories Sl. No 1 2 3 4 5 6 7

Geography North South Central Northeast West East Island

Fig. 5.5 Multidimensional plot for mtDNA SNPs of the geographic categories in the Indian population

FST 0.04890 0.04626 0.04806 0.04715 0.04775 0.04706 0.04782

North

1.0

Northeast

Dimension 2

0.5 South

0.0

East

Island

–0.5

Central West

–1.0 –3

–2

–1

0

1

2

Dimension 1

east, the north and the west and the central regions. Figure 5.6a shows the distribution of different communities residing in the eastern part of the country. As seen in the MDS plot of linguistic categories, hitherto the populations are widely distributed, such as the Khutia Khond, the Pauri Bhuiya, the Munda and the Mal Paharia. As observed in Fig. 5.6a (Austro-Asiatic group), wherein the Pauri Bhuiya and the Munda share a close association between them in the same cluster, a similar association is also seen in this plot also. Figure 5.6b depicts the representation of various communities in the central India. The communities are the Madia, the Kamar, the Nihal, the Andh, the Ma Thakur, the Korku and the Hill Kolam. The Nihal and the Andh the korku and the Hill Kolam show a close association, respectively. The Madia and the Kamar fall

in a cluster with slightly distanced apart. The Ma Thakur falls in a separate zone. Figure 5.6c depicts the region-wise distribution of different communities in the North East of India and its association within them. The Wanchoo, The Dirang Monpa, the Lepcha and the Lachungpa form a cluster and are distanced with each other. The Bhoi Khasi, The Thai Ahom, The Paite, The Angami Naga and the Shetukpen are in a cluster, shows their association among themselves, and the Bhoi Khasi and the Thai Ahom are closely linked between them. The Sonowal Kacharia and the Thai Khamti form one more cluster and are distanced apart. Similarly, the Toto and the Gallong also form a group without having closeness. Figure 5.6d describes the distribution of various communities in the Western region of the country. Except the Katkari and Ka Thankur all the other communities like the Dungri Bhill, the

136

5

Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

a Khutiakhond

0.75

Dimension 2

0.50 0.25 0.00

Malpharia

Pauribuhia

–0.25 Munda

–0.50 –2

–1

0

1

Dimension 1

b Madia

1.0

Kamar

0.5 Dimension 2

Korku

Nihal

Hillkolam

0.0

Andh

–0.5 –1.0 Mathakur

–1.5 –2.0 –3

0 –1 Dimension 1

–2

1

2

c Lepcha

1.5 Toto

Dimension 2

1.0 Wanchoo Dirangmanpa

Gallong

0.5

Lachungpa paite Bohikhasi Senwalkacharia

0.0

Taiahom

–0.5

Taikhamti

–1.0

Angaminaga Shertukpen

–1.5 –4

–3

–2

–1 Dimension 1

0

1

2

Fig. 5.6 Multidimensional plot of mtDNA SNPs for the geographic categories in the Indian population. (a) Multidimensional (MDS) plot of eastern Indian communities. (b) Multidimensional (MDS) plot of central Indian communities. (c) Multidimensional (MDS) plot of north-east Indian communities. (d) Multidimensional (MDS) plot of western Indian communities. (e) Multidimensional (MDS) plot of South Indian communities. (f) Multidimensional (MDS) plot of Island residing communities Islands

Molecular Diversity Indices

d

137

1.5

Dongribhil

Dimension 2

1.0 0.5 Kathodi

0.0

Padar

Katkari Kathakur

–0.5 –1.0 Gadialohar

–1.5 –1.5

–1.0

0.0

–0.5

1.0

0.5

1.5

Dimension 1

e 2

Mullukuruman Irula Jenukuruba Chenchu

1

Dimension 2

Bettakuruba

Alukuumba Koraga

0 Uralikuruman

Paniyan

Katunayakan Porja

Savara

–1

Soliga

–2

Tota

–3 –3

–2

f 1.0

1

–1 0 Dimension 1

2

Mallichari Jarwa

Dimension 2

0.5 0.0 –0.5 –1.0

Nicobaries

–1.5 –1.5

Fig. 5.6 (continued)

–1.0

–0.5

0.0 Dimension 1

0.5

1.0

1.5

138

5

Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

Padar, the Kathodi and the Gadia Lohar are distanced apart. Figure 5.6e shows the distribution of different communities in the Southern region of India and their association among them. The Kattu Nayakan, The Jenu Kuruba, the Irula, the Chenchu and the Paniyan form a cluster of which the Jenu Kuruba and the Kattu Nayakan share a close association between them. While, the Porja, the Savara, the Soliga and the Toda maintain a group, in which the Porja and the Savara form a close group. The Koraga, the Alu Kurumba and the Urali Kurman fall in a zone which is distanced each other. The Betta Kuruba deviates from all the other communities and falls in a separate zone. Figure 5.6f shows the distribution of Island populations, namely the Jarwa, the Nicobarese of Andaman Islands and the Melacheri of Lakshadweep Island. All the three communities clearly show their distinct identities and are widely distributed with each other, which suggest their different stock of origins.

Table 5.9 represents the FST values of different ethnic groups. The FST value is found to be high among the Negroid (0.03865), followed by the Europoid (0.03744), the Mongoloid (0.03721) and the Austroloid (0.03688). The FST value indicates their magnitude of correlationship between each other. The significance of P value across the four ethnic groups has been statistically found to be highly significant at 1% level (Table 5.10). Figure 5.7 gives a clear understanding about the distribution and its close affinity of different ethnic groups. The ethnic groups are widely distributed apart in different clusters with an exception with the Austroloid and Europoid, who forms a cluster, though distanced each other. Figure 5.8a indicates the distribution of different communities of the Mongoloid group. As seen from the figure it is noticed that there is a close association between the Sonowal Kacharia, the Dirang Monpa and the Thai Khamti. However, the Bhotia, the Wachoo and the Lepcha also fall within the same cluster. Similarly, The Bhoi Khasi, the Paite and the Thai Ahom indicate their

Table 5.9 Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population as per ethnic groups Sl. No 1 2 3 4

Ethnic groups Astroloid Mongoloid Europoid Negroid

Table 5.10 Analysis of molecular variance (AMOVA) FST of mtDNA among Indian population between the ethnic groups

Ethnic

Astroloid Mongoloid Europoid P 0.00000 0.00000 Astroloid FST P 0.00000 Mongoloid 0.03150 FST P 0.04997 Europoid 0.03188 FST Negroid

0.12810

0.16248

0.20188

Negroid 0.00000 0.00000 0.00000 FST

P

FST 0.03688 0.03721 0.03744 0.03865

Molecular Diversity Indices

139

0.8

Mongoloid

0.6

Dimension 2

0.4 0.2 0.0 Negroid

Astroloid

–0.2

Europoid –0.4

–3

–2

–1

0

1

2

Dimension 1

Fig. 5.7 Multidimensional plot of mtDNA SNPs for the Ethnic Groups in the Indian population

commonness among them showing a distance relationship with the Gallong, the Shertukpen and the Toto, respectively. Noticeably the Nicobarese of Andaman and Nicobar Island stood apart from all other studied mongoloid communities of mainland India, which suggests that Nicobarese have a different origin. Figure 5.8b exhibits the distribution of communities which belong to Austroloid group. The Savara, the Munda and the Pauri Bhuiya show very close affinity and fall in the same point of the plot. The cluster is also shared by the Madia and the Kamar, the Nihal, the Andh, the Chenchu, the Malpaharia the Paniyan and the Mullu Kurumba. Another cluster is being distributed with the Porja, the Irula, the Jenu Kuruba, the Kattu Nayakan and the Soliga. The Hill Kolam, the Khutia Khond, the Alu Kurmba

and the Urali Kuruman share within a cluster but are widely separated. The Betta Kuruba and Korku form a group with a considerable amount of distance. Figure 5.8c depicts the distribution of communities belonging to the Europoid group. Except the Ka Thakur and the Katkari all other communities like the Ma Thakur, the Dungri Bhill, the Kathodi, the Padar, the Gadia Lohar are distanced each other within the cluster, while the Melacheri community belonging to an island and the Toda population inhabiting the hilly area of the Nilgiris are separated widely among all other Europoid groups. Furthermore, the plot also indicates that there is a huge distance even between the Toda and the Melacheri communities itself.

140

5

Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

a 3 Toto

Dimension 2

2

1

Shertukpen Gallong Paite Angaminaga Taiahom Bohikhasi

Nicobaries

0

Bhotia Lepcha Wanchoo Taikhamti Lachungpa SonwalKacharia Dirangmanpa

–1 –5

–4

–3

–2

–1

0

1

Dimension 1

b 3 Mullukuruman

Dimension 2

2

1

Madia Kamar Chenchu Paniyan Malpharia Andh Nihal Korku Pauribuhia Munda Savara Hillkolam Porja Irula Koraga Khutiakhond

Bettakuruba

0 Uralikuruman

Alukuumba

Jenukuruba

–1

Katunayakan

Soliga

–2 –4

c

–2

0 Dimension 1

2

2 Toda

Dimension 2

1 Mathakur Dongribhil

0 Kathakur Kathodi Padar Gadialohar Katkari

–1 Mallichari

–2 –1

0

1

2

3

Dimension 1

Fig. 5.8 Multidimensional plot of mtDNA SNPs for the Ethnic Groups in the Indian population. (a) Multidimensional (MDS) plot of Mongoloid communities. (b)

Multidimensional (MDS) plot of Astroloid communities. (c) Multidimensional (MDS) plot of Europoid communities

References

References Achilli A, Rengo C, Battaglia V, Pala M, Olivieri A et al (2005) Saami and Berbers-an unexpected mitochondrial DNA link. Am J Hum Genet 76:883–886 Al-Abri A, Podgorná E, Rose JI et al (2012) PleistoceneHolocene boundary in Southern Arabia from the perspective of human mtDNA variation. Am J Phys Anthropol 149(2):291–298 Atkinson QD, Gray RD, Drummond AJ (2008) mtDNA variation predicts population size in humans and reveals a major Southern Asian chapter in human prehistory. Mol Biol Evol 25:468–474 Bamshad MJ, Watkins WS, Dixon ME, Jorde LB, Rao BB et al (1998) Female gene flow stratifies Hindu castes. Nature 395:651–652 Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker CE et al (2001) Genetic evidence on the origins of Indian caste populations. Genome Res 11(6):994–1004 Barik SS, Sahani R, Prasad BVR, Endicot P, Metspalu M et al (2008) Detailed mtDNA genotype permit a reassessment of the settlement and population structure of the Andaman Islands. Am J Phys Anthropol 136:19–27 Behar DM, Metspalu E, Kivisild T, Rosset S, Tzur S et al (2008a) Counting the founders: the matrilineal genetic ancestry of the Jewish Diaspora. PLoS One 3(4):2062 Behar DM, Villems R, Soodyall H, Blue-Smith J, Pereira L et al (2008b) The dawn of human matrilineal diversity. Am J Hum Genet 82(5):1130–1140 Cabrera VM, Marrero P, Abu-Amero KK et al (2018) Carriers of mitochondrial DNA macrohaplogroup L3 basal lineages migrated back to Africa from Asia around 70,000 years ago. BMC Evol Biol 18:98 Cann RL (2001) Genetic clues to dispersal in human populations: retracing the past from the present. Science 291:1742–1748 Chandrasekar A, Saheb SY, Gangopadyaya P, Gangopadyaya S, Mukherjee A et al (2007) YAP insertion signature in South Asia. Ann Hum Biol 34:582–586 Chandrasekar A, Kumar S, Sreenath J, Sarkar BN, Urade BP et al (2009) Updating phylogeny of mitochondrial DNA macrohaplogroup M in India: dispersal of modern human in south Asian corridor. PLoS One 4(10): e7447. https://doi.org/10.1371/journal.pone.0007447 Chaubey G, Karmin M, Metspalu E, Metspalu M, SelviRani D, Singh VK et al (2008a) Phylogeography of mtDNA haplogroup R7 in the Indian peninsula. BMC Evol Biol 8:227. https://doi.org/10.1186/1471-2148-8227 Chaubey G, Metspalu M, Karmin M, Thangaraj K, Rootsi S et al (2008b) Language shift by indigenous population: a model genetic study in South Asia. Int J Hum Genet 8(1–2):41–50 Clark VJ, Sivendren S, Saha N, Bentley GR, Aunger R et al (2000) The 9-bp deletion between the mitochondrial lysine tRNA and COII genes in tribal populations of India. Hum Biol 72:273–285

141 Coble MD, Just RS, O’Callaghan JE et al (2004) Single nucleotide polymorphisms over the entire mtDNA genome that increase the power of forensic testing in Caucasians. Int J Legal Med 118(3):137–146 Cordaux R, Saha N, Bentley GR, Aunger R, Sirajuddin SM et al (2003) Mitochondrial DNA analysis reveals diverse histories of tribal populations from India. Eur J Hum Genet 11:253–264 Derenko M, Malyarchuk B, Denisova G, Wozniak M, Grzybowski T, Dambueva I, Zakharov I (2007) Y-chromosome haplogroup N dispersals from south Siberia to Europe. J Hum Genet 52:763–770 Derenko M, Malyarchuk B, Denisova G et al (2012) Complete mitochondrial DNA analysis of eastern Eurasian haplogroups rarely found in populations of northern Asia and eastern Europe. PLoS One 7(2):e32179 Derenko M, Malyarchuk B, Bahmanimehr A, Denisova G, Perkova M et al (2013) Complete mitochondrial DNA diversity in Iranians. PLoS One 8(11):e80673 Edwin D, Vishwanathan H, Roy S, Usha Rani MV, Majumder PP (2002) Mitochondrial DNA diversity among five tribal populations of southern India. Curr Sci 83(2):25 Endicott P, Sanchez JJ, Metspalu E, Behar DM, Kivisild T (2007) The unresolved location of Otzi’s mtDNA within haplogroup K. Am J Phys Anthropol 132 (4):590–1; discussion 591–593 Ewan C (2017) Oldest Homo sapiens fossil claim rewrites our species’ history. Nature. https://doi.org/10.1038/ nature.2017.22114. Retrieved 11 June 2017 Fedorova SA, Reidla M, Metspalu E et al (2013) Autosomal and uniparental portraits of the native populations of Sakha (Yakutia): implications for the peopling of Northeast Eurasia. BMC Evol Biol 13:127 Fernandes V, Alshamali F, Alves M et al (2012) The Arabian cradle: mitochondrial relicts of the first steps along the southern route out of Africa. Am J Hum Genet 2:347–355 Finnila S, Lehtonen MS, Majamaa K (2001) Phylogenetic network for European mtDNA. Am J Hum Genet 68:1475–1484 Fornarino S, Pala M, Battaglia V, Maranta R, Achilli A et al (2009) Mitochondrial and Y-chromosome diversity of the Tharus (Nepal): a reservoir of genetic variation. BMC Evol Biol 2(9):154 Forster P (2004) Ice Ages and the mitochondrial DNA chronology of human dispersals: a review. Philos Trans R Soc Lond 359:255–264 Forster P, Matsumura S (2005) Enhanced: did early humans go north or south? Science 308:965–966 Fregel R, Seetah K, Betancor E et al (2014) Multiple ethnic origins of mitochondrial DNA lineages for the population of Mauritius. PLoS One 9(3):e93294 Fregel R, Cabrera V, Larruga JM, AbuAmero KK, González AM (2015) Carriers of mitochondrial DNA Macrohaplogroup N lineages reached Australia around 50,000 years ago following a Northern Asian route. PLoS One 10(6):e0129839

142

5

Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

Fu YX (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915–925 Fucharoen G, Fucharoen S, Horai S (2001) Mitochondrial DNA polymorphism in Thailand. J Hum Genet 46:115–125 Govindaraj P, Khan NA, Gopalakrishna P et al (2011) Mitochondrial dysfunction and genetic heterogeneity in chronic periodontitis. Mitochondrion 11(3):504–512 Gunnarsdóttir E, Nandineni M, Li M et al (2011) Larger mitochondrial DNA than Y-chromosome differences between matrilocal and patrilocal groups from Sumatra. Nat Commun 2:228 Harpending HC, Batzer MA, Gurven M, Jorde LB, Rogers AR, Sherry ST (1998) Genetic traces of ancient demography. Proc Natl Acad Sci U S A 95:1961–1967 Hartmann A, Thieme M, Nanduri LK et al (2009) Validation of microarray-based resequencing of 93 worldwide mitochondrial genomes. Hum Mutat 30(1):115–122 Harvati K, Röding C, Bosman AM et al (2019) Apidima Cave fossils provide earliest evidence of Homo sapiens in Eurasia. Nature 571:500–504 Herrnstadt C, Elson JL, Fahy E, Preston G, Turnbull DM et al (2002) Reduced-median network analysis of complete mitochondrial DNA coding-region sequences for the major African, Asian, and European haplogroups. Am J Hum Genet 70:1152–1171 Hill C, Soares P, Mormina M, Macaulay V, Meehan W, Blackburn J, Clarke D et al (2006) Phylogeography and Ethnogenesis of Aboriginal Southeast Asians. Mol Biol Evol 23(12):2480–2491 Ingman M, Kaessmann H, Pääbo S, Gyllensten U (2000) Mitochondrial genome variation and the origin of modern humans. Nature 408:708–713 Ji F, Sharpley MS, Derbeneva O et al (2012) Mitochondrial DNA variant associated with Leber hereditary optic neuropathy and high-altitude Tibetans. Proc Natl Acad Sci U S A 109(19):7391–7396 Jiang C, Cui J, Liu F et al (2014) Mitochondrial DNA 10609T promotes hypoxia-induced increase of intracellular ROS and is a risk factor of high altitude polycythemia. PLoS One 9(1):e87775 Jinam TA, Hong LC, Phipps ME et al (2012) Evolutionary history of continental southeast Asians: “early train” hypothesis based on genetic analysis of mitochondrial and autosomal DNA data. Mol Biol Evol 29 (11):3513–3527 Jorde LB, Bamshad M, Rogers AR (1998) Using mitochondrial and nuclear DNA markers to reconstruct human evolution. Bioessays 20(2):126–136 Kang L, Zheng HX, Chen F et al (2013) mtDNA lineage expansions in Sherpa population suggest adaptive evolution in Tibetan highlands. Mol Biol Evol 30 (12):2579–2587 Kivisild T, Bamshad MJ, Kaldma K, Metspalu M, Metspalu E et al (1999a) Deep common ancestry of Indian and western-Eurasian mitochondrial DNA lineages. Curr Biol 9:1331–1334

Kivisild T, Kaldma K, Metspalu M, Parik J, Papiha SS, Villems R (1999b) The place of the Indian mitochondrial DNA variants in the global network of maternal lineages and the peopling of the Old World. In: Papiha SS, Deka R, Chakraborty R (eds) Genomic diversity. Kluwer Academic/Plenum, New York, pp 135–152 Kivisild T, Papiha SS, Rootsi S, Parik J, Kaldma K et al (2000) An Indian ancestry: a key for understanding human diversity in Europe and beyond. In: Renfrew C, Boyle K (eds) Archaeogenetics: DNA and the population prehistory of Europe, McDonald Institute monographs. McDonald institute, Cambridge, pp 267–275 Kivisild T, Tolk H, Parik J, Wang Y, Papiha SS (2002) The emerging limbs and twigs of the East Asian mtDNA tree. Mol Biol Evol 19:1737–1751 Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K et al (2003a) The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. Am J Hum Genet 72:313–332 Kivisild T, Rootsi S, Metspalu M, Metspalu E, Parik J et al (2003b) Genetics of the language and farming spread in India. In: Renfrew C, Boyle K (eds) Examining the farming/language dispersal hypothesis, McDonald Institute monographs series. McDonald Institute for Archaeological Research, Cambridge, pp 215–222 Kivisild T, Shen P, Wall DP, Do B, Sung R et al (2006) The role of selection in the evolution of human mitochondrial genomes. Genetics 172:373–387 Kong QP, Yao YG, Sun C, Bandelt HJ, Zhu CL et al (2003) Phylogeny of East Asian mitochondrial DNA lineages inferred from complete sequences [Erratum 75 157]. Am J Hum Genet 73:671–676 Kong QP, Sun C, Wang HW et al (2011) Large-scale mtDNA screening reveals a surprising matrilineal complexity in east Asia and its implications to the peopling of the region. Mol Biol Evol 28(1):513–522 Kumar V, Banrida TL, Biswas S et al (2006) Asian and non-Asian origins of Mon-khmer and Mundari speaking Austro-Asiatic population of Indian American. J Hum Biol 18:461–469 Kumar S, Padmanabham PB, Ravuri RR, Uttaravalli K, Koneru P et al (2008) The earliest settlers’ antiquity and evolutionary history of Indian populations: evidence from M2 mtDNA lineage. BMC Evol Biol 8:230 Kumar S, Ravuri RR, Koneru P, Urade BP, Sarkar BN et al (2009) Reconstructing Indian-Australian phylogenetic link. BMC Evol Biol 9:173. https://doi.org/10.1186/ 1471-2148-9-173 Maca-Meyer N, Gonzalez AM, Larruga JM, Flores C, Cabrera VM (2001) Major genomic mitochondrial lineages delineate early human expansions. BMC Genet 2:13 Macaulay V, Hill C, Achilli A, Rengo C, Clarke D et al (2005) Single, rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes. Science 308:1034–1036

References Majumder PP (2001) Ethnic populations of India as seen from an evolutionary perspective. J Biosci 26 (4):533–545 Malyarchuk B, Grzybowski T, Derenko M et al (2008) Mitochondrial DNA phylogeny in Eastern and Western Slavs. Mol Biol Evol 25(8):1651–1658 Malyarchuk B, Derenko M, Denisova G, Kravtsova O (2010a) Mitogenomic diversity in Tatars from the Volga-Ural region of Russia. Mol Biol Evol 27 (10):2220–2226 Malyarchuk B, Derenko M, Grzybowski T et al (2010b) The peopling of Europe from the mitochondrial haplogroup U5 perspective. PLoS One 5(4):e10285 Matisoff JA (1991) Sino-Tibetan linguistics: present state and future prospects. Annu Rev Anthropol 20:469–504 Mellars P (2006) Going east: new genetic and archaeological perspectives on the modern human colonization of Eurasia. Science 313:796–800 Metspalu M, Kivisild T, Metspalu E, Parik J, Hudjashov G et al (2004) Most of the extant mtDNA boundaries in south and southwest Asia were likely shaped during the initial settlement of Eurasia by anatomically modern humans. BMC Genet 5:26 Nagle N, van Oven M, Wilcox S et al (2017) Aboriginal Australian mitochondrial genome variation – an increased understanding of population antiquity and diversity. Sci Rep 7:43041 Olivieri A, Achilli A, Pala M, Battaglia V, Fornarino S et al (2006) The mtDNA legacy of the Levantine early upper paleolithic in Africa. Science 314:1767 Olivieri A, Pala M, Gandini F et al (2013) Mitogenomes from two uncommon haplogroups mark late glacial/ postglacial expansions from the near east and neolithic dispersals within Europe. PLoS One 8(7):e70492 Oota H, Settheetham-Ishida W, Tiwawech D, Ishida T, Stoneking M (2001) Human mtDNA and Y-chromosome variation is corrlated with matrilocal versus patrilocal residence. Nat Genet 29:20–21 Oota H, Pakendorf B, Weiss G, von Haeseler A, Pookajorn S, Settheetham-Ishida W, et al (2005) Recent origin and cultural reversion of a hunter-gatherer group. PloS Biology 3:e71. https://doi.org/10.1371/journal.pbio. 0030071 Oppenheimer S (2003) The peopling of world. Constable, London Palanichamy M, Sun C, Agrawal S, Bandelt H-J, Kong Q-P et al (2004) Phylogeny of mitochondrial DNA macrohaplogroup N in India based on complete sequencing: implications for the peopling of South Asia. Am J Hum Genet 75:966–978 Palanichamy MG, Agrawal S, Yao YG, Kong QP, Sun C et al (2006) Comment on “Reconstructing the origin of Andaman islanders”. Science 311:47 Passarino G, Semino O, Bernini LF, SantachiaraBenerecetti AS (1996) Pre-Caucasoid and Caucasoid genetic features of Indian population revealed by mtDNA polymorphisms. Am J Hum Genet 59:927–934

143 Qin Z, Yang Y, Kang L et al (2010) A mitochondrial revelation of early human migrations to the Tibetan Plateau before and after the last glacial maximum. Am J Phys Anthropol 143(4):555–569 Quintana-Murci L, Chaix R, Wells S, Behar D, Sayar H et al (2004) Where West meets East: the complex mtDNA landscape of the Southwest and Central Asian corridor. Am J Hum Genet 74(5):827–845 Rani DS, Dhandapany PS, Nallari P, Govindaraj P, Singh L, Thangaraj K (2010) Mitochondrial DNA haplogroup ‘R’ is associated with Noonan syndrome of south India. Mitochondrion 10(2):166–173 Rao AI, Venugopal PN, Chandrasekar A (2013) Genetic continuity of anatomically modern human between India and Island Southeast Asia ISEA: last glacial dispersal of mtDNA lineage N22. Int J Res Advent Technol 1(5):300–305 Reddy BM, Langstieh BT, Kumar V, Nagaraja T, Reddy ANS et al (2007) Austro-Asiatic tribes of northeast India provide hitherto missing genetic link between South and Southeast Asia. PLoS One 2(11):e1141 Rogers AR, Harpending H (1992) Population growth makes waves in the distribution of pairwise genetic differences. Mol Biol Evol 9(3):552–569 Roostalu U, Kutuev I, Loogväli EL et al (2007) Origin and expansion of haplogroup H, the dominant human mitochondrial DNA lineage in West Eurasia: the Near Eastern and Caucasian perspective. Mol Biol Evol 24 (2):436–448 Roychoudhury S, Roy S, Dey B, Chakraborty M, Roy M et al (2000) Fundamental genomic unity of ethnic India is revealed by analysis of mitochondrial DNA. Curr Sci 79(9):1182–1192 Roychoudhury S, Roy S, Basu A, Banerjee R, Vishwanathan H et al (2001) Genomic structures and population histories of linguistically distinct tribal groups of India. Hum Genet 109:339–350 Santoro A, Balbi V, Balducci E et al (2010) Evidence for sub-haplogroup h5 of mitochondrial DNA as a risk factor for late onset Alzheimer’s disease. PLoS One 5 (8):e12037 Schönberg A, Theunert C, Li M, Stoneking M, Nasidze I (2011) High-throughput sequencing of complete human mtDNA genomes from the Caucasus and West Asia: high diversity and demographic inferences. Eur J Hum Genet 19(9):988–994 Sharma G, Tamang R, Chaudhary R, Singh VK, Shah AM et al (2012) Genetic affinities of the Central Indian tribal populations. PLoS One 7(2):e32546 Slatkin M, Hudson RR (1991) Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129:555–562 Su B, Xiao C, Deka R, Seielstad MT, Kangwanpong D et al (2000) Y chromosome haplotypes reveal prehistorical migrations to the Himalayas. Hum Genet 107:582–590 Sukernik RI, Volodko NV, Mazunin IO, Eltsov NP, Dryomov SV, Starikovskaya EB (2012) Mitochondrial genome diversity in the Tubalar, Even, and Ulchi:

144

5

Population Diversity and Molecular Diversity Indices Based on mtDNA. . .

contribution to prehistory of native Siberians and their affinities to Native Americans. Am J Phys Anthropol 148(1):123–138 Summerer M, Horst J, Erhart G et al (2014) Large-scale mitochondrial DNA analysis in Southeast Asia reveals evolutionary effects of cultural isolation in the multiethnic population of Myanmar. BMC Evol Biol 14:17 Sun C, Kong Q-P, Palanichamy MG, Agrawal S, Bandelt H-J et al (2006) The dazzling array of basal branches in the mtDNA macrohaplogroup M from India as inferred from complete genomes. Mol Biol Evol 23:683–690 Takahata N (1996) Neutral theory of molecular evolution. Curr Opin Genet Dev 6(6):767–772 Tanaka M, Cabrera VM, González AM, Larruga JM, Takeyasu T et al (2004) Mitochondrial genome variation in eastern Asia and the peopling of Japan. Genome Res 14(10A):1832–1850 Thangaraj K, Chaubey G, Kivisild T, Reddy AG, Singh VK et al (2005a) Reconstructing the origin of Andaman Islanders. Science 308:996 Thangaraj K, Sridhar V, Kivisild T, Reddy AG, Chaubey G et al (2005b) Different population histories of the Mundari and Mon-Khmer speaking Austro-Asiatic tribes inferred from the mt-DNA 9-bp deletion/insertion polymorphism in Indian populations. Hum Genet 116:507–517 Thangaraj K, Chaubey G, Singh VK, Vanniarajan A, Thanseem I et al (2006) In situ origin of deep rooting lineages of mitochondrial macrohaplogroup ‘M’ in India. BMC Genomics 7:151 Thangaraj K, Chaubey G, Kivisild T, Selvi-Rani D, Singh VK et al (2008) Maternal footprints of southeast Asians in North India. Hum Hered 66:1–9 Thangaraj K, Nandan A, Sharma V et al (2009) Deep rooting in-situ expansion of mtDNA haplogroup R8 in South Asia. PLoS One 4(8):e6545 Trejaut JA, Kivisild T, Loo JH, Lee CL, He CL et al (2005) Traces of archaic mitochondrial lineages persistin

Austronesian-speaking Formosan populations. PLoS Biol 3:e247 van der Walt EM, Smuts I, Taylor RW et al (2012) Characterization of mtDNA variation in a cohort of South African paediatric patients with mitochondrial disease. Eur J Hum Genet 20(6):650–656 van Holst Pellekaan SM, Ingman M, Roberts-Thomson J, Harding RM (2006) Mitochondrial genomics identifies major haplogroups in Aboriginal Australians. Am J Phys Anthropol 131:282–294 Wang HW, Mitra B, Chaudhuri TK, Palanichamy MG, Kong QP, Zhang YP (2011) Mitochondrial DNA evidence supports northeast Indian origin of the aboriginal Andamanese in the Late Paleolithic. J Genet Genomics 38(3):117–122 Wang HW, Li YC, Sun F et al (2012) Revisiting the role of the Himalayas in peopling Nepal: insights from mitochondrial genomes. J Hum Genet 57(4):228–234 Wen B, Xie X, Gao S, Li H, Shi H et al (2004) Analyses of genetic structure of Tibeto-Burman populations reveals sex-biased admixture in southern Tibeto-Burmans. Am J Hum Genet 74:856–865 Wen B, Li H, Gao S, Mao X et al (2005) Genetic structure of Hmong-Mien speaking populations in East Asia as revealed by mtDNA lineages. Mol Biol Evol 22 (3):725–734 Yao YG, Zhang YP (2002) Phylogeographic analysis of mtDNA variation in four ethnic populations from Yunnan Province: new data and a reappraisal. J Hum Genet 47:311–318 Yao Y-G, Kong Q-P, Bandelt H-J, Kivisild T, Zhang Y-P (2002) Phylogeographic differentiation of mitochondrial DNA in Han Chinese. Am J Hum Genet 70:635–651 Zhao M, Kong QP, Wang HW et al (2009) Mitochondrial genome evidence reveals successful Late Paleolithic settlement on the Tibetan Plateau. Proc Natl Acad Sci U S A 106(50):21230–21235

6

Y-Chromosome Phylogeny in Indian Population

The present study has been carried out on 2802 samples for Y-chromosome analysis from 73 tribal communities belonging to six different linguistic groups (D—Dravidian, IE— Indo-European, AA—Austro-Asiatic, TB— Tibeto-Burman, A—Andaman and F—Tai Kadai) living in different geographic regions of India. The predominant Y chromosome haplogroups are O, F, H, R, C, K, L and P covering majority of population, followed by low frequencies of J, N, Q, D, and G (Fig. 6.1). The parental lineages C or F have been proposed to be the earliest out-of-Africa founder types. Increased resolution of Y chromosomal data reveals regionally differentiated types, C1 and C3 in East Asia, C2 in New Guinea, C4 in Australia and C5 in India. In the present study C5* is found highest among a western tribe Damor (84.2%) and is widely distributed in the Dravidian population (ten populations). Among Austro-Asiatic tribes it is present only among Pauri Bhuinya population (1.8%). C5 is absent among the Tibeto-Burman population. Lineage D* is found in the Jarawa of Andaman Island and Bhotia, a Tibeto-Burman population of north India. The D1a* shows its presence mostly among Tharu (17%) and Lachungpa (54.05%) of North and NE population, respectively. Lineage D3* is observed only in the Dirang Monpa population (10%). Another founder lineage of Non-Africans F* has been observed in 48% of the studied population. Haplogroup F constitutes over 90% of

paternal lineages outside of Africa and primarily found throughout South Asia, Southeast Asia and parts of East Asia. Dravidian populations in the present study harbour high frequencies (82%) of F* clade, and it provides evidence of significant long-term isolation. Middle East haplogroup G1* is least represented in the three populations, Dongri Bhill (10%), Garwali Rajput (6.3%) and Soliga (7.0%). Haplogroup H, which is probably emerged in South Asia, was found in 266 out of the total 2802 Y chromosomes studied. Haplogroup H was further segregated into three lineages H*, H1, H2. Haplogroup H has very high frequency in the Indo-European (76%) and Dravidian (57.14%) linguistic groups. Highest frequency of haplogroup H has been found among Kutia Kondh (81.6%). Among the Dravidian tribes the Chenchu and Konda Reddis does not show H haplogroup. It remains necessary to scan these two populations of other geographic locations for the presence or the absence of haplogroup H. Lineage H1a* of H1 group is the most frequently occurring lineage across all the populations and found amongst 62%. Its maximum frequency is observed in the Koraga (93%) followed by Alu Kurumba, 63.3%. It is absent among Indo-Europeans (eight communities), Dravidians (three communities) and AustroAsiatic (one communities). Haplogroup H is completely restricted to India, Sri Lanka and Pakistan.

# Springer Nature Singapore Pte Ltd. 2021 Anthropological Survey of India, Genomic Diversity in People of India, https://doi.org/10.1007/978-981-16-0163-7_6

145

--

2.9

--

--

--

--

--

--

--

15.0

84.2

--

--

--

--

--

100.0

--

--

--

--

--

--

--

--

--

Andh

Betta Kuruba

Bharia

Bhil

Bhoi Khasi

Bhoksa

Bhotia

Bondo

Chenchu

Dungri Bhil

Damor

Dhodia

Dirang Monpa

--

--

--

--

--

--

29.3

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

20.0 10.0

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

2.6

5.0

--

48.6

--

41.4

--

--

--

71.4

5.1

18.4

--

--

--

--

--

--

1.7

--

--

--

--

--

--

--

--

--

--

10.0

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

---

--

--

2.5

2.1

5.3

1.7

--

--

5.2

10.3

--

--

--

--

--

--

--

--

5.2

--

--

--

12.5 7.1

--

5.1

4.1

--

16.7

--

6.7

--

--

--

17.2

--

--

--

20.0

28.2

63.3

--

--

--

--

--

--

--

3.4

--

--

--

--

--

--

--

--

--

--

--

--

--

--

2.6

--

--

--

--

--

--

--

--

--

7.7

--

--

--

--

10.4 2.1

---

--

--

13.3 3.3 10.0

--

11.4

--

--

--

--

--

2.9

--

--

--

8.3

--

--

25.0

--

--

3.4

--

--

--

--

7.7

--

--

8.3

--

--

--

--

--

--

--

--

--

--

--

--

7.5

--

--

6.7

--

--

--

--

--

--

--

--

2.6

--

--

2.1

--

1.7

2.5

--

--

3.4

--

--

--

2.9

5.1

2.0

--

--

--

--

--

--

--

3.4

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

13.8 8.6

--

--

--

25.0

--

--

--

--

--

--

--

--

--

--

3.4

--

--

--

--

5.1

--

--

4.2

--

8.3

--

40.0

--

--

61.3

--

48.2

--

12.8

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

13.8

--

--

--

--

--

--

--

57.5

--

--

--

--

--

1.7

--

35.5

--

--

--

--

--

2.5

4.2

--

--

--

--

--

--

--

--

--

--

--

--

--

2.1

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

3.4

--

--

--

3.6

--

--

--

--

--

--

--

--

--

--

--

--

--

3.6

--

--

--

--

27.1

--

6.7

32.5

--

--

13.8

3.2

--

--

--

12.8

2.0

--

12.5

7.9

11.7

40.0

--

17.2

--

--

--

--

--

5.1

2.0

R1 R1A1* R2

--

--

--

--

--

--

--

--

--

--

--

--

--

--

S*

6

Fig. 6.1 Distribution of Y chromosome haplogroups percentages among Indian population

8.2

--

Alu Kurumba

Name of the community C* C5* D* D1a* D3* F* G* G1* G2* H* H1* H1a* H1b H2* J* J2A J2B1 J2b2* K* L1 L* M* N* N1* O2A* o3* O3a3c O3A3C1* P* Q1* R*

146 Y-Chromosome Phylogeny in Indian Population

--

--

17.1

--

5.7

--

--

4.8

5.9

--

--

2.2

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

Gadia Lohar

Galong

Garhwali Brahmin

Garhwali Rajput

Ghorkha

Kolam

Hmar

Irular

Jenu Kuruba

Jarawa

Jaunsari

Kamar

Kanikkar

Karen

Ka Thakur

Fig. 6.1 (continued)

2.6

--

Yerukulas

--

--

--

--

--

100.0

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

2.9

3.1

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

87.5

--

--

--

17.6

19.0

10.0

28.1

--

--

7.3

--

9.4

2.6

--

--

--

--

3.6

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

6.3

--

--

--

--

--

--

--

--

--

--

--

--

--

5.3

--

--

--

--

--

--

4.3

--

12.5

6.7

39.3

--

55.9

9.5

--

1.8

5.7

15.6

7.3

--

1.9

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

26.7

--

--

17.6

9.5

--

14.0

5.7

18.8

7.3

--

7.5

7.7

--

--

--

2.2

--

--

--

--

--

--

2.9

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

4.8

--

--

8.6

9.4

4.9

--

--

5.1

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

4.8

--

--

--

--

2.4

--

1.9

35.9

--

--

--

--

--

--

4.4 13.3

--

--

--

--

--

7.0

--

--

--

2.1

--

--

--

--

--

--

--

--

--

4.8

--

--

--

--

9.8

--

--

--

--

--

--

--

26.2

--

--

--

--

--

--

--

--

--

--

--

--

32.0

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

5.7

--

4.9

--

--

--

--

16.0

--

44.4

--

--

--

4.8

80.0

15.8

--

--

--

2.1

--

--

--

8.0

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

44.0

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

10.0

--

34.3

--

2.4

93.8

--

--

--

--

--

--

--

--

2.9

--

--

10.5

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

2.1

--

--

--

6.0

--

--

--

--

--

--

--

--

--

--

--

Name of the community C* C5* D* D1a* D3* F* G* G1* G2* H* H1* H1a* H1b H2* J* J2A J2B1 J2b2* K* L1 L* M* N* N1* O2A* o3* O3a3c O3A3C1* P* Q1* R*

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

4.8

--

14.0

22.9

43.8

31.7

2.1

79.2

12.8

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

S*

12.8 80.9

--

--

--

25.0

--

--

33.3

--

3.5

5.7

3.1

4.9

--

--

33.3

R1 R1A1* R2

6 Y-Chromosome Phylogeny in Indian Population 147

--

--

--

38.9

--

--

--

--

--

--

--

26.2

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

Katkari

Kattunayakan

Kutia Kondh

Kondareddis

Koraga

Korku

Kota

Koya

Lachungpa

Lepcha

Madia

Malpaharia

Mara

Ma Thakur

Fig. 6.1 (continued)

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

54.1

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

23.8

36.2

--

4.8

25.8

--

5.4

42.9

5.9

--

--

33.3

--

12.5

16.7

9.3

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

23.4

--

26.2

6.5

--

--

--

--

--

--

--

--

--

14.3 21.4

--

--

--

--

81.6

31.3

25.0

48.8

38.1

29.8

--

7.1

38.7

--

2.7

--

37.3

--

92.9

--

--

50.0

50.0

32.6

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

14.3

--

--

--

--

--

--

--

--

--

--

--

--

--

4.3

--

--

--

--

--

--

--

100

--

--

--

--

--

4.7

--

--

--

7.1

--

--

--

--

--

--

--

--

--

--

--

--

4.8

--

--

--

--

--

--

--

21.6

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

3.2

12.8

5.4

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

5.9

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

8.2

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

20.5

2.7

--

--

--

--

--

--

--

--

--

--

--

43.6

7.1

12.9

2.6

8.1

21.4

--

--

--

27.8

10.2

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

56.4

--

3.2

61.5

13.5

--

--

--

--

--

--

--

--

--

--

6.4

--

--

3.2

2.6

--

--

--

--

--

--

--

--

8.3

4.7

33.3

--

--

--

--

--

8.1

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

7.1

--

--

--

--

11.8

--

--

--

--

--

--

--

--

--

--

--

6.5

--

--

--

17.6

--

7.1

--

--

6.3

--

--

R1 R1A1* R2

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

S*

6

Mathur

--

--

Kathodi

Name of the community C* C5* D* D1a* D3* F* G* G1* G2* H* H1* H1a* H1b H2* J* J2A J2B1 J2b2* K* L1 L* M* N* N1* O2A* o3* O3a3c O3A3C1* P* Q1* R*

148 Y-Chromosome Phylogeny in Indian Population

--

2.6

--

--

--

--

--

1.8

--

--

24.4

--

--

4.5

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

Mizo

Mullu Kurumba

Munda

Nayaka

Nicobarese

Nihal

Nishi

Pauri Bhuinya

Padhar

Paite

Paniyan

Porja

Rabha

Raji

Fig. 6.1 (continued)

Saharia

1.9

--

Mina

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

11.5

--

--

--

65.9

--

--

3.6

--

--

--

--

--

21.1

--

7.5

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

2.4

--

4.3

--

--

73.9

--

10.3

8.2

--

--

--

--

--

--

1.4

--

--

--

--

--

--

--

--

--

--

--

--

29.5

2.3

--

10.8

--

--

--

14.5

--

13.0

4.0

13.8

8.2

15.8

--

1.9

--

--

8.7

--

20.7

--

--

--

1.9

--

--

--

--

--

--

--

--

--

--

--

--

12.8 6.4

--

--

--

--

--

--

--

--

1.9

1.6

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

2.0

--

--

--

--

--

--

--

--

--

8.5

--

--

--

--

3.4

--

--

--

3.8

--

--

--

--

--

--

4.3

--

--

--

--

3.4

--

--

--

--

--

--

--

--

--

--

--

--

--

4.3

--

--

--

--

--

--

--

--

--

--

--

--

14.9

--

--

--

--

3.4

--

39.5

--

3.3

--

--

--

--

--

--

--

--

--

--

--

--

--

--

35.8 3.8

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

21.3

--

9.5

77.0

--

90.0

--

60.0

--

--

92.0

24.1

75.5

--

60.9

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

90.9

81.0

--

--

5.0

--

--

93.3

--

--

--

--

--

39.1

--

4.9

--

--

--

--

--

--

--

--

--

--

3.4

--

--

--

--

--

2.3

4.8

--

--

--

--

--

--

--

--

--

--

--

--

1.9

16.4

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

Name of the community C* C5* D* D1a* D3* F* G* G1* G2* H* H1* H1a* H1b H2* J* J2A J2B1 J2b2* K* L1 L* M* N* N1* O2A* o3* O3a3c O3A3C1* P* Q1* R*

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

4.8

--

7.3

5.0

42.6

5.5

6.7

--

4.0

13.8

6.1

10.5

--

32.1

11.5

--

--

10.8

--

--

6.4

14.5

--

--

--

3.4

--

10.5

--

7.5

R1 R1A1* R2

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

--

S*

6 Y-Chromosome Phylogeny in Indian Population 149

--

7.0

--

--

--

--

--

--

--

7.9

3.5

--

--

--

--

--

--

--

--

--

--

0.9

Sherdukpen

Soliga

Sonowal Kachari

Tai Khampti

Tharu

Tai Ahom

Toda

Toto

Wancho

Yanadi

Total

0.8

--

--

--

--

--

--

--

--

--

--

--

1.4

--

--

--

--

--

17.0

--

--

--

--

--

0.1

--

--

--

--

--

--

--

--

--

--

--

9.0

26.3

--

--

22.7

4.8

--

--

--

25.6

--

--

0.1

--

--

--

--

--

--

--

--

--

--

--

0.3

--

--

--

--

--

--

--

--

2.3

--

--

0.1

--

--

--

--

--

--

--

--

--

--

--

8.5

--

--

--

4.5

9.7

8.5

--

--

9.3

--

--

0.4

--

--

--

--

--

--

--

--

--

--

2.5

10.7

2.6

--

--

--

--

4.3

--

--

41.9

--

7.5

0.4

--

--

--

9.1

--

--

--

--

--

--

--

1.4

2.6

--

--

--

--

2.1

--

--

2.3

--

--

0.4

--

--

--

--

--

--

--

--

--

2.3

--

0.5

--

--

--

--

--

--

--

--

--

2.3

--

2.3

10.5

--

--

18.2

--

14.9

--

--

2.3

--

--

0.2

--

--

--

--

--

--

--

--

--

--

--

1.1

--

4.9

2.9

--

--

--

--

--

--

7.0

--

--

8.5

--

--

--

--

--

3.0

10.5

--

--

0.6

--

--

--

13.6 18.2

--

--

--

2.2

--

--

--

1.7

--

--

--

--

--

--

--

--

--

--

--

0.5

--

--

--

--

--

--

--

--

--

--

--

0.9

--

4.9

11.4

--

--

--

3.6

--

--

--

--

16.3

--

--

--

--

12.9

--

41.8

2.2

--

--

82.5

0.1

--

--

--

--

--

--

--

--

--

--

--

0.7

--

--

--

--

--

--

--

--

--

--

--

15.8

--

90.2

85.7

--

56.5

4.3

49.1

73.9

--

88.4

--

0.9

--

--

--

--

--

--

--

4.3

--

--

--

0.6

--

--

--

--

1.6

--

1.8

--

2.3

--

--

0.7

--

--

--

--

--

--

--

--

--

--

--

0.1

--

--

--

--

--

--

--

--

--

--

--

8.5

15.8

--

--

4.5

12.9

25.5

3.6

17.4

--

--

--

6.1

23.7

--

--

9.1

1.6

14.9

--

--

7.0

--

7.5

R1 R1A1* R2

1.4

--

--

--

--

--

--

--

--

--

--

--

S*

6

Fig. 6.1 (continued)

--

--

Savaras

Name of the community C* C5* D* D1a* D3* F* G* G1* G2* H* H1* H1a* H1b H2* J* J2A J2B1 J2b2* K* L1 L* M* N* N1* O2A* o3* O3a3c O3A3C1* P* Q1* R*

150 Y-Chromosome Phylogeny in Indian Population

Multi Dimension Analysis for Y-Chromosome SNP

Haplogroup J, believed to have evolved in western Asia, is represented by J*, J2a, J2b1, J2b2 clades in Indian populations. Haplogroup J ranges from 0% to 25% in different communities. Around 3.4% in 25 out of 71 communities of the present study falls in J Haplogroup. J2 is absent in north-east India. J2b1 is also a frequent lineage in India and typically found in East Asia and Central Asia at frequencies of 10–20% (Kivisild et al. 2003a, b). Hence, it is possible that the Indian J2 allele is originated in the northern portion of the Fertile Crescent where it later spreads throughout Central Asia, the Mediterranean and South into India. It is associated with Neolithic demic diffusion (Semino et al. 1996; QuintanaMurci et al. 2001). J2 frequency in India indicates an unambiguous recent external contribution from west Asia (Sahoo et al. 2006). Haplogroups N* and N1 are found across the Indo-European and Tibeto-Burman linguistic groups. The Karen population of Nicobar Islands is also represented by N*. A total of 919 Y-chromosomes out of 2776 belongs to haplogroup O. Indian haplogroup is represented by O2a*, O3*, O3a3c and O3a3c1* sub-haplogroups. The Austro-Asiatic groups of India show high-frequency O2a* haplogroup (40–83%). Nicobarese possess 92% of O2a lineage. It ranges from 0% to 90% in Paite of the Tibeto-Burman populations. Lineage O3a3c1* has been exclusively represented by TibetoBurman populations and in Pauri Bhuinya of Austro-Asiatic. The Karen of Nicobar Island (44%) and Bhotia (14%) represent O3a3c lineage. Haplogroups P and Q are sporadically present with less frequency in India. Haplogroup R is represented by R*, R1, R1a1* and R2 sub-haplogroups in India. Among these R1a1 and R2 are present across the linguistic groups reaching 33% in the Dravidian tribes, 88% in the Indo-European tribes, whereas it is 6% in the Austro-Asiatic tribes. Sub-haplogroup R2 reaches 40% in the Dravidian tribes, 25% in the Indo-European tribes and 15% in the AustroAsiatic tribes. About 80% of Ka Thakur population harbours the S* lineage. The Dongri Bhill, Nayaka, Dhodia, Kathodi and Ma Thakur show presence of J2 allele and R1 allele. R1 allele is

151

present in Dhodia and Padhar in high frequencies. The R1 allele is believed to have originated in the Eurasian steppes, North of Black Caspian Sea. This lineage might have been originated in a population of the Kurgan culture, known for the domestication of the horse (approximately 3000 B.C.E.). These people were also believed to be first speaker of the Indo-European language group. This lineage is currently found in Central and Western Asia, India and in Slavic populations of Eastern Europe.

Multi Dimension Analysis for Y-Chromosome SNP The two-dimensional plot from MDS, based on haplogroup frequencies, is shown in Figs. 6.2 and 6.3. MDS plot shows that Austro-Asiatic clustering with Tibeto-Burman constellation due to mixed genetic structure. Genetically they share O2a allele of the Y chromosome. Indo-Europeans Clustered with Dravidian Constellation. Tibeto-Burman constellation is distant with Dravidian Constellation by having dissimilarity in possessing H haplogroup alleles in their Y-chromosome. An MDS of Austro-Asiatic population shows that Bonds and Saharia are the outliers from Austro-Asiatic constellation. About 50% of F* alleles reflected in the MDS plot indicating their isolation from the neighbouring populations. Saharia is distinct from other Austro-Asiatic population by having J*, L*, P* and R* alleles in their Y-chromosome. Bhoi Khasi, though belongs Austro-Asiatic linguistically family, harbours neighbouring Tibeto-Burman-specific alleles (O3a3c1*) (Fig. 6.4a–f). MDS of Dravidians indicates that Dravidians and Indo-Europeans share numerous similarities. Tibeto-Burman populations like Nicobarese, Hmar and Paite are closer to Dravidians than contemporary Tibeto-Burman populations (Fig. 6.5a–e). MDS of Indo European clearly suggests that all the studied community’s cluster into three groups. However, Ka Thakur and Damor are stood apart from other Indo-European

152

6

Y-Chromosome Phylogeny in Indian Population

Fig. 6.2 Multi-dimension plot for Y-chromosome SNP for 71 Indian population

Fig. 6.3 Multi-dimension plot of Y-chromosome SNP’s for the linguistic families in the Indian population

communities. Tibeto-Burman communities such as Karen of Andaman Island, Bhoi Khasi of Meghalaya, and Bhotia of Uttarakhand shows some genetic affinity with Indo-European communities. Tai Khamti and Tai Ahom who

are recently migrated from Thailand to North East India have shown genetic affinity with Gujjar of Himachal Pradesh, Mina and Padhar of Gujarata (Fig. 6.6a–d).

Multi Dimension Analysis for Y-Chromosome SNP

153

Fig. 6.4 Multi-dimension plot showing comparison between Astro-Asiatic and other Linguistic communities for Y-chromosome SNPs. (a) Multi-dimension plot Astro-Asiatic communities. (b) Multi-dimension plot between AstroAsiatic vs Indo-European. (c) Multi-dimension plot between Astro-Asiatic vs Tibeto-Burman. (d) Multi-dimension plot between Astro-Asiatic vs Tai-Kadia. (e) Multi-dimension plot between Astro-Asiatic vs Dravidian. (f) Astro-Asiatic vs Andamanese

154

Fig. 6.4 (continued)

6

Y-Chromosome Phylogeny in Indian Population

Multi Dimension Analysis for Y-Chromosome SNP

Fig. 6.4 (continued)

155

156

6

Y-Chromosome Phylogeny in Indian Population

Fig. 6.5 Multi-dimension plot showing comparison between Dravidian and other Linguistic communities for Y-chromosome SNPs. (a) Multi-dimension plot of Dravidian communities. (b) Multi-dimension plot between Dravidian vs Indo-European. (c) Multi-dimension plot between Dravidian vs Tibeto-Burman. (d) Multi-dimension plot between Dravidian vs Tai-Kadia. (e) Multi-dimension plot between Dravidian vs Andamanese

Multi Dimension Analysis for Y-Chromosome SNP

Fig. 6.5 (continued)

157

158

6

Y-Chromosome Phylogeny in Indian Population

Fig. 6.5 (continued)

MDS of Tibeto-Burman populations show more clusters in India, indicating different waves of migrations and in turn an admixture with local populations (Fig. 6.7a–c). Distribution of FST values with neutrality tests and dendrogram drawn based on FST values is shown in Table 6.1. The FST values ranges from 0.37 to 0.38. Watterson neutrality test is significant in eight populations indicating the direction of directional selection. Slatkin’s exact value also indicates that there is deviation from balancing selection in these populations. Based on the FST values phylogenetic tree has been constructed (Fig. 6.8); since Korku and Bhil have shown 100% homogeneous structure, both the communities have been omitted from phylogenetic tree and also from MDS plot. Principal component analysis for Y-chromosome SNP’s among 71 Indian populations is shown in Fig. 6.9. The Y-chromosome results indicate that the western Indian tribes have experienced more gene flow (Table 6.2). The central Indian groups have also experienced more gene flow than what was predicted. This confirms that were a number of prehistoric and historic human migrations

through western India into Indian subcontinent. The present results support that M168 chromosomes were well differentiated into major lineages in south Asia (Chandrasekar et al. 2007) as evidenced by the presence of C, Yap+, D*, F*, K*, P*, L, R2, high frequency of H in Dravidian tribes and O* in Austro-Asiatic tribes (3–6, 8, 11–12). These lineages can be considered as parental lineages of India. While presence of J2 and R1 haplogroups infers recent influx, which have little impact on Y genealogy in India. Thus, our analyses of genetic data from uniparentally inherited loci provide a range of estimates of gene flow across geographic and linguistic borders. The strongest signal of Southeast Asian genetic ancestry among Indian Austro-Asiatic speakers is maintained in their Y chromosomes, with approximately two-thirds falling into haplogroup O2a. Geographic patterns of genetic diversity of this haplogroup are consistent with its origin in Southeast Asia approximately 20 kya, followed by more recent dispersal(s) to India.

Multi Dimension Analysis for Y-Chromosome SNP

159

Fig. 6.6 Multi-dimension plot showing comparison between Indo-European and other Linguistic communities for Y-chromosome SNP’s. (a) Multi-dimension plot of Indo-European communities. (b) Multi-dimension plot between Indo-European vs Tibeto-Burman. (c) Multi-dimension plot between Indo-European vs Tai Kadia. (d) Indo-European vs Andamanese

160

Fig. 6.6 (continued)

6

Y-Chromosome Phylogeny in Indian Population

Multi Dimension Analysis for Y-Chromosome SNP

161

Fig. 6.7 Multi-dimension plot showing comparison between Tibeto-Burman and Other Linguistic communities for Y-chromosome SNPs. (a) Multi-dimension plot of Tibeto-Burman communities. (b) Multi-dimension plot between Tibeto-Burman vs Tai-Kadia. (c) Multi-dimension plot between Tibeto-Burman vs Andamanese

162

6

Y-Chromosome Phylogeny in Indian Population

Fig. 6.7 (continued)

Table 6.1 AMOVA based on Y chromosome haplogroup frequencies Source of variation Among populations Within populations Total Fixation index

d.f. Sum of squares 70 1224.499 6999 1989.941 7069 3214.44 FST ¼ 0.37804: P ¼ 0.000

YAP (Y-Chromosome Alu Polymorphism) Insertion in Indian Samples (Part of the Data Has Been Published in the Article from Elsewhere: Chandrasekar et al. 2007) YAP insertion was observed in 73 samples (Table 6.3) of which only 1 (1.4%) YAP insertion sample from Dungri Bhil of Gujarat belongs to haplogroup E. This can be attributed to a contemporary admixture from an African or Middle Eastern population. The Shia Muslims of north India, who trace their origin to the Middle East, shows the presence of YAP element with haplogroup E lineage at a frequency of 11% (Agrawal et al. 2005). Seventy-two YAP insertion chromosomes

Variance components 0.17282 Va 0.28432 Vb 0.45714

Percentage of variation 37.8 62.2

possessed M174 mutation classified under haplogroup D comprising the Dirang Monpa (34%), Sherdukpen (8%), Lepcha (16%), Lachungpa (65%) and Jarawa (100%) tribes. Our results on the Jarawa are in resemblance with the findings of Thangaraj et al. (2003). It is well established that the Andaman Islanders (Onge, Great Andamanese and Jarawa) are the extant Negrito groups in south Asia who share a similar cultural backdrop including language, in addition to mtDNA lineages and Y-chromosome haplotype D*, and have closer ties with Asians than Africans (Thangaraj et al. 2003, 2005a, b; Palanichamy et al. 2006). Haplotype D* is also found in central Asia (Karafet et al. 2001). The phylogenetic order of YAP lineages—D* and E* is still uncertain. Presence of Paragroup DE*

YAP (Y-Chromosome Alu Polymorphism) Insertion in Indian Samples (Part of the. . .

163

Fig. 6.8 Y-chromosome phylogeny tree based on 71 communities (S Southern India, E Eastern India, N Northern India, NE North East India, W Western India, IA Andaman and Nicobar Island)

164

6

Y-Chromosome Phylogeny in Indian Population

Fig. 6.9 Principal component analysis for Y-chromosome SNP’s among 71 Indian population

among five Nigerians led Weale et al. (2003) to dissect the interior branching order of the YAP lineages and opined that it is impossible to represent the origin of the YAP clade with certainty. Regardless of the branching order of DE*, D*, and E*, the view that the male Andaman Islanders descended from Asian colonizers needs further scrutiny.

Evolutionary Implications Y-chromosome lineages have been classified into 17 lineages and named after English alphabets A to T. African and non-African lineages are separated by M168 mutation. A and B are African lineages and remaining are non-African lineages. Y-chromosomes with M168 mutation have been evolved into YAP insertion (DE haplogroup), C (RPS4Y/M216) and F*(m89/M213). An African

population with M168 mutation is dispersed from the Horn of Africa to southern Asia; the latest age estimate of the event is 101,000 years old (Haber et al. 2019). The C lineage (RPS4Y/M216 mutations) is probably originated in southern Asia. The YAP insertion is also occurred on an Asian Y-chromosome around ~55,000 years ago (Hammer et al. 1998). Our findings of the presence of YAP insertion with haplogroup D among the tribes of Andaman and north-east India specify that some of the M168 chromosomes have given rise to YAP insertion and M174 mutation in south Asia. The presence of C*, YAP insertion and F* lineages in India (Kivisild et al. 2003a, b; Cordaux et al. 2004a, b; Sengupta et al. 2006; Thangaraj et al. 2003) establishes that the Y chromosome evolved into major lineages in south Asia. Then they migrated towards Southeast Asia.

Bhoi Khasi Bondo Munda Nicobarese Pauri Bhuinya Saharia Savaras Jarawa Alu Kurumba Betta Kuruba Chenchu Yerukulas Kolam Irular Jenu Kuruba Kanikkar Kattunayakan Kutia Khond Konda Reddis Koraga Katkari Koya Madia Mullu Kurumba Paniyan Porja Soliga

Neutrality tests

Observed F value 0.50556 0.4122 0.5944 0.8496 0.4086

0.1893 0.6846 N.A. 0.4474 0.5468 0.32575 0.26086 0.16145 0.17301 0.37536 0.7888 0.3666 0.6888 0.3394

0.8698 0.23458 0.29864 0.2515 0.24204

0.50862 0.6172 0.26885

FST 0.3788 0.37777 0.37979 0.38262 0.37773

0.3753 0.38079 0.38429 0.37816 0.37926 0.37681 0.37609 0.37499 0.37511 0.37736 0.38195 0.37726 0.38084 0.37696

0.38285 0.3758 0.37651 0.37599 0.37588

0.37884 0.38004 0.37618

0.5784 0.563 0.30125

0.8075 0.42883 0.5623 0.32592 0.43124

0.3332 0.56452 N.A. 0.38215 0.48386 0.57606 0.37647 0.30331 0.27472 0.49519 0.80895 0.56201 0.6644 0.66595

Expected F value 0.67819 0.67263 0.48728 0.66141 0.41857

0.426 0.65 0.468

0.531 0.033 0.013* 0.281 0.054

0.04 0.716 N.A. 0.741 0.702 0.04 0.169 0.014* 0.088 0.26 0.388 0.116 0.56 0.004*

Watterson F P-value 0.242 0.064 0.754 0.791 0.557

0.331 0.54 0.295

0.531 0.011 0.007 0.126 0.028

0.032 0.46 N.A. 0.533 0.528 0.092 0.081 0.004 0.006 0.112 0.388 0.075 0.313 0.004

Slatkin’s exact P-value 0.291 0.09 0.457 0.675 0.275

4 4 9

2 6 4 8 6

8 4 1 7 5 4 7 9 10 5 2 4 3 3

No. of Obs. haplotypes 3 3 5 3 6

0.50361 0.61333 0.26139

0.86848 0.22693 0.29149 0.24387 0.23446

0.18094 0.68141 1 0.44176 0.54222 0.31901 0.25347 0.15307 0.16482 0.36911 0.78667 0.3602 0.68566 0.33273

Obs. homozygosity 0.50052 0.40626 0.5903 0.84808 0.40263

Table 6.2 Population-wise FST values and neutrality test for Y-chromosome SNP’s for 71 Indian Population

0.49639 0.38667 0.73861

0.13152 0.77307 0.70851 0.75613 0.76554

0.81906 0.31859 0 0.55824 0.45778 0.68099 0.74653 0.84693 0.83518 0.63089 0.21333 0.6398 0.31434 0.66727

Obs. heterozygosity 0.49948 0.59374 0.4097 0.15192 0.59737

4.20861 3.15807 8.90513

1.56643 10.27994 7.97025 9.53825 9.961

12.59267 2.64308 0 4.98735 3.80409 7.28033 9.22713 14.72734 13.84289 6.19815 1.9886 6.35856 2.61369 6.94163

Exp. no. of haplotypes 4.24375 5.53297 3.35349 1.66589 5.59234

0.63865 0.36672 0.54695 (continued)

0.43777 0.97251 0.98181 0.78175 0.96468

0.96693 0.22282 N.A. 0.19818 0.30423 0.96573 0.86208 0.98233 0.92836 0.78854 0.63815 0.92448 0.48691 0.98925

P (k or more haplotypes) 0.85473 0.95558 0.20279 0.14151 0.49286

YAP (Y-Chromosome Alu Polymorphism) Insertion in Indian Samples (Part of the. . . 165

Toda Yanadi Andh Bharia Damor Dhodia Gadia Lohar Garhwali Brahmin Garhwali Rajput Kamar Ka Thakur Kathodi Korku Malpaharia Ma Thakur Mathur Mina Nayaka Nahal Padhar Rabha Tharu Bhoksa Bhotia Dirang Monpa Galong Ghorkha

Neutrality tests

Observed F value 0.15538 0.17609 0.14844 0.31164 0.7154 0.17496 0.6394 0.16764

0.2688

0.2966 0.675 0.35497 0.3478 0.18049 0.28334 0.3134 0.24029 0.16181 0.5742 0.2428 0.65788 0.15812 0.24562 0.167 0.37928

0.8848 0.18877

FST 0.37492 0.37515 0.37484 0.37665 0.38113 0.37514 0.38029 0.37506

0.37618

0.37649 0.38068 0.37713 0.37705 0.3752 0.37634 0.37667 0.37586 0.37499 0.37957 0.37589 0.38049 0.37495 0.37592 0.37505 0.3774

0.38301 0.37529

Table 6.2 (continued)

0.955 0.179

0.314 0.722 0.215 0.065 0.023* 0.041 0.022* 0.56 0.046 0.563 0.202 0.699 0.005* 0.359 0.062 0.433

0.226

Watterson F P-value 0.001* 0.016* 0.05 0.206 0.737 0.214 0.804 0.132

0.931 0.019

0.228 0.573 0.12 0.038 0.007 0.052 0.057 0.354 0.042 0.27 0.028 0.363 0.008 0.118 0.028 0.158

0.087

Slatkin’s exact P-value 0 0.009 0.001 0.086 0.506 0.074 0.613 0.015

4 10

7 4 5 4 8 5 4 11 10 4 8 4 9 9 10 6

7

No. of Obs. haplotypes 8 8 11 6 4 11 5 11

0.88364 0.18074

0.28942 0.67172 0.34851 0.34121 0.17213 0.27603 0.30646 0.23277 0.15317 0.5699 0.23515 0.65446 0.1497 0.23776 0.15859 0.37313

0.26141

Obs. homozygosity 0.14693 0.16793 0.13947 0.30475 0.71253 0.16525 0.63576 0.15914

0.11636 0.81926

0.71058 0.32828 0.65149 0.65879 0.82787 0.72397 0.69354 0.76723 0.84683 0.4301 0.76485 0.34554 0.8503 0.76224 0.84141 0.62687

0.73859

Obs. heterozygosity 0.85307 0.83207 0.86053 0.69525 0.28747 0.83475 0.36424 0.84086

1.49482 12.74592

8.029 2.71135 6.61103 6.75309 13.20685 8.42891 7.5808 10.05577 14.58847 3.53718 9.90761 2.84072 15.0132 9.72743 14.23877 6.13442

8.92587

Exp. no. of haplotypes 15.25646 13.61554 15.70181 7.64283 2.43399 13.1393 2.97877 14.15366

0.01245 0.86721

0.73179 0.24127 0.83852 0.9458 0.97883 0.95846 0.9739 0.41722 0.95526 0.47228 0.82002 0.27702 0.9857 0.67008 0.94393 0.59935

0.83531

P (k or more haplotypes) 0.99585 0.9843 0.95342 0.82244 0.16877 0.81345 0.12949 0.88571

6

0.57026 0.26782

0.37758 0.56922 0.50296 0.57357 0.33387 0.49123 0.5634 0.24917 0.27046 0.56361 0.34092 0.56251 0.30487 0.30057 0.27371 0.43081

0.37525

Expected F value 0.33285 0.33366 0.24446 0.43227 0.57999 0.24136 0.49322 0.24442

166 Y-Chromosome Phylogeny in Indian Population

*P < 0.05

Hmar Jaunsari Karen Lachungpa Lepcha Mara Mizo Nishi Paite Raji Sherdukpen Sonowal Kachari Tai Khampti Tai Ahom Toto Wancho

0.66 0.2874 0.328 0.3308 0.42983 0.5072 0.5242 0.8698 0.815 0.8314 0.79594 0.59065

0.41182 0.35359 0.7526 0.815

0.38052 0.37638 0.37683 0.37687 0.37796 0.37882 0.37901 0.38285 0.38224 0.38242 0.38203 0.37975

0.37776 0.37712 0.38155 0.38224

0.48968 0.37358 0.66925 0.66544

0.6618 0.49026 0.56689 0.33195 0.4911 0.81287 0.8084 0.80716 0.6763 0.56658 0.57189 0.48735 0.38 0.532 0.63 0.713

0.513 0.064 0.038 0.606 0.434 0.05 0.098 0.515 0.699 0.893 0.859 0.751 0.409 0.238 0.481 0.559

0.287 0.067 0.029 0.125 0.277 0.05 0.098 0.515 0.542 0.834 0.77 0.648 5 7 3 3

3 5 4 8 5 2 2 2 3 4 4 5 0.40594 0.34713 0.7501 0.81313

0.65657 0.2802 0.32121 0.32404 0.42419 0.50222 0.51939 0.86848 0.81313 0.8297 0.79386 0.58648 0.59406 0.65287 0.2499 0.18687

0.34343 0.7198 0.67879 0.67596 0.57581 0.49778 0.48061 0.13152 0.18687 0.1703 0.20614 0.41352 5.54907 6.6403 2.19973 1.84498

2.82119 8.32034 7.21096 7.1433 5.27292 4.23173 4.04154 1.56643 1.84498 1.75866 1.9469 3.38187 0.68762 0.50328 0.33884 0.2056

0.55388 0.95457 0.96362 0.41595 0.63615 0.9696 0.9622 0.43777 0.2056 0.03809 0.06585 0.2088

YAP (Y-Chromosome Alu Polymorphism) Insertion in Indian Samples (Part of the. . . 167

168

6

Y-Chromosome Phylogeny in Indian Population

Table 6.3 Y chromosome haplogroups in YAP Frequency Ethnic/linguistic Region Population category North East India Dirang Monpa Tibeto-Burman Lachungpa Tibeto-Burman Lepcha Tibeto-Burman Sherdukpen Tibeto-Burman Northern India Bhotia Tibeto-Burman Garhwali Indo-European Rajput Gorkha Indo-European Tharu Indo-European Andaman & Nicobar Islands Jarawa Negrito/Andamanese Total

Sample size

YAP insertion

Haplogroup D

Haplogroup E

50 51 50 50

17 33 8 4

17 (34%) 33 (65%) 8 (16%) 4 (8%)

0 0 0 0

58 32

17 1

17 (29.3%) 1 (3.1%)

0 0

35 47

1 8

1 (2.9%) 8 (17%)

0 0

7 2169

7 73

7 (100%) 72

0 1

A group of people with YAP insertion migrated from south Asia to Central Asia and reached to the Mediterranean. They gave rise to E lineage (Hammer et al. 1998) with mutations at M40 and M96. The E lineage dispersed to Africa, Middle East, Southern and Eastern Europe. Lineage E1 is predominant over entire Africa and at lesser frequency in Middle East and Europe. This back migration of people to Africa through the Levant is supported by Hammer et al. (1997), Altheide and Hammer (1997), Chandrasekar et al. (2007), Cruciani et al. (2004) and Cabrera et al. (2018). YAP+ E migrated back to Africa with other Eurasian haplogroups, such as lineage R1b1* (18–23 kya) and lineage T (39–45 kya), which has been observed at high frequency in northern Cameroon, and in low frequencies in Africa (Cruciani et al. 2002; Luis et al. 2004) than E lineage (50 kya) observed at high frequencies of 80–92% in Africa. Thus, the major sub-sets of Y lineages that arose from M168 lineage do not trace to an African origin. Likewise, pre-L3, M, N and R haplogroups of mtDNA have no indication of an African origin. The split of the Y-chromosome composite DE haplogroup is very similar to the age of mtDNA L3 (Cabrera et al. 2018). Predominantly north African clades of mtDNA haplogroup M1 (coalescence time of

38.6  7.1 ky) and U6 (coalescence time of 45.1  6.9 ky) which arose in southwestern Asia and differentiated into their major sub-clades while they were in the Mediterranean area and only later some sub-sets of M1a (coalescence time of 28.8  4.9 ky), U6a2 (coalescence time of 24.0  7.3 ky) and U6d (coalescence time of 20.6  7.3 ky) diffused to East and North Africa through the Levant (Olivieri et al. 2006). Thus, modern humans used a southern coastal route for their ‘Out of Africa’ exit and the Levantine route from Asia to Africa for ‘back migration’ (Chandrasekar et al. 2007).

References Agrawal S, Khan F, Pandey A, Tripathi M, Herrrera RJ (2005) YAP, signature of an African Middle Eastern migration into northern India. Curr Sci 88:174–179 Altheide TK, Hammer MF (1997) Evidence for a possible Asian origin of YAP+ Y chromosomes. Am J Hum Genet 61(2):462–466 Cabrera VM, Marrero P, Abu-Amero KK et al (2018) Carriers of mitochondrial DNA macrohaplogroup L3 basal lineages migrated back to Africa from Asia around 70,000 years ago. BMC Evol Biol 18:98 Chandrasekar A, Saheb SY, Gangopadyaya P, Gangopadyaya S, Mukherjee A et al (2007) YAP insertion signature in South Asia. Ann Hum Biol 34:582–586 Cordaux R, Aunger R, Bentley GR, Nasidze I, Sirajuddin SM et al (2004a) Independent origins of Indian caste and tribal paternal lineages. Curr Biol 14(3):231–235

References Cordaux R, Deepa R, Vishwanathan H, Stoneking M (2004b) Genetic evidence for the demic diffusion of agriculture to India. Science 304:1125 Cruciani F, Santolamazza P, Shen P, Macaulay V, Moral P, Olckers A et al (2002) A back migration from Asia to Sub-Saharan Africa is supported by high-resolution analysis of human Y-chromosome haplotypes. Am J Hum Genet 70(5):1197–1214 Cruciani F, La Fratta R, Santolamazza P, Sellitto D, Pascone R et al (2004) Phylogeographic analysis of haplogroup E3b (E-M215) Y chromosomes reveals multiple migratory events within and out of Africa. Am J Hum Genet 74:1014–1022 Haber M, Jones AL, Connel BA, Asan, Arciero E, Huanming Y, Thomas MG, Xue Y, Tyler-Smith C (2019) A rare deep-rooting D0 African Y-chromosomal haplogroup and its implications for the expansion of modern humans out of Africa. Genetics 212(4):1421–1428 Hammer HF, Spurdle AB, Karafet T, Bonner MR, Wood ET et al (1997) The geographic distribution of human Y chromosome variation. Genetics 145(3):787–805 Hammer MF, Karafet T, Rasanayagam A, Wood ET, Altheide TK, Jenkins T, Griffiths RC et al (1998) Out of Africa and back again: nested cladistic analysis of human Y chromosome variation. Mol Biol Evol 15:427–441 Karafet T, Xu L, Du R, Wang W, Feng S et al (2001) Paternal population history of East Asia: sources, patterns, and microevolutionary processes. Am J Hum Genet 69(3):615–628 Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K et al (2003a) The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. Am J Hum Genet 72:313–332 Kivisild T, Rootsi S, Metspalu M, Metspalu E, Parik J et al (2003b) Genetics of the language and farming spread in India. In: Renfrew C, Boyle K (eds) Examining the farming/language dispersal hypothesis, McDonald Institute monographs series. McDonald Institute for Archaeological Research, Cambridge, pp 215–222 Luis JR, Rowold DJ, Regueiro M, Caeiro B, Cinnioglu C et al (2004) The Levant versus the Horn of Africa:

169 evidence for bidirectional corridors of human migrations. Am J Hum Genet 74:532–544 Olivieri A, Achilli A, Pala M, Battaglia V, Fornarino S et al (2006) The mtDNA legacy of the Levantine early upper paleolithic in Africa. Science 314:1767 Palanichamy MG, Agrawal S, Yao YG, Kong QP, Sun C et al (2006) Comment on “Reconstructing the origin of Andaman islanders”. Science 311:47 Quintana-Murci L, Krausz C, Zerjal T, Sayar SH, Hammer MF et al (2001) Y-chromosome lineages trace diffusion of people and languages in Southwestern Asia. Am J Hum Genet 68(2):537–542 Sahoo S, Singh A, Himabindu G, Banerjee J, Sitalaximi T et al (2006) A prehistory of Indian Y-chromosomes: evaluating demic diffusion scenarios. Proc Natl Acad Sci U S A 103(4):843–848 Semino O, Passarino G, Brega A, Fellous M, SantachiaraBenerecetti AS (1996) A view of the neolithic demic diffusion in Europe through two Y chromosome-specific markers. Am J Hum Genet 59(4):964–968 Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA et al (2006) Polarity and temporality of high-resolution Y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of central Asian pastoralists. Am J Hum Genet 78(2):202–221 Thangaraj K, Singh L, Reddy A, Rao V, Sehgal S et al (2003) Genetic affinities of the Andaman islanders, a vanishing human population. Curr Biol 13:86–93 Thangaraj K, Chaubey G, Kivisild T, Reddy AG, Singh VK et al (2005a) Reconstructing the origin of Andaman Islanders. Science 308:996 Thangaraj K, Sridhar V, Kivisild T, Reddy AG, Chaubey G et al (2005b) Different population histories of the Mundari and Mon-Khmer speaking Austro-Asiatic tribes inferred from the mt-DNA 9-bp deletion/insertion polymorphism in Indian populations. Hum Genet 116:507–517 Weale ME, Shah T, Jones AL, Greenhalgh J, Wilson JF et al (2003) Rare deep-rooting Y chromosome lineages in humans: lessons for phylogeography. Genetics 165 (1):229–234

7

Genomic Diversity of 75 Communities in India

Alu Kurumba Alu Kurumba has an ancestor cult, which is their mythic ancestors called Kurupade—Tayi (Mother of the multiple or Army of the Alu Kurumbas). According to census (2011), the total strength of Alu Kurumba were 6823, out of which 3380 were males and 3443 were females (sex ratio ¼ 1019). They are inhabitants of the Nilgiri district, Tamil Nadu. They speak a dialect of the Dravidian language among themselves, which has close resemblance with Kannada, but they communicate in Tamil with others. The Alu Kurumba accepts water and cooked food from other groups of the Kurumba, Toda, Badaga and Irula as well as from other local communities but not from the Kota. Some Kurumba are the officiating priests at Badaga hamlets and fields. Occupationally, the Alu Kurumba are mainly dependent on the forest for their livelihood. Most of these people are plantation labourers, irrespective of sex; some of them have taken up sericulture, bee keeping and horticulture. With regard to the marriage rules; one can marry his mother’s brother’s daughter or father’s sister’s daughter. The Alu Kurumba have several exogamous clans, namely Nagara, Kaigeru, Irapane, Gobeada, Bellega, Neeraga, Bellare, Masole, Macole and Ballaku. Regarding marriage rules, they normally follow monogamy. They are mainly Hindu by religion (Singh 1994). The biological variations like the blood groups, dermatographics and other markers had been

studied by various scholars. With regard to finger dermatoglyphics, the distribution of finger patterns among the Alu Kurumba generally show a preponderance of loops (60.91%) followed by whorls (36.84%) (Chakrabartti and Mukherjee 1964). The Pattern Intensity Index shows a higher value of 12.95 among them. Undevia et al. (1981) showed that some tribes of Tamil Nadu had very less incidence (about 4–5%) of colour blindness. Buchi (1959) reported 70.3% taster and 29.7% non-taster with T gene frequency 45.48 and t gene frequency 54.52. Saha et al. (1976) found 20.9% sickle cell trait (HbAS) among the Alu Kurumba. The S gene frequency among Alu Kurumba was 10.46%. With regard to the ABO blood group system, the frequency of O blood group is the highest (59.00%) followed by ‘A’ (28.1%) and ‘B’ (12.9%). The Rh (d) gene is totally absent in this population (Saha 1973). Among the Kurumba of Tamil Nadu, the allele frequencies of HP1 allele are 13.4 and HP2 allele is 86.6. The Alu Kurumba exhibit the 100.0% gene frequency of TFC variant of transferring (Saha et al. 1976). The Alu Kurumba exhibit GC1 (64.40%) and GC2 (35.60%) variant. Among the Kurumbas, there are 13.10% of G6 PD-deficient persons as studied by Saha et al. in 1976. Among the Kurumba, the frequency of Pa gene is 28.10% and Pb gene is 71.90% as studied by Saha et al. 1976. The frequency of PGM1 gene is 44.2% and PGM2 gene is 55.80%. To ascertain the genomic diversity the present study on Alu

# Springer Nature Singapore Pte Ltd. 2021 Anthropological Survey of India, Genomic Diversity in People of India, https://doi.org/10.1007/978-981-16-0163-7_7

171

172

7

Genomic Diversity of 75 Communities in India

Fig. 7.1 Y Chromosomal haplogroups of Alu Kurumba

Kurumba was carried out, and 93 blood samples were collected from the state of Tamil Nadu.

among the Dravidian speakers and central Indian communities, haplogroup L1 is typically found among the Dravidian community of India. On the other hand, R1a1* is widely distributed in Eurasia.

Paternal Lineage (Y Chromosomal Haplogroups) Maternal Lineage (mtDNA Haplogroups) The Y chromosome haplogroup of the 49 Alu Kurumba individuals were all assigned to seven haplogroups (Fig. 7.1) through screening of the Y-SNPs. Haplogroup H1a has the highest frequency (63%), followed by haplogroup F* (18%), C5* (8%), H* (4%) and haplogroup L1, R1a1* and R2 (2% each). Haplogroup H1a was found at a higher frequency among the Dravidian and Central Indian tribes and represents the major indigenous Indian haplogroup. Haplogroup F* is found mostly among the Dravidian-speaking population, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic language throughout Eurasia. Haplogroup C5* is found in high frequency in Australian aborigines. Haplogroup C attends its highest frequency among the indigenous population of Mongolia, Russian, Far-East, Polynesia, Australia and at moderate frequency in Korean peninsula and Manchuria. It displays its high frequency in modern Indian populations. While, haplogroup H is found

Mitochondrial genomes of 22 Alu Kurumba individuals were scanned and completely sequenced under M-and N-haplogroups for maternal lineages based on HVR1 motifs. Alu Kurumba maternal lineages comprises of 86% Asian macro-haplogroup M and 14% of European macro-haplogroup N. A total of three maternal lineages belonging to haplogroup M (Fig. 7.2) were found in Alu Kurumba population. Haplogroup M2 has highest frequency (74%), followed by M36 (21%) and M6 (5%). Haplogroup M2 is found in high frequency among the populations of Bangladesh and South East India. Haplogroup M6 is found in south Asia with highest concentration in mid-Eastern India and Kashmir. Founder age of Alu Kurumba population was 64  13ky. The N-haplogroup (Fig. 7.3) of the three individuals of Alu Kurumba were all assigned to

Alu Kurumba

173

Fig. 7.2 mtDNA phylogenetic tree of M-haplogroup among the Alu Kurumba

two haplogroups. Haplogroup R5 has highest frequency (67%), followed by haplogroup U1 (33%). Haplogroup R5 was distributed across groups of the Indian subcontinent and peaks in coastal South-West India. The coalescence time was estimated to be 66,100  22,000 years. Indian U lineages differ substantially from those in Europe and their coalescence to a common ancestor dates back to about 50,000 years. Haplogroup U1 is found at very low frequency throughout Europe. It is found more often in Eastern Europe, Anatolia and the Near East. It is also found at low frequencies in India. Haplogroup U1 is a very ancient haplogroup, with an estimated age of about 32,000 years.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Alu Kurumba population is 0.001570  0.000800. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistic may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.4 shows mismatch distributions of Alu Kurumba population. Smooth line is the

174

7

Genomic Diversity of 75 Communities in India

Fig. 7.3 mtDNA phylogenetic tree of N-haplogroup among the Allu Kurumba

Fig. 7.4 Mismatch distribution of nucleotide differences of Alu Kurumba Population

Andh

expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often multimodal in Alu Kurumba population indicates recent expansion. The smaller number of sum of squared deviation (0.0185) and Harpending’s raggedness index (r) (0.0089) also confirm that population has undergone recent demographic expansion (Table 7.1). Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistic that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Alu Kurumba population 0.68 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (9.72872) is also an evidence for a recent population expansion.

Andh The Andhs are a branch of the Gonds who during the Maratha conquest got isolated from the parent stock. After coming in contact with the Marathas they adopted the customs, manners and language of the Maratha Kunbi and merged with them. They migrated to Maharashtra from Andhra Pradesh (Russell and Hiralal 1975). The ‘Andh’ identification is only used for the people who by the start of the twentieth century had a long history of residence in central India. However, two divisions of Andh are reported, the Andhs and Sadhu Andhs (illegitimate progeny of the Andhs) in the Andhra Pradesh and has identified them as cultivators and hunters. They are further subdivided into the Vertali and the Khaltali. The Vertali consider themselves as superior people and avoid marrying the Khaltali. They are distributed in the Nanded, Parbhani and Yeotmal districts of Maharashtra. They are also distributed in Andhra Pradesh and Madhya Pradesh. They speak in Marathi which belongs to the family of Indo-European language, and is written in the Devanagari script. The Andhs have commensal relations with the Vanjara, Maratha, Naikpod and Chanwar but not with the Waddar,

175

Chambar, Mang and Maha. They are part of the traditional patron–client relationship and avail of the services of the Ghisadi, Varik and others by making an annual payment in grains. The traditional and primary occupation of the Andh is cultivation while animal husbandry, hunting and gathering are secondary occupations. Some of them work as wage labourers, some are in Government services and they also work as clerks, teachers and in other capacities. Marriage with one’s mother’s brother’s daughter or sister’s daughter is preferred. Junior sororal marriage is allowed. At present, they marry only after attaining adulthood. Marriages through negotiation, mutual consent and by elopement are common. Monogamy is the common form, though sororal and non-sororal polygamy is accepted. Divorce and remarriage are permitted. The post marital residence is patrilocal. Though majority of them are Hindus, but Muslims, Buddhists and Christians are also there (Singh 1994). In the 2011 census, their population was enumerated 474,110 individuals of which 243,300 were males and 230,810 were females (sex ratio ¼ 949). Traditional medicine is still used, but they are also favourably disposed towards modern medicines and family welfare programmes. For the present study Andh blood samples were collected from the state of Maharashtra. The Andhs are of below medium height with a mesocephalic head shape and show medium nasal and broader facial profiles (Karve and Dandekar 1951). Regarding PTC tasting ability, Buchi in 1959 showed 70.3% taster and 29.7% non-taster with ‘T’ gene frequency 45.48 and t gene frequency 54.52.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 39 Andh individuals were all assigned to 12 haplogroups (Fig. 7.5) through the screening of Y-SNPs. Haplogroup H1a has the highest frequency (28%), followed by haplogroup R1a1 and haplogroup O2a (13% each). Haplogroup H1a

Pairwise differences 21.510  9.831

Nucleotide diversity 0.00157  0.0008

Sum of squared deviation SSD P 0.0185 0.530

Table 7.1 Molecular Diversity Indices among the Allu Kurumba Harpending’s Raggedness index HRI P 0.0089 0.800 Theta Pi 21.51087  0.959

S 25.9754  8.705

Tajima’s P D P 0.6814 0.285

Fu’s Fs Fu’s Fs 9.728

p 0.000

176 7 Genomic Diversity of 75 Communities in India

Andh

177

Fig. 7.5 Y chromosome haplogroups of Andh

was found at a higher frequency among Dravidian and Central Indian tribes and represents the major indigenous Indian haplogroup. Haplogroup R1a1 is widely distributed in Eurasia: it is mainly found in Eastern Europe, Central Asia, South Asia, Siberia, ancient Siberia, but it is rare in East Asia. It has also been suggested that R1a1 might have an independent origin in the Indian subcontinent (Kivisild et al. 2003a, b). Haplogroup R2, another signature of the Central Asian lineage is about 5.0% in Andh population. This haplogroup is mainly found in Indian, Iranian, and Central Asian populations and has been postulated to have a Central Asian origin. About 16% of J2 and 5% of L indicates episodes of Neolithic migrations from Central Asia. Haplogroup O is distributed widely in Asia, from southern India to the Altai Mountains and Central Asia in the west, and from Indonesia to northern China and Japan in the east. The presence of 13% of O2a haplogroup in Andh indicates admixture of Austro-Asian genes.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 84 Andh individuals were scanned for maternal lineages in the population. Out of 84 mtDNA genomes, 25 mtDNA genomes were selected for complete sequencing under M-haplogroup and 14 mtDNA genomes under N-haplogroup based on HVR I motifs. A total of eight maternal lineages belonging to haplogroup M (Fig. 7.6) were found in Andh population. All the M lineages: M2 (28%), M3 (4%), M5 (8%), M6 (4%), M30 (12%), M35 (20%) and M39 (24%) are autochthonous to India. Andh population comprises of both M2a and M2b lineages whose founder age was 64  13 ky. The N-haplogroup of the 14 individuals of Andh were all assigned to 7 haplogroups (Fig. 7.7). Haplogroup R5 and U8 has highest frequency (29%), followed by haplogroup U2 (14%) and U8 (29%). Haplogroup R7, R33, U1 and U5 have shown the same frequency, i.e. 7% and a new haplogroup has exhibited frequency of 2%. R5 is distributed across groups of the Indian subcontinent and peaks in coastal South West India. The coalescence time was estimated to be

178

7

Genomic Diversity of 75 Communities in India

Fig. 7.6 mtDNA phylogenetic tree of M-haplogroup among the Andh

66,100  22,000 years. U2 is sparsely distributed specially in the northern half of the subcontinent. It is also found in South West Arabia. Indian U lineages differ substantially from those in Europe, and their coalescence to a common ancestor also dates back to about 50,000 years. The subclade R7 is very much frequent in Indian subcontinent (Chaubey et al. 2008a, b). Indian and WesternEurasian U2 lineages were estimated to be 53,000  4000 ybp. Haplogroup U1 is found at

very low frequency throughout Europe. It is found more often in Eastern Europe, Anatolia and the Near East. It is also found at low frequencies in India. Haplogroup U1 is a very ancient haplogroup, with an estimated age of about 32,000 years. Genetic distance FST of both Y chromosome and mtDNA haplogroups reveals that they are genetically more Dravidian rather than Indo-European (Fig. 6.5b).

Andh

179

Fig. 7.7 mtDNA phylogenetic tree of N-haplogroup among the Andh

Molecular Diversity Molecular diversity indices are shown in Table 7.2. Nucleotide diversity measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Andh population is 0.001445  0.0007333. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistic may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships.

Figure 7.8 shows mismatch distributions of Andh population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Andh population which indicates a recent expansion. The smaller number of sum of squared deviation (0.0063) and Harpending’s raggedness index (r) (0.0085) also confirm that Andh has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations, Tajima’s D will be a negative value. Among Andh population 1.5103 indicates the population is under expansion, meaning recovering from bottleneck. A negative

Mean number of pairwise differences 23.950  10.893

Nucleotide diversity 0.001445  0.0007

Sum of squared deviation SSD P 0.00631 0.210

Table 7.2 Molecular Diversity Indices among the Andh Harpending’s Raggedness index HRI P 0.00853 0.5100 Theta Pi 23.9500  12.139

S 38.6656  12.664

Tajima’s P D P 1.5103 0.052

Fu’s Fs Fu’s Fs 9.63172

p 0.004

180 7 Genomic Diversity of 75 Communities in India

Angami Naga

181

Fig. 7.8 Mismatch distributions of nucleotide differences of Andh Population

value of FU’s Fs (9.63172) is also an evidence for a recent population expansion of Andh community.

Angami Naga The Angami Nagas trace their origin to the Kerge village of Kezakenoma in Nagaland. They presently inhabit the Kohima district of Nagaland. The Angami language belongs to Tibeto-Burman Family of Languages. Marriages are usually monogamous and fidelity to the spouse is considered a high virtue. Marriage within the same clan is not permitted as it is considered being incest. The family is the most important institution of social education and social control. The traditional Naga society is a patriarchal society. The Angami Naga primarily depends on wet terrace cultivation, fishing, shifting cultivation and animal husbandry are also their livelihood. Majority of the Angamis are Christians, and only few of them are Hindus. For the present study, Angami Naga blood samples were collected from the state of Nagaland. The Angami are of below medium stature. The serological characters indicate a low percentage

of blood groups B (22%) and AB (7%), a high incidence of A (45%), O gene (57%) among them (Seth and Seth 1973). The incidence of G-6PD deficiency is very high among them (27.06%) (Seth and Seth 1971). Dermatoglyphics traits show a greater frequency of whorls (52.34%) over loops (47.42%) and arches (0.25%). The pattern intensity index value is found to be 15.2 in palmar dermatoglyphics, which confirm to the mongoloid pattern (Mitra 1936).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 108 Angami Naga individuals were scanned for maternal lineages in the population. Angami Naga maternal lineages comprises 47% Asian macro-haplogroup M and 53% of European macro-haplogroup N. For haplogroup M, 51 mtDNA genomes were selected for complete sequencing based on HVR I motifs. A total of eight maternal lineages belonging to haplogroup M were found in Angami Naga population (Fig. 7.9). Haplogroup D and M54 have the highest frequency (23%), followed by M8CZ (20%), M38 and M58 (each 16%). Haplogroup D

182

7

Genomic Diversity of 75 Communities in India

Fig. 7.9 mtDNA phylogenetic tree of M-haplogroup among the Angami Naga

Betta Kuruba

is found in Eastern Eurasia, native Americans, Central Asia and occasionally also in West Asia and Northern Europe. Haplogroup M8C2 is prevalent in Eurasia. While haplogroup M13 is distributed in Tibet, Mongolia and Siberia, a lower frequency of M10 is found in East Asia, South East Asia, Bangladesh, Central Asia, Southern Siberia and Belarus. Founder age of Angami Naga population was 59  12 ky. The N-haplogroup of the 57 individuals of Angami Naga were all assigned to ten haplogroups (Fig. 7.10). Haplogroup F has the highest frequency (35%), followed by haplogroup R* (23%), R5 (9%), R0 (5%), R22 (4%) and N10, R30 have shown the same frequency, i.e. 2%. Haplogroup F is fairly common in East Asia and Southeast Asia. Haplogroup R and its descendants are distributed all over Europe, North Africa, the Near East, the Indian Subcontinent, Oceania and the Americas. The basal R* clade is found among the Socotri (1.2%), as well as in Northeast Africa (1.5%), the Middle East (0.8%), the Near East (0.8%) and the Arabian Peninsula (0.3%). Haplogroup A is found in Central and East Asia, as well as among Native Americans. Haplogroup R5 was distributed across groups of the Indian subcontinent and peaks in coastal SW India. The coalescence time was estimated to be 66,100  22,000 years. The subclade R0 within the haplogroup R occurs commonly in the Arabian Peninsula, with its highest frequency observed among the Socotri (Černý et al. 2009). Moderate frequencies are found in North Africa, the horn of Africa and the Central Asia. Haplogroup R22 is found mainly in southcentral Indonesia. Haplogroup R30 is found in South East Asia and Far East. Haplogroup N10 is found in China and Southeast Asia.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Angami Naga population is 0.002216  0.001075. Nucleotide diversity is a measure of genetic variation. It is

183

usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.3). Figure 7.11 shows mismatch distributions of Angami Naga population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often unimodal in Angami Naga population which indicates recent expansion. The smaller number of sum of squared deviation (0.00039) and Harpending’s raggedness index (r) (0.000568) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Angami Naga population 1.531 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (23.9814) is also an evidence for a recent population expansion.

Betta Kuruba Thurston (1909) writes that ‘the Kurubas or Kurumbas are modern representative of the ancient Kurumbas or Pallavas who were once very powerful in South India. Their strength and power gradually declined with the rise of Kongu, Chola and Chalukya chiefs. About the seventh and eighth century A.D., with the evident King Adondi of Chola, the Kurumbas sovereignty was affected and they were overthrown. This led to the dispersion of the Kurumbas far and wide. Many fled to the hills of Malabar, Nilgiris, Coorg, Wynad and Mysore. Thus, this led to a comparative isolation and the loss of the community’s culture. The Kurumbas may be regarded as very old inhabitants of the land, contesting their Dravidian kinsmen, the priority of occupation in the soil (Singh 1994). For the

184

7

Genomic Diversity of 75 Communities in India

Fig. 7.10 mtDNA phylogenetic tree of N-haplogroup among the Angami Naga

Mean number of pairwise differences 36.7126  16.089

Nucleotide diversity 0.002216  0.001

Sum of squared deviation SSD P 0.00039 0.870

Table 7.3 Molecular Diversity Indices among the Angami Naga Harpending’s Raggedness index HRI P 0.00056 0.990 Theta Pi 36.712  17.816

S 67.6589  16.272

Tajima’s P D P 1.531 0.028

Fu’s Fs Fu’s Fs 23.981

p 0.001

Betta Kuruba 185

186

7

Genomic Diversity of 75 Communities in India

Fig. 7.11 Mismatch distributions of nucleotide differences of Angami Naga Population

present study, Betta Kuruba blood samples were collected from the state of Karnataka. According to 2011 census Betta Kurubas population size was 3111. Out of which males were 1547 and females were 1564 (sex ratio ¼ 1011). They are mainly distributed in northern part of Nilgiri Hills district of Tamil Nadu. The Betta Kuruba speaks a dialect of their own, which is having similarity with Kannada language and belongs to the family of Dravidian languages. Their primary occupation is daily labour. With regard to the marriage rules, they largely practised monogamy. Marriage with one’s father’s sister’s daughter, the mother’s brother’s daughter and the elder sister’s daughter is common among them. They are the followers of Hinduism. Physically, Betta Kurubas are mostly short to very short stature people with a mean stature of 1548.1 mm (Sirajuddin 1993). They possess dolichocephalic heads (cephalic index: 75.84) and mesorhine noses (nasal index: 76.42). Regarding ABO blood group, prevalence of ‘O’ blood group in this population is the highest (49.25%) followed by ‘B’ blood group (25.37%), then the ‘A’ blood group (16.42%) and the least is the ‘AB’ blood group (8.95%). It has been observed

that Betta Kuruba show a high frequency of G6PD deficiency, i.e. 17.02%.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 35 Betta Kuruba individuals for Y-SNPs revealed the presence of five haplogroups. These are C5*, F*, H1a*, H2* and L1. Here F* shows the highest preponderance with a percentage of 71% followed by H1a* (20%), L1 (3%), H2* (3%) and C5* (3%) (Fig. 7.12). The F* haplogroup is mainly distributed in North, Central, Western and South India, Sri Lanka, Nepal, Borneo, Java and Sulaweshi and Lemdada. However, the haplogroup H1a* is predominant among the Dravidian-speaking communities and central Indian communities. It is considered to be a major indigenous Indian haplogroup. The L1 haplogroup also indicates episode of Neolithic migrations from central Asia.

Betta Kuruba

187

Fig. 7.12 Y chromosome haplogroups of Betta Kuruba

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 104 Betta Kuruba individuals were scanned for maternal lineages in the population. Based on HVR-I motifs, 30 samples for M-haplogroup and 12 samples for N-haplogroup have been selected for mtDNA sequencing. A total of four maternal lineages belonging to haplogroup M were found among the Betta Kurumba population (Fig. 7.13). Of these 57% predominantly belong to M2 lineage, 27% belong to M35 and 17% belong to M39 lineage. The founder age of the Beta Kuruba was estimated to be 64  13 ky based on the high preponderance of M2 maternal lineage. Out of 104 mtDNA genomes, 42 mtDNA genomes were selected for complete sequencing of which 30 were under M-haplogroup and 12 were under N-haplogroup based on HVR1 motifs (Table 7.4). The N-haplogroup of the 12 individuals of Betta Kuruba were all assigned to three haplogroups (Fig. 7.14). Haplogroup R30 and U1 have the highest frequency (42%), followed by haplogroup U2 (17%). Haplogroup R is a very extended and diversified macro-haplogroup. Haplogroup R30 is found in Andhra Pradesh,

Uttar Pradesh (India), in the Tharu people from Nepaland Sinhalese people from Sri Lanka. The coalescence time of haplogroup R was estimated to be 73,000  20,900 years (Kivisild et al. 2003a, b). R30 was found in South East Asia and Far East. Haplogroup U1 is found at very low frequency throughout Europe. It is found more often in Eastern Europe, Anatolia and the Near East. It is also found at low frequencies in India. Haplogroup U1 is a very ancient haplogroup, with an estimated age of about 32,000 years. U2 is sparsely distributed specially in the northern half of the subcontinent. It is also found in South West Arabia. Indian U lineages differ substantially from those in Europe and their coalescence to a common ancestor also dates back to about 50,000 years.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Betta Kuruba population is 0.001140  0.000578. Nucleotide

188

7

Genomic Diversity of 75 Communities in India

Fig. 7.13 mtDNA phylogenetic tree of M-haplogroup among the Betta Kuruba

diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.15 shows mismatch distributions of Betta Kuruba population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Betta Kuruba population which indicates a recent expansion. The smaller number of sum of squared deviation

(0.08035654) and Harpending’s raggedness index (r) (0.09715418) also confirm that Betta Kuruba has undergone recent demographic bottleneck. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations, Tajima’s D will be a negative value. Among Betta Kuruba population, 1.34954 indicates the population may have suffered a recent bottleneck or decreasing. A negative value of FU’s Fs (16.01467) is also an evidence for a recent population bottleneck of Betta Kuruba community.

Mean number of pairwise differences 18.8873  8.607

Nucleotide diversity 0.00114  0.0005

Sum of squared deviation SSD P 0.08035 0.000

Table 7.4 Molecular Diversity Indices among the Betta Kuruba Harpending’s Raggedness index HRI P 0.09715 0.000 Theta Pi 18.8873  9.578

S 13.8830  4.595

Tajima’s P D P 1.3495 0.939

Fu’s Fs Fu’s Fs 16.0146

p 0.000

Betta Kuruba 189

190

7

Genomic Diversity of 75 Communities in India

Fig. 7.14 mtDNA phylogenetic tree of N-haplogroup among the Betta Kuruba

Bharia The Bharias are mostly distributed in Seoni and Chhindwara districts of Madhya Pradesh. Their origin traces back to the Mahabharata times, which says that Arjun, one of the Pandavas, helped some men by pressing bharru grass; these men are said to be the ancestors of Bharia. They accept water and food from the Brahmin, Khatri, Kunohi, Ahir, Lohar and Gadina. They speak a local dialect which belongs to Indo-European family of languages. Occupationally, they are mainly wage-earners and

agricultural labourers. They are monogamous and observe clan exogamy, and are divided into various clans which regulate marriage alliance. They follow Hinduism as their religion (Singh 1994). According to 2011 census Bharia’s population size was 193,230. Out of which males were 97,574 and females were 95,656. Sex ratio of this population is 980. The Bharia are physically short to below medium in stature and have a lean body mass index. They are closely associated with the Gond. For the present study, Bharia blood samples were collected from the state of Madhya Pradesh.

Bhoi Khasi

191

Fig. 7.15 Mismatch distributions of nucleotide differences of Betta Kuruba community

Paternal Lineage (Y Chromosomal Haplogroups)

Bhoi Khasi

The Y chromosome haplogroup of the 56 Bharia individuals were all assigned to six haplogroups (Fig. 7.16) through screening the Y-SNPs. Haplogroup O2A* has the highest frequency (48%), followed by haplogroup M* (25%), H1 (13%), H* (7%) and R* and R1 (4% each). Haplogroup O2A shows admixture of AustroAsian genes in this population. Haplogroup H1a was found at a higher frequency among Dravidian and Central Indian tribes and represents the major indigenous Indian haplogroup. Haplogroup M* has been found in Papua New Guinea, neighbouring Melanesia, Indonesia and indigenous aboriginal Australians. Haplogroup R1a is widely distributed in Eurasia: it is mainly found in Eastern Europe, Central Asia, South Asia, Siberia, ancient Siberia, but rare in East Asia. It has also been suggested that R1a might have an independent origin in the Indian subcontinent (Kivisild et al. 2003a, b).

According to Bareh (1967), ‘Khasi is a general name given to the various tribes and sub tribes that inhabit the Khasi and the Jaintia Hills’. The Bhoi Khasi refers to those groups of Khasis who inhibit in the low-lying areas. The Bhoi region is an extensive plateau associated with flat lands and open valleys and its northern part slopes towards the Brahmaputra valley (Bareh 1967, cf. Singh 1994). The Bhoi Khasis are distributed in four blocks, namely Umsning, Jirang, Umling and Bhoirymbong, Ri Bhoi district of Meghalaya. The Bhoi villages are surrounded by hills and dense forest, characterized by moderate temperature and medium rainfall. The Bhoi-Khasi language is associated with the Mon-Khmer linguistic group (Natarajan 1977, cf. Singh 1994). The dialect of the Bhoi Khasi, which is known as Bhoi, is a form of the Khasi language. The Bhoi Khasi is divided into a number of exogamous clans, and there is no hierarchy among the clan groups. The main function of the clan is to

192

7

Genomic Diversity of 75 Communities in India

Fig. 7.16 Y chromosome haplogroups of Bharia

regulate marriage. This is an endogamous community, but the incidence of inter-tribal marriage among them is not common. There is no preference for the consanguineous marriage. However, marriage with mother’s brother’s daughter is allowed after the death of the maternal uncle. Monogamy is the rule, but a few cases of polygamous marriage have also been reported. Sororate and both junior and senior levirate marriages are permissible. The Bhoi Khasi are matrilocal. A majority of Bhoi Khasi have adopted Christianity; some are animist and very small percentage of the Bhoi Khasis either worship Hindu gods or goddess or have faith in Islam. Their primary occupation is agriculture, supplemented by sericulture, petty business, contract jobs, basketry, fishing, hunting and collection of forest products like honey, firewood, etc. (Singh 1994). For the present study, Bhoi Khasi blood samples were collected from the state of Meghalaya. Predominantly the Bhoi are of short statured (64%), followed by below medium stature (15%), and very short stature (13%) with mesocephalic, hypsicephalic and acrocephalic heads, having mesorrhine and platyrrhine types of nose. As far as finger print patterns are concerned, the Bhoi

males are characterized by high frequency of whorl (55.94%) and relatively high value of pattern intensity index (15.46) in contrast to the females who have high frequency of loops (62.06%) and low value of Pattern Intensity Index (13.64). In the ABO Blood group system, the Bhoi show high frequency of O blood group (38.7%) followed by A (30.4%), B (23.0%) and AB (7.8%). The frequency of tasters and non-tasters among them is 78.09 and 21.90, respectively (Das 1978).

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 31 individuals were all assigned to three haplogroups (Fig. 7.17) through screening the Y-SNPs. Haplogroup O2A* has the highest frequency (61%), followed by haplogroup O3A3C1* (36%) and haplogroup R1A1* (3%). The presence of 61% of O2a haplogroup in Bhoi Khasi indicates admixture of Austro-Asian genes. Haplogroup O was found at a higher frequency among Dravidian and Central Indian tribes and

Bhoi Khasi

193

Fig. 7.17 Y chromosome haplogroups of Bhoi Khasi

represents the major indigenous Indian haplogroup. Haplogroup R1a is widely distributed in Eurasia: it is mainly found in Eastern Europe, Central Asia, South Asia, Siberia, ancient Siberia, but rare in East Asia. It has also been suggested that R1a might have an independent origin in the Indian subcontinent (Kivisild et al. 2003a, b).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 35 Bhoi Khasi individuals were scanned from the population and were selected for complete sequencing of which 69% were under M-haplogroup and 31% were under N-haplogroup. A total of 11 maternal lineage belonging to haplogroup M were found in the Bhoi Khasi population (Fig. 7.18). Haplogroup M39 has the highest frequency (33%), followed (12%), M38, M46, M48 (each 8%) and M4, M8CZ, M31, M45 and M50 (each 4%). A new haplogroup of M lineage has to be assigned for two samples. Haplogroup M31 is distributed among the Onge in the Andaman islands. Haplogroup D is found

in Eastern Eurasia, native Americans, Central Asia and occasionally also in West Asia and Northern Europe. While, haplogroup M49 is found among the ancient specimen in the Euphrates valley. Haplogroup M8C2 is prevalent in Eurasia, while M4 is found in South Asia and in low concentration in Eastern Saudi Arabia. Founder age of Bhoi Khasi population was 20  6 ky. The N-haplogroup of the 11 individuals of Bhoi Khasi were all assigned to four haplogroups (Fig. 7.19). Haplogroup A17, F1 and R0 have the highest frequency (27%), followed by haplogroup B 40 5 (18%). Haplogroup A is found in Central and East Asia, as well as among Native Americans. Haplogroup F is fairly common in East Asia and Southeast Asia. Higher frequencies occur in some areas like Nicobar at 50% and Arunachal Pradesh 31% (India) and Shors people from Siberia at 44%. The subclade R0 within the haplogroup R occurs commonly in the Arabian Peninsula, with its highest frequency observed among the Socotri (Černý et al. 2009). Moderate frequencies are found in North Africa, the horn of Africa and the Central Asia. Haplogroup B is believed to have arisen in Asia some

194

7

Genomic Diversity of 75 Communities in India

Fig. 7.18 mtDNA phylogenetic tree of M-haplogroup among the Bhoi Khasi

50,000 years before present. Its ancestral haplogroup was haplogroup R. Its greater variety is in China. It is conspicuous that haplogroup B may have its earliest diversification in Southern China and/or Southeast Asia. Haplogroup B is found frequently in South Eastern Asia.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Bhoi Khasi population is 0.002134  0.001065. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be

Bhoksa

195

Fig. 7.19 mtDNA phylogenetic tree of N-haplogroup among the Bhoi Khasi

used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.5). Figure 7.20 shows mismatch distributions of Bhoi Khasi population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Bhoi Khasi population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00163395) and Harpending’s raggedness index (r) (0.00196596) also confirm that Bhoi Khasi has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the

number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Bhoi Khasi population 1.9623 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (16.31913) is also an evidence for a recent population expansion of Bhoi Khasi community.

Bhoksa Bhoksa is a scheduled tribe population from Uttarakhand. A section of Bhoksa claim descent from Raja Jajatdeo, a famous warrior of Rajasthan. According to 2011 census their

Mean number of pairwise differences 26.5731  11.931

Nucleotide diversity 0.00213  0.001

Sum of squared deviation SSD P 0.00163 0.890

Table 7.5 Molecular Diversity Indices among the Bhoi Khasi Harpending’s Raggedness index HRI P 0.00196 0.990 Theta Pi 26.57310  13.263

S 55.1210  16.629

Tajima’s P D P 1.9623 0.004

Fu’s Fs Fu’s Fs 16.3191

p 0.000

196 7 Genomic Diversity of 75 Communities in India

Bhoksa

197

Fig. 7.20 Mismatch distributions of nucleotide differences of Bhoi Khasi Population

population size is 54,037. Out of which males were 28,836 and females were 26,201 (sex ratio ¼ 909). A total of 52,899 populations were found to live in rural areas. The Bhoksa inhabit in the Terai areas of Dehradun, Nainital, Bijnor and Paurigarhwal districts of Uttarakhand. They speak Hindi and belong to Indo-European family of languages. The traditional and primary occupation of the Bhoksa is agriculture and animal husbandry. The subsidiary occupation is wage labour. The Bhoksa are monogamous. They follow clan exogamy. Marriages are usually arranged through negotiation or by elopement. A few cases of inter-community marriages with the Ranga and Hindu Gujjar have been reported, and such unions are recognized by their societies. Majority of them are Hinduism, and very few follow Buddhism, Sikhism, Jainism and Islam (Singh 1994). For the present study, Bhoksa blood samples were collected from the state of Uttarakhand. As per as their anthropometric measurements are concerned, they are of medium or below medium stature in general. With regard to the ABO blood group, the frequency of B blood group is the highest (38.2%), followed by the O blood group (32.4%), then A blood group (20.5%) and the least frequency is the AB blood

group (8.8%) (Majumdar and Krishen 1947). The Bhoksa are also known as Meheri or Mehra.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 29 individuals were all assigned to nine haplogroups (Fig. 7.21) through screening of the Y-SNPs. Haplogroup F* has the highest frequency (41%), followed by haplogroup H1A* (17%), R1A1* (14%), H* (10%), and H1B, J2B1, L1, L* and haplogroup N1* (4% each). Haplogroup F* was found mostly among the Dravidian-speaking population, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. Haplogroup H1A* is widely distributed among the Dravidian and Central Indian tribes. Haplogroup R1A1* is widely distributed in Eurasia. While haplogroup L* is distributed in Pakistan, and haplogroup L1 is typically distributed among Dravidian communities in India.

198

7

Genomic Diversity of 75 Communities in India

Fig. 7.21 Y chromosome haplogroups of Bhoksa

Bhotia Historical accounts state that they are the descendants of the Bhil Kirat or the Mon-Khmer who entered India from the eastern direction. According to 2011 census, total population of Bhotia is 39,106. Out of which 19,168 were males and 19,938 were females (sex ratio ¼ 1040), while 28,230 number of Bhotia live in rural areas. They inhabit the ranges along the snowy peaks of Himalayas in the Indo-Tibetan border of Uttarakhand. In Uttarakhand, the Bhotia are one of the earliest inhabitants of the Kumaon and Garhwal hills. They speak various dialect of the Bhotia language which belongs to the TibetoBurman family of languages. The Bhotia prefer to dine with all communities except the Sudras. Business is their traditional occupation. Apart from that, their occupations are agriculture, weaving, animal husbandry and as daily-wage labourers. They are generally monogamous, but in the northern parts, a few cases of polyandry were found. They profess Buddhism, Hinduism and Christianity as their religion. The Bhotia are divided into eight sub-groups on the basis of religion, territory, occupation and dialect (Singh 1994). For the present study, Bhotia blood

samples were collected from the state of Uttarakhand. Bhotia are below medium or short stature, round headed and show a medium profile with a long to medium nose form. About 75% of them are non-tasters for PTC taste character. The incidence of colour blindness ranges from 3% to 6%. With regard to G6PD deficiency, it was found that among the Bhotias, there is 1.96% deficient (Kapoor and Vaid 1977). In the ABO Blood Group system, blood group ‘B’ is the highest (50.7%), followed by ‘O’ blood group (18.1%) and equal proportions of A and AB blood groups (15.3%) (Tiwari 1952).

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 58 individuals for Y-SNPs revealed the presence of ten haplogroups. These are D*, R2, M*, O3a3c, N*, H*, H1*, R*, G* and O3A3C1*. Here, D* shows the highest preponderance with a percentage of 29% followed by R2 (17%), M* and O3a3c (each 14%), N* (9%), H* and H1* (each 5%), and R* (3%), G* and O3A3C1* (each 2%) (Fig. 7.22). The haplogroup

Bhotia

199

29.3

17.2 13.8

13.8 8.6

5.2

5.2

1.7 D*

G*

1.7 H*

H1*

M*

3.4

N* O3a3c O3A3C1* R*

R2

Fig. 7.22 Y chromosome haplogroups of Bhotia

D is an Asian haplogroup, predominant in South East Asia including Siberia and Central Asia.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of nine Bhotia individuals were scanned for maternal lineages in the population of which five are under Mhaplogroup and four are under N-haplogroup based on HVR1 motif. A total of five maternal lineages, i.e. M4, M8, G, M18, M33 belonging to haplogroup M were found in equal frequency (20%) among Bhotia population (Fig. 7.23), haplogroup G distributed in Russia, Straikovskaya, Japan, Mongolia and Tibet. While M4 haplogroup is distributed in South Asia and in low concentration in Eastern Saudi-Arabia. Haplogroup M18 is predominant among the Tharus in southern Nepal and tribal populations in Andhra Pradesh. Haplogroup M33 is found in smaller amount in South Asia, Belarus and Southern China. Founder age of Bhotia population was 60  15 ky. The N-haplogroup of the individuals of Bhotia were both assigned to four lineages (Fig. 7.24). Haplogroup A, A11, W and R6 have equal

frequency, i.e. 25% each. Haplogroup A is found in Central and East Asia, as well as among Native Americans. Haplogroup R6, a subclade of Haplogroup R, whose origin traces to 66.8 kya, is found in small frequencies in India and Pakistan.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Bhotia population is 0.001175  0.000706. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.6). Figure 7.25 shows mismatch distributions of Bhotia population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Bhotia population which indicates a

200

7

Genomic Diversity of 75 Communities in India

Fig. 7.23 mtDNA phylogenetic tree of M-haplogroup among the Bhotia

recent expansion. The smaller number of sum of squared deviation (0.04810018) and Harpending’s raggedness index (r ¼ 0.11555556) also confirm that Bhotia has undergone recent demographic bottleneck. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Bhotia population D (1.17654) indicates the population may have suffered a recent bottleneck and expanding population. A negative value of FU’s Fs (0.21914) is also an evidence for a recent population bottleneck of Bhutia community.

Bondo This community is also known as Bondo Poraja (Nayak 2010). They prefer to describe them as ‘Remo’ which means man or people, who are majorly distributed in the Khairput block of the Malkangiri subdivision in Koraput district of Odisha (Nandy 2016). They are categorized into three groups: (a) The upper Bondo, who live in the Bondo Hills, hostile to outsiders and still retain their distinctive sociocultural features; (b) The lower Bondo, who live at the foot of the Bondo Hills, maintain a symbolic relationship with the surrounding peasant communities and (c) The Gadaba Bondo who practise settled agriculture and live in multi-ethnic settlements (Nandy 2016). The Bondo territory is full of mountain ranges and forests. According to the

Bondo

201

Fig. 7.24 mtDNA phylogenetic tree of N-haplogroup among the Bhotia

2011 census, their population is 12,231 out of which 5669 were males and 6562 were females (sex ratio ¼ 1158) (Census of India 2011). They speak Bondo language which belongs to the Austro-Asiatic linguistic family (Page Jr 2020). A few of them are conversant with the Odia language and use the Odia script. The Bondo are an endogamous community divided into two totemic clans, namely Ontal (cobra) and Killo (tiger) (Page Jr 2020). These are further divided into a number of exogamous lineages. The members of a lineage are related through a common mythical descent and remain together under a common leader and a magico-religious head. In the Bondo society, spouses are selected when a boy attains 10–12 years of age and a girl attains 15–20 years of age. The youth dormitories play

an important role in the selection of spouses (Page Jr 2020). The Bondo are generally monogamous. The traditional and primary occupation of the Bondo is shifting cultivation; however, they are also engaged in collection of forest produces, hunting, fishing, animal husbandry, settled cultivation, wage earning, trade, and petty service (Nandy 2016). The Bondo religion is mixed with Hinduism nowadays (Singh 1994). The literacy rate among Bondo is very low. According to the 1981 census, the literacy rate of Bondo is only 3.61%. In physical appearance, the Bondo exhibit features such as a short stature, a mesocephalic head shape, a broad nose and a round face (Karve 1954). For the present study, Bondo blood samples were collected from the state of Odisha.

Mean number of pairwise differences 16.8000  8.742

Nucleotide diversity 0.00117  0.0007

Sum of squared deviation SSD P 0.04810 0.700

Table 7.6 Molecular Diversity Indices among the Bhotia Harpending’s Raggedness index HRI P 0.11555 0.6200 Theta Pi 16.800  10.0955

S 20.5839  9.995

Tajima’s P D P 1.1765 0.108

Fu’s Fs Fu’s Fs 0.2191

p 0.270

202 7 Genomic Diversity of 75 Communities in India

Bondo

203

3.5

Observed

Simulated

3

No. of Samples

2.5

2 1.5 1

0.5 0 0

1 2 3

4

5 6

7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 No. of Nucleotide Difference

Fig. 7.25 Mismatch distributions of nucleotide differences of Bhotia population

Fig. 7.26 Y chromosome haplogroups of Bondo

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 35 individuals were all assigned to three haplogroups (Fig. 7.26) through screening the

Y-SNPs. Haplogroup F* has the highest frequency (49%), followed by haplogroup O2a (40%) and H2* (11%). Haplogroup F* was found mostly among the Dravidian-speaking population, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic groups throughout

204

Eurasia. The O2a haplogroup in Bondo indicates admixture of Austro-Asian genes. Haplogroup H2* is distributed among the Dravidian and central Indian tribes.

Chenchu The origin of the Chenchu is connected to Lord Mallikarjun of Srisailam temple who was the personification of the Lord Shiva. Manusmriti mentions a tribe called the Chenchus and treats them at par with the Andhra people. According to 2011 census, their population strength is 64,227. Out of which, 32,196 were males and 32,031 were females (sex ratio ¼ 995). Fifty-seven thousand one hundred and eighty seven Chenchus live in the rural areas. The Chenchus are distributed in three areas of Andhra Pradesh, namely Mahabub Nagar, Karnool and Guntur. Traditionally, they were hunter-gatherers, but at present, they are engaged as day-labourer and in agriculture. They are a Dravidian tribe, speaking a language with Telugu accents. They mix freely with the people from the plain areas, but commensal norms are applied to the Mala, Madiga and some others. Majority of them practise agriculture and agricultural labourer. Marriages are arranged through negotiation and by elopement. Crosscousin marriage is preferred. Marriage with mother’s brother’s daughter or father’s sister’s daughter is followed in such type of preferential marriage. Mostly they follow Hinduism. Some anthropologists are of the view that there can be little doubt that the Chenchus and Yanadis are descendants from the same stock. The Chenchu have 26 exogamous clans (Singh 1994). The Chenchu blood samples were collected from the state of Andhra Pradesh. The Chenchu are of medium stature, with a long and narrow head shape, round or oval facial profile and a short nose of medium breadth. The distribution of finger patterns among the Chenchus generally show a preponderance of loops (69.20%) over whorls (29.2%) as studied by Rao et al., in 1983. Most of the Dravidianspeaking tribes have a higher incidence of loops over whorls. Pattern Intensity Index is 13.85.

7

Genomic Diversity of 75 Communities in India

Goud and Rao 1979 studied among the Chenchu tribes of Andhra Pradesh showed that there is very less incidence of colour blindness among them (1.06%). Studies among the tribes of Tamil Nadu by Buchi in 1959 showed 57.8% PTC taster and 42.2% PTC non-taster with T gene frequency 35.01 and t gene frequency 64.99. Ramesh et al. (1980) conducted studies on abnormal Haemoglobin among the Chenchus in 1980 and found that there is 8% sickle cell trait (HbAS) among them with S gene frequency 6.0. The phenotype and allele frequencies for ABO blood group system among Chenchus studied by Ramesh et al. (1980). Blood group ‘O’ predominates with the highest frequency (33.11%), followed by ‘A’ (29.8%) and ‘B’ (29.8%). The ‘AB’ blood group among the Chenchus is 7.14%. The gene frequency for the ‘d’ gene was 32.2. A study carried out by Ramesh et al. in 1980 had revealed gene frequency of HP1 was 19.2 and HP2 was 80.8 in Mahbubnagar district, Andhra Pradesh. Ramesh et al. (1980) had worked on gene frequency of transferrin variants among Chenchu of Mehbubnagar district, Andhra Pradesh, and showed 99.6 gene frequency of Tfc variant, whereas 0.4 for Tfd variant. Papiha et al. 1987 worked on GroupSpecific Component System (GC) found 76.80 gene frequency for GC1 variant, 22.70 for GC2 variant and 0.50 for GCR variant. Among the Chenchus, regarding Red cell acid phosphate system (RCAP) the frequency of Pa gene is 38.13 and Pb gene is 61.87 as studied by Ramesh et al. (1980). The frequency of PGM1 gene is 75.0% and PGM2 gene is 25.00%.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 40 individuals for Y-SNPs revealed the presence of four haplogroups. These are R2, R1A1*, J2B1 and L1. Here, R2* shows the highest preponderance with a percentage of 40% followed by R1A1* (32%), J2B2 (25%), and L1 (2.5%) (Fig. 7.27). The haplogroup R2 is predominantly of Central Asian lineage.

Damor

205

Fig. 7.27 Y chromosome haplogroups of Chenchu

Maternal Lineage (mtDNA Haplogroups) A total of 13 mtDNA genomes were selected for complete sequencing and 100% under Mhaplogroup (Fig. 7.28). A total of five maternal lineages belonging to haplogroup M were found in Chenchu population. Haplogroup M30 has the highest frequency (38%), followed by M2 (31%), M6 (15%) and M5 and M33 (each 8%). Haplogroups M39 and M5 are distributed in South Asia. Haplogroup M2 is distributed in South East Asia. Haplogroup M6 is also distributed in South Asia with highest concentration in mid-Eastern India and Kashmir. Founder age of Chenchu population was 36  10 ky.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Chenchu population is 0.001660  0.000876. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological

populations, and to determine evolutionary relationships (Table 7.7). Figure 7.29 shows mismatch distributions of Chenchu population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Chenchu population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01241353) and Harpending’s raggedness index (r) (0.03526144) also confirm that Chenchu has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Chenchu population 0.95026 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (3.62978) is also an evidence for a recent population expansion of Chenchu community.

Damor The Damor claimed to be of Rajput origin and recall their migration from Gujarat. According to 2011 census, their population strength was 91,463. Out of which 46,037 were males and

206

7

Genomic Diversity of 75 Communities in India

Fig. 7.28 mtDNA phylogenetic tree of M-haplogroup among the Chenchu

45,426 were females (sex ratio ¼ 987). Total 89,974 of them live in rural areas. They are mainly distributed in Dungarpur, especially in the border areas of Rajasthan and Gujarat. The Indo-European language, Vagri is their mother tongue and they are also conversant with Gujarati and Mewari. The Damor have commensal relation with the Jains, Rajputs, Brahmins, Gadia Lohar, Banjara, Dangi and others, which are reciprocated by some of them. Their traditional occupation is agriculture. Monogamy is the common form of marriage but polygamy is also allowed. They practise child marriage which is finalize through negotiation. The Damor profess the Hindu religion. The Damor are divided into

two groups, namely Rajasthani Damor and Gujarati Damor. The Gujarati Damor are treated as socially higher than Rajasthani Damor (Singh 1994). For the present study, Damor blood samples were collected from the state of Rajasthan. Sickle cell trait found at 4.28%, ‘a’ gene frequency was reported as 97.86, S gene frequency was reported as 2.14 (Bhasin 1994).

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 38 individuals were all assigned to four

Mean number of pairwise differences 20.76923  9.765774

Nucleotide diversity 0.00166  0.000876

Sum of squared deviation SSD P 0.01241 0.700

Table 7.7 Molecular Diversity Indices among the Chenchu Harpending’s Raggedness index HRI P 0.03526 0.2700 Theta Pi 20.76923  10.962

S 26.4139  10.049

Tajima’s P D P 0.95026 0.170

Fu’s Fs Fu’s Fs 3.6297

p 0.048

Damor 207

208

7

Genomic Diversity of 75 Communities in India

Fig. 7.29 Mismatch distribution of nucleotide differences of Chenchu population

Fig. 7.30 Y chromosome haplogroups of Damor

haplogroups (Fig. 7.30) through screening the Y-SNPs. Haplogroup C5* has the highest frequency (84%), followed by haplogroup R2* (8%), haplogroup H* (5%) and haplogroup F* (3%). Haplogroup C5* is found in high frequency among the Australian aborigines.

Haplogroup C attends its highest frequency among the indigenous population of Mongolia, Russia, Far East Polynesia, Australia and at moderate frequency in Korean Peninsula and Manchuria. It displays its high frequency in modern Indian population. Haplogroup R2 is another signature of Indian tribes, haplogroup F* found

Dhodia

mostly among the Dravidian-speaking population, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic group throughout Eurasia.

Dhodia The term Dhodia has been derived from ‘Dhulia’, a place in Maharashtra from where two Rajput princes came to this area. They met two beautiful Nayaka women and got married to them. Thus, a new community was formed. The progeny of them are called Dhodia. According to 2011 census their population strength was 635,695. Out of which 318,087 were males and 317,608 were females. Sex Ratio is 998. Eighty-nine thousand nine hundred and seventy-four Dhodias live in rural areas. The Dhodia are mainly distributed in Surat and Valsad districts of Gujarat. Their mother tongue is Dhodia, belongs to the Bhili group of the Indo-European family of languages. The Gujarati language is used for inter-group communication. The Nayaka maintain a social relationship with Dhodia. They depend on agriculture and agricultural labour. In addition to that,

Fig. 7.31 Y chromosome haplogroups of Dhodia

209

they are also engaged in fishing. They are monogamous and their marriages are generally fixed through negotiation. They follow Hinduism (Singh 1994). For the present study, Dhodia blood samples were collected from the state of Gujarat. Negi (1968) found sickle cell carrier— 17.84%, A gene frequency—91.08% and S gene frequency—8.92%. A blood group is dominant among this population (38.3%), followed by O (37.8%), B (16.4%) and AB (7.5%). Frequency of p gene is 18.4, q ¼ 23.04 and r ¼ 58.3 (Negi 1990).

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 48 Dhodia individuals were all assigned to 12 haplogroups (Fig. 7.31) through screening the Y-SNPs. Haplogroup R1A1* has the highest frequency (27%), followed by haplogroup H1A* (17%), haplogroup R2 (13%), H2* (10%), J2B1 and J2b2* (each 9%), O2A* and P* (each 4%), H*, J*, L1 and Q1* (each 2%). Haplogroup R1A1* is

210

7

Genomic Diversity of 75 Communities in India

widely distributed in Eurasia. Haplogroup H1A* is distributed among Dravidian and Central Indian communities. Distribution of haplogroup R2 is noticed among the Central Asian lineage. Haplogroup J is distributed in North Africa, the Caucasus, South East Europe, central Asia, Iran, Pakistan and western India. While, haplogroup O2a is an admixture of Austro-Asian genes; L1 is typically distributed among the Dravidian communities of India.

among the males and 167.00 among the females. In the ABO blood group system the frequency of O gene is the highest (54.20), followed by ‘A’ gene (24.8) and then ‘B’ gene (21.00). The frequency of r gene is the highest (54.2), followed by p (24.80) and then by q (21.0) (Goswami and Das 1990). With regard to Phenylthiocarbamide (PTC) tasting ability, the frequency of non-tasters is the highest among the Dirang Monpa.

Dirang Monpa

Paternal Lineage (Y Chromosomal Haplogroups)

The Dirang Monpa are believed to be the original inhabitants of the area. According to 2011 census their population strength was 43,709. Out of which 21,150 were males and 22,559 were females (sex ratio ¼ 1067). Thirty-seven thousand seven hundred and forty-four Dirang Monpas live in rural areas. Majority of them inhibit in the Dirang circle of West Kameng district, Arunachal Pradesh. They speak the Sangla dialect of the Monpa language which belongs to the Bhotia group of Tibeto-Burman language family. Hindi and Assamese are used for intergroup communication. They have close religious linkage with Sherdukpen, Tawang Monpa and other Lamaistic Buddhist group. Primarily they are agriculturist. They are also engaged in forestry and hunting. Monogamy is common but polyandrous unions are also permitted. They practise adult marriage. In the remote past, clan exogamy was practised. Majority of them are Buddhist. Some of them are Hindus and Christians. Pre-marital and extra marital relationships are tolerated by the community to some extent (Singh 1994). For the present study, Dirang Monpa blood samples were collected from the state of Arunachal Pradesh. Physically the Dirang Monpa have a below medium stature and are metriocephalic. The distribution of finger patterns in dermatoglyphics among the Dirang Monpa, generally show a preponderance of whorls (63.90%) over loops (35.49%), and 0.61% arches, as studied by Goswami and Das (1990). Pattern Intensity Index is 16.21. The Furuhata’s index is 180.00

The screening of 40 Dirang Monpa individuals for Y-SNPs revealed the presence of six haplogroup. These are O3a3c1*, D1a*, D3*, K*, H* and P*. Here O3a3c1* shows the highest preponderance with a percentage of 58% followed by D1a* (20%), D3 (10%), K* (7%), H* (2%) and P* (2%) (Fig. 7.32).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 100 individuals were scanned for maternal lineages in the population. Out of that 32 is under M-haplogroup and 12 is under N-haplogroup which were completely sequenced based on HVR1 motifs. A total of nine maternal lineages belonging to haplogroup M were found among the Dirang Monpa population (Fig. 7.33). Of these 34% predominantly belong to M49, 31% belong to M8 lineages, 6% belong to M67, M9, D and M62 lineages each. Again 3% each has been found in the haplogroups G, M13 and M35. The founder age of the Dirang Monpa was estimated to be 45  15 ky based on the high preponderance of M2 maternal lineage. The N-haplogroup of the 12 individuals of Dirang Mongpa were all assigned to five haplogroups (Fig. 7.34). Haplogroups A17, B40 5, F and R2 have shown 25% frequency followed by F1 (17%) and A1 (8%). Haplogroup A is found in Central and East Asia, as well as among Native Americans. Haplogroup B is

Dirang Monpa

211

Fig. 7.32 Y chromosome haplogroups of Dirang Monpa

believed to have arisen in Asia some 50,000 years before present and its ancestral haplogroup was haplogroup R. Haplogroup F is fairly common in East Asia and Southeast Asia. Higher frequencies occur in some areas like Nicobar at 50% and Arunachal Pradesh 31% (India) and Shors people from Siberia at 44%. Haplogroup R2 is found mainly in Balochistan in Pakistan. In India it is found in Rajasthan and Uttar Pradesh and also in Iran, Georgia and Turkey. Sub-haplogroup R2 was rare among Indian samples, and the age of this sub-haplogroup was calculated to be 11,400  9000 ybp.

Molecular Diversity Molecular diversity indices are shown in Table 7.8. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Dirang Monpa population is 0.001585  0.000794. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected

heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.35 shows mismatch distributions of Dirang Monpa population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal. Dirang Monpa population indicates a recent expansion. The smaller number of sum of squared deviation (0.02263933) and Harpending’s raggedness index (r) (0.03533104) also confirm that Dirang Monpa has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Dirang Monpa population negative value of Tajima’s D (1.37233) indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (14.02616) is also an evidence for a recent population expansion of Dirang Monpa community.

212

7

Genomic Diversity of 75 Communities in India

Fig. 7.33 mtDNA phylogenetic tree of M-haplogroup among the Dirang Monpa

Dungri Bhil

213

Fig. 7.34 mtDNA phylogenetic tree of N-haplogroup among the Dirang Monpa

Dungri Bhil Forefathers of Dungri Bhil migrated from Mewar in Rajasthan about 300,000 years ago. According to 2011 census, their population strength is 4,215,603. Out of which 2,133,216 were males and 2,082,387 were females (sex ratio ¼ 1067). 3,881,972 Dungri Bhils live in rural areas. They speak in Dungri among themselves which is a dialect of Bhil, which belongs to the Indo-European family of languages. They communicate with other in Gujarati as well as in Hindi. The other Bhil groups do not consider

Dungri as equal to them. They are mainly settled agriculturists. There is no strict rule of clan exogamy. A man cannot marry his maternal uncle’s daughter or father sister’s daughter, father’s brother’s daughter and father’s maternal uncle’s daughter. Widow re-marriage is not permitted though a widower can remarry. They are followers of Hinduism (Singh 1994). For the present study, Dungri Bhil blood samples were collected from the state of Gujarat. Regarding ABO blood group the highest frequency is found among 0 (37.13%), followed by A (27.37%), B (26.56%) and AB (8.94%) (Majumdar and Krishen 1947).

Mean number of pairwise differences 26.25806  11.820

Nucleotide diversity 0.00158  0.0007

Sum of squared deviation SSD P 0.02263 0.000

Table 7.8 Molecular Diversity Indices among the Dirang Monpa Harpending’s Raggedness index HRI P 0.03533 0.000 Theta Pi 26.25806  13.14857

S 40.9709  12.692

Tajima’s P D P 1.3723 0.065

Fu’s Fs Fu’s Fs 14.0261

p 0.000

214 7 Genomic Diversity of 75 Communities in India

Dungri Bhil

215

Fig. 7.35 Mismatch distribution of nucleotide differences of Dirang Monpa population

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 60 individuals for Y-SNPs revealed the presence of 13 haplogroups. These are C5*, H1a*, H2*, R2, J2A, G1*, O2A*, K*, R1A1, F*, J*, H* and L1. Here, C5* shows the highest preponderance with a percentage of 15% followed by H2* (13%), R2 (12%), J2A, and G1* (10%), O2A* (8%), H1a*, K* and R1A1 (7%), F* (5%), J* (3%), H* and L1 (2%) (Fig. 7.36). Haplogroup C5 was found in high frequency among the Australian aborigines. Although haplogroup C attends its highest frequency among the indigenous population of Mongolia, Russian far east, Polynesia, Australia and at moderate frequency in Korean Peninsula and Manchuria. It displays its high frequency in modern Indian populations.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 118 individuals were scanned for maternal lineages in the population. Out of that 43 is under M-haplogroup and 19 is under N-haplogroup were completely sequenced based on HVR1 motif. A total of ten maternal lineages belonging to haplogroup M were found

among the Dongri Bhill population (Fig. 7.37). Of these, 35% predominantly belong to M5, 16% belong to M30 and M33 lineages, 9% belong to M2 and M37, 7% belong to M57 lineage and 2% belong to M3 and D lineages. The founder age of the Dongri Bhill was estimated to be 44  10 ky based on the high preponderance of M5 maternal lineage. The N-haplogroup of the 19 individuals of Dungri Bhil were all assigned to ten haplogroups (Fig. 7.38). Haplogroup U2 has the highest frequency (21%), followed by haplogroup R2, W (16%). Further, haplogroup B40 5 and A21 have a frequency of 11% each and J1, R6, R30, U5 and U7 have shown the same frequency, i.e. 5% each. Haplogroup U2 is sparsely distributed specially in the northern half of the subcontinent. It is also found in SW Arabia. The coalescence age of Indian and Western-Eurasian U2 lineages was estimated to be 53,000  4000 ybp. Haplogroup B is believed to have arisen in Asia some 50,000 years before present. Its ancestral haplogroup was haplogroup R. Its greater variety is in China. It is conspicuous that haplogroup B may have its earliest diversification in southern China and/or Southeast Asia. Haplogroup B is found frequently in southeastern Asia. Haplogroup W is found in Western Eurasia and South Asia. Haplogroup R30 found in Andhra

216

7

Genomic Diversity of 75 Communities in India

Fig. 7.36 Y chromosome haplogroups of Dungri Bhil

Pradesh, Uttar Pradesh (India), in the Tharu people from Nepal and Sinhalese people from Sri Lanka. Haplogroup J is found in highest frequency in the Near East (12%), 21% in Saudi Arabia and declines towards Europe at 11%, Caucasus 8%, North Africa 6% and becomes practically missing in East Asia. Haplogroup U5 is present in approximately 11% of total Europeans and 10% of European-Americans. The highest frequency is in the Sami people. Haplogroup U7 is considered a West Eurasianspecific, believed to have originated in the Black Sea area approximately 30,000-years-ago.

Molecular Diversity Molecular diversity indices are shown in Table 7.9. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Dongri Bhil population is 0.001134  0.000569. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to

the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.39 shows mismatch distributions of Dungri Bhil population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Dungri Bhil population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00399225) and Harpending’s raggedness index (r) (0.00764279) also confirm that Dungri Bhil has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Dungri Bhil population negative value of Tajima’s D (1.34943) indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (24.20270) is also an evidence for a recent population expansion of Dungri Bhil community.

Dungri Bhil

Fig. 7.37 mtDNA phylogenetic tree of M-haplogroup among the Dongri Bhill

217

218

7

Genomic Diversity of 75 Communities in India

Fig. 7.38 mtDNA phylogenetic tree of N-haplogroup among the Dongri Bhill

Gadia Lohar Gadia Lohars (also known as Gaduliya Lohars) are a nomadic community of Rajasthan, India (Mishra 1977). They are also found in the Malwa region of Madhya Pradesh. They are lohar (ironsmith) by profession who move on from one place to another place on bullock carts, which in Hindi are called gadi, hence the name ‘Gadia Lohar’ (Ruhela 1968; Singh 1998). These Lohars are different from the Lohar clan of Iran, Pakistan and India. They usually make and repair agricultural and household implements. Their origin is shrouded in legend. Their forefathers were

blacksmiths in the army of Maharana Pratap of Mewar. When Mewar fell to the Mughals they pledged never to return to their homeland and never to settle anywhere else until Rana’s hegemony was restored (Tehrani 2015). The bullock carts are their homes. They lead a hard life and despite the vagaries of weather and the uncertainties of their trade, they, the children of the desert, are a handsome and cheerful lot and remain buoyantly dignified, unmindful of their hard life (Singh 1994). For the present study, Gadia Lohar blood samples were collected from the state of Rajasthan.

Mean number of pairwise differences 18.7884  8.489

Nucleotide diversity 0.00113  0.0005

Sum of squared deviation SSD P 0.00399 0.600

Table 7.9 Molecular Diversity Indices among the Dongri Bhill Harpending’s Raggedness index HRI P 0.00764 0.3300 Theta Pi 18.78848  9.427

S 29.8145  8.785

Tajima’s P D P 1.3494 0.068

Fu’s Fs Fu’s Fs 24.2027

p 0.000

Gadia Lohar 219

220

7

Genomic Diversity of 75 Communities in India

Fig. 7.39 Mismatch distribution of nucleotide differences of Dungri Bhil population

Fig. 7.40 Y chromosome haplogroups of Gadia Lohar

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 53 individuals were all assigned to five haplogroups (Fig. 7.40) through screening the

Y-SNPs. Haplogroup R1A1* has the highest frequency (79%), followed by haplogroup F* (9%), haplogroup H1a* (8%), H* and L1 (each 2%). Haplogroup R1A1 is widely distributed in Eurasia. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan,

Galong

Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. While, haplogroup H1A* is distributed among Dravidian and Central Indian communities, haplogroup L1 is typically distributed among Dravidian linguistic groups of India.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 60 Gadia Lohar individuals were scanned for maternal lineages in the population. Gadia Lohar maternal lineages comprises of 65% Asian macro-haplogroup, M and 35% of European macro-haplogroup N. A total of seven maternal lineages belonging to haplogroup M were found in Gadia Lohar population (Fig. 7.41). Haplogroup M33 has the highest frequency (36%), followed by M5 (23%), M3 (13%), M2, M30 and M65 (each 8%) and M57 (5%). Haplogroup M33 distributed in South Asia, Belarus and Southern China. Haplogroup M3, M5 and M2 all are distributed in South Asia. Haplogroup M30 is distributed mainly in India, Middle East and North Africa. Founder age of Gadia Lohar population was 51  9 ky. The N-haplogroup of the 21 individuals of Gadia Lohar were all assigned to four haplogroups (Fig. 7.42). Haplogroup U2 has the highest frequency (52%), followed by haplogroup U7 and haplogroup T (19%), and the least is haplogroup R30 (10%). U2 is sparsely distributed specially in the northern half of the subcontinent. It is also found in SW Arabia. The coalescence age of Indian and Western-Eurasian U2 lineages was estimated to be 53,000  4000 ybp. Haplogroup U7 is considered a West Eurasianspecific, believed to have originated in the Black Sea area approximately 30,000-years-ago. The coalescence time of haplogroup U7 was estimated to be 32,000  5500 years. Haplogroup T has the highest frequency in the Caspian region (Caucasus, Northern Iran, and Turkmenistan). It is important in Europe (almost 10%), Middle East, Central Asia, Pakistan and North Africa. The same is found in small frequency in the Horn of Africa and India. Haplogroup R30 is

221

found in Andhra Pradesh, Uttar Pradesh (India), in the Tharu people from Nepal and Sinhalese people from Sri Lanka.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Gadia Lohar population is 0.001977  0.000976. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.10). Figure 7.43 shows mismatch distributions of Gadia Lohar population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Gadia Lohar population which indicates a recent expansion. The smaller number of sum of squared deviation (0.0021) and Harpending’s raggedness index (r) (0.0020) also confirm that Gadia Lohar has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Gadia Lohar population 1.329 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (24.09776) is also an evidence for a recent population expansion of Gadia Lohar community.

Galong Mythology of Galong recalls that their ancestor was Abo Tani. They trace their origin to a place called ‘NyimePui’ at the northern snow peak from where they migrated towards the South. According to 2011 census, their population

222

7

Genomic Diversity of 75 Communities in India

Fig. 7.41 mtDNA phylogenetic tree of M-haplogroup among the Gadia Lohar

strength is 79,327. Out of which 38,901 were males and 40,426 were females (sex ratio ¼ 1039). Fifty-nine thousand one hundred and sixty-three

Galong live in rural areas. They are concentrated in the areas of Along and Basar of Arunachal Pradesh. They speak the Galong dialect

Galong

223

Fig. 7.42 mtDNA phylogenetic tree of N-haplogroup among the Gadia Lohar

belonging to the Adi group of the Tibeto-Burman family of languages. They converse in Adi, Assamese, Hindi and Nepali languages with other. Adi are the Galongs which have trade

relations with the Bori and through them with the Tibetans and the Assamese. Agriculture is considered as their primary occupation. Monogamy is the general practice and polygamy also

Mean number of pairwise differences 25.2728  11.246

Nucleotide diversity 0.0019  0.0009

Sum of squared deviation SSD P 0.00212 0.810

Table 7.10 Molecular Diversity Indices among the Gadia Lohar Harpending’s Raggedness index HRI P 0.00207 0.910 Theta Pi 25.27288  12.473

S 40.530  11.068

Tajima’s P D P 1.329 0.070

Fu’s Fs Fu’s Fs 24.097

p 0.000

224 7 Genomic Diversity of 75 Communities in India

Galong

225

Fig. 7.43 Mismatch distribution of nucleotide differences of Gadia Lohar Population

exists (Singh 1994). They have affinity with Adi Minyong. They have synonyms like Galo. For the present study, Galong blood samples were collected from the state of Arunachal Pradesh. Physically, the Galong are of short stature (39%), followed by below medium stature (26%) with dolicocephalic head form (69%). With regard to ABO blood groups, the prevalence of A blood group is highest in this population (31.52%) followed by B (20.86%) and AB (7.25%), while the O blood group frequency is 40.19%. With regard to the ABO blood group system the Galong and the Miniyong had maintained a genetic homogeneity. Prevalence of colour blindness was 1.10%. With regard to the finger print pattern it was observed that the whorl was the predominant finger pattern (56.42%), followed the loop (40.99%), while the arch was reported to be in negligible frequency (2.45%). The prevalence of non-taster gene among the Galong was 24.60% (Goswami and Das 1990).

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 48 individuals for Y-SNPs revealed the presence of four haplogroups.

These are O3A3C1*, K*, O2A*, and R1A1. Here, O3A3C1* shows the highest preponderance with a percentage of 94% followed by K*, O2A* and R1A1 (2%) (Fig. 7.44). The haplogroup O3A is predominant among the Tibeto-Burman communities of South East Asia.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 107 individuals were scanned for maternal lineages in the population. Total 49 mtDNA genomes were selected for complete sequencing based on HVR I motifs. A total of eight maternal lineages belonging to haplogroup M were found among the Gallong population (Fig. 7.45). Of these, 49% predominantly belong to D, followed by 15% belongs to M40 and 13% belong to M8 lineage, 8% belongs to M11 and M37 lineage. M10, M12 and G have an equal frequency of 3%. The founder age of the Galong was estimated to be 59  12 ky based on the high preponderance of D maternal lineage. The N-haplogroup of the ten individuals of Gallong were all assigned to four haplogroups (Fig. 7.46). Haplogroup F1 has the highest frequency (50%), followed by haplogroup A1 (20%), A7, B40 5 and U2 (each 10%). Haplogroup

226

7

Genomic Diversity of 75 Communities in India

Fig. 7.44 Y chromosome haplogroups of Galong

F is fairly common in East Asia and Southeast Asia. Higher frequencies occur in some areas like Nicobar at 50% and Arunachal Pradesh at 31% (India) and Shors people from Siberia at 44%. There is also an important frequency in Taiwanese aborigines, Guangdong (China), Maluku (Indonesia), Thailand and Vietnam. The coalescence age was estimated to be 16.7  5.6 kya. Haplogroup A is found in Central and East Asia, as well as among Native Americans. The coalescence age of Indian and Western-Eurasian U2 lineages was estimated to be 53,000  4000 ybp. Haplogroup B is believed to have arisen in Asia some 50,000 years before present. Its ancestral haplogroup was haplogroup R. Its greater variety is in China. It is conspicuous that haplogroup B may have its earliest diversification in southern China and/or Southeast Asia. Haplogroup B is found frequently in southeastern Asia. U2 is sparsely distributed specially in the northern half of the subcontinent. It is also found in SW Arabia. The coalescence age of Indian and Western-Eurasian U2 lineages was estimated to be 53,000  4000 ybp.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Gallong population is 0.001696  0.000843. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.11). Figure 7.47 shows mismatch distributions of Gallong population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Gallong population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00271075) and Harpending’s raggedness index (r) (0.00326364) also confirm that Gallong has undergone recent demographic expansion.

Garwali Brahmin

227

Fig. 7.45 mtDNA phylogenetic tree of M-haplogroup among the Gallong

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Gallong population 1.77933 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (18.96397) is also an evidence for a recent population expansion of Gallong community.

Garwali Brahmin For this study, Garhwali Brahmin blood samples were collected from the state of Uttarakhand.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 41 individuals were all assigned to 11 haplogroups (Fig. 7.48) through screening the Y-SNPs. Haplogroup R1A1* has the highest

228

7

Genomic Diversity of 75 Communities in India

Fig. 7.46 mtDNA phylogenetic tree of N-haplogroup among the Galong

frequency (32%), followed by haplogroup C5* (17%), haplogroup L* (10%), F*, H* and H1a* (each 7%), J2B1, N1*R2 (each 5%), L1, O3A3C1* (each 2%). Haplogroup R1A1 is widely distributed in Eurasia. Haplogroup C5 is found in high frequency among the Australian aborigines. Although haplogroup C attends its highest frequency among the indigenous population of Mongolia, Russian far east, Polynesia, Australia and at moderate frequency in the Korean peninsula and Manchuria. It displays its high frequency in modern Indian population. Haplogroup L is found among neolithic migration from Central Asia. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. While,

haplogroup H1a* is distributed among the Dravidian and Central Indian communities. Haplogroup L1 is typically distributed among Dravidian linguistic groups of India.

Garhwali Rajput The community enjoying non-Khas or real Rajput status is Parmar and titled as thakur or Kunwar and was the feudal rulers of Garhwal. The royal Parmar of Garhwal claim their ancestry rooted in the reputed Parmar Kingship of Dharanagar, Malva. The Garhwali Rajputs are distributed in almost all districts of the Garhwal division. They speak Garwali language and write the same in the Devanagari script. There is no system of

Mean number of pairwise differences 28.09716  12.564

Nucleotide diversity 0.001696  0.0008

Sum of squared deviation SSD P 0.00271 0.660

Table 7.11 Molecular Diversity Indices among the Galong Harpending’s Raggedness index HRI P 0.00326 0.8000 Theta Pi 28.09716  13.959

S 53.6909  15.844

Tajima’s P D P 1.7793 0.014

Fu’s Fs Fu’s Fs 18.963

p 0.0000

Garhwali Rajput 229

230

7

Genomic Diversity of 75 Communities in India

Fig. 7.47 Mismatch distribution of nucleotide differences of Galong population

Fig. 7.48 Y chromosome haplogroups of Garhwali Brahmin

stratification in this community. The social division above the family is mundit-dai-bhai (minimal patrilineage) and varjit (maximal patrilineage). Both these are exogamous and

‘varna’ system is known to them. The rule of marriage is lineage exogamy and Rajput endogamy, i.e. they choose their mate outside the Parmar and within the Rajput fold of Jammu,

Ghorkha

231 43.8

18.8 15.6 9.4 6.3 3.1

D1a*

3.1

G1*

H*

H1a*

J2B1

R1A1*

R2

Fig. 7.49 Y chromosome haplogroups of Garhwali Rajput

Himachal Pradesh, Garhwal, Kumaon and Nepal and some also in plains. In general, monogamy is a rule, but some also practice polygyny. Rule of residence after marriage is patrilocal. Agriculture is the mainstay of the Garhwali Rajput. They supplement their economy by herding, horticulture, poultry farming and many hold jobs of high status in civil and military departments. They profess Hinduism. For the present study, Garhwali Rajput blood samples were collected from the state of Uttarakhand.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 32 individuals were all assigned to seven haplogroups (Fig. 7.49) through screening the Y-SNPs. Haplogroup R1A1* has the highest frequency (44%), followed by haplogroup H1a* (19%), haplogroup H* (16%), haplogroup J2B1 (9%), haplogroup G1 (6%), whereas haplogroup D1a* and R2 (each 3%). Haplogroup R1A1* is widely distributed in Eurasia. While, haplogroup

H1a* is distributed among the Dravidian and Central Indian communities. Haplogroup H* is mainly distributed in South Asia, India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan (Sengupta et al. 2006). Haplogroup J2B1 is widely distributed in Italy, Czechoslovakia and Germany. Haplogroup R2 is of central Asian Lineage. Haplogroup G1 is distributed in Iran and the countries adjoining Iran in the West.

Ghorkha The Ghorkha pronounced and alternately spelled as Gorkhas are soldiers from Nepal. Historically, the terms Gurkha and Gorkhali were synonymous with Nepali, and derived from the hill town and district of Gorkha from which the Kingdom of Nepal expanded. The name may be traced to the medieval Hindu warrior-saint Guru Gorakhnath who has a historic shrine in Gorkha. The word itself derived from Go-Raksha, raksha becoming rakha. Rakhawala means protector and is derived from raksha as well. Gorkha were basically

232

7

Genomic Diversity of 75 Communities in India

Fig. 7.50 Y chromosome haplogroups of Ghorkha

Hindus in Nepal who were called protectors of cows. As per Hindu traditions, cows denote wealth, prosperity and power. The place name called Gorkha from where the battle begins to form great Nepal and with the troops of Chhetri people in leadership of The Great King Prithivi Narayan Shah. There is Gurkha military units in the Nepalese, British and the Indian army (Gorkhas) enlisted in Nepal. For the present study, Gorkha blood samples were collected from the state of Uttarakhand.

South East Asia. Haplogroup R1A1* is widely distributed in Eurasia. Haplogroup J2B1 is widely distributed in Italy, Czechoslovakia and Germany. Haplogroup C5 is found in high frequency among the Australian aborigines. Although haplogroup C attends its highest frequency among the indigenous population of Mongolia, Russian far east, Polynesia, Australia and at moderate frequency in Korean peninsula and Manchuria. It displays its high frequency in modern Indian population. Haplogroup D1A is widely distributed in Japan, Central Asia and Andaman Islands in the Bay of Bengal.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 21 individuals were all assigned to ten haplogroups (Fig. 7.50) through screening the Y-SNPs. Haplogroup O3A3C1* has the highest frequency (34%), followed by haplogroup R1A1* (23%), haplogroup J2B1 (9%), haplogroups C5*, H*, H1a*, N1* and R2 (each 6%), haplogroups D1a* and H1b (each 3%). Haplogroup O3A3C1* is widely distributed in

Kolam Kolam trace their descent form Bhima and Hirimba, the two characters from the epic Mahabharata. According to 2011 census, their population strength is 194,671. Out of which 98,319 were males and 96,352 were females (sex ratio ¼ 980). They are distributed in the Yeotmal, Osmanabad, Chandrapur, Gadchiroli and Nagpur district of Maharashtra and Sagar

Kolam

and Betul districts of Madhya Pradesh and Adilabad district of Andhra Pradesh. They speak in Kolami, a Dravidian language among themselves and in Marathi and Hindi with others. They have an identical social structure with Gond. They are mostly cultivators or wage labourer. Monogamy is the common practice. Marriage with one’s mother’s brother’ daughter or father’s sister’s daughter is allowed. Levirate is prohibited while junior sororate is permissible. They profess Hinduism. In their morphological profile the Kolam differ significantly from the tribal communities of Andhra Pradesh (Singh 1994). For the present study, Kolam blood samples were collected from the state of Maharashtra. They are of below medium stature, with a broad facial and nasal profile. Regarding the PTC tasting ability, Kolam population exhibited 49.11% of taster and 50.89% of non-taster. Kolam population showed 40% of prevalence of colour blindness (Ramesh et al. 1979).

Fig. 7.51 Y chromosome haplogroups of Kolam

233

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 57 Hill Kolkam individuals for Y-SNPs revealed the presence of nine haplogroups. These are F*, O2a*, H1a*, R1a1*, G2*, K*, R2, H* and P*. Here F* shows the highest preponderance with a percentage of 28% followed by O2a* (16%), H1a* and R1a* (each 14%), K* (7%), G2* (5%), R2 (4%), H* (2%), P* (2%) (Fig. 7.51). The F* haplogroup is mainly distributed in North, Central, Western and South India, Sri Lanka, Nepal, Borneo, Java and Sulaweshi and Lemdada. O2a* is considered to be the major Southeast Asian Lineage.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 122 individuals were scanned for maternal lineages in the population. Total 25 mtDNA genomes were selected for

234

7

Genomic Diversity of 75 Communities in India

Fig. 7.52 mtDNA phylogenetic tree of M-haplogroup among the Kolam

complete sequencing based on HVR I motifs. A total of five maternal lineages belonging to haplogroup M were found among the Kolam population (Fig. 7.52). Of these 50% predominantly belong to M2, followed by 25% from 8% belong to M3 and M5, M6 and M35 lineages each. The founder age of M2 haplogroup among the Kolam was estimated to be 64,000 + 13,000 ky based on the high preponderance of M2 maternal lineage. The N-haplogroup of the 13 individuals of Kolam were all assigned to four haplogroups. Haplogroup U2 has the highest frequency (54%), followed by haplogroup R8 (23%), R6 (15%) and U7 (8%). U2 is sparsely distributed

specially in the northern half of the subcontinent (Fig. 7.53). It is also found in South West Arabia. The coalescence age of Indian and WesternEurasian U2 lineages was estimated to be 53,000  4000 ybp. Haplogroup R8 has the highest frequency towards East India, especially within Orissa (12%), and it is found among the Austroasiatic tribes (Munda speakers). It is also present in low frequency among the Dravidian and Indo-European speaking populations. Haplogroup R and its descendants are distributed all over Europe, North Africa, the Horn of Africa, the Near East, the Indian Subcontinent, Oceania and the Americas. The origin of haplogroup R dates back to 66.8 kya (1000 years ago) with a

Kolam

235

Fig. 7.53 mtDNA phylogenetic tree of N-haplogroup among the Kolam

95% confidence interval of 52.6–81 kya. Haplogroup U7 is considered a West Eurasianspecific, believed to have originated in the Black Sea area. The coalescence time of haplogroup U7 was estimated to be 32,000  5500 years.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Kolam population is 0.001778  0.000942. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of

population diversity, and is similar to the expected heterozygosity. This statistic may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.12). Figure 7.54 shows mismatch distributions of Kolam population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Hill Kolam population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01717675) and Harpending’s raggedness index (r) (0.03053260) also confirm that Kolam has undergone recent demographic expansion.

Mean number of pairwise differences 29.4545  13.868

Nucleotide diversity 0.0017  0.000942

Sum of squared deviation SSD P 0.01717 0.440

Table 7.12 Molecular Diversity Indices among the Kolam Harpending’s Raggedness index HRI P 0.03053 0.6200 Theta Pi 29.4545  15.615

S 35.76304  14.026

Tajima’s P D P 0.8231 0.195

Fu’s Fs Fu’s Fs 1.8492

p 0.102

236 7 Genomic Diversity of 75 Communities in India

Hmar

237

Fig. 7.54 Mismatch distribution of nucleotide differences of Kolam Population

Tajima’s D is a statistic that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Kolam population 0.82318 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (1.84923) is also an evidence for a recent population expansion of Kolam community.

Hmar The Hmar migrated to Assam, Manipur and Mizoram from Sinlung, a place believed to be situated somewhere in the North. According to 2011 census, their population strength is 48,375. Out of which 23,851 were males and 24,524 were females (sex ratio ¼ 1028). Forty thousand five hundred and one Hmars live in rural areas. They live in the north Cachar district of Assam, Manipur and in Mizoram. Their mother tongue is Hmar, it comes under the Kuki-Chin group of the Tibeto-Burman languages. They interact with the Ralte, Lushai, Baite and few more neighbours. Shifting cultivation (jhum) is their traditional occupation. But presently they are workers,

cultivators and labourers. Monogamy is the norm and marriage with one’s mother’s brother’s daughter is preferred. Negotiation is the accepted way of contracting a marriage. Majority of them are Christians. The Hmar have close socioeconomic interactions with their neighbours like the Zemi, Dimasa, Kuki and other Mizos (Singh 1994). For the present study, Hmar blood samples were collected from the state of Manipur. The Hmar shows predominantly Mongoloid types of palmar mainline formula, 7.5.5 > 9.7.5 for both the sexes and exhibit a relatively higher value pattern intensity, with 42–49% of whorls (Chakravartti and Mukherjee 1962). In the ABO blood group characteristics, the blood group A is found to be in a greater frequency than B (Bhattacharjee 1975).

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 23 Hmar individuals were all assigned to three haplogroups (Fig. 7.55) through screening the Y-SNPs. Haplogroup O2A* has the highest frequency (80%), followed by haplogroup F* and haplogroup O3A3Ca* (each 10%). Haplogroup

238

7

Genomic Diversity of 75 Communities in India

Fig. 7.55 Y chromosome haplogroups of Hmar

O2A is an admixture of Austro-Asian genes. Haplogroup F* found mostly among the Dravidian, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic groups throughout Eurasia. Haplogroup O3A3C1* is widely distributed in South East Asia and haplogroup R1A1* is widely distributed in Eurasia.

Irular The name Irular is derived from the Tamil word ‘Irular’ meaning darkness or night. According to 2011 census, their population strength is 189,661. Out of which 94,521 were males and 95,140 were females (sex ratio ¼ 1007). In Tamil Nadu they are settled in the Nilgiris, Coimbatore, South Arcot and North Arcot districts. Their mother tongue is Irular and languages with others are Tamil and Telegu. The Kota and the Kurumbas accept food and water from the Irular, while the Badaga and Toda do not do so. Until the beginning of this century, the Irular was a food gathering tribe. Nowadays all of them depend

mainly on wage labour and plantation labourer. The Irular are divided into 14 exogamous clans. They prefer marriage alliances between the maternal uncle and niece and between cross cousins. Monogamy is the common form though polygamy is permitted. The Irular follow tribal religion and observe most of the Hindu festivals (Singh 1994). For the present study Irular blood samples were collected from the state of Tamil Nadu. They are on an average of short or below medium height in and exhibit a long and narrow head shape. Sastry (1990) conducted work on abnormal haemoglobin on 254 individuals and found that there are 31.89% sickle cell carrier and 1.96 SS, A gene frequency is 82.09% and S gene frequency is 17.91%. The prevalence of colour blindness is 4.48%. Sastry (1990) showed prevalence of A (21.9), B (31.1), AB (11.9), O (31.8). Two research works had been carried out by Saha et al. (1976) and Kirk et al. (1962) in Nilgiri Hills among Irulas. Kirk et al. (1962) reported gene frequency for HP1 was 7.20% and HP2 was 92.80%, whereas Saha et al. (1976)

Irular

239

Fig. 7.56 Y chromosome haplogroups of Irula

reported gene frequency for HP1 was 5.90% and HP2 was 94.10%. Saha et al. in the 1976 showed 100.0% gene frequency of Tfc variant of transferring among Irulas of Nilgiri Hills, Tamil Nadu.

Paternal Lineage (Y Chromosome Haplogroups) The screening of 21 Irula individuals for Y-SNPs revealed the presence of ten haplogroups. These are R2, F*, H*, H1a*, C5, J2B1, L, L1, O2A* and R1A1. Here, R2 shows the highest preponderance with a percentage of 33% followed by F* (19%), H* and H1a* (10%), and 5% preponderance has been shown by C5, J2B1, L, L1, O2A* and R1A1 (Fig. 7.56). The haplogroup R2 is predominantly in Central Asian lineages.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 55 Irula individuals were scanned for maternal lineages in the

population. Irula maternal lineages comprise of 73% Asian macro-haplogroup M and 70% of European macro-haplogroup N. A total of seven maternal lineages belonging to haplogroup M were found in Irula population (Fig. 7.57). Haplogroup M3 has the highest frequency (40%), followed by M8CZ (20%), M5 (17%), M35 (7%) and M2, M6 and M36 (5% each). Haplogroups M2, M3, M6, M35 and M5 all are distributed in South East Asia. Haplogroup M8CZ is distributed among Eurasian populations. Founder age of Irula population was 23  6 ky. The N-haplogroup of the 15 individuals of Irula were all assigned to four haplogroups (Fig. 7.58). Haplogroup U2 has the highest frequency (60%), followed by haplogroups R5 and R30 (20%). U2 is sparsely distributed specially in the northern half of the subcontinent. It is also found in South West Arabia. The coalescence age of Indian and Western-Eurasian U2 lineages was estimated to be 53,000  4000 ybp. R5 was distributed across groups of the subcontinent, especially in Madhya Pradesh (India) at 17%. The coalescence time was estimated to be

240

7

Genomic Diversity of 75 Communities in India

Fig. 7.57 mtDNA phylogenetic tree of M-haplogroup among the Irular

66,100  22,000 years. Haplogroup R30 found in Andhra Pradesh, Uttar Pradesh (India), in the Tharu people from Nepal and Sinhalese people from Sri Lanka. Haplogroup T has the highest frequency in the Caspian region (Caucasus, Northern Iran, Turkmenistan). It is important in Europe (almost 10%), Middle East, Central Asia, Pakistan and North Africa. Small frequency is also found in the Horn of Africa and India.

Molecular Diversity Molecular diversity indices are shown in Table 7.13. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Irula population is 0.001449  0.000723. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical

Jarawa

241

Fig. 7.58 mtDNA phylogenetic tree of N-haplogroup among the Irular

measures of population diversity, and it is similar to the expected heterozygosity. This statistic may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.59 shows mismatch distributions of Irula population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Irula population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00469719) and Harpending’s raggedness index (r) (0.00385930) also confirm that Irula has undergone recent demographic expansion.

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Irula population 0.90956 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (22.60318) is also an evidence for a recent population expansion of Irula community.

Jarawa The Jarawa are the descendants of immigrants who at some time in the past made airway across

Mean number of pairwise differences 23.19743  10.423

Nucleotide diversity 0.001449  0.0007

Sum of squared deviation SSD P 0.00469 0.790

Table 7.13 Molecular Diversity Indices among the Irular Harpending’s Raggedness index HRI P 0.003859 0.9000 Theta Pi 23.1974  11.579

S 30.79785  9.196

Tajima’s P D P 0.9095 0.189

Fu’s Fs Fu’s Fs 22.603

p 0.000

242 7 Genomic Diversity of 75 Communities in India

Jarawa

243

Fig. 7.59 Mismatch distribution of nucleotide differences of Irular population

from little Andaman and thrust themselves upon the inhabitants Rutland island and the South Andaman, maintaining their footing in the new country by forced of arms. According to 2011 census, their population strength is 380. Out of which 194 were males and 186 were females (sex ratio ¼ 959). All of them live in rural areas. Jarawa inhabit the west coast reserved area of 765 sq. km in the south and middle Andaman Islands. They are mono-lingual who do not know any other language apart from mother tongue, Jarawa. Occupations of Jarawa are hunting, fishing and collecting honey. They appear to be monogamous regarding marriage rules. They belong to Negrito stock (Singh 1994). For the present study, Jarawa blood samples were collected from Andaman and Nicobar Islands. They are of short stature with a dark skin and frizzy hair. The data on somatology and dermatoglyphics show their close resemblance with the Asiatic Negrito and not with African one (Sarkar 1985).

Paternal Lineage (Y Chromosomal Haplogroups) The screening of six individuals for Y-SNPs revealed the presence of only one haplogroup

D* (100%). This is mainly an Asian haplogroup, found in North East Asia including Siberia and Central Asia (Fig. 7.60).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of ten individuals were scanned and completely sequenced under M-and N-haplogroup based on HVR1 motifs for maternal lineages in the population. Its maternal lineages comprise 100% Asian macrohaplogroup M, 50% from M31 and M32 each. M group founder age was 66  9 ky (Fig. 7.61).

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Jarawa population is 0.000967  0.000534. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistic may be used to monitor diversity within or between ecological

244

7

Genomic Diversity of 75 Communities in India

Fig. 7.60 Y chromosome haplogroups of Jarawa

populations, and to determine evolutionary relationships (Table 7.14). Figure 7.62 shows mismatch distributions of Jarawa population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Jarawa population which indicates a recent expansion. The smaller number of sum of squared deviation (0.20836224) and Harpending’s raggedness index (r) (0.22222222) also confirm that Jarawa has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Jarawa population 2.69924 indicates the population may have suffered a recent bottleneck. A negative value of FU’s Fs (2.27806) is also an evidence for a recent population expansion of Jarawa community.

Jaunsari The history of the Jaunsari community belongs to the Khasa affiliations who are the dominant

people of Khasa region. The other communities like the Kolta and Bajgi represent the non-Aryan stock of the hills. According to 2011 census, their population strength is 88,664. Out of which 46,020 were males and 42,644 were females (sex ratio ¼ 927). Eighty-two thousand six hundred and sixteen of the population live in rural areas. The community is distributed in the Chakrata Tehsil of Dehradun district. They speak in Jaunsari which belongs to the Pahari group of Indo-European languages. Among the Jaunsari, Brahmins enjoys the highest position in the local socio-religious hierarchy and the Kolta occupy the lowest. The Jaunsari are primarily dependent on agriculture and partially on animal husbandry. The earlier practice of child marriage has been largely replaced by adult marriage. The traditional form of marriage was fraternal polyandry which has declined in favour of monogamy. Cases of polygynandry are also common. They are Hindu. They are of below medium in height with a long head shape and a long and narrow nasal shape with a convex bridge. They derive their name from the region Jaunsar Bawar in the Uttarakhand (Singh 1994). For the present study, Jaunsari blood samples were collected from the state of Uttarakhand.

Jenu Kuruba

245

Fig. 7.61 mtDNA phylogenetic tree of M-haplogroup among the Jarawa

Regarding ABO blood group the O gene frequency is the highest (52.12%) among the Jaunsari, followed by B (29.03%) and A gene frequency (18.85%) (Banerjee and Banerjee 1967).

haplogroup G* (4%). Haplogroup H* is widely distributed in south Asia—in India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan. Haplogroup M is mainly distributed in Papua New Guinea, neighbouring Melanesia, Indonesia and Australians aborigines. Haplogroup R2 is of central Asian lineage.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 84 individuals were assigned to five haplogroups (Fig. 7.63) through screening the Y-SNPs. Haplogroup H* has the highest frequency (40%), followed by haplogroup M* (26%), haplogroup R2 (25%), haplogroup R* (6%) and

Jenu Kuruba Jenu Kuruba derive their name from Jen or honey which they collect from the forest. According to 2011 census, their population strength is 36,076. Out of which 17,948 were males and 18,128 were females (sex ratio ¼ 1010). They are concentrated

Mean number of pairwise differences 16.0222  7.816

Nucleotide diversity 0.00096  0.0005

Table 7.14 Molecular Diversity Indices among the Jarawa Sum of squared deviation SSD P 0.20836 0.050

Harpending’s Raggedness index HRI P 0.2222 0.3600 Theta Pi 16.0222  8.840

S 10.2510  4.471

Tajima’s P D P 2.6992 1.000

Fu’s Fs Fu’s Fs 2.2780

p 0.096

246 7 Genomic Diversity of 75 Communities in India

Jenu Kuruba

247

Fig. 7.62 Mismatch distribution of nucleotide differences of Jarawa population

Fig. 7.63 Y chromosome haplogroups of Jaunsari

in Kodagu and Mysore districts of Karnataka. They speak the ‘JenuNudi’ dialect of Kanda, it belongs to the family of Dravidian languages. They collect honey, herbs, roots and fruits from the forest. Large sections of them are employed as

daily-wage labourer. Marriage with one’s father’s sister’s daughter, the mother’s brother’s daughter and the elder sister’s daughter is practised among them (Singh 1994). They are mainly followers of Hinduism. For the present study, Jenu Kuruba

248

blood samples were collected from the state of Karnataka. They have similarities with Betta or Kadu Kuruba in respect of anthropometric characteristics. They are short-statured with a meso-cephalic head shape and show broad facial and nasal profiles. Sastry (1990) showed prevalence of sickle cell carrier is 8%. Regarding ABO Blood Group System, Banerjee et al. (1988) found prevalence of A blood group (49%), B blood group (16%), AB blood group (8%) and O blood group (27%). Rh negative gene frequency among them is 14.14 (Banerjee et al. 1988). Banerjee et al. (1988) found gene frequency for HP1 was 22.50 and 77.50 for HP2 among Jenu Kuruba.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 34 individuals for Y-SNPs revealed the presence of five haplogroups. These

Fig. 7.64 Y chromosome haplogroups of Jenu Kuruba

7

Genomic Diversity of 75 Communities in India

are H*, F*, H1a*, C5* and P*. H* show the highest preponderance with a percentage of 56% followed by F* (18%) and H1a* (18%) each, C5* (6%) and P* (3%). However, the haplogroup H1a* is predominant among the Dravidian and central Indian communities. It is considered to be a major indigenous Indian haplogroup (Fig. 7.64).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 114 individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 97% Asian macrohaplogroup M and 3% of European macrohaplogroup N (Fig. 7.65). Out of which 84 samples are under M-haplogroup and 3 samples from N-haplogroup were selected for complete sequencing based on HVR I motifs. A total of five maternal lineages belonging to haplogroup M were found among the Jenu Kuruba population. Of these 39% predominantly belong to M36, 38% belong to M3, 15% belong

Jenu Kuruba

Fig. 7.65 mtDNA phylogenetic tree of M-haplogroup among the Jenu Kuruba

249

250

7

Genomic Diversity of 75 Communities in India

Fig. 7.66 mtDNA phylogenetic tree of N-haplogroup among the Jenu Kuruba

to M8, 6% belong to M2 and 1% from M30 lineages. The founder age of the Jenu Kuruba was estimated to be 45  12 ky based on the high preponderance of M36 maternal lineage. The N-haplogroup of the three individuals of Jenu Kuruba were all assigned to two haplogroups (Fig. 7.66). Haplogroup U4 has the highest frequency of 67% followed by haplogroup B (33%). Haplogroup U4 has its origin in the Upper Palaeolithic, dating to approximately 25,000 years ago, and is an ancient mitochondrial haplogroup and is relatively rare in modern populations. U4 is found in Europe with highest concentrations in Scandinavia and the Baltic states and is found in the Sami population of the Scandinavian Peninsula (QuintanaMurci et al. 2004; Olga et al. 2002; Clio et al. 2013). Haplogroup B is believed to have arisen in Asia some 50,000 years before present. Its ancestral haplogroup was haplogroup R. Its greater

variety is in China. It is conspicuous that haplogroup B may have its earliest diversification in Southern China and/or Southeast Asia. Haplogroup B is found frequently in South Eastern Asia.

Molecular Diversity Molecular diversity indices are shown in Table 7.15. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Jenu Kuruba population is 0.001427  0.000702. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between

Mean number of pairwise differences 23.64429  10.500

Nucleotide diversity 0.00142  0.0007

Sum of squared deviation SSD P 0.01244 0.030

Table 7.15 Molecular Diversity Indices among the Jenu Kuruba Harpending’s Raggedness index HRI P 0.007707 0.0200 Theta Pi 23.64429  11.634

S 24.98966  6.553

Tajima’s P D P 0.18164 0.516

Fu’s Fs Fu’s Fs 24.00

p 0.001

Jenu Kuruba 251

252

7

Genomic Diversity of 75 Communities in India

Fig. 7.67 Mismatch distribution of nucleotide differences of Jenu Kuruba

ecological populations, and to determine evolutionary relationships. Figure 7.67 shows mismatch distributions of Jenu Kuruba population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Jenu Kuruba population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01244787) and Harpending’s raggedness index (r) (0.00770700) also confirm that Jenu Kuruba has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Jenu Kuruba population 0.18164 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (24.00564) is also an evidence for a recent population expansion of Jenu Kuruba community.

Ka Thakur The Rajput elements in Ka-thakur community are said to have been brought by the fugitives from Gujarat, who took refuge among them during the

regime of Mohammad Begade. They are the endogamous division of Ma-Thakur. According to 2011 census, their population strength is 567,968. Out of which 287,764 were males and 280,204 were females (sex ratio ¼ 974). They are concentrated in Thane district of Maharashtra. They speak a corrupt form of Marathi, an Indo-European language. These people interdine with the Kunbi and Koli, but not with the Katkari, Varli and some scheduled castes. Most of community members subsist on agriculture and wage labour. Cross-cousin marriages are allowed among them, and marriage alliances are contracted mostly to negotiation. The Ka-Thakur has their own methods of maintaining social control. Usually monogamy is the form of marriage. They are mainly followers of Hinduism (Singh 1994). They are short-statured people and show a tendency towards a broad head shape (Gupta 1907). For the present study, Ka Thakur blood samples were collected from the state of Maharashtra.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 47 Ka Thankur individuals for Y-SNPs revealed the presence of four haplogroups. These are R2, S*, H*, and R*.

Ka Thakur

253

Fig. 7.68 Y chromosome haplogroups of Ka Thakur

Here, R2 shows the highest preponderance with a percentage of 13% followed by S* (81%), H* (4%) and R* (2%) (Fig. 7.68). The haplogroup R2 is predominantly a Central Asian lineage (Fig. 7.68).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 30 Ka Thakur individuals were scanned for maternal lineages in the population. Its maternal lineages comprise of 80% Asian macro-haplogroup M and 20% of European macro-haplogroup N were selected for complete sequencing based on HVR I motifs. A total of seven maternal lineages belonging to haplogroup M were found among the Ka Thakur population (Fig. 7.69). Among these, 42% predominantly belongs to M2, 21% belongs to M38, 12% belong to M65, 8% belong to M4 and M30 lineages, each 4% belong to M5 and M39 lineage each. The founder age of the Ka Thakur was estimated to be 64  13 ky based on the high preponderance of M2 maternal lineage.

The N-haplogroup of the Ka Thakur individuals were all assigned to four haplogroups. Haplogroup R30 and R8 have the highest frequency (33%), followed by haplogroups R1 and I (17%). Haplogroup R30 is found in Andhra Pradesh, Uttar Pradesh (India), in the Tharu people from Nepal and Sinhalese people from Sri Lanka. Haplogroup R8 has the highest frequency towards East India, especially within Orissa (12%), and it is found among the Austroasiatic tribes (Munda speakers). It is also present in low frequency among the Dravidian and Indo-European speaking populations. Haplogroup R1 found in Kurdish from Turkmenistan, whereas Haplogroup I is found in West Eurasia and South Asia (Fig. 7.70).

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences

254

7

Genomic Diversity of 75 Communities in India

Fig. 7.69 mtDNA phylogenetic tree of M-haplogroup among the Ka Thakur

directly. Nucleotide diversity in Ka Thakur population is 0.001444  0.000733. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.16).

Figure 7.71 shows mismatch distributions of Ka Thakur population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Ka Thakur population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01065137) and Harpending’s raggedness index (r) (0.01697385) also confirm that Ka Thakur has undergone recent demographic expansion.

Kamar

255

Fig. 7.70 mtDNA phylogenetic tree of N-haplogroup among the Ka Thakur

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Ka Thakur population 0.62498 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (8.94631) is also an evidence for a recent population expansion of Ka Thakur community.

Kamar Kamar are an offshoot of the Gond. According to 2011 census, their population strength was 26,530. Out of which 13,070 were males and 13,460 were females (sex ratio ¼ 1030). Kamar

are a native community of the South Eastern part of Raipur district, Chhattisgarh. They speak in Chhattisgarhi among themselves which belongs to Indo-European family of languages and with other they speak in Hindi. They accept food and water from the Gond, Sau and Patel. However, they do not accept food from the Ganda. Their main occupation is shifting cultivation. They practise child marriage; cross-cousin marriage is common in the community. Junior levirate and junior sororate are permissible. They profess Hindu religion (Singh 1994). Tiwari et al. (1980) found Sickle cell carrier to be 5% among the Kamars with S gene frequency 2.38. When they suffer from a disease, they consult a medicine man called Baiga, who is known as a soothsayer also. For the present study, Kamar blood

Mean number of pairwise differences 23.92029  10.896

Nucleotide diversity 0.0014  0.0007

Sum of squared deviation SSD P 0.010651 0.400

Table 7.16 Molecular Diversity Indices among the Ka Thakur Harpending’s Raggedness index HRI P 0.0169738 0.200 Theta Pi 23.92029  12.147

S 28.38557  9.479

Tajima’s P D P 0.62498 0.267

Fu’s Fs Fu’s Fs 8.946

p 0.005

256 7 Genomic Diversity of 75 Communities in India

Kamar

257

Fig. 7.71 Mismatch distribution of nucleotide differences of Ka Thakur Population

samples were collected from the state of Madhya Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 45 individuals for Y-SNPs revealed the presence of seven haplogroups. These are O2a*, H1a*, L1, H*, K*, H1b and C5*. Here O2a* shows the highest preponderance with a percentage of 44.4% followed by H1a* (27%), L1 (13%), H* (7%), K* (4%), H1b and C5* (2%) (Fig. 7.72). O2a* is considered to be a major Southeast Asian Lineage. H1a is considered as a specified haplogroup of Dravidian communities.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 80 Kamar individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 71% Asian macro-haplogroup M and 29% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs (Fig. 7.73). A total of seven maternal lineages belonging to

haplogroup M were found among the Kamar population. Of these 28% predominantly belong to M3, followed by 25% M2, 21% M6, 17% M35, 3% M40 and M41, M39 2%. The founder age of M3 haplogroup among the Kamar was estimated to be 23 + 6 ky based on the high preponderance of M3 maternal lineage. The N-haplogroup of the 23 individuals of Kamar were all assigned to six haplogroups. Haplogroup R7 and R8 have the highest frequency (26%), followed by haplogroup R (17%), R5 and U2 (13%), R30 (4%). Haplogroup R is a very extended and diversified macrohaplogroup. The coalescence time of haplogroup R was estimated to be 73,000  20,900 years (Kivisild et al. 2003a, b). The subclade R0 within the haplogroup R occurs commonly in the Arabian Peninsula, with its highest frequency observed among the Socotri (Černý et al. 2009). Moderate frequencies are found in North Africa, the horn of Africa and the Central Asia (Fig. 7.74).

Molecular Diversity Molecular diversity indices are shown in Table 7.17. Nucleotide measures the degree of

258

7

Genomic Diversity of 75 Communities in India

Fig. 7.72 Y chromosome haplogroups of Kamar

polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Kamar population is 0.001380  0.000684. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.75 shows mismatch distributions of Kamar population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Kamar population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01937992) and Harpending’s raggedness index (r) (0.02533936) also confirm that Kamar has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the

number of segregating sites. When there is a lot of a rare mutation Tajima’s D will be a negative value. Among Kamar population 0.5762 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (24.11926) is also an evidence for a recent population expansion of Kamar community.

Kanikkar Some of the historical account suggests that they migrated to Travancore from the Tirunelveli district of Tamil Nadu. According to 2011 census, their population strength was 21,251. Out of which 9975 were males and 11,276 were females (sex ratio ¼ 1130). Nineteen thosusand four hundred and two of the population live in rural areas. They inhabit the hills of Neyyattinkara and Nedumangadu Taluk of Trivandrum district and also live in the adjoining district of Quilon in Kerala. They speak Kanikkarkkar or Malampashi,

Kanikkar

259

Fig. 7.73 mtDNA phylogenetic tree of M-haplogroup among the Kamar

which is close to the Dravidian language. The Kanikkarkkar accept food and water from almost all their neighbour but not from local scheduled caste people. Traditionally the Kanikkarkkar were hunters, gatherer and shifting cultivators. Occasionally fishing, hunting and gathering are still practised. The present day occupation of the community is settled cultivation. Monogamy is the most common form of marriage, but cases of

polygyny have been reported. The rules of community endogamy and clan exogamy are followed by them. Marriages among cross cousins are allowed. Junior sororate and junior levirate are also permitted. Majority of them are Hindu (Singh 1994). The Kanikkar are short stature, dolicocephal with round or oval faces and a broad nose (Singh 1994). Regarding ABO blood group, the O blood group is the highest (65.86%),

260

7

Genomic Diversity of 75 Communities in India

Fig. 7.74 mtDNA phylogenetic tree of N-haplogroup among the Kamar

followed by A (20.95%), B (11.97%) and AB (0.59%) blood group (Buchi 1953). According to Buchi (1953) Kanikkar appears to be typical representatives of the Malid section, who do not seem to bear any physiological resemblance with Negritoes, but indicate a relationship with the Australian aborigines (Australoids). He further emphasized that the blood gene distribution of the Kanikkar puts them closest to the Caucasoids. For the present study, Kanikkar blood samples were collected from the state of Kerala.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the eight Kanikkar individuals were all assigned to two haplogroups (Fig. 7.76) through screening the Y-SNPs. Haplogroup F* has the highest frequency (87%) followed by haplogroup H* (13%). Haplogroup F* found mostly among the Dravidian, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic groups throughout

Mean number of pairwise differences 22.86666  10.219

Nucleotide diversity 0.001380  0.0006

Sum of squared deviation SSD P 0.01937 0.000

Table 7.17 Molecular Diversity Indices among the Kamar Harpending’s Raggedness index HRI P 0.02533 0.00000 Theta Pi 22.86666  11.336

S 27.31983  7.703

Tajima’s P D P 0.5762 0.302

Fu’s Fs Fu’s Fs 24.1192

p 0.000

Kanikkar 261

262

7

Genomic Diversity of 75 Communities in India

Fig. 7.75 Mismatch distribution of nucleotide differences of Kamar population

Fig. 7.76 Y chromosome haplogroups of Kanikkar

Kathodi

Eurasia. Haplogroup H is mainly found in south Asia, India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan.

Karen Karen community are Burmese immigrants. They were brought to North Andaman by the British Government of India in March 1925. They speak in Karen language which is their own dialect. The Karen have developed contact over the years with other communities like Burmese, Ranchi and Bengali. They are primarily agriculturists. Hunting and fishing are usually practised as group enterprises. Monogamy is considered the ideal. Inter-religious marriages are on the rise too. They follow Christianity and are members of the Baptist Church. In the ABO blood groups system, A (34.24%) is found to be preponderant over B (27.85%) and O (25.53%) (Roy 1980). For the present study Karen blood samples (92) were collected from the state of Andaman and Nicobar Islands.

Fig. 7.77 Y chromosome haplogroups of Karen

263

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 25 Karen individuals were all assigned to four haplogroups (Fig. 7.77) through screening the Y-SNPs. Haplogroup O3A3C has the highest frequency (44%), followed by haplogroup N (32%), haplogroup O2A (16%) and haplogroups O3 (8%). While, haplogroup O2a is an admixture of Austro-Asian genes; O3 is mainly found in South East Asia.

Kathodi The Kathodi community originated from Raigarh and Thane district of Maharashtra. According to 2011 census, their population strength was 13,632. Out of which 6787 were males and 6845 were females (sex ratio ¼ 1009). They are distributed in Surat, Bharuch, Sabarkantha and Dang districts of Gujarat. They speak a corrupt form of Marathi, an Indo-European language but

264

uses Gujarati for intergroup communication. They have a cultivator-labour relation with Koli, Garasia and Gamit. Their traditional occupation is catechu-making. Apart from this, they are also engaged as a cultivator, animal husbandry and wage labourer. Both monogamy and polygamy are in practice. Marriage by capture and negotiated marriage are practised. They believe in their traditional tribal religion; however, Hinduism has influenced their religious practice and life-cycle rituals (Singh 1994). They are shortstatured, round headed people. They have a broad nasal and facial profile (Karve and Dandekar 1951). Kathodi are also known as Katkari or Kathodia. For the present study, Kathodi blood samples were collected from the state of Gujarat.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 43 Kathodi individuals for Y-SNPs revealed the presence of five haplogroups. These are H*, H1a*, F* J*, and P*. Here, H* shows the highest preponderance with a percentage of 49% followed by H1a*

Fig. 7.78 Y chromosome haplogroups of Kathodi

7

Genomic Diversity of 75 Communities in India

(33%), F* (9%), J* and P* (5%) (Fig. 7.78). The F* haplogroup is mainly distributed in North, Central, Western and South India, Sri Lanka, Nepal, Borneo, Java and Sulaweshi and Lemdada. However, the haplogroup H1a* is predominant among the Dravidian and central Indian communities. It is considered to be a major indigenous Indian haplogroup.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 120 individuals were scanned for maternal lineages in the population (Fig. 7.79). Total 44 mtDNA genomes were selected for complete sequencing based on HVR I motifs. A total of eight maternal lineages belonging to haplogroup M were found among the Kathodi population. Of these, 25% predominantly belong to M2, 20% belong to M3 and M30 lineages, 15% belong to M35, 5% belong to M5, G, M31 and M39 lineages. The founder age of the Kathodi was estimated to be 64  13 ky based on the high preponderance of M2 maternal lineage. The N-haplogroup of the seven individuals of Kathodi were all assigned to four haplogroups.

Kathodi

265

Fig. 7.79 mtDNA phylogenetic tree of M-haplogroup among the Kathodi

U2 is the highest with a frequency of 46% followed by U7 (17%) and R6 (12%), R1 and R30 (8%) each, R5 and U2 (4%) each (Fig. 7.80).

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences

directly. Nucleotide diversity in Kathodi population is 0.001391  0.000714. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.18).

266

7

Genomic Diversity of 75 Communities in India

Fig. 7.80 mtDNA phylogenetic tree of N-haplogroup among the Kathodi

Figure 7.81 shows mismatch distributions of Kathodi population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Kathodi population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01437378) and Harpending’s raggedness index (r) (0.02146814) also confirm that Kathodi has undergone recent demographic expansion.

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Kathodi population 1.19122 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (6.57961) is also an evidence for a recent population expansion of Kathodi community.

Mean number of pairwise differences 23.04210  10.587

Nucleotide diversity 0.00139  0.0007

Sum of squared deviation SSD P 0.014373 0.110

Table 7.18 Molecular Diversity Indices among the Kathodi Harpending’s Raggedness index HRI P 0.02146 0.150 Theta Pi 23.04210  11.824

S 32.4150  11.233

Tajima’s P D P 1.1912 0.095

Fu’s Fs Fu’s Fs 6.5796

p 0.011

Kathodi 267

268

7

Genomic Diversity of 75 Communities in India

Fig. 7.81 Mismatch distribution of nucleotide differences of Kathodi population

Katkari The Katkari originated from Raigarh and Thane district of Maharashtra. According to 2011 census, their population strength was 285,334. Out of which 142,619 were males and 142,715 were females (sex ratio ¼ 1009). They are distributed in Surat, Bharuch, Sabarkantha and Dang districts of Gujarat. They speak corrupt form of Marathi, an Indo-European language, but uses Gujarati for intergroup communication. They have a cultivator labour relation with Koli, Garasia and Gamit. Their traditional occupation is catechu-making. Apart from that they are also engaged as a Cultivator, animal husbandry and wage labourer. Both monogamy and polygamy are in practice. Marriage by capture and negotiated marriage are practised. They believe in their traditional tribal religion; however, Hinduism has influenced their religious practice and life-cycle rituals. They are short-statured, round headed people. They have a broad nasal and facial profile. They are also known as Katkari or Kathodia (Singh 1994). The phenotype and allele frequencies for ABO blood group system among Katkari has been studied by Mukherjee et al. (1979). Regarding ABO blood group, ‘B’ blood group predominates with the highest frequency (41.30%), followed by

‘O’ (25.7%) and ‘A’ (21.1%). The ‘AB’ blood group among the Katkari is the least (11.9%). Haptoglobin (Mukherjee et al. 1977) gene frequency is 8.0 for HP1 gene and 92.0 for HP2 gene among them. For the present study, Katkari blood samples were collected from the state of Maharashtra.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 36 Katkari individuals for Y-SNPs revealed the presence of four haplogroups. These are H1a*, H*, F* and P*. Here, H1a* shows the highest preponderance with a percentage of 50% followed by H* (25%), F* (17%) and P* (8%). The haplogroup H*is predominantly a Central Asian lineage (Fig. 7.82).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 46 Katkari individuals were scanned for maternal lineages in the population. Total 33 mtDNA genomes selected for complete sequencing under M-and N-haplogroup

Katkari

269

Fig. 7.82 Y chromosome haplogroups of Katkari

based on HVR 1 motifs. A total of six maternal lineages belonging to haplogroup M were found among the Katkari population. Among these, 47% predominantly belongs to M2, 16% belong to M3, 5% belongs to M33 lineages, 10% belong to M37 and M38 lineages each. The founder age of the Katkari was estimated to be 64  13 ky based on the high preponderance of M2 maternal lineage 9 (Fig. 7.83). The N-haplogroup of the 14 individuals of Katkari were all assigned to five haplogroups. U2 is the highest with a frequency 43%, followed by R30 (21%), R1 and R5 14% each and then comes R7 with a frequency 7% (Fig. 7.84).

Molecular Diversity Molecular diversity indices are shown in Table 7.19. Nucleotide measure the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Katkari population is 0.001470  0.000755.

Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.85 shows mismatch distributions of Katkari population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Katkari population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00392531) and Harpending’s raggedness index (r) (0.00721590) also confirm that Katkari has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Katkari population 0.81844 indicates the population is under expansion, meaning recovering from bottleneck. A negative

270

7

Genomic Diversity of 75 Communities in India

Fig. 7.83 mtDNA phylogenetic tree of M-haplogroup among the Katkari

value of FU’s Fs (5.70868) is also an evidence for a recent population expansion of Katkari community.

Kattunayakan Kattunayakan are believed to be the descendants of the Hidamasura of the Mahabharata epic. According to 2011 census, their population

strength was 46,672. Out of which 23,360 were males and 23,312 were females (sex ratio ¼ 998). They are mainly distributed in Wayanad district of Kerala and Nilgiri Hills of Tamil Nadu. They speak their own dialect, which is close to Kannada, within the family and with other kin groups. Their dialect belongs to the family of Dravidian languages. They speak Tamil and Malayalam with others. Kattunayakan reciprocates food and water from the Marathi, Kurumba, Wayanad

Kattunayakan

271

Fig. 7.84 mtDNA phylogenetic tree of N-haplogroup among the Katkari

Pulayan and a few other communities. They are engaged in hunting and gathering, fishing, trapping of birds and animals and as labourer. They are engaged in agriculture, animal husbandry, coffee and pepper plantation works and in private services. Monogamy and sororal polygyny are the forms of marriage. Marriage with the father’s sister daughter and the mother’s brother daughter is permissible. They prefer marriage by capture, service, elopement, exchange and

negotiation. Majority of them are Hindu (Singh 1994). Haptoglobin—Ananthakrishnan et al. (1970) found HP1 gene frequency 5.0 and HP2 gene frequency 95.0 among the Kattunayakan community of Tamil Nadu. Transferrin system—Ananthakrishnan and Kirk in the 1969 reported 100.0 gene frequency of Tfc variant of transferrin among Kattunayakan of Tamil Nadu. Kattunayakan are referred to by different names like KattuNaiakan, Kadu Kurumba and Jenu

Mean number of pairwise differences 24.35672  11.200

Nucleotide diversity 0.00147  0.0007

Sum of squared deviation SSD P 0.00392 0.920

Table 7.19 Molecular Diversity Indices among the Katkari Harpending’s Raggedness index HRI P 0.00721 0.920 Theta Pi 24.35672  12.516

S 30.3284  10.661

Tajima’s P D P 0.8184 0.198

Fu’s Fs Fu’s Fs 5.70868

p 0.015

272 7 Genomic Diversity of 75 Communities in India

Kattunayakan

273

Fig. 7.85 Mismatch distribution of nucleotide differences of Katkari population

Kurumba. For the present study, Kattunayakan blood samples were collected from the state of Tamil Nadu.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 16 Kattunayakan individuals for Y-SNPs revealed the presence of four haplogroups. These are H1a*, H*, F*, and R2. Here, H1a* shows the highest preponderance with a percentage of 50% followed by H* (31%), F (13%) and R2 (6%) (Fig. 7.86). The haplogroup H* is predominantly a Central Asian lineage.

Maternal Lineage (mtDNA Haplogroups) A total of three maternal lineages belonging to haplogroup M were found in Kattunayakan population. Haplogroup M36 has the highest frequency (76%), followed by M8CZ (22%) and M3 (3%). Haplogroup M8C2 is distributed among Eurasian populations and haplogroup M3 is distributed in South East Asia. Founder age of Kattunayakan population was 45  12 ky (Fig. 7.87).

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Kattunayakayan population is 0.000843  0.000430. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.20). Figure 7.88 shows mismatch distributions of Kattunayakan population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Kattunayakan population which indicates a recent expansion. The smaller number of sum of squared deviation (0.04062913) and Harpending’s raggedness index (r) (0.04647215) also confirm that Kattunayakan has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot

274

7

Genomic Diversity of 75 Communities in India

Fig. 7.86 Y chromosome haplogroups of Kattunayakan

of rare mutations Tajima’s D will be a negative value. Among Kattunayakan population 0.16755 indicates the population is recovering from bottleneck. A negative value of FU’s Fs (24.35999) is also an evidence for a recent population expansion of Kattunayakan community.

Kutia Kondh Ethnically, the Kutia Kondh resemble close to the proto-australoid stock with considerable mongoloid admixture. According to 2011 census, their population strength was 1,627,486. Out of which 790,559 were males and 836,927 were females (sex ratio ¼ 1059). 1,574,980 resides in the rural areas. The Kutia Kondh live in the Belghar region of Phulboni district of Orissa. They speak in ‘Kui’, a dialect which belongs to the Dravidian family but nowadays they understand and speak Odia. Kutia Kondh have a reciprocal relationship in terms of food and water with the milkman, blacksmith, Sundhi, Siali, Pano and the Ghasi.

But they avoid service from barbers and washermans. A few community members are still involved in shifting cultivation, hunting and food gathering. The Hinduised section is more prone to settled cultivation. Monogamy is the rule, but there are also cases of polygamous marriages. Levirate and sorrorate type of marriage is also prevalent. A widow can marry a widower. Village exogamy is practised. They are Hindu (Singh 1994). For the present study, Kutia Kondh blood samples were collected from the state of Odisha. They have dark brown to nearly black skin, medium stature and are of narrow headed (dolicocephalic). Abnormal Haemoglobin—Carrier ¼ 4.3%, S gene frequency 2.14. ABO blood group system—A ¼ 21%, B ¼ 37%, AB ¼ 22%, O ¼ 20% (Ahmed 1977). Rh System-Haptoglobin—Papiha et al. (1988) worked on Kutia Kondh of Berhampur, Odisha, and found HP1 gene frequency was 17.10% and HP2 gene frequency was 82.90%. Transferrin system—Papiha et al. in 1988 reported gene frequency of Tfc variant was 89.6% and gene frequency of Tfd variant was

Kutia Kondh

275

Fig. 7.87 mtDNA phylogenetic tree of M-haplogroup among the Kattunayakan

10.4% among Kutia Kondh community of Odisha. Kutia Kondh is one of the sub-sections of the Kondh.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 49 Kutia Kondh individuals were all assigned to three haplogroups (Fig. 7.89) through screening the Y-SNPs. Haplogroup H* has the highest frequency (82%), followed by haplogroup O2A (10%) and haplogroup M (8%). Haplogroup H*

is widely distributed in South Asia, India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan. Haplogroup O2A is an AustroAsian genes found mainly in South East Asia. Haplogroup M is distributed among Papua New Guinea, neighbouring Melanesia, Indonesia and indigenous Australian aborigines.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 92 Khutia Kondh individuals were scanned maternal lineages in the population. Khutia Kondh maternal lineages

Mean number of pairwise differences 13.96396  6.410

Nucleotide diversity 0.00084  0.0004

Sum of squared deviation SSD P 0.04062 0.690

Table 7.20 Molecular Diversity Indices among the Kattunayakan Harpending’s Raggedness index HRI P 0.04647 0.740 Theta Pi 13.963964  7.123

S 12.9354  4.128

Tajima’s P D P 0.16755 0.617

Fu’s Fs Fu’s Fs 24.3599

p 0.00

276 7 Genomic Diversity of 75 Communities in India

Kutia Kondh

277

Fig. 7.88 Mismatch distribution of nucleotide differences of Kattunayakan

Fig. 7.89 Y chromosome haplogroups of Kutia Kondh

comprises 80% Asian macro-haplogroup M and 20% of European macro-haplogroup N. A total of 11 maternal lineages belonging to haplogroup M were found in Kutia Kondh population. Haplogroup M45 has the highest frequency (16%), followed by M2 (15%), M5 and M42 (13% each), M49 (12%), M53 (11%), M38 (8%), new (4%), M4 and M43 (3% each), M39

(1%). Haplogroups M2, M3, M4, M5, and M39 are found in South Asian countries (Fig. 7.90). While haplogroup M42 is distributed among Australian aborigines. Haplogroup M49 is distributed among ancient specimen in the Euphrates valley. Founder age of Kutia Kondh population was 29  10 ky.

278

7

Genomic Diversity of 75 Communities in India

Fig. 7.90 mtDNA phylogenetic tree of M-haplogroup among the Kutia Kondh

The N-haplogroup of the 18 individuals of Kutia Kondh were all assigned to seven haplogroups (Fig. 7.91). Haplogroup U2 is

having the highest frequency 33%, followed by R30 (17%) followed by R*, R2, R8, HV, and W with equal frequency of 11% and R6 is 6%. The

Kutia Kondh

Fig. 7.91 mtDNA phylogenetic tree of N-haplogroup among the Kutia Kondh

279

280

coalescence age between Indian and WesternEurasian U2 lineages was estimated to be 53,000  4000 ybp.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Khutia Kondh population is 0.002292  0.001114. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.21). Figure 7.92 shows mismatch distributions of Khutia Kondh population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often unimodal in Khutia Kondh population indicates recent expansion. The smaller number of sum of squared deviation (0.00045392) and Harpending’s raggedness index (r) (0.0004415) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Khutia Kondh population 2.139 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (24.0725) is also an evidence for a recent population expansion.

Konda Reddis The Konda Reddis claim their origin from Bhima, one of the Pandavas of the epic Mahabharata. According to 2011 census, their population

7

Genomic Diversity of 75 Communities in India

strength was 107,747. Out of which 53,244 were males and 54,503 were females (sex ratio ¼ 1024). They are distributed in the east and west Godabari district and Khammam district of Andhra Pradesh. Their mother tongue is Telugu, which belongs to the Dravidian family of languages. They are shifting cultivators. Agricultural labour is another important source of livelihood. Marriage by service, capture, negotiation, courtship, elopement and exchange are common. In Tamil Nadu, the Konda Reddis refer to themselves as Ganjama Reddi and use the title Reddiyar. The Konda Reddis are divided into exogamous sects or intiperulu, some of which are Allalu, Poteruy, Kadapala, Sayanta and Kathula. They are Hindu (Singh 1994). For the present study, Konda Reddis blood samples were collected from the state of Andhra Pradesh. They are characterized by a long and narrow head and broad facial profile with a short and moderately broad nose and are of short-stature. Abnormal Haemoglobin—HbA gene frequency found to be 98.84 and S gene frequency 1.16. ABO Blood Group System—Naidu et al. (1990) found A ¼ 30.3%, B ¼ 21.5%, AB ¼ 6.8%, O ¼ 41.4%.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 18 Konda Reddis individuals were all assigned to three haplogroups (Fig. 7.93) through screening the Y-SNPs. Haplogroup C5* has the highest frequency (39%), followed by haplogroup F* (33%) and haplogroup O2A* (28%). Haplogroup C5* is found in high frequency in the Australian aborigines. Haplogroup C attends its highest frequency among the indigenous population of Mongolia, Russia, Polynesia, Australia and at moderate frequency in Korean peninsula and Manchuria. It displays its high frequency in modern Indian populations. Haplogroup F* is found mainly in European and Native American, South Asia and Central Asia. Haplogroup O2A is an Austro-Asian gene found mainly in South East Asia.

Mean number of pairwise differences 37.85413  16.605

Nucleotide diversity 0.00229  0.001

Sum of squared deviation SSD P 0.00045 0.950

Table 7.21 Molecular Diversity Indices among the Kutia Kondh Harpending’s Raggedness index HRI P 0.00044 1.0000 Theta Pi 37.85413  18.394

S 101.2845  24.981

Tajima’s P D P 2.139 0.001

Fu’s Fs Fu’s Fs 24.0725

p 0.000

Konda Reddis 281

282

7

Genomic Diversity of 75 Communities in India

Fig. 7.92 Mismatch distribution of nucleotide differences of Kutia Kondh population

Fig. 7.93 Y chromosome haplogroups of Konda Reddis

Koraga According to 2011 census, their population strength was 14,794. Out of which 7210 were males and 7584 were females (sex ratio ¼ 1052).

They are mainly concentrated in the Dakshin Kannad district of Karnataka. They speak in Koraga among themselves and in Tulu and Kannada with others. Koraga accept cooked food and water from all communities, but none of them

Koraga

283

Fig. 7.94 Y chromosome haplogroups of Koraga

have been reported to reciprocate. They are engaged in basketry, agricultural labour, scavengers and sweepers for their subsistence economy. A Koraga man may marry his father sister’s daughter or his mother brother’s daughter. Marriage alliances are negotiated and they usually marry after attaining adulthood. They profess Hinduism. The groups existing among the Koraga are Kuntu, Chippi and Vanti (Singh 1994). For the present study, Koraga blood samples were collected from the state of Karnataka.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 14 Koraga individuals were all assigned to two haplogroups (Fig. 7.94) through screening the Y-SNPs. Haplogroup H1a* has the highest frequency (93%), followed by haplogroup R2 (7%). Haplogroup H1a* is widely distributed among Dravidian and Central Indian communities. Haplogroup R2 is of Central Asian lineage.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 64 Koraga individuals were scanned for maternal lineages in the population. Koraga maternal lineages comprise of 39% Asian macro-haplogroup M and 61% of European macro-haplogroup N were selected for complete sequencing under HVR1 motif. A total of seven maternal lineages belonging to haplogroup M were found in Koraga population (Fig. 7.95). Haplogroup M30 has the highest frequency (60%), followed by M3 (20%), M2 (7%), M19, M36, M40 and M65 (3% each). Haplogroup M30 found mainly in India, Middle East and North Africa. Haplogroups M2 and M3 are distributed among Batak people of Palawan. Haplogroup M40 is found in South Asian countries. Founder age of Koraga population was 20 6 ky. The N-haplogroup of the 34 individuals of Koraga were all assigned to six haplogroups. The highest frequency is U1 (82%), followed by U2 (6%) then N5, U, H and N* haplogroup have a frequency of 3% each (Fig. 7.96).

284

7

Genomic Diversity of 75 Communities in India

Fig. 7.95 mtDNA phylogenetic tree of M-haplogroup among the Koraga

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Koraga population is 0.001447  0.000723. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.22).

Figure 7.97 shows mismatch distributions of Koraga population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often multimodal in Koraga population which indicates recent expansion. The smaller number of sum of squared deviation (0.01183290) and Harpending’s raggedness index (r) (0.00847415) also confirm population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. Among Koraga population 0.45873 indicates

Koraga

Fig. 7.96 mtDNA phylogenetic tree of N-haplogroup among the Koraga

285

Mean number of pairwise differences 23.2930  10.476

Nucleotide diversity 0.00144  0.0007

Sum of squared deviation SSD P 0.01183 0.390

Table 7.22 Molecular Diversity Indices among the Koraga Harpending’s Raggedness index HRI P 0.00847 0.5100 Theta Pi 23.29303  11.641

S 20.70646  6.365

Tajima’s P D P 0.4587 0.742

Fu’s Fs Fu’s Fs 20.782

p 0.000

286 7 Genomic Diversity of 75 Communities in India

Korku

287

Fig. 7.97 Mismatch distribution of nucleotide differences of Koraga population

the population suffered a recent bottleneck or may be decreasing. A negative value of FU’s Fs (20.78208) is also an evidence for a recent population expansion.

Korku The Korku derive their name from the combination of the word koru meaning man and ku meaning plural (Russell and Hiralal 1975). Their main habitat is located in central highland of the Satpuras, which falls mainly in Madhya Pradesh. According to 2011 census, their population strength was 264,492. Out of which 134,931 were males and 129,561 were females (sex ratio ¼ 960). They are concentrated in the districts of Betul, Chhindwara, Khandwa, Hoshangabad and Dewas of Madhya Pradesh. Their mother tongue is Korku, belongs to the Northern Munda branch of the Austro-Asiatic family of languages. They speak Hindi with others. They maintain a commensal and connubial distance from Bondeya groups. Primarily the Korku are cultivators. Some of them work as farm labourers on daily wages and some work in various government and non-government agencies. Negotiation and mutual consent are common

forms of marriage. The groups of Korku are Bopchi, Bawaria, Mouasi, Nihal, Nahul, Bondi and Bondeya, each of which is endogamous. They are Hindu (Singh 1994). For the present study, Korku blood samples were collected from the state of Maharashtra. They are of below medium statured, round headed, with a broad facial profile and nasal shape. With regard to ABO blood group, the prevalence of A blood group is the highest (32.1%), followed by B blood group (27.7%) and then O blood group (22.6%) and the least is the prevalence of AB blood group (17.5%). Sickle cell haemoglobin has been detected among Korku with an incidence about 9% heterozygous HbAS (Negi and Maitra 1974).

Paternal Lineage (Y Chromosomal Haplogroups) The screening of two Korku individuals for Y-SNPs revealed the presence of one haplogroup. This is haplogroup J* which shows preponderance of 100% (Fig. 7.98). The haplogroup J* is predominantly of Central Asian lineage.

288

7

Genomic Diversity of 75 Communities in India

Fig. 7.98 Y chromosome haplogroups of Korku

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of the 49 Korku individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 61 Asian macro-haplogroup M, and 39 of European macrohaplogroup N were selected for complete sequencing based on HVR I motifs. From the M sequences, nine maternal lineages were found among the Korku population (Fig. 7.99). Among these, 32% predominantly belongs to M2, 23% belongs to M5, 9% each belong to M38, M45 and M56 lineages and 4% each belong to M6, M3, M33 and M40 lineages. The founder age of M2 among the korku is estimated to be 64  13 ky based on the high preponderance of M2 maternal lineage. A total of nine maternal lineages belonging to haplogroup N were found among the Korku population (Fig. 7.100). Among these, R8 predominantly belongs to 21% followed by U2 (16%), W, R30, N5 (11% each), R6 and U7 (10%), R7 and R5 (5% each).

Molecular Diversity Molecular diversity indices are shown in Table 7.23. Nucleotide measures the degree of

polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Korku population is 0.001756  0.000891. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.101 shows mismatch distributions of Korku population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Korku population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01357122) and Harpending’s raggedness index (r) (0.01328686) also confirm that Korku has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Korku population 1.31886 indicates the population is under expansion, meaning recovering from bottleneck. A negative

Korku

Fig. 7.99 mtDNA phylogenetic tree of M-haplogroup among the Korku

289

290

7

Genomic Diversity of 75 Communities in India

Fig. 7.100 mtDNA phylogenetic tree of N-haplogroup among the Korku

Mean number of pairwise differences 29.09523  13.229

Nucleotide diversity 0.00175  0.0008

Sum of squared deviation SSD P 0.01357 0.170

Table 7.23 Molecular Diversity Indices among the Korku Harpending’s Raggedness index HRI P 0.01328 0.3800 Theta Pi 29.09523  14.760

S 43.0684  14.481

Tajima’s P D P 1.3188 0.088

Fu’s Fs Fu’s Fs 6.488

p 0.006

Korku 291

292

7

Genomic Diversity of 75 Communities in India

Fig. 7.101 Mismatch distribution of nucleotide differences of Korku population

value of FU’s Fs (6.48827) is also an evidence for a recent population expansion of Korku community.

Kota According to Kota legend, the Toda, Kota and Kurumba were brothers and were the earliest inhabitants of the Nilgiri Hills. According to 2011 census, their population strength was 308. Out of which 155 were males and 153 were females (sex ratio ¼ 1330). Kota are one of the four tribal groups of Nilgiri Hills in Tamil Nadu. They are also known as Kotha, Kotar, Koter, Kohatur and Kuof. They speak among themselves in their own language, ‘Kota’, which belongs to the Dravidian family of language, and use the regional language, Tamil, for intergroup communication. The Kota inter-dine with their neighbours and share sources of water and public places with them. Traditionally the Kota were potters. At present, agriculture is the mainstay of livelihood for many of them. They practise both shifting and settled cultivation. Animal husbandry is also a major economic pursuit. Monogamy is the form of marriage. Marriage to one’s mother’s brother’s daughter and father’s sister’s

daughter are common. Marriages are carried out by elopement, exchange and negotiation. They are Hindu (Singh 1994). For the present study, Kota blood samples were collected from the state of Tamil Nadu. ABO Blood Group System—Mohan Raj et al. (1986) A ¼ 0.09%, B ¼ 53.3%, AB ¼ 0.09%, O ¼ 44.8%. Rh System—d gene frequency is 13.73%. Haptoglobin—Ghosh (1977) worked on Kota tribal community of Nilgiri Hills, Tamil Nadu and showed HP1 gene frequency was 14.70 and HP2 gene frequency was 85.30. Transferrin system—Ghosh in the 1977 found 100.0 gene frequency of Tfc variant of transferring among Kota of Nilgiri Hills, Tamil Nadu. An interesting feature found in this group is the absence of sickle-cell trait. According to different census figure this population is declining gradually.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 51 Kota individuals were all assigned to six haplogroups (Fig. 7.102) through screening the Y-SNPs. Haplogroup H1a has the highest frequency (37%), followed by haplogroup J2B1 (22%),

Koya

293

Fig. 7.102 Y chromosome haplogroups of Kota

haplogroup F* (6%), haplogroup L1 (6%). Haplogroup H1a* is distributed among Dravidian and Central Indian communities. Haplogroup J2B1 is found in Italy, Czechoslovakia and Germany. Haplogroup F* found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. Haplogroup L1 is typically found among Dravidian communities of India.

Koya The Koya claim their origin from Bhima, one of the Pandavas of the epic Mahabharata. According to 2011 census, their population strength was 590,739. Out of which 289,025 were males and 301,714 were females (sex ratio ¼ 1044). They inhabit the agency tracts of the East and West Godavari district and the adjoining Khammam and Adilabad district of Andhra Pradesh. They speak Telugu among themselves and with other communities. Traditionally they were hunters and gatherers. Now they have are engaged in

agriculture. Monogamy is the form of marriage, though instances of polygyny exist. They are mainly Hindu, but some of them embrace Christianity (Singh 1994). For the present study, Koya blood samples were collected from the state of Andhra Pradesh. They exhibit a short stature with a medium or lean body type, long and narrow head shape and tendency towards broader nasal and facial profile with a short chin. Abnormal Haemoglobin—AS—12.58%, SS— 0.06%, A gene frequency—93.08, S gene frequency—4.02 (Negi 1976). ABO Blood Group System—A ¼ 30.0%, B ¼ 23.3%, AB ¼ 9.0% and O ¼ 37.7% (Rao 1971). Rh System—d gene frequency of this population is 17.78 (Pingle et al. 1981). Colour Blindness—Prevalence of Colour blindness is 2.06 (Naidu 1989). Haptoglobin—Goud and Rao in 1980 conducted a research work on Koya-Dora of Andhra Pradesh to reveal gene frequency of HP1 and

294

7

Genomic Diversity of 75 Communities in India

Fig. 7.103 Y chromosome haplogroups of Koya

HP2 and found 8.80 and 91.20 gene frequencies, respectively. Transferrin—Goud and Rao in the year of 1980 had worked on gene frequency of transferring variants among Koya-Dora population of Venkatapuram and Badra Challan, Andhra Pradesh, and showed 96.6 (Venkatpuram) and 95.5 (Badra Chalan) gene frequency of Tfc variant, whereas 3.4 (Venkatpuram) and 4.5 (BadraChalan) for Tfd variant. GC System—Walter et al. (1984) reported 85.70 gene frequency for GC1 variant and 14.30 gene frequency for GC2 variant of GroupSpecific Component System among Koya community of Andhra Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 14 Koya individuals were all assigned to four haplogroups (Fig. 7.103) through screening the Y-SNPs. Haplogroup F* has the highest frequency (43%),

followed by haplogroup H1a* (22%), haplogroup O2A* (21%) and haplogroup H* (14%). Haplogroup F* found mostly among the Dravidian, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic group throughout Eurasia. Haplogroup H1a* is distributed among Dravidian and Central Indian communities. Haplogroup O2A is an Austro-Asian gene found mainly in South East Asia. Haplogroup H* is distributed mainly in South Asia-India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan.

Lachungpa The total population of Lachungpa was 69,598. Out of which 35,224 were males and 34,374 were females (sex ratio ¼ 976), whereas rural population was 50,856 (Census of India 2011). They are distributed in North Sikkim district. They speak a language belongs to Tibeto-Burman family of language. For the present study, Lachungpa

Lachungpa

295

54.1

13.5 5.4

D1a*

F*

2.7 H1a*

5.4

K*

8.1

8.1

2.7 N1*

O2A*

O3A3C1* Q1*

Fig. 7.104 Y chromosome haplogroups of Lachungpa

blood samples were collected from the state of Sikkim.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 37 Lachungpa individuals for Y-SNPs revealed the presence of eight haplogroups. These are D1a*, O3A3C1*, O2A*, Q1*, F*, K*, H1a* and N1. Among these, D1a* shows preponderance with a percentage of 54% (Fig. 186) followed by O3A3C1* (14%), O2A (8%), Q1*, F* and K* (5%) and H1a* and N1 (3%). The haplogroup H* is predominantly of Central Asian lineage (Fig. 7.104).

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 104 Lachungpa individuals were scanned for maternal lineages in the population. A total of 36 mtDNA genomes were selected for complete sequencing under Mand N-haplogroups based on HVR I motifs. A total of five maternal lineages belonging to haplogroup M were found among the Lachungpa

population (Fig. 7.105). Among these, 32% predominantly belongs to M8, 28% belong to D lineage, G (24%), M9 and M61 (8% each). The founder age of the Lachungpa was estimated to be 45  15 ky based on the high preponderance of M49 maternal lineage. In case of N-haplogroup the highest frequency is F1 (55%), A14 (18%), R2, B40 5 and R0 having frequency of 9% each (Fig. 7.106). Subhaplogroup R2 is rare among Indian sample, and the age of this sub-clade is calculated to be 11,400  9000 ybp.

Molecular Diversity Molecular diversity indices are shown in Table 7.24. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Lachungpa population is 0.001749  0.000882. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or

296

7

Genomic Diversity of 75 Communities in India

Fig. 7.105 mtDNA phylogenetic tree of M-haplogroup among the Lachungpa

Lachungpa

297

Fig. 7.106 mtDNA phylogenetic tree of N-haplogroup among the Lachungpa

between ecological populations, and to determine evolutionary relationships. Figure 7.107 shows mismatch distributions of Kolam population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly

unimodal in Lachungpa population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00534387) and Harpending’s raggedness index (r) (0.00523333) also confirm that Lachungpa has undergone recent demographic expansion.

Mean number of pairwise differences 28.97333  13.111

Nucleotide diversity 0.00174  0.0008

Sum of squared deviation SSD P 0.00534 0.570

Table 7.24 Molecular Diversity Indices among the Lachungpa Harpending’s Raggedness index HRI P 0.00523 0.9300 Theta Pi 28.97333  14.611164

S 37.8711  12.412

Tajima’s P D P 0.932 0.187

Fu’s Fs Fu’s Fs 8.264

p 0.005

298 7 Genomic Diversity of 75 Communities in India

Lepcha

299

Fig. 7.107 Mismatch distribution of nucleotide differences of Lachungpa population

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations, Tajima’s D will be a negative value. Among Lachungpa population 0.93207 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (8.26493) is also an evidence for a recent population expansion of Lachungpa community.

are hunters and gatherers and pastoralists. Nowadays they are primarily engaged in cultivation. Marriages with the Bhotia are not discouraged. Inter-clan marriage is allowed. Monogamy is the norm. They are mainly Buddhist. Some of them are Christian and Hindu. For the present study, Lepcha blood samples were collected from the state of Sikkim. They are characterized by short stature and show a tendency towards a broad head shape and a broad facial profile with a short and often narrow nasal feature.

Lepcha

ABO Blood Group—A ¼ 34.1%, B ¼ 21.9%, AB ¼ 6.8%, O ¼ 37.0%. Gene frequency of p—26.00, q—20.00, r—54.00 (Bhattacharjee 1968). Colour Blindness—2.41%. Haptoglobin—Bhasin et al. (1986) found HP1 gene frequency 10.8 and HP2 gene frequency 89.2 among the Lepcha population of Sikkim. GC System—Bhasin et al. (1986) worked on GC system among Lepcha community of North and South Sikkim and found gene frequency for GC1 were 87.10 (North Sikkim) and 89.10 (South Sikkim) and gene frequency for GC2 were 12.90 (North Sikkim) and 10.90 (South Sikkim).

Lepcha claimed to be the early inhabitants of Sikkim, who believe that their homeland was the legendary kingdom Mayel near mount Kangchenzonga. According to 2011 census, their population strength was 42,909. Out of which 21,614 were males and 21,295 were females (sex ratio ¼ 1015). They are concentrated in the Dzongu district of Sikkim and the Darjeeling of West Bengal. Their language is Lepcha which belongs to the Tibeto-Burman family of languages. They accept all types of food and water from their neighbouring communities but do not share their crematoria. Traditionally they

300

7

Genomic Diversity of 75 Communities in India

Fig. 7.108 Y chromosome haplogroups of Lepcha

In PTC substance taste ability, the non-taster gene (t) value is 36%.

Paternal Lineage (Y Chromosomal) The screening of 39 Lepcha individuals for Y-SNPs revealed the presence of five haplogroups. These are O3a3a1*, N1*, K*, O2a* and P*. Here, O3a3a1* shows the highest preponderance with a percentage of 62% followed by N1* (21%), K* (13%), O2a* and P* (3%) (Fig. 7.108). The haplogroup O3a is predominant among the Tibeto-Burman communities of South East Asia.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 109 Lepcha individuals were scanned for maternal lineages in the population. A total 29 mtDNA genomes were selected for complete sequencing under Mand N-haplogroup based on HVR I motifs. A total of four maternal lineages belonging to haplogroup M were found among the Lepcha population (Fig. 7.109). Of these, 35% predominantly belong to M33, 25% belong to M9 lineages, 20% belong to M8 and D each. The

founder age of the Lepcha was estimated to be 51  9 ky based on the high preponderance of M33 maternal lineage. In case of N-haplogroup frequency of sub-haplogroups A and B are equally distributed (50%). Haplogroup A is found in Central and East Asia, as well as among Native Americans. A total of four lineages belonging to haplogroup N were found among the Lepcha population (Fig. 7.110). The highest frequency is assigned to B40 5 (44%) followed by A17 and A21 (22% each), U2 (11%).

Molecular Diversity Molecular diversity indices are shown in Table 7.25. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Lepcha population is 0.001533  0.000784. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships.

Lepcha

301

Fig. 7.109 mtDNA phylogenetic tree of M-haplogroup among the Lepcha

Figure 7.111 shows mismatch distributions of Lepcha population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Lepcha population which indicates a recent expansion. The smaller number of sum of squared deviation (0.05857460) and Harpending’s raggedness index (r) (0.10612188) also confirm that Lepcha has undergone recent demographic expansion.

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Lepcha population 0.00513 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (6.07251) is also an evidence for a recent population expansion of Lepcha community.

302

7

Genomic Diversity of 75 Communities in India

Fig. 7.110 mtDNA phylogenetic tree of N-haplogroup among the Lepcha

Ma Thakur The Rajput elements in Ma-thakur community are said to have been brought by the fugitives from Gujarat, who took refuge among them during the regime of Mohammad Begade. According to 2011 census, their population strength was 567,968. Out of which 287,764 were males and 280,204 were females (sex ratio ¼ 974). They are concentrated in Thane district of Maharashtra and speak a corrupt form of Marathi, and Indo-European language. These people interdine with the Kunbi and Koli, but not with the Katkari, Varli and some scheduled castes. Most

of these people subsist on agriculture and wage labour. Cross-cousin marriages are allowed among them, and marriage alliances are contracted mostly to negotiation. The Ma-Thakur have their own methods of maintaining social control. Usually monogamy is the form of marriage. They are mainly followers of Hinduism (Singh 1994). Ma-Thankur are short-stature people and show a tendency towards a broad head shape (Gupta 1907). For the present study, Ma Thakur blood samples were collected from the state of Maharashtra.

Mean number of pairwise differences 25.40000  11.638

Nucleotide diversity 0.00153  0.0007

Sum of squared deviation SSD P 0.05857 0.000

Table 7.25 Molecular Diversity Indices among the Lepcha Harpending’s Raggedness index HRI P 0.106121 0.000 Theta Pi 25.40000  12.998081

S 25.36826  8.869

Tajima’s P D P 0.00513 0.557

Fu’s Fs Fu’s Fs 6.072

p 0.009

Ma Thakur 303

304

7

Genomic Diversity of 75 Communities in India

Fig. 7.111 Mismatch distribution of nucleotide differences of Lepcha population

Fig. 7.112 Y chromosome haplogroups of Ma Thakur

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 47 individuals were all assigned to five haplogroups (Fig. 7.112) through screening the Y-SNPs. Haplogroup F* has the highest frequency (36%), followed by haplogroup H1a*

(30%), H* (23%), P* (6%) and haplogroup J* (4%). Haplogroup H1a was found at a higher frequency among Dravidian and Central Indian communities and represents the major indigenous Indian haplogroup. About 4% of J indicates episodes of Neolithic migrations from Central Asia.

Ma Thakur

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 116 Ma Thakur individuals were scanned for maternal lineages in the population. Ma Thakur maternal lineages comprise 35% Asian macro-haplogroup M and 65% of European macro-haplogroup N. A total of four maternal lineages belonging to haplogroup M were found in Ma Thakur Fig. 7.113 mtDNA phylogenetic tree of M-haplogroup among the Ma Thakur

305

population. All the M lineages (Fig. 7.113): M44 (53%), M2 (27%), M30 (13%) and M65 (7%) are autochthonous to India. Ma Thakur population comprises of M2 lineage whose founder age was 64  13 ky. With regard to the N-macro-haplogroup (Fig. 7.114), the highest frequency is found among U2 (21%) which is followed by R8 and R6 (18%), R5 and R30 (14% each) and U1 (11%).

306

7

Genomic Diversity of 75 Communities in India

Fig. 7.114 mtDNA phylogenetic tree of N-haplogroup among the Ma Thakur

Madia

Molecular Diversity Molecular diversity indices are shown in Table 7.26. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Ma Thakur population is 0.001051  0.000555. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.115 shows mismatch distributions of Ma Thakur population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Ma Thakur population which indicates a recent expansion. The smaller number of sum of squared deviation (0.02168970) and Harpending’s raggedness index (r) (0.02730159) also confirm that Ma Thakur has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Ma Thakur population 0.02973 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (4.81994) is also an evidence for a recent population expansion of Ma Thakur community.

Madia Etymologically, the Madia means living in trees (mar or mad means a tree) (Singh 1994). The Madia are mainly distributed in four tehsils (Aheri, Bhamragarh, Etapally and Sironcha) of Godchiroli district in Eastern Maharashtra. They speak the Madia dialect of Gondi, a South-Central Dravidian language, and they communicate with outsiders in Hindi and Marathi. Madia are divided

307

into a number of exogamous phratries, which are again divided into a number of clans, namely Mandavi and Bagh (Singh 1998). Hunting and gathering are their main occupation. Apart from this, they collect forest produce which they sell in the weekly market (Reddy et al. 2012). They practise cross-cousin, junior sororate and junior levirate marriages (Singh 1994). There are mentions of living megalithic practices among the Madia Gonds. Madia are short to below medium height and show a dolichocephalic head with a round or tending to a linear facial profile, a short and narrow forehead and a broad nasal shape. They on an average show a higher incidence of gene B (27–31%) than the gene A (17–18%) in the ABO blood group system. Non-secretors of ABH substance in the saliva are found in relatively higher proportions (59–70%) (Singh 1998). Microsatellite diversity was analysed in Gondi language-speaking Madia Gonds of Maharashtra and three other Marathi-speaking Proto—Australoid tribal groups, to understand their genetic structure and to identify the congruence between language and gene pool. Allele frequency data at 15 Short tandem repeat (STR) loci in studied tribes was compared with data of 22 Indo-European and Dravidian-speaking caste and tribal populations using heterozygosity allele size variance, analysis of molecular variance (AMOVA), GST (Glutathione S-transferase) estimate, PC plot and Mantel correlation test. Results demonstrate that Gondi tribes comprising the Madia-Gond, a hunter-gatherer population, harbour lower diversity than Marathi tribal groups, which are culturally and genetically distinct. The Proto-Australoid tribal populations were genetically differentiated from castes of similar morphology, suggesting different evolutionary mechanisms operating upon the populations. The populations showed genetic and linguistic similarity, barring a few groups with varied migratory histories. The microsatellite variation clearly demonstrates the interplay of sociocultural factors including linguistic, geographical contiguity and microevolutionary processes in shaping the genetic diversity of populations in contemporary India (Gaikwad et al. 2006). With regard to the transferrin system, Mukherjee et al.

Mean number of pairwise differences 17.40952  8.201

Nucleotide diversity 0.00105  0.0005

Sum of squared deviation SSD P 0.021689 0.560

Table 7.26 Molecular Diversity Indices among the Ma Thakur Harpending’s Raggedness index HRI P 0.02730 0.590 Theta Pi 17.40952  9.196

S 17.53003  6.675

Tajima’s P D P 0.02973 0.536

Fu’s Fs Fu’s Fs 4.819

p 0.016

308 7 Genomic Diversity of 75 Communities in India

Madia

309

Fig. 7.115 Mismatch distribution of nucleotide differences of Ma Thakur population

(1979) reported the gene frequency of Tfc variant was 100.0% among Gond-Madia of Maharashtra. For the present study Madia blood samples (140) were collected from the state of Maharashtra.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 31 Madia individuals for Y-SNPs revealed the presence of eight haplogroups. These are F*, H1a*, O2a*, H*, R2, K*, O3a3c1* and P*. Here, H1a*shows the highest preponderance with a percentage of 39% followed by F* (26%), O2a* (13%), H* and R2 (7%), K*, O3a3c1* and P* (3%) (Fig. 7.116). The haplogroup H1a* is predominant among the Dravidian and Central Indian communities. It is considered to be a major indigenous Indian haplogroup.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 140 Madia individuals were scanned for maternal lineages in the

population. Its maternal lineages comprise 76% Asian macro-haplogroup M and 24% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total of eight maternal lineages belonging to haplogroup M were found among the Madia population. Of these, 24% predominantly belong to M35 and M63, 16% belong to M2 and M42 lineages, 8% belong to M39, 4% belong to M3, M33 and M41 lineages (Fig. 7.117). The founder age of the Madia was estimated to be 64  13 ky based on the high preponderance of M33 and M63 maternal lineage. In case of N-haplogroup R6 sub haplogroup have the highest frequency (37%), followed by B (25%) and R8, R30 and U2 (13%) (Fig. 7.118).

Molecular Diversity Molecular diversity indices are shown in Table 7.27. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Madia population is 0.001341  0.000681. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical

310

7

Genomic Diversity of 75 Communities in India

Fig. 7.116 Y chromosome haplogroups of Madia

measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.119 shows mismatch distributions of Madia population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Madia population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00601605) and Harpending’s raggedness index (r) (0.00755556) also confirm that Madia has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Madia population 1.09351 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (10.21667) is also an evidence

for a recent population expansion of Madia community.

Mal Paharia Mal Paharia are the inhabitants of the Dumka and the southern portion of the Sahebganj districts of the erstwhile Santal Parganas of Jharkhand. Apart from that they are also found in West Bengal and Bihar. In these places they are recognized as Schedule Tribes. The Mal Paharia considers themselves higher in social stratus than the two other groups of the Paharia—the Sauria Paharia and Kumarbhag Paharia. These groups neither inter-marries nor accept cooked food from each other. They are mostly agricultural labourers. A few of the community member are still in the traditional occupation of hunting and gathering. Both child and adult marriages are practised. Marriages take place through negotiation and by mutual consent. The religion of the Mal Paharia may be described as a combination of their traditional religion and Hinduism. They perform

Mal Paharia

311

Fig. 7.117 mtDNA phylogenetic tree of M-haplogroup among the Madia

Dharma Puja, and celebrate the festivals of Baisakh Sankranti, Kali Puja and Maghipuja. Mal Paharia do not visit any pilgrimage centre, and only youngsters attend a few fairs held at nearby villages (Singh 1994). According to 2011 census, the total population is 135,797. Out of which 67,791 are males and 68,006 are females (sex ratio is 1003). Majority of the community follow Hinduism, a few follow Christianity and Buddhism. For the present study, Mal

Paharia blood samples were collected from the state of Jharkhand.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 42 Mal Paharia individuals for Y-SNPs revealed the presence of eight haplogroups. These are C5*, H*, H1a*, H2*,

312

7

Genomic Diversity of 75 Communities in India

Fig. 7.118 mtDNA phylogenetic tree of N-haplogroup among the Madia

J2a, O2a*, RIa1* and F*. Here, C5* and H* show the highest preponderance with a percentage of 26% followed by H2* (14%), H1a* (7%), J2a (7%), O2a* (7%) and RIa1* (7%) and F* (5%) (Fig. 7.120). The F* haplogroup is mainly distributed in North, Central, Western and South India, Sri Lanka, Nepal, Borneo, Java and Sulaweshi and Lemdada. However, the haplogroup H1a* is predominant among the Dravidian and central Indian communities. It is considered to be a major indigenous Indian haplogroup.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 114 individuals were scanned for maternal lineages in the population.

Its maternal lineages comprise 58% Asian macrohaplogroup M and 42% of European macrohaplogroup N selected for complete sequencing based on HVR I motifs. A total of seven maternal lineages belonging to haplogroup M were found among the Mal Paharia population. Of these 26% predominantly belong to M2 and M33, 21% belong to M41 lineages, 10% belong to M5, 5% belong to M39, M40 and M49 lineages (Fig. 7.121). The founder age of the Mal Paharia was estimated to be 34  09 ky based on the high preponderance of M2 and M33 maternal lineage. The N-haplogroup of the 14 individuals of Mal Paharia were all assigned to three haplogroups— U2, R6 and R8. U2 is predominantly high with 79% followed by R6 (14%) and R8 (7%) (Fig. 7.122). In the Indian scenario, since

Mean number of pairwise differences 22.22000  10.129

Nucleotide diversity 0.00134  0.0006

Sum of squared deviation SSD P 0.00601 0.520

Table 7.27 Molecular Diversity Indices among the Madia Harpending’s Raggedness index HRI P 0.00755 0.7400 Theta Pi 22.2200  11.288

S 30.7206  10.136

Tajima’s P D P 1.093 0.149

Fu’s Fs Fu’s Fs 10.216

p 0.002

Mal Paharia 313

314

7

Genomic Diversity of 75 Communities in India

Fig. 7.119 Mismatch distribution of nucleotide differences of Madia Population

Fig. 7.120 Y chromosome haplogroups of Mal Paharia

haplogroup R and its sub-haplogroup U are predominant, the coalescence age of Indian and Western-Eurasian U2 lineages was estimated to be 53,000  4000 ybp.

Molecular Diversity Molecular diversity indices are shown in Table 7.28. Nucleotide measures the degree of

Mal Paharia

315

Fig. 7.121 mtDNA phylogenetic tree of M-haplogroup among the Mal Paharia

polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Mal Paharia population is 0.001813  0.000930. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistic may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.123 shows mismatch distributions of Mal Paharia population. Smooth line is the expected distribution under the hypothesis of

constant population size. Mismatch distribution is mostly unimodal in Mal Paharia population which indicates a recent expansion. The smaller number of sums of squared deviation (0.04620027) and Harpending’s raggedness index (r) (0.09457901) also confirm that Mal Paharia has undergone recent demographic expansion. Tajima’s D is a statistic that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Mal Paharia population 0.83913 indicates the population is under expansion, meaning recovering from bottleneck. A negative

316

7

Genomic Diversity of 75 Communities in India

Fig. 7.122 mtDNA phylogenetic tree of N-haplogroup among the Mal Paharia

value of FU’s Fs (4.30507) is also an evidence for a recent population expansion of Mal Paharia community.

Mara The Mara people are the native inhabitants of Mizoram, primarily in the Mara Autonomous District Council of the state of Mizoram. Significant numbers of Maras are also found living South-Eastern part of Burma, in Chin State and Rakhine State which border the district (Doungel 2019). They were earlier known as the Lakher by outsiders as Lusei called them by that name, and

the new name Mara was inserted in list of Scheduled Tribes in Mizoram state in 1978 replacing the old name (Doungel 2019). The Mara language is a language related to Tibeto-Burman family. It is spoken by Mara people who live in a contiguous area in Mizoram state, India and Chin state, Myanmar. Mara is also closely related to other Mizo and Chin languages widely spoken in the area. All ethnic Mara people claim to be 100% Christian, mostly Evangelical. According to 2011 census, the Lakher of Mizoram has a total population of 42,855. Out of which 21,402 are males and 21,453 are females (sex ratio is 1002). For this study, blood samples have been collected from the state of Mizoram.

Mean number of pairwise differences 30.04575  13.777

Nucleotide diversity 0.00181  0.000930

Sum of squared deviation SSD P 0.04620 0.010

Table 7.28 Molecular Diversity Indices among the Mal Paharia Harpending’s Raggedness index HRI P 0.09457 0.000 Theta Pi 30.04575  15.406

S 37.50487  13.268

Tajima’s P D P 0.8391 0.207

Fu’s Fs Fu’s Fs 4.305

p 0.021

Mara 317

318

7

Genomic Diversity of 75 Communities in India

Fig. 7.123 Mismatch distribution of nucleotide differences of Mal Paharia population

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 55 Mara individuals were all assigned to two haplogroups (Fig. 7.124) through screening the Y-SNPs. Haplogroup O3A3C1 has the highest frequency (56%) followed by haplogroup O2A* (44%). Haplogroup O3A3C1 mostly found in South East Asia. Haplogroup O2A* is an Austro-Asian gene found mainly in South East Asia.

found common though there is existence of nuclear family also. The Mathur are mainly engaged in the government or private sectors. They are settled in the urban areas. The other castes with whom Mathurs have linkages are the occupations groups of Nai, Kahar, Kumhar and artisan castes. For the present study, Mathur blood samples were collected from the state of Rajasthan.

Paternal Lineage (Y Chromosomal Haplogroups)

Mathur The Mathur are said to have originated from Lord Chitragupta who was born from the body of Brahma. They are distributed in Jodhpur, Nagaur, Ajmer, Jaipur and Bhilwara districts of Rajasthan and speak in Marwari among themselves and are also well conversant in Hindi. The Mathur use Devanagari script. They have 16 exogamous clans and claim a status equivalent to Kshatriya. The Mathur are an endogamous community which has recently become wide by including other sub-groups of Kayastha. Monogamy is the form of marriage. Divorced among them is permitted though discouraged. Extended family is

The Y chromosome haplogroup of the Mathur individuals were all assigned to four haplogroups (Fig. 7.125) through screening the Y-SNPs. Haplogroup H1a* has the highest frequency (38%) followed by haplogroup Q1 (33%), F* (24%) and J2B1 (5%). The haplogroup H1a* is predominant among the Dravidian and Central Indian communities. It is considered to be a major indigenous Indian haplogroup. Haplogroup F* found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic group throughout Eurasia. Haplogroup J2B1 is distributed in Italy, Czechoslovakia and Germany.

Mathur

Fig. 7.124 Y chromosome haplogroups of Mara

Fig. 7.125 Y chromosome haplogroups of Mathur

319

320

Mina They trace their descent to Minavatar, believed to be an incarnation of Vishnu in the form of a fish. According to 2011 census, their population strength was 4,345,528. Out of which 2,264,021 were males and 2,081,507 were females (sex ratio ¼ 919), whereas rural population was 4,033,132. They are concentrated in the Jaipur, Alwar, Bharatpur, Sawai Madhopur, Tonk and Bundi districts of Rajasthan. They speak Mewari which belongs to Indo-European language family. They accept the superiority of Brahmins, but equate themselves with the Rajputs, Jat, Thakur, Mahajan and Gujar of the area. Inter-dinning with them is not forbidden. The Mina economy is predominantly based on agriculture. They also depend on animal husbandry, labour and government service for their livelihood. Most of the Meena/Mina groups practise endogamy. They are Hindu. For the present study, Meena/Mina blood samples were collected from the state of Rajasthan.

Fig. 7.126 Y chromosome haplogroups of Mina

7

Genomic Diversity of 75 Communities in India

Dermatoglyphics: They have a preponderance of loop (55.9%) over whorls (41.1%) and arches (3.9%). PII is 13.71. ABO Blood Group System: A ¼ 17.5%, B ¼ 36.25%, AB ¼ 7.5%, O ¼ 38.75% (Papiha et al. 1982).

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 53 Meena/ Mina individuals were all assigned to 11 haplogroups (Fig. 7.126) through screening the Y-SNPs. Haplogroup L1 has the highest frequency (36%) followed by haplogroup R1A1* (32%), haplogroup F* and R2 (each 8%), haplogroup J2B1* (4%) and haplogroup H1A*, H1b and H* (each 2%). Haplogroup L1 is mostly found among the Dravidian communities of India and R1A1* is widely distributed in Eurasia. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic groups in Eurasia.

Melacheri

Haplogroup R2 is distributed among Central Asian lineage. Haplogroup J2 is found among Neolithic migrations from Central Asia. Haplogroup H1A* is distributed among Dravidian and Central Indian communities.

Melacheri The word Melacheri is derived from the combination of two words, ‘Mela’ meaning west and ‘Cheri’ means Village, because they were native of some place on western part of islands. Melacheri are the inhabitants of all the islands of the Lakshadweep except Minicoy.They speak the Dweep Bhasha which is a dialect of Malayalam which belongs to Dravidian language family. The Melacheri are primarily agriculturist, among whom land is own individually or by the tarward. Making coir products, coconut plantation and fishing are their other major sources of income. Besides, they are also engaged in business, government service and some are self-employed. The community follows Islam and belongs to the Shafei School of the Sunni Sect. Some individuals and families are affiliated to a Pir. They are monogamous. Marriage to one’s mother’s brother’s daughter or father’s sister’s daughter is preferred. Spouses are acquired through negotiation or by exchange. Remarriage is permissible. For the present study, Melacheri blood samples were collected from Lakshadweep. According to Bhattacharya and Biswas (1978) the Melacheri show a relatively higher occurrence of O blood group (34.4%) followed by both A (28.2%) and B (31.3%) blood groups and AB (6.1%) in the ABO blood group system. They indicate rarity or absence of G6PD deficient and Colour-Blind individuals.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 21 individuals were scanned for maternal lineages for the Malacherri population of Lakshadweep Island. Its maternal

321

lineages comprise of 14% Asian macrohaplogroup M and 861% of European macrohaplogroup N. The M maternal lineage comprise of only M30 haplogroup (100%) (Fig. 7.127). The founder age of the Malacherri was estimated to be 34  09 ky based on the preponderance of M30 maternal lineage. The N-haplogroup of the 18 individuals of Malacherry were all assigned to only one haplogroup R30 distributing within R30 (94.44%) and R30b (5.66%) (Fig. 7.128).

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Melacherri population is 0.000751  0.000400. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.29). Figure 7.129 shows mismatch distributions of Melacherri population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often multimodal in Melacheri population indicates recent expansion. The smaller number of sum of squared deviation (0.02865) and Harpending’s raggedness index (r) (0.04544218) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. Among Melacheri population 1.477 indicates the population is under expansion and have a chance for population increase. A negative value of FU’s Fs (13.84939) is also an evidence for a recent population expansion.

322

7

Genomic Diversity of 75 Communities in India

Fig. 7.127 mtDNA phylogenetic tree of M-haplogroup among the Melacheri

Mizo The Mizos believe that they originally came out from a cave (Chhingulang), which may perhaps be a place in the east, later on they moved on to Tibet and Burma. Around AD 1700 they migrated to Mizoram. According to 2011 census, their population strength was 734,910. Out of which 363,397 were males and 371,513 were females (sex ratio ¼ 1022), whereas rural population was 272,397. They are concentrated in Mizoram. They speak in Lushai or Mizo dialect which belongs to Tibeto-Burman family of languages. They are divided into several sub-groups but none of the groups are considered a separate group, having an identity of their own. Clan exogamy is practised. Courtship, mutual consent and elopement are the popular modes of marriage. The finger dermatoglyphic characters show a higher proportion of loops (53%) in both sexes than

whorls with pattern intensity index of 14 (Chakravartti and Mukherjee 1962). The Mizos exhibit a higher incidence of blood group A (44.7%) than B in the ABO blood group system (Mitra 1936). For the present study Mizo blood samples (104) were collected from the state of Mizoram.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 23 Mizo individuals were all assigned to two haplogroups (Fig. 7.130) through screening the Y-SNPs. Haplogroup O2A* has the highest frequency (61%) followed by haplogroup O3A3C1* (39%). Haplogroup O2A* is an Austro-Asian gene. Haplogroup O3A3C1* is mainly distributed in South East Asia.

Mullu Kurumba

323

Fig. 7.128 mtDNA phylogenetic tree of N-haplogroup among the Melacheri

Mullu Kurumba They are a group of the Kurumba, mainly concentrated in the Nilgiri district of Tamil Nadu. They speak a dialect of Kannada among themselves and use the Tamil Language and the Tamil script with others. They have exogamous kulams namely, Vadakku, Villippa, Kathipa and Vengada. Mullu Kurumba consider the tribal groups of Paniyan, Kattunyakan and Urali Kurumba as lower to them in social status and do not accept food from them. Though hunting and agriculture were the traditional economic activities, at present most of them have lost their land and agricultural labour has become their

main source of livelihood. The thali is the symbol of marriage and the traditional bride price or kaanon is paid by the groom’s party, with which the bride can procure a set of silver jewellery. The remarriage of widow or divorcees is allowed. They are Hindu by religion and the DeivaPerai is the centre for the celebration and observation of all the rites and rituals connected with their life cycle. They celebrate Hindu festivals like Sanskriti, Onam etc. They have sickle cell gene with a frequency of 37.62% (AS) and 1% (SS). A gene frequency is 80.20 and S gene frequency is 19.80 (Sastry 1990). The ABO blood group frequency among this population is A ¼ 41.0%, B ¼ 17.2%, AB ¼ 6.9%, O ¼ 34.5% (Saha

Mean number of pairwise differences 9.63809  4.601

Nucleotide diversity 0.000751  0.0004

Sum of squared deviation SSD P 0.028658 0.210

Table 7.29 Molecular Diversity Indices among the Malacherri Harpending’s Raggedness index HRI P 0.04544 0.100 Theta Pi 9.63809  5.136

S 15.28737  5.425

Tajima’s P D P 1.477 0.037

Fu’s Fs Fu’s Fs 13.8493

p 0.000

324 7 Genomic Diversity of 75 Communities in India

Mullu Kurumba

325

Fig. 7.129 Mismatch distribution of nucleotide differences of Melacheri population

Fig. 7.130 Y chromosome haplogroups of Mizo

et al. 1976). Kirk et al. in 1962 found 100% gene frequency of Tfc variant of transferrin among Mallu Kurumba community of Tamil Nadu. For the present study, Mullu Kurumba blood samples were collected from the state of Tamil Nadu.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 38 individuals were all assigned to six

326

7

Genomic Diversity of 75 Communities in India

Fig. 7.131 Y chromosome haplogroups of Mullu Kurumba

haplogroups (Fig. 7.131) through screening the Y-SNPs. Haplogroup L1 has the highest frequency (40%) followed by haplogroup F* (21%), haplogroup H1a* (16%), haplogroup R1A1* and R2 (each 11%) and haplogroup C5* (3%). Haplogroup L1 is mostly found among Dravidian communities of India and R1A1* is widely distributed in Eurasia. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. Haplogroup H1A* is distributed among the Dravidian and Central Indian communities. Haplogroup R2 is distributed among the Central Asian lineage. Haplogroup J2 is found among Neolithic migrations from Central Asia. Haplogroup C5* found in high frequency in the Australian aborigines.

Its maternal lineages comprise 86% Asian macrohaplogroup M and 14% of European macrohaplogroup selected for complete sequencing based on HVR I motifs. A total of two maternal lineages belonging to haplogroup M were found among the Mullu Kurumba population. A total of two maternal lineages belonging to haplogroup M were found in Mullu Kurumba population (Fig. 7.132). Haplogroup M35 has the highest frequency (96%) followed by a new haplogroup (4%). Haplogroup M35 is distributed in South Asia. Founder age of Mullu Kurumba population was 32  8 ky. The N-haplogroup of the nine individuals of Mullu Kurumba were all assigned to only one haplogroup (Fig. 7.133). The only sub-haplogroup is U1 which is 100%.

Molecular Diversity Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 64 individuals were scanned for maternal lineages in the population.

Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences

Mullu Kurumba

Fig. 7.132 mtDNA phylogenetic tree of M-haplogroup among the Mullu Kurumba

327

328

7

Genomic Diversity of 75 Communities in India

Fig. 7.133 mtDNA phylogenetic tree of N-haplogroup among the Mullu Kurumba

directly. Nucleotide diversity in Mullu Kurumba population is 0.000923  0.000468. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.30). Figure 7.134 shows mismatch distributions of Mullu Kurumba population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions in Mullu Kurumba population indicates recent expansion. The smaller number of sum of squared deviation (0.1188) and Harpending’s raggedness

index (r) (0.006828) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. Among Mullu Kurumba population 1.2518 indicates the population is under expansion and have a chance for population increase. A negative value of FU’s Fs (24.360) is also an evidence for a recent population expansion.

Munda Munda tribe mainly inhabit in the state of Jharkhand, although they are well spread in the

Mean number of pairwise differences 12.6537  5.781

Nucleotide diversity 0.00092  0.0004

Sum of squared deviation SSD P 0.11887 0.000

Table 7.30 Molecular Diversity Indices among the Mullu Kurumba Harpending’s Raggedness index HRI P 0.00682 1.0000 Theta Pi 12.6537  6.410

S 19.88043  5.546

Tajima’s P D P 1.2518 0.089

Fu’s Fs Fu’s Fs 24.360

p 0.000

Munda 329

330

7

Genomic Diversity of 75 Communities in India

Fig. 7.134 Mismatch distribution of nucleotide differences of Mullu Kurumba population

states of West Bengal, Chhattisgarh, Orissa and Bihar. Munda generally means headman of the village. Mundas speak Mundari language, which belongs to the family of Austro-Asiatic language. They have short curly hair. Common surnames used among the Mundas include Topno, Barla, Aind, Hemrom, Guria, Herenge, Surin, Horo, Sanga and Samad. The Munda people are likely descended from Austroasiatic migrants from Southeast Asia. Munda people have been wanderers and hunters occupying India’s tribal belt. Now they are converted to the settled agriculturist. Most of them do not have land of their own. They are largely dependent on the labour work in the fields to earn their livelihood. Munda people are excellent in basket work and weaving. Mundari folk legends refer to the beliefs and practices strictly indigenous to the Munda people, including ancestral worship, worship of indigenous gods, and local festivals (Singh 1994). For the present study, Munda blood samples were collected from the state of Jharkhand. The Mundas are mostly short statured with a long head tending towards a round shape. They show mesoprosopic faces with both broad and long faces with equal proportion. The nasal shape is short and wide. Regarding ABO blood group system the O frequency is highest (59%)

followed by gene A (28%) then by B (20%) (Kumar and Bhattacharjee 1976). Presence of haemoglobin E is 5%.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 49 Munda individuals for Y-SNPs revealed the presence of five haplogroups. These are O2a*, H*, H1a*, R1a1 and J2a*. Here, O2a*shows the highest preponderance with a percentage of 76% followed by H* and H1a* (8%), R1a1 (6%), J2a* (2%) (Fig. 7.135). The haplogroup H1a* is predominant among the Dravidian and Central Indian communities. It is considered to be a major indigenous Indian haplogroup.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 103 Munda individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 81% Asian macro-haplogroup M and 19% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total

Nayaka

331

Fig. 7.135 Y chromosome haplogroups of Munda

of 12 maternal lineages belonging to haplogroup M were found among the Munda population. Of these, 15% predominantly belong to M40, 11% belong to M2, M45 and M58 lineages, 9% belong to M5 and M6, M43 lineages, 6% belong to M38, M39 lineages and 3% belong to M18, M34 and M35 lineages (Fig. 7.136). The founder age of the Munda was estimated to be 35  11 ky based on the high preponderance of M40 maternal lineage. The N-haplogroup of the Munda individuals were all assigned to three haplogroups. Sub-haplogroup R7 is 63%, followed by R6— 25% and R8—13% (Fig. 7.137).

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Munda population is 0.001900  0.000944. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.31).

Figure 7.138 shows mismatch distributions of Munda population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Munda population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00697003) and Harpending’s raggedness index (r) (0.01047949) also confirm that Munda has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Munda population 1.86518 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (14.32218) is also an evidence for a recent population expansion of Munda community.

Nayaka According to one legend, the Nayaka are the descendants of Rupakhati. According to 2011 census, their population strength was 459,908. Out of which 232,965 were males and 226,943 were females (sex ratio ¼ 974). They are

332

7

Genomic Diversity of 75 Communities in India

Fig. 7.136 mtDNA phylogenetic tree of M-haplogroup among the Munda

distributed in the Vadodara, Panch Mahal, Ahmedabad, Kutch and Saurashtra region of Gujrat. Their dialect is ‘Naiki’ which belongs to the Dravidian family of language. The primary occupations are agriculture and cultivation. Nayaka are also engaged as carpenters. They are monogamous and practise clan exogamy. They are patrilocal. A widow can remarry her diseased husband’s younger brother. Nayaka are mainly

followers of Hinduism (Singh 1994). They record a moderate to high incidence of the sickle cell trait (8–12%). They have more or less equal frequency of A and B gene (A ¼ 29.2%, B ¼ 30.9, AB ¼ 6.4%, O ¼ 33.3%) (Vyas et al. 1962). For the present study, Nayaka blood samples were collected from the state of Gujarat.

Nicobarese

333

Fig. 7.137 mtDNA phylogenetic tree of N-haplogroup among the Munda

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 29 Nayaka individuals were all assigned to ten haplogroups (Fig. 7.139) through screening the Y-SNPs. Haplogroup O2A has the highest frequency (24%) followed by haplogroup H2* (21%), haplogroup R1a1* (14%), haplogroup H1a* (14%), haplogroup H* (10%) and haplogroup J2B1, J2B2*, L1, P* and R2 (each 4%). Haplogroup O2a is an Austro-Asian gene found in South East Asia. Haplogroup L1 is mostly found among Dravidian communities of India and R1A1* is widely distributed in Eurasia. Haplogroup H1A* is distributed among

Dravidian and Central Indian communities. Haplogroup R2 is distributed among the Central Asian lineage.

Nicobarese Car Nicobarese are of Mongoloid origin. Bonnington (1932) described them as a branch of the Mon race. According to 2011 census, total number of Car Nicobarese was 22,886. Out of which males were 11,766 and females were 11,120 (sex ratio ¼ 945), while 22,886 populations live in rural areas. Majority of them inhabit in Car Nicobar Island. Nicobarese of different islands speak different dialects of the Nicobarese language and these languages belong

Mean number of pairwise differences 31.4789  14.074

Nucleotide diversity 0.00190  0.0009

Sum of squared deviation SSD P 0.00697 0.040

Table 7.31 Molecular Diversity Indices among the Munda Harpending’s Raggedness index HRI P 0.01047 0.000 Theta Pi 31.47899  15.646

S 61.9201  18.635

Tajima’s P D P 1.865 0.012

Fu’s Fs Fu’s Fs 14.322

p 0.000

334 7 Genomic Diversity of 75 Communities in India

Nicobarese

Fig. 7.138 Mismatch distribution of nucleotide differences of Munda population

Fig. 7.139 Y chromosome haplogroups of Nayaka

335

336

to the Mon-Khmer Nicobarese sub-groups of the Austro-Asiatic family of language. Hindi is used for intergroup communication. The Nicobarese of Car Nicobar considers themselves superior than their counterparts of other islands. The Nicobarese are horticulturists and practise animal husbandry. Nicobarese are endogamous and a member of this community is free to marry any one provided they are not related consanguineally for at least two ascending generations. Monogamy is the rule of marriage. The Car Nicobarese are expert in weaving mats made out of pandanus leaf. Majority of them are Christians. Some of them are Hindus and Muslims (Singh 1994). For the present study, Nicobarese blood samples were collected from Andaman and Nicobar Islands. As far as the anthropometric data are concerned, they are predominantly short to below medium in height with a fairly broad head shape. They have an excess of concave nose (Ganguly 1976). The ABO blood groups studied in Car Nicobar show very high frequency of O blood group (83.56%) and low frequencies of A and B blood groups (Sarkar 1952).

Fig. 7.140 Y chromosome haplogroups of Nicobarese

7

Genomic Diversity of 75 Communities in India

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 25 Nicobarese individuals for Y-SNPs revealed the presence of three haplogroup. These are O2a*, H1a* and R1a1*. Here O2a* shows the highest preponderance with a percentage of 92% followed by H1a* and R1a1 (4% each) (Fig. 7.140). O2a* is considered to be the major South East Asian Lineage. H1a is considered as a specified haplogroup of the Dravidian communities.

Maternal Lineage (mtDNA Haplogroups) The whole mitochondrial genomes of 18 Nicobarese individuals were sequenced for tracing the maternal lineages in the population. Its maternal lineages comprise 100% N of European macro-haplogroup selected for complete sequencing based on HVR I motifs. The N-haplogroup of the 11 individuals of Nicobarese were all assigned to three haplogroups namely, B40 5 (61%), followed by F1 (33%) and R22 (6%) (Fig. 7.141).

Nicobarese

Fig. 7.141 mtDNA phylogenetic tree of N-haplogroup among the Nicobarese

337

338

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Nicobarese population is 0.001305  0.000678. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.32). Figure 7.142 shows mismatch distributions of Nicobarese population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often bimodal in Nicobarese population indicates recent expansion. The smaller number of sum of squared deviation (0.05698153) and Harpending’s raggedness index (r) (0.04260381) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. Among Nicobarese population 0.1828 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (5.111) is also an evidence for a recent population expansion.

Nihal The term Nahal in the Nahali language means a tiger or a lion, and it probably has the same significance as the term Singh (tiger), which the Rajput and members of some other communities affix to their personal names. Russell and Hiralal (1975) mentioned that the Nihal or Nahal are a mixture of the Bhil and Korku. They were also identified with the ancient community called Nahalka, mentioned in the Padma Purana as an

7

Genomic Diversity of 75 Communities in India

offshoot of the Nishada. In Maharashtra the Nihals are concentrated in Melghat hills of the Amravati District, where they are surrounded by other major tribal communities, like the Bhil, Korku and Gond. The Nihals used to speak Nihali language which has been described as a mixed language having Austric speech allied to Korku as its base with borrowed elements from Dravidian and Indo-Aryan Languages. They mainly subsist on agricultural labour both on a daily and an annual payment basis. Their mode of acquiring mates is by negotiation. Marriages by elopement, intrusion and capture are also in vogue. Monogamy is the prevailing practice and payment of bride price is obligatory. Divorce is permitted. The Nihals are Hindu by religion. Their major festivals are Rakhi, Holi, Jiroti, Nagpanchami and Diwali (Singh 1994). The Nihal are mostly short-statured with a long head tending towards a round shape and possess a short and broad nose (Weninger 1952). For the present study Nihal blood samples (107) were collected from the state of Maharashtra.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 23 Nihal individuals were all assigned to four haplogroups (Fig. 7.143) through screening the Y-SNPs. Haplogroup H* has the highest frequency (74%) followed by haplogroup H1a* (13%), H2* (9%) and haplogroup K* (4%). Haplogroup H1a was found at a higher frequency among the Dravidian and Central Indian communities, representing the major indigenous Indian haplogroup.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 112 Nihal individuals were scanned maternal lineages in the population. Nihal maternal lineages comprise 67% Asian macro-haplogroup M and 33% of European macro-haplogroup N. A total of 13 maternal lineages belonging to haplogroup M were found in Nihal population. All the M lineages: M2

Mean number of pairwise differences 21.6029  10.026

Nucleotide diversity 0.00130  0.0006

Sum of squared deviation SSD P 0.05698 0.000

Table 7.32 Molecular Diversity Indices among the Nicobarese Harpending’s Raggedness index HRI P 0.04260 0.1100 Theta Pi 21.60294  11.221

S 20.7055  7.587

Tajima’s P D P 0.1828 0.614

Fu’s Fs Fu’s Fs 5.111

p 0.023

Nihal 339

340

7

Genomic Diversity of 75 Communities in India

Fig. 7.142 Mismatch distribution of nucleotide differences of Nicobarese population

Fig. 7.143 Y chromosome haplogroups of Nihal

(23%), M3 (19%), M30 and M37 (10%), M5, M40, M64 (6%) each. M33, M34, M35, M57, M58 3% (Fig. 7.144) each are autochthonous to India. Nihal population comprises of M2 lineage whose founder age was 64  13 ky.

The N-haplogroup of the Nihal individuals were assigned to seven haplogroups. Sub-haplogroup R30 is 24%, R5—20%, K2 and R8 (13% each), B40 5 and U7 (7% each) (Fig. 7.145).

Nihal

Fig. 7.144 mtDNA phylogenetic tree of M-haplogroup among the Nihal

341

342

7

Genomic Diversity of 75 Communities in India

Fig. 7.145 mtDNA phylogenetic tree of N-haplogroup among the Nihal

Molecular Diversity Molecular diversity indices are shown in Table 7.33. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Nihal population is 0.001587  0.000796. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships.

Figure 7.146 shows mismatch distributions of Nihal population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Nihal population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01049138) and Harpending’s raggedness index (r) (0.01688981) also confirm that Nihal has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Nihal population 1.57561 indicates the population is under expansion,

Mean number of pairwise differences 26.294  11.84

Nucleotide diversity 0.001587  0.000796

Sum of squared deviation SSD P 0.01049138 0.110

Table 7.33 Molecular Diversity Indices among the Nihal Harpending’s Raggedness index HRI P 0.01688981 0.020 Theta Pi 26.294  13.18

S 44.555  13.86

Tajima’s P D P 1.575 0.037

Fu’s Fs Fu’s Fs 13.232

p 0.00000

Nihal 343

344

7

Genomic Diversity of 75 Communities in India

Fig. 7.146 Mismatch distributions of nucleotide differences of Nihal population

meaning recovering from bottleneck. A negative value of FU’s Fs (13.23232) is also an evidence for a recent population expansion of Nihal community.

Nishi The Nishi tribe is one of the principal inhabitants of Arunachal Pradesh in North-Eastern India. Nyi refers to “a man” and the word shi denotes “a being”, which collectively means a civilized human being (Ramya and Ramjuk 2018). They are spread across six districts of Arunachal Pradesh, viz. Papum Pare, part of Lower Subansiri, KurungKumey, East Kameng, parts of Upper Subansiri, the recently created district KraDadi and are also found in the Sonitpur and North Lakhimpur districts of the neighbouring state of Assam. Their population of around 300,000 makes them the most populous tribe of Arunachal Pradesh, closely followed by the combined tribes of the Adis and the Galos (Abors) who were the most populous in the 2001 census. The Nishi language belongs to the Sino-Tibetan family; however, the origin is disputed. Polygyny is prevalent among the Nishi. It signifies one’s

social status and economical stability and also proves handy during hard times like clan wars or social huntings and other social activities. This institution however is being challenged especially with the trend towards modernization and also with the spread of Christianity. They trace their descent patrilineally and are divided into several clans. Before modern economic invaded them, they used the barter system (Ramya 2012). They greatly valued the generalized reciprocity and also balance reciprocity in their economic system. Nishi people, traditionally being dependent on the forest also include fruits, roots, bamboo shoots, wild animals, fishes, wild leafy vegetables in their diet. Most Nishis have been converted to Christianity by Christian missionaries in the 1972s, particularly in the Papum Pare region and Christianity is the major religion among the Nishis. Small groups of Hindus also exist among the Nishi (Singh 1994). Physically the Nishis are of short stature with a mesocephalic head form. With regard to ABO blood group system, the incidence of ‘O’ (33.88%) blood group is highest among them followed by A blood group (32.23%). Dark brown head hair is their unique character

Padhar

345

Fig. 7.147 Y chromosome haplogroups of Nishi

(Goswami and Das 1990). For the present study Nishi blood samples (115) were collected from the state of Arunachal Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 15 individuals were all assigned to two haplogroups (Fig. 7.147) through the screening of Y-SNPs. Haplogroup O3a3C1* has the highest frequency (93%) followed by haplogroup R1a1* (7%). Haplogroup O3a3C1* is mainly distributed in South East Asia. Haplogroup R1a1* is widely distributed in Eurasia.

Padhar The Padhar are distributed in six villages of Surendranagar and four villages of Ahmedbad district of Gujarat. The villages occupied by the Padhar are Shahpur, Sahiyal, Dharji, Devadthalnamkatchi, Ranaghat, Ralal, Parali, Paranala,

Godi and Anadpur. They belong to Scheduled tribe category and are treated as one of the five primitive tribal groups in this state. The Padhar are Hindu. According to 2011 census their population strength was 30,932. Out of which 15,911 were males and 15,021 were females (sex ratio ¼ 944). A total of 29,746 populations live in rural areas. They speak Gujarati which belongs to Indo-Aryan family of Languages (Singh 1994). Dermatoglyphics study of Padhar has been done by Krishan (1987). It is observed that the Loop (58.80%) was preponderant over whorl (36%) and PII index showed 13.38% (Krishan 1987). They are of medium height with platyrrhine nose and mesocephalic head. For the present study, Padhar blood samples were collected from the state of Gujarat.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 47 Padhar individuals were all assigned to eight haplogroups (Fig. 7.148) through the screening of Y-SNPs.

346

7

Genomic Diversity of 75 Communities in India

Fig. 7.148 Y chromosome haplogroups of Padhar

Haplogroup R1A1* has the highest frequency (43%) followed by haplogroup L1 (15%), haplogroup H1b (13%), haplogroup J2B1 (9%), haplogroup H2* and R2 (each 6%), haplogroup H* and J2b2* (each 4%). Haplogroup R1A1* is widely distributed in Eurasia and haplogroup L1 is mostly found among Dravidian communities of India. Haplogroup H is mainly found in south Asia. Haplogroup J2B1 is distributed in Italy, Czechoslovakia and Germany. Haplogroup R2 is distributed among the Central Asian lineage.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 85 individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 80% Asian macrohaplogroup M and 20% of European macrohaplogroup N selected for complete sequencing based on HVR I motifs. A total of four maternal lineages belonging to haplogroup M were found in Padhar population. Haplogroup M45 has the highest frequency (40%), followed by M39 (38%), M3 (21%), and M6 (2%) (Fig. 7.149). Haplogroup M4 is

distributed in South Asia and low concentration in Eastern Saudi Arabia. While haplogroup M39 is distributed among the populations of South Asia, M3 is also distributed in South Asia with highest concentration in West and North-West India. Haplogroup M6 is distributed in South Asia with highest concentration in mid-Eastern India and Kashmir. Founder age of Padhar population was 33  5 ky. The N-haplogroup of the 17 individuals of Padhar were all assigned to four haplogroups. The highest frequency U7—59%, followed by R5—24%, R13—12% and HV—6% (Fig. 7.150). The coalescence age of U7 was calculated to be 32,000  5500 years.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Padhar population is 0.001664  0.000815. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of

Padhar

Fig. 7.149 mtDNA phylogenetic tree of M-haplogroup among the Padhar

347

348

7

Genomic Diversity of 75 Communities in India

Fig. 7.150 mtDNA phylogenetic tree of N-haplogroup among the Padhar

population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.34). Figure 7.151 shows mismatch distributions of Padhar population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often multimodal in Padhar population indicates recent expansion. The smaller number of sum of squared deviation (0.00321) and Harpending’s raggedness index (r) (0.001755)

also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Padhar population 1.704 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (23.9862) is also an evidence for a recent population expansion.

Mean number of pairwise differences 27.57003  12.183

Nucleotide diversity 0.00166  0.000815

Sum of squared deviation SSD P 0.00321 0.590

Table 7.34 Molecular Diversity Indices among the Padhar Harpending’s Raggedness index HRI P 0.00175 0.770 Theta Pi 27.57003  13.496

S 55.2109  13.867

Tajima’s P D P 1.70491 0.013

Fu’s Fs Fu’s Fs 23.986

p 0.001

Padhar 349

350

7

Genomic Diversity of 75 Communities in India

Fig. 7.151 Mismatch distribution of nucleotide differences of Padhar population

Paite Paite are a people dwelling in Bangladesh, Burma and India. They use different nomenclatures in Bangladesh, Burma, and India. In India they are recognized as ‘Paite’, a Scheduled Tribe in the states of Manipur and Mizoram. Their main concentration is in Manipur where 55,542 Paite speakers were recorded in the 2011 census. Out of which 27,309 were males and 28,233 were females (sex ration ¼ 1034). Earlier they followed tribal religion but now a majority of them have converted to Christianity. They also worship their deity, Pathian. They speak Paite language which belongs to the Tibeto-Burman family of Kuki Chin group. Paite tribes are mainly agriculturists. The distribution of finger patterns among the Paite generally show a preponderance of whorls (51.31%) over loops (46.93%) and Arches about 1.68%, as studied by Chakravarti and Mukherjee, in 1962. Pattern Intensity Index (PII) is 12.95. High values of PII characterize the TibetoBurman linguistic groups. Studies among the tribes of Manipur by Buchi in 1959 showed 70.3% taster and 29.7% non-taster with T gene frequency—45.48 and t gene frequency 54.52

HbS. Saha et al. conducted studies among the Paite in 1976 and found that there are 20.9% sickle cell trait (HbAS) among them with S gene frequency—10.46. ABO System—in the ABO blood group system the frequency of O gene is highest (59.00), studied by Saha in 1973, and among the Paite of Manipur followed by ‘A’ gene and then ‘B’ gene. Rh (d) blood group system—the Rh negative gene, Rh(d) shows considerable variations in India with high frequencies among Indo-Aryan and few Austro-Asiatic linguistic communities, average to zero frequencies among North East, Austro-Asiatic and Dravidian communities. The d gene is totally absent in this population. For the present study, Paite blood samples were collected from the state of Manipur.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 20 Paite individuals were all assigned to three haplogroups (Fig. 7.152) through the screening of Y-SNPs. Haplogroup O2A* has the highest frequency (90%) followed by haplogroup O3A3C1* (5%), haplogroup R1A1* (5%). Haplogroup O2A* is an Austro-Asian gene distributed in South East

Paite

351

Fig. 7.152 Y chromosome haplogroups of Paite

Asia. Haplogroup O3A3C1* is also distributed in South East Asia. Haplogroup R1A1* is widely distributed in Eurasia.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 23 Paite individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 78% Asian macro-haplogroup M and 22% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total of 12 maternal lineages belonging to haplogroup M were found in Paite population. Haplogroup D has the highest frequency (33%), followed by M13 (11%) and M8CZ, M10, M31, M33, M38, M46, M49, M62, M74 and M76 (6% each) (Fig. 7.153). Haplogroup M13 is distributed in Tibet, Mongolia and Siberia. Haplogroup M8C2 is distributed among Eurasian populations. While M10 is distributed in East Asia, South East Asia, Central Asia, Southern Siberia and Belarus, M31 is found among the Onge of Andaman Islands. Haplogroup M33 found in South Asia, Belarus and Southern China. Haplogroup M49 found among the ancient specimen in the

Euphrates valley. Founder age of Paite population was 45  15 ky. The N-haplogroup of the five individuals of Paite were all assigned to four haplogroups. Frequencies are B40 5—40%, followed by K6— 20%, K—20% and R22—20% respectively (Fig. 7.154).

Molecular Diversity Molecular diversity indices are shown in Table 7.35. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Paite population is 0.002364  0.001191. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.155 shows mismatch distributions of Paite population. Smooth line is the expected distribution under the hypothesis of constant

352

7

Genomic Diversity of 75 Communities in India

Fig. 7.153 mtDNA phylogenetic tree of M-haplogroup among the Paite

population size. Mismatch distribution is mostly unimodal in Paite population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00454370) and Harpending’s raggedness index (r) (0.00796763) also confirm that Paite has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Paite population 2.08756 indicates the population is under expansion, meaning recovering from bottleneck. A negative

value of FU’s Fs (5.90917) is also an evidence for a recent population expansion of Paite community.

Paniyan The etymological meaning of the term ‘Paniya’ indicates that they earn their livelihood from labour as the term ‘Pani’ in Malayalam means ‘labour’. Thus the word ‘Paniya’ literally means ‘labourer’ or worker. Traditionally they were engaged in bonded-labour (Kundalpani). Now-adays the economy of Paniyas is well-knit with

Paniyan

353

Fig. 7.154 mtDNA phylogenetic tree of N-haplogroup among the Paite

that of the non-tribal settlers who control their occupational pattern and thereby influencing their economic conditions. Marriage to one’s father’s sister’s daughter (FSD) and mother’s sister’s daughter (MSD) is preferred. Acquiring spouses through negotiation is common, but instances of marriage by service and elopement also exist (Singh 1994). Sex ratio of this population is 936 (Census of India 2011). They show the highest frequency of blood group A, found among the Indian populations (gene frequency 42–48%) and alternatively possess very low values of B gene frequency (7–8%) in the ABO blood group system (Das and Ghosh 1954). The linguistic affinity of the Kattupaniyan with the Paniyan tribe is quite evident. They speak the same language and use the same kinship terms. The Paniyan tribe’s major concentration is in Wayanad district and their spill over can be seen

in adjoining Kannur, Kozhikode and Malappuram Districts. For the present study, Paniyan blood samples were collected from the state of Kerala.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 41 Paniyan individuals were all assigned to four haplogroups (Fig. 7.156) through the screening of Y-SNPs. Haplogroup F* has the highest frequency (66%), followed by haplogroup C5* (24%), haplogroup R1A1* (7%) and haplogroup H* (2%). Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic group throughout Eurasia. Haplogroup R1A1* is widely distributed

Mean number of pairwise differences 35.90909  16.223

Nucleotide diversity 0.00236  0.001191

Sum of squared deviation SSD P 0.00454 0.570

Table 7.35 Molecular Diversity Indices among the Paite Harpending’s Raggedness index HRI P 0.00796 0.58000 Theta Pi 35.90909  18.092

S 73.96743  24.355

Tajima’s P D P 2.08756 0.004

Fu’s Fs Fu’s Fs 5.9091

p 0.016

354 7 Genomic Diversity of 75 Communities in India

Paniyan

355

Fig. 7.155 Mismatch distribution of nucleotide differences of Paite population

Fig. 7.156 Y chromosome haplogroups of Paniyan

in Eurasia. Haplogroup H* is mainly found in South Asia—India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan. Haplogroup C5 is found in high frequency in the Australian aborigines. Haplogroup C attends

its highest frequency among the indigenous population of Mongolia, Russia, Polynesia, Australia and at moderate frequency in Korean peninsula and Manchuria. It displays its high frequency in modern Indian populations.

356

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 86 Paniyan individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 81% Asian macro-haplogroup M and 19% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total of six maternal lineages belonging to haplogroup M were found in Paniyan population. Haplogroup M3 has the highest frequency (56%), followed by M2 (30%), M36 (7%), M35 (4%), M6 and M52 (1% each) (Fig. 7.157). Haplogroup M3 is distributed in South Asia with highest concentration in west and North-West India. Haplogroup M2 is found in South East India and Bangladesh. Haplogroup M35 and M6 both are distributed in South Asia. Founder age of Paniyan population was 23  6 ky. The N-haplogroup of the 16 individuals of Paniyan were all assigned to four haplogroups. The highest frequency was for R—50%, followed by U4—38%, R30—6% and R6—6% (Fig. 7.158).

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Paniyan population is 0.001361  0.000671. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.36). Figure 7.159 shows mismatch distributions of Paniyan population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often multimodal in Paniyan population indicates recent expansion. The smaller number of sum of squared deviation (0.01824453) and

7

Genomic Diversity of 75 Communities in India

Harpending’s raggedness index (r) (0.009577) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. Among Paniyan population 1.0174 indicates the population is under expansion and have a chance for population increase. A negative value of FU’s Fs (24.00145) is also an evidence for a recent population expansion.

Pauri Bhuinya The name Bhuya or Bhuyan is derived from the Sanskrit word Bhumi meaning land. They consider themselves to be the children and owner of the land and hence are known as Bhuyan. Paudi Bhuyan are the section of the Bhuyan. They are distributed in Bhuiyapir in the Keonjhargarh district, the Nagira hills in Dhenkanal district and the Bondo hills in the Sundargarh districts of Odisha. Their mother tongue is Odia. The Odia language and script are used for intergroup communication. The Pauri Bhuinyans are divided into a number of lineages (Khilli) which regulate marriage alliances. The Bhuyans are mainly cultivators and agricultural labourers. They practise shifting cultivation called Toilachasa or poduchasa on hilltops or slopes. They grow paddy, gingili, mustard, ginger, maize, jalli, ragi and other crops extensively. Among women, weaving of mats from the wild date palm and preparation of broomsticks are common art. Men generally know rope making and a very few of them work as carpenters. Collection of forest products is the major occupation of the community. They observe community endogamy and lineage (khilli) exogamy. Crosscousin marriage and junior sororate are reported. Adult marriage is practised. Elopement (dharipala) and capture (ghicha) are the population methods of selecting a spouse, and marriage through negotiation (mangibibha) is now in practice too.

Pauri Bhuinya

357 PNN66 PNN51 PNN43 PNN92 PNN90 PNN86 PNN83 PNN81 PNN80 PNN79 PNN49 PNN48 PNN44 PNN37 PNN31 PNN33 PNN25 PNN23 PNN13 PNN06 PNN04 PNN87 PNN76 PNN58 PNN56 PNN75 PNN29 PNN20 PNN19 PNN61 PNN46 PNN26 PNN16 PNN72 PNN65 PNN40 PNN05 PNN68 PNN69 PNN64 PNN98 PNN57 PNN63 PNN54 PNN38 PNN22 PNN14 PNN11 PNN09 PNN12 PNN85 PNN62 PNN89 PNN35 PNN27 PNN32 PNN30 PNN17 PNN77 PNN73 PNN60 PNN41 PNN07 PNN55 PNN01 PNN96 PNN93 PNN91 PNN70 PNN08 rCRS

M3a

M6b

M52a M35a

M2a

M

M36c

Fig. 7.157 mtDNA phylogenetic tree of M-haplogroup among the Paniyan

Divorce and remarriage are permitted with the consent of the Pradhan and dehuri. Widow remarriage is allowed preferably with the husband’s younger brother. As per 2011 census, the total

population of Paudi Bhuyan in Odisha is 5788. As 1981 census 99.93% Paudi Bhuyan population follows Hinduism, whereas 0.04% follows Christianity and rest follow their own tribal

358

7

Genomic Diversity of 75 Communities in India

Fig. 7.158 mtDNA phylogenetic tree of N-haplogroup among the Paniyan

religion. Sex ratio for this population is calculated as 979. Literacy rate in Odisha has reached 73.45%, but for Paudi Bhuyan it is only 19.24% (Action Aid data 2016). Hemoglobinopathies (7.9%) and G-6-PD deficiency (19.0%) were major public health problems in Bhuyan tribe of Hemgiri block in Sundargarh district of Orissa. Both beta thalassemia trait (10.2%) and sickle cell disorders (4.1%) were common in the community. The frequency of Rhesus negative blood group was very low (0.3%) among the Paudi Bhuyan of Lahunipara Block in Sundargarh district of Orissa. The frequency of blood group O (23.0%) is lower as compared to A (31.0%) and B

(33.1%) in Paudi Bhuyan (ICMR report 2003–2004). Three cases of a rare blood group, Bombay (Oh) phenotype (2 out of 244 Khandayat Bhuyan and 1 out of 379 Paudi Bhuyan from Hemgiri and Lahunipara blocks, respectively) in the Bhuyan tribe of Sundargarh district in North-Western Orissa were detected, giving an incidence of 1 in 122 in Khandayat Bhuyan and 1 in 379 in Paudi Bhuyan, with an average of 1 in 278 among the Bhuyan tribal population. This incidence is high in comparison to earlier studies reported from India (Balgir 2007). For the present study Pauri Bhuinya blood samples (120) were collected from the state of Orissa.

Mean number of pairwise differences 22.5543  10.028

Nucleotide diversity 0.00136  0.0006

Sum of squared deviation SSD P 0.01824 0.020

Table 7.36 Molecular Diversity Indices among the Paniyan Harpending’s Raggedness index HRI P 0.00957 0.0400 Theta Pi 22.5543  11.110

S 32.2340  8.314

Tajima’s P D P 1.0174 0.127

Fu’s Fs Fu’s Fs 24.001

p 0.00

Pauri Bhuinya 359

360

7

Genomic Diversity of 75 Communities in India

Fig. 7.159 Mismatch distribution of nucleotide differences of Paniyan population

Fig. 7.160 Y chromosome haplogroups of Pauri Bhuinya

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 55 Pauri Bhuinya individuals for Y-SNPs revealed the presence of six haplogroups

(Fig. 7.160). These are O2a*, H1a*, R2, R1a1*, F* and C5*. Here O2a* shows the highest preponderance with a percentage of 60% followed by H1a* (15%) and R2 (15%) each. R1a1* (6%), F* (4%) and C5* (2%) (Fig. 7.160). O2a* is

Pauri Bhuinya

considered to be the major haplogroup in the South East Asian Lineage.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 126 individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 57% Asian macrohaplogroup M and 43% of European macroFig. 7.161 mtDNA phylogenetic tree of M-haplogroup among the Pauri Bhuinya

361

haplogroup N selected for complete sequencing based on HVR I motifs. A total of 14 maternal lineages belonging to haplogroup M were found among the Pauri Bhuinya population. Of these 22% predominantly belongs to M5, followed by 11% each for M2 and M49, M40 (19%), M6 (8%), M12 and M37 (5% each), M18, M34, M38, M39, M42, M49, M56 (3% each) (Fig. 7.161). The founder age of M5 haplogroup among the Pauri Bhuyan was

362

estimated to be 56  14 ky based on the high preponderance of M6 maternal lineage. The N-haplogroup of the 25 individuals of Pauri Bhuiya were all assigned to five

7

Genomic Diversity of 75 Communities in India

haplogroups. The highest frequency recorded for R9—32%, followed by R6 and R7—25% each, U2—11% and HV—7% (Fig. 7.162).

Fig. 7.162 mtDNA phylogenetic tree of N-haplogroup among the Pauri Bhuinya

Porja

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Pauri Bhuyan population is 0.001791  0.000890. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.37). Figure 7.163 shows mismatch distributions of Pauri Bhuyan population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Paudi Bhuyan population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01185260) and Harpending’s raggedness index (r) (0.00585270) also confirm that Paudi Bhuyan has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Pahuyan population negative value of Tajima’s D (1.89857) indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (16.56672) is also an evidence for a recent population expansion of Pauri Bhuyan community.

Porja The term Porja, as Thurston suggested, has been derived from the Odia word ‘po’, meaning ‘Son’ and raja meaning ‘King’. They are the son of raja. There is another view, according to which the root word is the Sankrit Paroja or sons of the soil. Porja are mainly distributed near the hill slopes of Munchangiputtu, Ananthagiri and Pedabayalu mandal of Visakhapatnam district of Andhra Pradesh. The Porja living in Andhra

363

Pradesh belong to the Parengi Porja. They are said to be a section of Kondh/Gadaba tribe of Ganjam district. They have their own dialect. In addition to their own dialect, they speak Telugu as well as Adivasi Oriya. They belonging to Austro-Asiatic linguistic family where recently migrated from Odisha. Their population as per 2011 census is 36,502 among whom males are 17,741 and females are 18,761 (sex ratio ¼ 1057). They are recognized as Particularly Vulnerable Tribal Group (PVTG). The Porja community is a conglomeration of some several endogamous divisions. Cross cousin and widow marriages are permissible among them. Monogamy is the rule. But marriages are sometimes polygamous (having multiple spouses). Agriculture is the main source of their livelihood. They practise both shifting cultivation on hill slopes and plough cultivation on flat fields and irrigated terraces. A large landless section of them work as agricultural and industrial labourers (Singh 1994). The total literacy rate among Porja is 35.0 according to 2011 census. Porja are generally short stature, long narrow headed people with a tendency towards a round head shape. They show a moderate to broad facial profile and flat nasal form. Serological studies showed a marked excess of gene B (27%) over gene A (Das et al. 1962, 1966). The sickle cell trait is absent among them (Devi and Naidu 1987). A recent study suggests that females of Porja community were more prone to have prehypertension and stage I hypertension than males (Rao et al. 2014a). For the present study Porja blood samples (147) were collected from the state of Andhra Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 60 Porja individuals were all assigned to four haplogroups (Fig. 7.164) through the screening of Y-SNPs. Haplogroup O2A* has the highest frequency (76%), followed by haplogroup H1A* and R2 (each 11%) and haplogroup H* (2%). Haplogroup O2A is an Austro-Asian gene, which is distributed in South East Asia.

Mean number of pairwise differences 29.67867  13.270

Nucleotide diversity 0.00179  0.000890

Sum of squared deviation SSD P 0.01185 0.020

Table 7.37 Molecular Diversity Indices among the Pauri Bhuinya Harpending’s Raggedness index HRI P 0.005852 0.400 Theta Pi 29.67867  14.747

S 59.8865  17.825

Tajima’s P D P 1.8985 0.013

Fu’s Fs Fu’s Fs 16.566

p 0.001

364 7 Genomic Diversity of 75 Communities in India

Porja

365

Fig. 7.163 Mismatch distribution of nucleotide differences of Pauri Bhuinya population

Fig. 7.164 Y chromosome haplogroups of Porja

Haplogroup H1a is predominant among Dravidian and Central Indian communities. Haplogroup R2 is distributed among the Central Asian lineage.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 88 Porja individuals were scanned for maternal lineages in the

population. Its maternal lineages comprise 60% Asian macro-haplogroup M and 40% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total of 16 maternal lineages belonging to haplogroup M were found in Porja population (Fig. 7.165). Haplogroup M2 and M6 have the highest frequency (15%), followed by M5 and M45 (11% each), M40 (9%), M63 (6%), M3, M30, M34, M35, M39, M49 and New (4%

366

7

Genomic Diversity of 75 Communities in India

Fig. 7.165 mtDNA phylogenetic tree of M-haplogroup among the Porja

Porja

367

Fig. 7.166 mtDNA phylogenetic tree of N-haplogroup among the Porja

each) and M33, M38 and M41 (2% each). Haplogroups M2, M3, M5, M6, M34, M35, M39, M40 and M41 all are distributed in South and South East Asia. Haplogroup M30 mainly found in India, Middle East and North Africa. Haplogroup M33 found is South Asia, Belarus and Southern China. Haplogroup M49 found among the ancient specimen in the Euphrates valley. Founder age of Porja population was 64  13 ky. The N-haplogroup of the 35 individuals of Porja were all assigned to seven haplogroups (Fig. 7.166). The highest frequency was R8—

46% followed by R*—17%, R5—11%, R0— 6%, R32—6%, U2—9% and W—6%.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Porja population is 0.002106  0.001026. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected

368

heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.38). Figure 7.167 shows mismatch distributions of Porja population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Porja population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00059364) and Harpending’s raggedness index (r) (0.00050124) also confirm that Porja has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Porja population 2.0563 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (24.05500) is also an evidence for a recent population expansion of Porja community.

7

Genomic Diversity of 75 Communities in India

Das (1958), and Das et al. (1980) show about 51–55% of gene O, and slightly higher value of gene A (24–28%) than gene B (21%) in this population. For the present study, Rabha blood samples were collected from the state of Assam.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 21 Rabha individuals were all assigned to four haplogroups (Fig. 7.168) through screening the Y-SNPs. Haplogroup O3A3C1* has the highest frequency (81%) followed by haplogroup O2A* (9%), haplogroup Q1* and R1A1 (each 5%). Haplogroup O3A3C1* is found mostly in South East Asia. Haplogroup O2A* is Austro-Asian genes, which is also distributed in South East Asia. Haplogroup Q1* is widely distributed in Central Siberia, Central Asia and Native America, and R1A1 is mainly distributed in Eurasia.

Raji Rabha Rabha is a Scheduled Tribe community of Assam, Meghalaya and West Bengal. The language/dialect spoken by the Rabha people is also of the same name. In Assam, the Rabhas live mostly in Goalpara and Kamrup districts. In Meghalaya, Rabhas are mostly found in Garo Hills. In West Bengal, Rabha people mainly live in Jalpaiguri district and Cooch Behar district. Moreover, almost, 70% of them live in Jalpaiguri district. The whole area of Eastern and Western Dooars may be termed as the cradle land of the Rabhas. The Rabhas refer to themselves as Koch and assert a connection to the historical Koch Kingdom. The traditional economy of the Rabhas in general is based on agriculture, forest-based activities and weaving. Rabha people traditionally practise a few animistic rituals. However, today they more often follow a faith, which is a blend of some Hindu and a few animistic rituals (Singh 1994).

The Raji are a very small group of Scheduled Tribe of Kumaon hills who were cave dwellers and nomadic hunter-gatherers. Within Kumaon, they are more popularly called ‘Ban Rawat’, which literally means ‘King of the Forest’. The Raji are confined to Pithoragarh district and are also found in contiguous areas of south-west Nepal across river Kali. In India, Raji are distributed in nine villages only falling under Dharchula, Didihat and Champawat tehsils of Pithoragarh district. According to the 2011 census report, the Raji population is 1295. However, a survey conducted by U.P. Harijan and Samaj Kalyan Vibhag during April, 1981, enumerates the Raji population as 371 persons. The Raji speaks their own language which belongs to the Tibeto-Burman language family. They have been included in the list of Primitive Vulnerable Tribal Group prepared by Government of India. The Raji society is divided into a number of patrilineages, locally called ‘rath’ which is an exogamous patri-lineal grouping of families.

Mean number of pairwise differences 34.89341  15.338

Nucleotide diversity 0.002106  0.001026

Sum of squared deviation SSD P 0.00059 0.840

Table 7.38 Molecular Diversity Indices among the Porja Harpending’s Raggedness index HRI P 0.00050 1.0000 Theta Pi 34.8934  16.993

S 87.1483  21.762

Tajima’s P D P 2.0563 0.005

Fu’s Fs Fu’s Fs 24.05

p 0.00

Raji 369

370

7

Genomic Diversity of 75 Communities in India

Fig. 7.167 Mismatch distribution of nucleotide differences of Porja population

Fig. 7.168 Y chromosome haplogroups of Rabha

The Raji are endogamous. They are monogamous, and the practice of levirate marriage is prevalent. Cross cousin marriage also exits. The

traditional occupation of the Raji is hunting gathering which in the past was supplemented by intermittent slash and burn agriculture and

Raji

invisible trade under which they bartered their crudely manufactured wooden vessels for cereals with neighbouring Kumaoni peasants. At present, apart from the traditional food gathering, they follow a variety of occupations like agriculture, animal husbandry, resin tapping and agri-labour; their religious faith represents an interesting admixture of traditional animism and local Hinduism. They believe in many deities and spirits who they worship periodically or in the event of an emergency (Singh 1994). The average stature of Raji is 159.42 cm and Cephalic Index is 73.99 (Tiwari and Bhasin 1975). Alam et al. have shown the 45.7% prevalence of underweight among Raji where 55.8% shown by males and 37.3% shown by females. Further, mean value of haemoglobin level in case of male and females is 13.17 and 10.70, respectively. For the present study, Raji blood samples were collected from the state of Uttarakhand.

Fig. 7.169 Y chromosome haplogroups of Raji

371

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 44 Raji individuals were all assigned to four haplogroups (Fig. 7.169) through the screening of Y-SNPs. Haplogroup O3A3C1* has the highest frequency (91%) followed by haplogroup C5* (5%) and haplogroup Q1* and H1A* (each 2%). Haplogroup O3A3C1* is found mostly in South East Asia. Haplogroup C5* is found in high frequency in the Australian aborigines. Although haplogroup C attends its highest frequency among the indigenous population of Mongolia, Russia, Far East, Polynesia, Australia and at moderate frequency in Korean Peninsula and Manchuria. It displays its high frequency in modern Indian populations. Haplogroup Q1* is distributed in Central Siberia, Central Asia and Native America. Haplogroup H1A* is predominant among Dravidian and Central Indian communities.

372

Saharia Saharia, Sahar, Sehariyaor Sahariya is an indigenous Munda-speaking tribe of Madhya Pradesh. The Saharias are mainly found in the districts of Morena, Sheopur, Bhind, Gwalior, Datia, Shivpuri, Vidisha and Guna districts of Madhya Pradesh and Baran district of Rajasthan (Sati 2015). The history of the Saharia tribe is spotty and lost completely in many places. The older generations of the Saharia fail to give any account of their history, and written records of ancestry are virtually non-existent. Traditionally, they trace their beginnings to the days of the Ramayana and beyond. They trace their origin from Shabri of the Ramayan (Rajak 2016). The tribe members believe in Folk Hinduism’s. Child marriage is not favoured, although there are some arranged marriages, and any marriage is performed after attaining the age of 15 years. Widow marriage called ‘nat’ is permitted but only to a fellow widower or a divorcee. Polygamy is reserved only for male. The Sahariyas are

Fig. 7.170 Y chromosome haplogroups of Saharia

7

Genomic Diversity of 75 Communities in India

expert woodsmen and forest product gatherers. They are particularly skilled in making catechu from Khair trees. Their main business is gathering and selling of forest wood, gum, tendu leaf, honey, mahua and medicinal herbs (Rajak 2016). Their traditional occupations also include making baskets, mining and quarrying, and breaking stones. Some Sahariyas are settled cultivators (Mandal and Bera 2007). For this study, blood samples of the Saharia were collected from the state of Madhya Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 61 individuals were all assigned to eight haplogroups (Fig. 7.170) through the screening of Y-SNPs. Haplogroup H1a* has the highest frequency (30%) followed by haplogroup O2A* (21%), haplogroup R* (16%), haplogroup F* and R2 (each 12%), P* (5%), haplogroup L* (3%)

Savaras

and haplogroup J* (2%). Haplogroup H1a* is distributed among Dravidian and Central Indian communities. Haplogroup O2A* is an admixture of Austro-Asian genes. Haplogroup R* is found in Central Siberia, Central Asia and Native America and also common in the parts of West Asia, Africa and North America. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. While haplogroup P* is distributed among European and Native American, South Asia and Central Asia, haplogroup L is found among the Neolithic migrants from Central Asia. Haplogroup J* is predominant in North Africa, the Caucasus, South East Europe, Central Asia, Iran, Pakistan and Western India.

Savaras In Sanskrit, Savaras means a mountaineer, barabarian or savage. They live in the states of Andhra Pradesh, Madhya Pradesh, Bihar, Maharashtra, Orissa and West Bengal. They speak their mother tongue Saora, an AustroAsiatic language. They are divided into 25 divisions which are based on occupations. There are two broad territorial divisions among them, namely the Hill Savaras and the Savaras of the low country. The Savaras have exogamous totemic clans. A woman does not change her lineage membership even after her marriage. Adult marriage is practised in the community. They often practise polygyny. Levirate and sororate types of marriages are also prevalent. The Savaras avail the employment opportunities created through various rural and tribal development programmes. They depend upon moneylenders and businessmen for credit. The Savaras are of the followers of Hinduism (Elwin 1951; Singh 1994). In ABO blood group, they show more or less equal proportions of A (24%) and B blood groups (Sarkar et al. 1960). A study found that Savara females reflects high blood pressure than males, which may lead to higher incidences of cardiovascular diseases among them (Rao et al. 2014b). For the present study, Savaras blood

373

samples were collected from the state of Andhra Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 40 Savaras individuals were all assigned to four haplogroups (Fig. 7.171) through the screening of Y-SNPs. Haplogroup O2A* has the highest frequency (83%) followed by haplogroup R2 (8%), haplogroup H1a* (7%) and haplogroup H1* (2%). Haplogroup O2A* is an admixture of Austro-Asian genes and haplogroup R2* is of the Central Asian lineage. Haplogroup H1a* is distributed among Dravidian and Central Indian communities.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 105 individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 86% Asian macrohaplogroup M and 14% of European macrohaplogroup N selected for complete sequencing based on HVR I motifs. A total of 13 maternal lineages belonging to haplogroup M were found in Savaras population (Fig. 7.172). Haplogroup M45 has the highest frequency (19%), followed by M5 (16%), M31 (11%), M2, M35, M49 and new (7% each), M42 (6%), M3, M6, M38 and M53 (4% each) and M18 (3%). Haplogroup M2, M3, M5, M6, M35 and M39 all are distributed in South and South East Asia. Haplogroup M31 is found among the Onge in the Andaman Islands. Haplogroup M18 is distributed in Tharus in southern Nepal and tribal populations in Andhra Pradesh. Haplogroup M42 is found in Australian aborigine’s, and haplogroup M49 is found among the ancient specimen in Euphrates Valley. Founder age of Savaras population was 36  10 ky. The N-haplogroup of the 15 individuals of Savaras were assigned to three haplogroups (Fig. 7.173). The highest frequency was recorded for R8—40%, followed by R5—33% and U2— 27%.

374

7

Genomic Diversity of 75 Communities in India

Fig. 7.171 Y chromosome haplogroups of Savaras

Molecular Diversity Molecular diversity indices are shown in Table 7.39. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Savaras population is 0.001895  0.000924. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.174 shows mismatch distributions of Savaras population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Savaras population which indicates a recent expansion. The smaller number of sum of squared deviation (0.0064) and Harpending’s raggedness index (r) (0.001126) also confirm that Savaras has undergone recent demographic expansion.

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Savaras population 1.9150 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (23.96201) is also an evidence for a recent population expansion of Savaras community.

Sherdukpen Their folk tales suggest that they originated from the union of a local prince and a princess from Assam, possibly of Kachari origin. They inhabit the Bomdila subdivision of the West Kameng district of Arunachal Pradesh. They speak a language known as Sherdukpen or Ngnok, which belongs to the NEFA subgroup of the TibetoBurman linguistic family. They are primarily agriculturist and practise shifting cultivation, terrace cultivation and wet paddy cultivation. Nowa-days they have discarded shifting cultivation in

Sherdukpen

375 M49

M35

M3

M45

M42 M6 M19’53

M2

M31a3

M18 M38 M*

M5

M

M* M*

Fig. 7.172 mtDNA phylogenetic tree of M-haplogroup among the Savaras

SAV118 SAV130 SAV110 SAV108 SAV107 SAV32 SAV22 SAV136 SAV52 SAV07 SAV64 SAV44 SAV61 SAV05 SAV123 SAV02 SAV100 SAV139 SAV104 SAV37 SAV08 SAV03 SAV97 SAV101 SAV94 SAV89 SAV145 SAV96 SAV87 SAV85 SAV04 SAV57 SAV01 SAV75 SAV133 SAV79 SAV36 SAV27 SAV93 SAV114 SAV88 SAV69 SAV42 SAV42 SAV11 SAV116 SAV63 SAV60 SAV34 SAV31 SAV18 SAV17 SAV106 SAV95 SAV98 SAV91 SAV81 SAV80 SAV78 SAV74 SAV113 SAV40 SAV92 SAV77 SAV68 SAV119 SAV138 SAV121 SAV51 SAV30 SAV23 SAV109 SAV105 SAV112 SAV102 SAV134 SAV137 SAV127 SAV147 SAV48 SAV65 SAV56 SAV50 SAV39 SAV10 SAV33 SAV06 SAV86 SAV82 RCRS

376

7

Genomic Diversity of 75 Communities in India

Fig. 7.173 mtDNA phylogenetic tree of N-haplogroup among the Savaras

favour of wet paddy cultivation. They are generally monogamous, but some of them also practise polygyny. Marriage to the late husband’s brother and late wife’s sister as well as cross-cousin marriages are permitted. The Sherdukpen belong to the Lamaistic sect of Tibetan Buddhism of the Mahayana school, which is blended with local beliefs and shamanism. A small section of them are also found to embrace Hinduism. As per the 2011 Census, their total population is 3463. Out of which 1678 are males and 1875 are females (sex ratio ¼ 1064). According to 1981 census, 20.99% of them are literate, male literacy rate being the higher (29.22%) than the female literacy rate (13.04%) (Singh 1994). In respect of ABO blood group, the Sehrdukpen show the

highest frequency of O gene (54.3%), followed by A (32.2%) and B gene (13.4%) (Goswami and Das 1990). For the present study, Sherdukpen blood samples were collected from the state of Arunachal Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 43 individuals for Y-SNPs revealed the presence of four haplogroups. These are O3A3C1*, K*, J* and J2A*. Here, O3A3C1* shows the highest preponderance with a percentage of 88%, followed by K* (7%), J* (2%) and J2A* (2%) (Fig. 7.175). The

Mean number of pairwise differences 31.40072  13.815

Nucleotide diversity 0.00189  0.0009

Sum of squared deviation SSD P 0.00646 0.130

Table 7.39 Molecular Diversity Indices among the Savaras Harpending’s Raggedness index HRI P 0.00112 1.000 Theta Pi 31.40072  15.301

S 72.59343  17.688

Tajima’s P D P 1.9150 0.007

Fu’s Fs Fu’s Fs 23.962

p 0.00

Sherdukpen 377

378

7

Genomic Diversity of 75 Communities in India

Fig. 7.174 Mismatch distribution of nucleotide differences of Savaras population

Fig. 7.175 Y chromosome haplogroups of Sherdukpen

Sherdukpen

haplogroup H1a* is predominant among the Dravidian and central Indian communities. It is considered to be a major indigenous Indian haplogroup.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 103 Sherdukpen individuals were scanned for maternal lineages

Fig. 7.176 mtDNA phylogenetic tree of M-haplogroup among the Sherdukpen

379

in the population. A total of 19 samples were selected for complete sequencing based on HVR I motifs. A total of five maternal lineages belonging to haplogroup M were found among the Shertukpen population (Fig. 7.176). Of these, 47% predominantly belong to D, 27% belong to M13 lineages, 13% belong to M5, 7% belong to M43 and M67 lineages. The founder age of the Shertukpen was estimated to be 59  12 ky based on the high preponderance of D maternal lineage.

380

7

Genomic Diversity of 75 Communities in India

Fig. 7.177 mtDNA phylogenetic tree of N-haplogroup among the Sherdukpen

The N-haplogroup of the four individuals of Sherdukpen were assigned to two haplogroups (Fig. 7.177). The sub-haplogroup F1 is 50%, A17 and A11 (25% each) among the Sherdukpen. Haplogroup A found in Central and East Asia, as well as among Native Americans.

Molecular Diversity Molecular diversity indices are shown in Table 7.40. Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Sherdukpen population is 0.001311  0.000687. Nucleotide diversity is a measure of genetic

variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.178 shows mismatch distributions of Sherdukpen population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Sherdukpen population which indicates a recent expansion. The smaller number of sum of squared deviation (0.07580449) and Harpending’s raggedness index (r) (0.13941043) also confirm that Sherdukpen has undergone recent demographic expansion.

Mean number of pairwise differences 21.71428  10.150

Nucleotide diversity 0.00131  0.0006

Sum of squared deviation SSD P 0.07580 0.000

Table 7.40 Molecular Diversity Indices among the Sherdukpen Harpending’s Raggedness index HRI P 0.13941 0.0000 Theta Pi 21.71428  11.380

S 22.7582  8.560

Tajima’s P D P 0.1997 0.473

Fu’s Fs Fu’s Fs 3.999

p 0.036

Sherdukpen 381

382

7

Genomic Diversity of 75 Communities in India

Fig. 7.178 Mismatch distribution of nucleotide differences of Sherdukpen population

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Sherdukpen population 0.19978 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (3.99986) is also an evidence for a recent population expansion of Sherdukpen community.

Soliga The Soliga tribe traces their origin to Karayya, son of Lord Maleya Mahadeshwara, swamy of Maleya Mahadeshwara Hills, Karnataka (Singh 2006). The Soliga tribe is mainly distributed in the hilly part of Mysore district of Karnataka. They speak kanada, a Dravidian language. The Soliga speak the Sholaga language (Soliganudi) as a mother tongue, a member of the Dravidian family; it is most closely related to Kannada with several Tamil influences (Buchanan 1870). The Soliga have five divisions, namely Urali, Kadu, Male, Urubathi or Dasayya or Burude and Pujari. Nanjundayya and AnanthakrishnaIyer (1930) reported four endogamous divisions among the

Soliga, namely Urali, Male, Kadu and Urubatti which are further divided into exogamous clans. The traditional occupation of the Soliga was shifting cultivation and collection of minor forest produce. Nowadays they have taken to forest and agricultural labour. Marriage to one’s mother’s brother’s daughter (MBD) and father’s sister’s daughter (FSD) is preferred. They practise adult marriage and generally arrange alliances through negotiation. Some cases of marriage by elopement have also taken place. Soliga people follow Hindu practices and their main deities are Madeshwara, Rangaswamy of Biligirirangana Hills, Karayya, KyateDevaru and Jadeswamy. Other deities worshipped by them include Madeshwara, Basaveshwara and Nanjundeshwara and Sri Alamelu Ranganayaki smetha Sri Ranganatha (Singh 2006). They have the sex ratio of 1006 (16,860 males and 16,959 females) with a total population of 33,819. The Soliga people are below medium-statured people with a long and narrow head shape, an oval face and broad nose (Karve 1954). A study (Morlote et al. 2011) examined the phylogenetic relationships of the Soligas in relation to 29 worldwide, geographically targeted, reference populations. For this purpose, they employed a battery of 15 hypervariable autosomal short

Soliga

tandem repeat loci as markers. The Soliga tribe was found to be remarkably different from other Indian populations including other southern Dravidian-speaking tribes. In contrast, the Soliga people exhibited genetic affinity to two Australian aboriginal populations. This genetic similarity could be attributed to the ‘Out of Africa’ migratory wave(s) along the southern coast of India that eventually reached Australia. Alternatively, it was observed that the genetic affinity may be explained by more recent migrations from the Indian subcontinent into Australia. For the present study, Soliga blood samples were collected from the state of Karnataka.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 43 Soliga individuals were all assigned to nine haplogroups (Fig. 7.179) through the screening of Y-SNPs.

Fig. 7.179 Y chromosome haplogroups of Soliga

383

Haplogroup H1a* has the highest frequency (42%), followed by haplogroup F* (26%), haplogroup H* (9%), haplogroup C5* and R2* (each 7%) and haplogroups G1*, H2*, J2B1 and Q1 (each 2%). Haplogroup H1a* is distributed among the Dravidian and Central Indian communities. Haplogroup F* is found mostly among the Dravidian, Indo-European, SinoTibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. Haplogroup H* is distributed mainly in South Asia, India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan. While haplogroup C5* is found in high frequency in the Australian aborigines, haplogroup R2* is of Central Asian lineage. Haplogroup G1 is distributed in Iran and the countries adjoining Iran on the West. Haplogroup H2 is primarily European. They are also found among Armenian, Iranian as well as among the people of India and Southern Asia. While haplogroup J2B1 is distributed mainly in Italy, Czechoslovakia and Germany, haplogroup Q1 is distributed in Central Siberia, Central Asia and

384

Native American. Frequencies of haplogroup are R* (16%), F* and R2 (each 12%), P* (5%), L* (3%) and p J* (2%), respectively. Haplogroup H1a* is distributed among Dravidian and Central Indian communities. Haplogroup O2A* is an admixture of Austro-Asian genes. Haplogroup R* is found in Central Siberia, Central Asia and Native America and also common in the parts of the West Asia, Africa and North America. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, TibetoBurmese and Turkic linguistic groups throughout Eurasia. While haplogroup P* is distributed among the European, Native American, South Asia and Central Asia, haplogroup L is found among the Neolithic migrants from Central Asia. Haplogroup J* is predominant in North Africa, the Caucasus, South East Europe, Central Asia, Iran, Pakistan and Western India.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 39 Soliga individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 41% Asian macro-haplogroup M and 59% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total of three maternal lineages belonging to haplogroup M were found in Soliga population (Fig. 7.180). Haplogroup M3 has the highest frequency (81%), followed by M8CZ (12%) and M36 (6%). Haplogroup M3 distributed in South Asia with highest concentration in West and North-West India. Haplogroup M8C2 is distributed among Eurasian populations. Founder age of Soliga population was 23  6 ky. The N-haplogroup of the 23 individuals of Soliga were assigned to four haplogroups (Fig. 7.181). The highest frequency is of R5— 74%, followed by U2—17%, N1 and U1 (4% each). The coalescence time of R5 was estimated to be 66,100  22,000 years. Haplogroup N1 is found in West Eurasia.

7

Genomic Diversity of 75 Communities in India

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Soliga population is 0.001846  0.000915. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.41). Figure 7.182 shows mismatch distributions of Soliga population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often multimodal in Soliga population indicates recent expansion. The smaller number of sum of squared deviation (0.0028846) and Harpending’s raggedness index (r) (0.00223) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Soliga population 1.002 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (17.7895) is also an evidence for a recent population expansion.

Sonowal Kachari The Sonowal Kacharis belong to the Mongoloid origin. The Sonowal Kacharis are descendants of the ‘Hammusa’ family. They are predominantly inhabitants of the Dhemaji, Lakhimpur, Tinisukia and Dibrugarh districts of Assam. They are also scattered in the districts of Sibsagar, Jorhat and Golaghat in Assam in North-East India (Das

Sonowal Kachari

385

Fig. 7.180 mtDNA phylogenetic tree of M-haplogroup among the Soliga

2015). The original mother tongue of Kachari is Bodo which is the division of Tibeto-Burman linguistic family but in present day situation they speak Assamese, an Indo-European language (Nath 2019). The Sonowal are broadly divided into seven territorial khel groups. Presently they have taken up agriculture in addition to their traditional occupation of gold washing. The rule of clan exogamy is strictly followed. Adult marriage is the rule; elopement (chorbiah) and arranged (barbiah) marriages are prevailed in the community. Polygyny is rarely practised. Divorce

is not socially accepted. Earlier, the Sonowal were the followers of their traditional religion, but at present they are the followers of the Mahapurushia Vaishnavism (a Hinduised sect) propounded by Srimanta Sankaradeva during the fifteenth and sixteenth centuries A.D. However, their traditional socio-religious faith has not been abandoned completely, and in their religious ideology, elements of both traditional and Vaishnavite faiths have been observed (Singh 1994). For the present study, Sonowal Kachari blood samples were collected from the state of Assam.

386

7

Genomic Diversity of 75 Communities in India

Fig. 7.181 mtDNA phylogenetic tree of N-haplogroup among the Soliga

SLG98 SLG93 SLG106 SLG99 SLG90 SLG35 SLG30

R5a2b

SLG04 SLG83 SLG79 SLG92 SLG73 SLG71 R5a R5a2 R5a2

SLG75 SLG113 SLG81 SLG105

U N1a1b1

SLG103 SLG97

N SLG91 SLG86 U2b1

SLG114 SLG100 rCRS

The Sonowal Kachari present a characteristically high incidence of haemoglobin homozygous Hb E (31.3%) and the Hb AE phenotypes (47.33%), a higher frequency of blood group B than A in the ABO system and a negligible frequency of Rh negative (Das et al. 1975, 1980).

with a percentage of 74% followed by R1A1* (17%), P* (4%), L1 and O2A (2%) (Fig. 7.183). The haplogroup O3A is predominant among the Tibeto-Burman communities of South East Asia.

Maternal Lineage (mtDNA Haplogroups) Paternal Lineage (Y Chromosomal Haplogroups) The screening of 35 individuals for Y-SNPs revealed the presence of five haplogroups. These are HG, O3A3C1*, R1A1, P*, L1 and O2A. Among these, O3A3C1* shows preponderance

Mitochondrial genomes of 112 Sonowal Kachari individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 63% Asian macro-haplogroup M and 37% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total of six maternal lineages belonging to haplogroup

Mean number of pairwise differences 30.58029  13.647

Nucleotide diversity 0.00184  0.0009

Sum of squared deviation SSD P 0.00288 0.970

Table 7.41 Molecular Diversity Indices among the Soliga Harpending’s Raggedness index HRI P 0.00223 0.980 Theta Pi 30.58029  15.162

S 41.8647  12.435

Tajima’s P D P 1.0025 0.157

Fu’s Fs Fu’s Fs 17.7895

p 0.00

Sonowal Kachari 387

388

7

Genomic Diversity of 75 Communities in India

Fig. 7.182 Mismatch distribution of nucleotide differences of Soliga population

Fig. 7.183 Y chromosome haplogroups of Sonowal Kachari

Sonowal Kachari

389

Fig. 7.184 mtDNA phylogenetic tree of M-haplogroup among the Sonowal Kachari

M were found among the Sonowal Kachari population (Fig. 7.184). Among these, 37% predominantly belongs to M9, 21% belongs to D and M49 (16% each), M60 and M8 (11% each) and M33 (5%) lineages, the founder age of the Sonowal Kachari was estimated to be 56  14 ky based on the high preponderance of M6 maternal lineage. The N-haplogroup of the 11 individuals of Sonowal Kachari were assigned to seven haplogroups (Fig. 7.185). The highest frequency

is for haplogroup U2 (27%), H and F1 (18% each), U7, R5, A14 and A (9% each). Haplogroup A found in Central and East Asia, as well as among Native Americans. Haplogroup F is fairly common in East Asia and Southeast Asia. Higher frequencies occur in some areas like Nicobar at 50% and Arunachal Pradesh 31% (India) and Shors people from Siberia at 44%. There is also an important frequency in Taiwanese aborigines, Guangdong (China), Maluku (Indonesia),

390

7

Genomic Diversity of 75 Communities in India

Fig. 7.185 mtDNA phylogenetic tree of N-haplogroup among the Sonowal Kachari

Thailand and Vietnam. Haplogroup B is believed to have arisen in Asia some 50,000 years before present. Its ancestral haplogroup was haplogroup R. Its greater variety is in China. It is conspicuous that haplogroup B may have its earliest diversification in southern China and/or Southeast Asia. Haplogroup B is found frequently in southeastern Asia. R5 was distributed across groups of the subcontinent, especially in Madhya Pradesh at 17%. The coalescence time was estimated to be 66,100  22,000 years. Haplogroup HV is a west Eurasian haplogroup mainly found throughout the Middle East, including Iran. It is also found in North Africa, Central Asia and South Asia.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Sonowal Kachari population is 0.001660  0.000850. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.42). Figure 7.186 shows mismatch distributions of Sonowal Kachari population. Smooth line is the

Mean number of pairwise differences 27.50877  12.607

Nucleotide diversity 0.00166  0.0008

Sum of squared deviation SSD P 0.021054 0.020

Table 7.42 Molecular Diversity Indices among the Sonowal Kachari Harpending’s Raggedness index HRI P 0.05536 0.01000 Theta Pi 27.50877  14.089

S 38.62541  13.479

Tajima’s P D P 1.2013 0.106

Fu’s Fs Fu’s Fs 5.1538

p 0.020

Sonowal Kachari 391

392

7

Genomic Diversity of 75 Communities in India

Fig. 7.186 Mismatch distribution of nucleotide differences of Sonowal Kachari population

expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Sonowal Kachari population which indicates a recent expansion. The smaller number of sum of squared deviation (0.02105455) and Harpending’s raggedness index (r) (0.05536746) also confirm that Sonowal Kachari has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Sonowal Kachari population 1.20139 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (5.15386) is also an evidence for a recent population expansion of Sonowal Kachari community.

Tai Ahom The Ahom are the descendants of the ethnic Tai people that accompanied the Tai prince Su-KaPhaa into the Brahmaputra valley in 1228 and ruled the area for six centuries (Gogoi 1968). The modern Ahom people and their culture are a syncretic blend of the original Tai culture, the indigenous Tibeto-Burmans and Hinduism.

Some ethnic groups, including the TibetoBurman speaking Borahi people, were completely subsumed into the Ahom community (Gogoi 2017). Now Ahoms form the largest mongoloid community of Assam and North East India. They have majority in Upper Assam (Boruah 2018). The Tibeto-Burman locals near the Ahoms gave them the name “Ahom” (Gogoi 1968). Many Tai Ahoms practise Hinduism. The Ahom people also practise the ancient and distinct Indian religion Furalung. Given the non-dogmatic nature of Indian religious traditions, an adherent may identify with both Furalung and Hinduism. Similarly, many among the Ahom people practise Buddhism as well. Mohung, Changbun, and Moplong are the three priestly Ahom clans. Regarding ABO blood group system among the Tai Ahom the O gene frequency is the highest (59.87%), followed by B gene (20.78%) and A gene (19.35%) (Flatz et al. 1972). For the present study Tai Ahom blood samples were collected from the state of Assam.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 62 individuals were all assigned to seven

Tai Ahom

393

Fig. 7.187 Y chromosome haplogroups of Tai Ahom

haplogroups (Fig. 7.187) through screening the Y-SNPs. Haplogroup O3A3C1* has the highest frequency (56%), followed by haplogroup O2A* and R1A1* (each 13%), haplogroup H* (10%), haplogroup F* (5%) and Q1* and R2 (each 2%). Haplogroup O3A3C1* is distributed mainly in South East Asia. While haplogroup O2A* is Austro-Asian genes found in South East Asia, and haplogroup R1A1* is widely distributed in Eurasia. Haplogroup H* is distributed mainly in South Asia, India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. While haplogroup R2* is of Central Asian lineage, haplogroup Q1 is distributed in Central Siberia, Central Asia and Native America.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 65 Tai Ahom individuals were scanned for maternal lineages in the population. Its maternal lineages comprise

63% Asian macro-haplogroup M and 37% of European macro-haplogroup N selected for complete sequencing based on HVR I motifs. A total of 16 maternal lineages belonging to haplogroup M were found in Tai Ahom population (Fig. 7.188). Haplogroup D has the highest frequency (19%), followed by M9 and M38 (12% each), M9 and New (10% each), M5, M33, M35, M48, M54, M60 and M71 (5% each) and M30, M41, M44, M45 and M49 (2% each). Haplogroup D is found in Eastern Eurasia, Native Americans, Central Asia and occasionally also in West Asia and Northern Europe. Haplogroup M9 is found in East Asia and Central Asia especially in Tibet. Haplogroup M33 is found in South Asia, Belarus and Southern China. Haplogroup M48 found in Saudi Arabia. Haplogroups M5, M35 and M41 are found mainly in South Asia. Haplogroup M30 is distributed in India, Middle East and North Africa. Haplogroup M49 is found among the ancient specimen in Euphrates Valley. Founder age of Tai Ahom population was 59  12 ky. The N-haplogroup of the 23 individuals of Tai Ahom were assigned to 14 haplogroups

394

7

Genomic Diversity of 75 Communities in India

Fig. 7.188 mtDNA phylogenetic tree of M-haplogroup among the Tai Ahom

(Fig. 7.189). The highest frequency haplogroups are U7, R6 and A (12% each), followed by T, U9, F2 and A14 (8% each), F1, N8, R5, R30, R32, H, X (14% each). Haplogroup A is found in Central and East Asia, as well as among Native Americans. The subclade R0 within the haplogroup R occurs commonly in the Arabian Peninsula, with its highest frequency observed among the Socotri (Černý et al. 2009). Moderate frequencies are also found in North Africa, the horn of Africa and the Central Asia. Haplogroup F is fairly common in East Asia and Southeast Asia. Higher frequencies occur in some areas like

Nicobar at 50% and Arunachal Pradesh 31% (India) and Shors people from Siberia at 44%. There is also an important frequency in Taiwanese aborigines, Guangdong (China), Maluku (Indonesia), Thailand and Vietnam. The coalescence age was estimated to be 16.7  5.6 kya. Haplogroup R6 has the most important presence among Austroasiatic language-speakers from India. Haplogroup U7 is considered a West Eurasian-specific, believed to have originated in the Black Sea area. The coalescence time of haplogroup U7 was estimated to be 32,000  5500 years. Haplogroup T has the

Tai Ahom

395

Fig. 7.189 mtDNA phylogenetic tree of N-haplogroup among the Tai Ahom

highest frequency in the Caspian region (Caucasus, Northern Iran, and Turkmenistan). It is important in Europe (almost 10%), Middle East, Central Asia, Pakistan and North Africa. Small frequency is also found in the Horn of Africa and India. Haplogroup N8 is found in China. Haplogroup R5 was distributed across groups of the subcontinent, especially in Madhya Pradesh at 17%. The coalescence time of haplogroups R5 was estimated to be 66,100  22,000 years. Haplogroup R30 is found in Andhra Pradesh, Uttar Pradesh (India), in the Tharu people from Nepal and Sinhalese people from Sri Lanka.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Tai Ahom population is 0.002153  0.001053. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.43).

Mean number of pairwise differences 35.6567  15.724

Nucleotide diversity 0.00215  0.001

Sum of squared deviation SSD P 0.001979 0.510

Table 7.43 Molecular Diversity Indices among the Tai Ahom Harpending’s Raggedness index HRI P 0.00175 0.41000 Theta Pi 35.6567  17.434

S 117.3791  30.967

Tajima’s P D P 2.4695 0.000

Fu’s Fs Fu’s Fs 24.1690

p 0.000

396 7 Genomic Diversity of 75 Communities in India

Tai Khampti

397

Fig. 7.190 Mismatch distribution of nucleotide differences of Tai Ahom population

Figure 7.190 shows mismatch distributions of Tai Ahom population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Tai Ahom population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00197948) and Harpending’s raggedness index (r) (0.00175924) also confirms that Tai Ahom has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Tai Ahom population 2.4695 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (24.16909) is also an evidence for a recent population expansion of Tai Ahom community.

writes that Bor-Khampti, Moonglair-Khampti, Kamjangs, Phakials, Shan and Aitonia are synonyms of the Khampti. The Tai khampti speaks Khamti language which belongs to the Siamese-Burman language family. They are mainly cultivators, businessman and timber merchants. Monogamy is the common practice, but polygyny is also permitted. Negotiation, courtship and elopement are the usual modes of acquiring mates. They are the followers of Theravada Buddhism (Singh 1994). Physically the Tai khampti are of short stature. Dolicocephalic head is predominant (39%) followed by mesocephalic head form (37%). With regard to ABO blood group system, the incidence of ‘O’ (65.8%) blood group is the highest among them followed by B blood group (18.1%), and A blood group (16.1%) (Goswami and Das 1990). For the present study Tai Khampti blood samples were collected from the state of Arunachal Pradesh.

Tai Khampti They are a Tai-Shan community who inhabit the plains of the Lohit district of Arunachal Pradesh. A small section of them are also found in the Lakhimpur district of Assam. Mackenzie (1884)

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 55 individuals were all assigned to five

398

7

Genomic Diversity of 75 Communities in India

Fig. 7.191 Y chromosome haplogroups of Tai Khampti

haplogroups (Fig. 7.191) through the screening of Y-SNPs. Haplogroup O3a3c1* has the highest frequency (46%), followed by haplogroup O2a* (41%), haplogroup N1* and R1a1* (each 4%) and haplogroup Q1* (2%). Haplogroup O3a3c1* is distributed mainly in South East Asia. While haplogroup O2a* is Austro-Asian genes found in South East Asia, and haplogroup R1a1* is widely distributed in Eurasia. Haplogroup N1* is found chiefly in NorthEastern Europe, particularly in Finland, Lapland, Estonia, Latvia, Luthiana and Northern Russia. Haplogroup Q1 is distributed in Central Siberia, Central Asia and Native American.

population (Fig. 7.192). New and M49 haplogroup has the highest frequency (39%), followed by haplogroup D (21%), M54 (14%) and M60 (7% each). Haplogroup D is found in Eastern Eurasia, Native Americans, Central Asia and occasionally also in West Asia and Northern Europe. M49 is found among the ancient specimen in Euphrates Valley. Founder age of Tai Ahom population was 59  12 ky. The N-haplogroup of the three individuals of Tai Khampti were assigned to two haplogroups (Fig. 7.193). The highest frequency is R9 (67%) followed by New (33%). Haplogroup R9 appears mostly in Southeast Asia, and it is found all over Indonesia, in Indochina, Malaysia, in Aboriginal Malays like Semelai at 28% and Temuanat 21%.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 17 individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 82% Asian macrohaplogroup M and 18% of European macrohaplogroup N selected for complete sequencing based on HVR I motifs. A total of five maternal lineages belonging to haplogroup M were found in Tai Khampti

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Tai Khampti population is 0.000713  0.000381. Nucleotide diversity is a measure of genetic variation. It is

Tai Khampti

399

Fig. 7.192 mtDNA phylogenetic tree of M-haplogroup among the Tai Khampti

usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.44). Figure 7.194 shows mismatch distributions of Tai Khampti population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often unimodal in Tai Khampti population which indicates recent expansion. The smaller number of sum of squared deviation

(0.0153) and Harpending’s raggedness index (r) (0.0459) also confirms that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Tai Khampti population 0.5271 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (8.62915) is also an evidence for a recent population expansion.

400

7

Genomic Diversity of 75 Communities in India

Fig. 7.193 mtDNA phylogenetic tree of N-haplogroup among the Tai Khampti

Tharu Tharu seem to be an offshoot of Thadau Kuki. They live along the foothills adjacent to the border of Nepal from district Nainital to district Lakhimpur-Kheri. They speak the language of respective regions, viz. Braj, Kanauji, Awadhi and Bhojpuri and use Devanagari script. They are divided into several groups, namely, Rana, Katheri and the Dangura. The Rana Tharu proudly claim high status in the regional hierachy. They are aware of Varna system and place themselves equal to Kshatriya. These groups are endogamous. The endogamous groups are divided into units called kurma which are exogamous. They maintain lineage exogamy. Consanguineous marriages of cross cousins, sororate and levirate, divorcee and widow remarriages are permissible. Monogamy is the rule. The Tharu are largely agriculturists, and land is the main

resource. At present they have to depend on settled cultivation. Those who do not own land depend on others as farm labourers. They are Hindu and worship deities of the wider pantheon (Singh 1994). The Tharu are on an average of below medium height with a round head tending towards a broader shape. They have a short nose and a round or oval face (Mahalanobis et al. 1949). In case of ABO blood Group system, the B blood group has shown highest prevalence 51.3% followed by A (20.5), O (18.9), and AB (10.3) (Srivastava 1965). Regarding the Rh System, d gene frequency is 15.62. Prevalence of colour blindness is 3.7. Chopra (1970) worked on haptoglobin among Tharu population of Kumaon region of Uttarakhand and shown that HP1 and HP2 gene frequencies were 22.04 and 77.96, respectively. Chopra in the year 1970 reported that gene frequency for GC1 was 64.80 and GC2

Mean number of pairwise differences 11.1029  5.307

Nucleotide diversity 0.0007  0.0003

Sum of squared deviation SSD P 0.015313 0.340

Table 7.44 Molecular Diversity Indices among the Tai Khampti Harpending’s Raggedness index HRI P 0.04590 0.0900 Theta Pi 11.10294  5.940

S 12.71915  4.795

Tajima’s P D P 0.5271 0.316

Fu’s Fs Fu’s Fs 8.6291

p 0.003

Tharu 401

402

7

Genomic Diversity of 75 Communities in India

Fig. 7.194 Mismatch distribution of nucleotide differences of Tai Khampti population

variant of Group-specific component System (GC) was 35.20 among Tharu community of Kumaon region, Uttarakhand. Regarding the ABO Blood Group System among the Tharu population, prevalence of B blood group is highest (51.3%), followed by A (20.5%), O (18.9%), AB (10.3%) (Srivastava 1965). With regard to Rh blood group system, the d gene frequency is 15.62. Prevalence of colour blindness is 3.7. Chopra (1970) worked on Tharu population of Kumaon region of Uttarakhand and shown HP1 and HP2 gene frequencies to be 22.04 and 77.96, respectively. Kaur et al. (1977) reported 99.6 gene frequency of Tfc variant and 0.4 gene frequency of Tfd variant among Tharu tribal community of Khatima subdivision of Uttarakhand. Chopra (1970) reported gene frequency for GC1 was 64.80 and GC2 variant was 35.20 among Tharu community of Kumaon region, Uttarakhand. Mukherjee et al. (2015a and 2015b) found that the extent of under-nutrition (BMI < 18.5) was found to be moderately high (22.2%) among Tharus especially among old aged individuals, and there was a significant difference in the prevalence of undernutrition between males (26.2%) and females (18.9%). However, on the basis of waist to hip ratio (WHR) 29% adults Tharus were at high risk of obesity and it was also observed that 47.2%

were at pre-hypertension stage regardless of gender. For the present study Tharu blood samples were collected from the state of Uttarakhand.

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 47 individuals were all assigned to nine haplogroups (Fig. 7.195) through the screening of Y-SNPs. Haplogroup R1A1* has the highest frequency (26%) followed by haplogroup D1a* (17%), haplogroups J2B1 and R2 (each 15%), haplogroups H* and L* (each 9%), haplogroups H1a* and O3A3C1* (each 4%) and haplogroup H2* (2%). Haplogroup R1A1* is widely distributed in Eurasia. Haplogroup D1A* is distributed mainly in Japan, Central Asia and Andaman Islands in Bay of Bengal. While, haplogroup J2B1 is found in Italy, Czechoslovakia and Germany, haplogroup R2 is of Central Asian lineage. Haplogroup H* is distributed mainly in South Asia—India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan. Haplogroup L is found in Neolithic migrations from Central Asia. Haplogroup O3A3C1* is distributed mainly in South East Asia.

Toda

403

Fig. 7.195 Y chromosome haplogroups of Tharu

Toda According to the Todas, the goddess Teikirshy and her brother first created the sacred buffalo and then the first Toda man (Mohanty 2004a, b). The first Toda woman was created from the right rib of the first Toda man. The Toda are distributed in the Nilgiri Hills of Tamil Nadu. The Toda language is a member of the Dravidian family. The language is typologically aberrant and phonologically difficult. Linguists have classified Toda (along with its neighbour Kota) as a member of the southern subgroup of the historical family proto-SouthDravidian. It split off from South Dravidian, after Kannada and Telugu, but before Malayalam. In modern linguistic terms, the aberration of Toda results from a disproportionately high number of syntactic and morphological rules, of both early and recent derivation, which are not found in the other South Dravidian languages. They are divided into two endogamous groups, viz.,

Toroas and Towfily (Mohanty 2004a, b). It is important that they have dual clan structure. They are particularly agriculturist. But now their sole occupation is cattle-herding and dairy-work. Holy dairies are built to store the buffalo milk. They once practised fraternal polyandry, a practice in which a woman marries all the brothers of a family, but no longer do so. All the children of such marriages were deemed to descend from the eldest brother. In the Toda tribe, families arrange contracted child marriage for couples (Singh 1994). The Toda are tall-statured, long headed people with a narrow nose form and their physical features conform to the Mediterranean type. They show a high gene B frequency of 38–40% with a lower value of gene O, 35–53% in the ABO blood groups and a higher Ms Combination in the MNSs system (Kirk et al. 1962). For the present study, Toda blood samples were collected from the state of Tamil Nadu.

404

7

Genomic Diversity of 75 Communities in India

Fig. 7.196 Y chromosome haplogroups of Toda

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 22 individuals were all assigned to eight haplogroups (Fig. 7.196) through the screening of Y-SNPs. Haplogroup F* has the highest frequency (23%), followed by haplogroup J2b1 and L* (18%), haplogroup L1 (14%), haplogroups H1b and R2* (each 9%) and haplogroups H* and R1a1* (each 5%). Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. Haplogroup J2B1 is found in Italy, Czechoslovakia and Germany. Haplogroup L* is found mainly in Pakistan. Haplogroup R2 is of Central Asian lineage. Haplogroup H* is distributed mainly in South Asia—India, Sri Lanka, Nepal and Pakistan with lower frequency in Afghanistan. Haplogroup R1A1* is widely distributed in Eurasia.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 90 individuals were scanned for maternal lineages in the population.

Its maternal lineages comprise 22% Asian macrohaplogroup M and 78% of European macrohaplogroup N selected for complete sequencing based on HVR I motifs. A total of two maternal lineages belonging to haplogroup M were found in Toda population (Fig. 7.197). Haplogroup M2 has the highest frequency (75%) followed by M30 (25%). Haplogroup M2 distributed mainly in South East Asia. Haplogroup M30 found mainly in India, Middle East and North Africa. Founder age of Toda population was 64  13 ky. The N-haplogroup of the 70 individuals of Toda were assigned to two haplogroups (Fig. 7.198). The highest frequency is U7 (1%), followed by R5 (99%). Haplogroup U7 is considered a West Eurasian-specific, believed to have originated in the Black Sea area. In India U7 haplogroup has a significant presence in Gujrat and Punjab and in Pakistan. The possible homeland of this haplogroup spans Indian Gujarat (highest frequency, 12%) and Iran because from there its frequency declines steeply both to the east and to the west. The coalescence time of haplogroup U7 was estimated to be 32,000  5500 years. The subclade R5 of the haplogroup R is mainly distributed in the Indian subcontinent. In India, haplogroup R5 was

Toda

405

Fig. 7.197 mtDNA phylogenetic tree of M-haplogroup among the Toda

distributed across groups of the subcontinent, especially in Madhya Pradesh at 17%. The coalescence time was estimated to be 66,100  22,000 years.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Toda population is 0.000841  0.000422. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of

population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.45). Figure 7.199 shows mismatch distributions of Toda population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distributions are little ragged and often multimodal in Toda population indicates recent expansion. The smaller number of sum of squared deviation (0.03905088) and Harpending’s raggedness index (r) (0.05325) also confirm that population has undergone recent demographic expansion.

406

7

Genomic Diversity of 75 Communities in India TDA97 TDA78 TDA76 TDA75 TDA73 TDA70 TDA69 TDA68 TDA67 TDA66 TDA62 TDA60 TDA57 TDA56 TDA55 TDA47 TDA27 TDA96 TDA94 TDA80 TDA79 TDA45 TDA44 TDA42 TDA41 TDA39 TDA36 TDA35 TDA34 TDA33 TDA32 TDA31 TDA9 TDA8 TDA3 TDA29 TDA26 TDA25 TDA24 TDA23 TDA20 TDA2 TDA17 TDA15 TDA48 TDA6 TDA82 TDA83 TDA63 TDA19 TDA12 TDA77 TDA71 TDA59 TDA46 TDA38 TDA11 TDA43 TDA52 TDA37 TDA30 TDA10 TDA64 TDA50 TDA7 TDA84 TDA93 TDA22 TDA98 TDA90 rCRS

R5 N U7

Fig. 7.198 mtDNA phylogenetic tree of N-haplogroup among the Toda

Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Toda population

0.4196 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (24.19543) is also an evidence for a recent population expansion.

Mean number of pairwise differences 13.93982  6.313

Nucleotide diversity 0.00084  0.0004

Sum of squared deviation SSD P 0.03905 0.600

Table 7.45 Molecular Diversity Indices among the Toda Harpending’s raggedness index HRI P 0.05325 0.4100 Theta Pi 13.9398  6.994

S 15.9717  4.265

Tajima’s P D P 0.4196 0.3980

Fu’s Fs Fu’s Fs 24.1954

p 0.000

Toda 407

408

7

Genomic Diversity of 75 Communities in India

Fig. 7.199 Mismatch distribution of nucleotide differences of Toda

Toto They inhabit three hillocks, Subhapara, Dhuchipara and Panchayatpara on the IndoBhutan frontier. Presently they settled in Totopara, Jalpaiguri District, West Bengal. According to the 2011 census, their population was 66,627. They speak toto language which belongs to the Himalayan group of TibetoBurman family of languages. They are conversant with Bengali and Nepali also and use the Bengali script. Toto have 13 exogamous clans in which Damku Be accorded the highest status. Marriage is prohibited in mother’s clan. Monogamy is commonly practised and polygyny is rare cases. Widow Remarriage is allowed. Sororate is also practised. Toto subsist on agriculture, horticulture, poultry farming, animal husbandry and pig rearing. Their religion has greatly been influenced by Hinduism. The most important deity of the Toto is the goddess Ishapa or Mahakali, who is represented by two sacred drums which hang from the ceiling of the temple (Singh 1994). The Toto have flat nose, small eyes, broad cheeks and thick lips. Chaudhuri et al. (1964) studied a sample of the Toto from Jalpaiguri district, reported the lowest ever recorded frequency of blood group O (3%) and very high

heterozygous phenotype AB (15%), in the Indian subcontinent for the ABO system. This community is also reported to have very high percentage (47%) of B gene. A2 allele has been found to significantly high (8%). A significant feature in the MNSs system is that both genes M-and N have very little differences in their frequencies. S has an equal association with Mand N. In the Rh blood group system they show a very high frequency of Rz (7%) and R (9%) haplotypes, with an absence of r and very high R1 (70%), a trend closer to other Mongoloid groups of North-East India. For the present study, Toto blood samples were collected from the state of West Bengal.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 35 individuals for Y-SNPs revealed the presence of three haplogroups. These are O3A3C1*, N1* and K*. Among these, O3a3c1* shows preponderance with a percentage of 86% followed by N1* (11%) and K* (3%) (Fig. 7.200). The haplogroup O3a is predominant among the Tibeto-Burman communities of South East Asia.

Toto

409

Fig. 7.200 Y chromosome haplogroups of Toto

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 102 individuals were scanned for maternal lineages in the population. A total of 31 samples were selected for complete sequencing under M-and N-haplogroup based on HVR I motifs. A total of four maternal lineages belonging to haplogroup M were found among the Toto population (Fig. 7.201). Among these, 75% predominantly belongs to D, 14% belongs to M35, 7% belongs to M60 and 4% belongs to M33 lineages. The founder age of the Toto was estimated to be 59  12 ky based on the high preponderance of D maternal lineage. The N-haplogroup of the three individuals of Toto were assigned only one haplogroup—F, which is 100% (Fig. 7.202).

Molecular Diversity Molecular diversity indices are shown in Table 7.46. Nucleotide measures the degree of

polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Toto population is 0.000997  0.000510. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships. Figure 7.203 shows mismatch distributions of Toto population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Toto population which indicates a recent expansion. The smaller number of sum of squared deviation (0.01919247) and Harpending’s raggedness index (r) (0.01807060) also confirm that Toto has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the

410

7

Genomic Diversity of 75 Communities in India

Fig. 7.201 mtDNA phylogenetic tree of M-haplogroup among the Toto

number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Toto population 0.07746 indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (15.63764) is also an evidence for a recent population expansion of Toto community.

Urali Kuruman The Urali Kuruman inhabits the states of Kerala and Tamil Nadu. The term ‘uraly’ is also spelt as

Urali, and it has been derived from the words ur, meaning a village and al, meaning person. They claim that they are descendants of Mutturaja. They were formerly employed as soldiers, and eventually spread into the adjoining districts of Madurai. The district of Madurai is considered their original home (Thurston 1909). The Urali Kuruman is distributed in the Manantody and Sultan Battery taluks of Wayanad districts of Kerala. These Kuruman tribes have descended from places of the Nilgiri progenitors. These ancestors were chiefly collectors or some also practised farming. As far as the languages are

Urali Kuruman

411

Fig. 7.202 mtDNA phylogenetic tree of N-haplogroup among the Toto

concerned, the Urali Kuruman tribes converse with each other in a language, which belongs to the family of the Dravidian family language. This tribal community speaks in a language that is a blend of Kannada and Malayalam language. They prefer to marry their mother’s brother’s daughter or father’s sister’s daughter. Monogamy is the norm. Sororate and levirate are practised. Many of them earn their livelihood as daily-wage labourers. The community follow Hinduism (Singh 1994). With regard to their dermatoglyphics pattern there is a preponderance of loops (54.4%) over whorls (44.3%) and arches (1.3%). Their pattern Intensity Index is 14.3 (Chakravartti and Gupta 1960). Banerjee et al. (1988) studied their ABO blood group system and observed highest frequency of ‘O’ blood group (58.24%), followed by ‘B’ blood group (29.67%) and ‘A’ blood group (8.79%) and ‘AB’ blood group (3.29%) (Banerjee et al. 1988). For the present study,

Urali Kuruman blood samples were collected from the state of Kerala.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 23 individuals were scanned for maternal lineages in the population. Its maternal lineages comprise 52% Asian macrohaplogroup M and 48% of European macrohaplogroup N selected for complete sequencing based on HVR I motifs. The M-haplogroup of the three individuals of Urali Kuruman were assigned only one haplogroup—M2 which is 100% (Fig. 7.204). The N-haplogroup of the 11 individuals of Urali Kuruman were assigned to two haplogroups (Fig. 7.205). The highest frequency is R30 82%, followed by U1 (18%). The subclade R0 within the haplogroup R occurs commonly in the Arabian Peninsula, with its highest frequency

Mean number of pairwise differences 16.51851  7.583

Nucleotide diversity 0.00099  0.0005

Table 7.46 Molecular Diversity Indices among the Toto Sum of squared deviation SSD P 0.01919 0.190

Harpending’s Raggedness index HRI P 0.01807 0.310 Theta Pi 16.51851  8.442

S 16.18931  5.378

Tajima’s P D P 0.07746 0.609

Fu’s Fs Fu’s Fs 15.637

p 0.00

412 7 Genomic Diversity of 75 Communities in India

Wancho

413

Fig. 7.203 Mismatch distribution of nucleotide differences of Toto population

observed among the Socotri (Černý et al. 2009). Moderate frequencies are found in North Africa, the horn of Africa and the Central Asia. Haplogroup U1 is found at very low frequency throughout Europe. It is found more often in Eastern Europe, Anatolia and the Near East. It is also found at low frequencies in India. Haplogroup U1 is a very ancient haplogroup, with an estimated age of about 32,000 years.

Molecular Diversity Nucleotide measures the degree of polymorphism within a population. Nucleotide diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Urali Kuruba population is 0.000699  0.000388. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships (Table 7.47). Figure 7.206 shows mismatch distributions of Urali Kuruba population. Smooth line is the expected distribution under the hypothesis of

constant population size. Mismatch distributions are little ragged and often multimodal in Urali Kuruman population which indicates recent expansion. The smaller number of sum of squared deviation (0.01300465) and Harpending’s raggedness index (r) (0.0176) also confirm that population has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. It is an important statistics that is widely used in population genetics. When you have a lot of rare mutations we get a negative Tajima’s D. Among Urali Kuruba population 1.756 indicates the population is under expansion meaning recovering from bottleneck. A negative value of FU’s Fs (5.0174) is also an evidence for a recent population expansion.

Wancho Elwin (1965) described the Wancho as the most versatile and picturesque people, who live in Tirap district of Arunachal Pradesh. The Wancho’s migrated via-Burma and part of Tuensang district of Nagaland (before

414

7

Genomic Diversity of 75 Communities in India

Fig. 7.204 mtDNA phylogenetic tree of M-haplogroup among the Urali Kuruman

independence of India in 1947, the Tuensang area was under NEFA) which is locally called as “LongphohSangnua”. They migrated from Tangnu and Tsangnu areas of Nagaland to their present area. They are also known as Konyak. They speak the Wancho language, which belongs to the Naga group of the Tibeto-Burman family of languages. The Roman and Devanagari scripts are used by them. There are several exogamous clans among the Wancho. Both monogamy and polygyny are in vogue. In order to maintain the purity of blood, the chief (Wangham) must bring one of his wives from the family of the chief of another village. The Wancho mostly practises slash and burn cultivation. The Wancho had trade relations with Nagaland and also with the

Nocte people from time immemorial (Singh 1994). Blood group studies suggest that the Wancho possess a high frequency of gene A (18%) than gene B (19%) while gene O shows a moderate value (Sengupta and Dutta 1980). The dermatoglyphic characters exhibit a high value of whorls as compared to loops. For the present study, Wancho blood samples were collected from the state of Arunachal Pradesh.

Paternal Lineage (Y Chromosomal Haplogroups) The screening of 41 individuals through Y-SNPs revealed the presence of three haplogroups. These

Wancho

415

Fig. 7.205 mtDNA phylogenetic tree of N-haplogroup among the Urali Kuruman

are O3A3C1*, K*and N1*. Among these, O3A3C1* shows preponderance with a percentage of 90%, followed by K* and N1* (5% each) (Fig. 7.207). The haplogroup O3A is predominant among the Tibeto-Burman communities of South East Asia.

Maternal Lineage (mtDNA Haplogroups) Mitochondrial genomes of 125 individuals were scanned for maternal lineages in the population. A total of 39 samples were selected for complete sequencing under M-and N-haplogroups based on HVR I motifs. A total of seven maternal lineages belonging to haplogroup M were found among

the Wancho population (Fig. 7.208). Among these, 48% predominantly belongs to M8, 17% belongs to D, 13% belongs to M9 (10%), M9 (4%) belongs to M11, G and M49 lineages. The founder age of the Wancho was estimated to be 91  15 ky based on the high preponderance of M8 maternal lineage. The N-haplogroup of the 16 individuals of Wancho were assigned to five haplogroups (Fig. 7.209). The highest frequency is F1—37%, followed by N9—25%, A6—12%, B40 5—19% and R30—6%. Haplogroup F is fairly common in East Asia and Southeast Asia. Higher frequencies occur in some areas like Nicobar at 50% and Arunachal Pradesh 31% (India) and Shors people from Siberia at 44%. There is also

Mean number of pairwise differences 9.80303  4.830

Nucleotide diversity 0.00069  0.0003

Sum of squared deviation SSD P 0.013004 0.920

Table 7.47 Molecular Diversity Indices among the Urali Kuruman Harpending’s Raggedness index HRI P 0.01767 0.9900 Theta Pi 9.80303  5.439

S 15.8946  6.431

Tajima’s P D P 1.756 0.02

Fu’s Fs Fu’s Fs 5.0174

p 0.010

416 7 Genomic Diversity of 75 Communities in India

Wancho

417

Fig. 7.206 Mismatch distribution of nucleotide differences of Urali Kuruman population

Fig. 7.207 Y chromosome haplogroups of Wancho

an important frequency in Taiwanese aborigines, Guangdong (China), Maluku (Indonesia), Thailand and Vietnam. The coalescence age was estimated to be 16.7  5.6 kya. Haplogroup N9 is

found in Far East. Haplogroup A is found in Central and East Asia, as well as among Native Americans. Haplogroup I is found in West Eurasia and South Asia. The coalescence age of

418

7

Genomic Diversity of 75 Communities in India

Fig. 7.208 mtDNA phylogenetic tree of M-haplogroup among the Wancho

the Indian ‘N’ cluster is estimated to be 45,405  7752 ybp (Barnabas et al. 2005). R30 was found in South East Asia and Far East.

Molecular Diversity Molecular diversity indices are shown in Table 7.48. Nucleotide measures the degree of polymorphism within a population. Nucleotide

diversity can be calculated by examining the DNA sequences directly. Nucleotide diversity in Wanchu population is 0.001851  0.000936. Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and it is similar to the expected heterozygosity. This statistics may be used to monitor diversity within or between ecological populations, and to determine evolutionary relationships.

Yanadi

419

Fig. 7.209 mtDNA phylogenetic tree of N-haplogroup among the Wancho

Figure 7.210 shows mismatch distributions of Wancho population. Smooth line is the expected distribution under the hypothesis of constant population size. Mismatch distribution is mostly unimodal in Wanchu population which indicates a recent expansion. The smaller number of sum of squared deviation (0.00533351) and Harpending’s raggedness index (r) (0.00917059) also confirm that Wanchu has undergone recent demographic expansion. Tajima’s D is a statistics that compares the average number of pairwise differences with the number of segregating sites. When there are a lot of rare mutations Tajima’s D will be a negative value. Among Wancho population 1.44609

indicates the population is under expansion, meaning recovering from bottleneck. A negative value of FU’s Fs (6.77728) is also an evidence for a recent population expansion of Wanchu community.

Yanadi The Yanadi are one of the major scheduled tribes of Andhra Pradesh. The Yanadi were notified as a criminal tribe during colonial rule, but were denotified after independence. Thurston (1909) noted that the people were natives of Sriharikota island and suggested that they derived their name

Mean number of pairwise differences 30.52173  13.837

Nucleotide diversity 0.001851  0.000936

Sum of squared deviation SSD P 0.00533 0.710

Table 7.48 Molecular Diversity Indices among the Wanchoo Harpending’s Raggedness index HRI P 0.00917 0.64000 Theta Pi 30.5217  15.432

S 47.4150  15.743

Tajima’s P D P 1.4460 0.047

Fu’s Fs Fu’s Fs 6.7772

p 0.013

420 7 Genomic Diversity of 75 Communities in India

Yanadi

421

Fig. 7.210 Mismatch distribution of nucleotide differences of Wancho population

from the Sanskrit word Anadi, denoting those whose origin is unknown. Their mother tongue is Telegu, and they use the Telegu script (Singh 1994). The Yanadi are divided into two endogamous divisions, Challa and Manchi. In Telegu, Manchi means good and Challa means unclean. The Yanadis, who are settled in the rehabilitation colonies, practise agriculture. The government has allotted them cultivable land. But those who live outside the colonies depend on agricultural labour and food gathering. Following the tradition of so many tribal communities of the entire region, this Yenadi tribal community have adept to the profession of hunting. In fact this Yenadi hunters, shikari as they are popularly referred as, have gained acclamation in entrapping several animals like hares, rats, cobras and leopards. Apart from Andhra Pradesh, these Yenadi tribal communities also are found in other state of Indian subcontinent as well. Festivals, performing art forms, crafts and several artworks have contributed in making the culture and tradition of this Yenadi tribal community quite enriched. Girls of this community are married soon after they attain puberty. Bride price as well as dowry is in vogue among these people. Most of their families are of the nuclear type.

If one observes a Yenadi tribe very closely, one can easily distinguish several identifiable features. The Yanadi are below medium stature, long headed, and the median value of their cephalic index is among the lowest observed in the state. They have a broad facial profile with a short chin, and short and broad nasal features (Shreenath and Ahmad 1989). The Yenadi people have a dark complexion and are thin. The muscles of this Yenadi tribal community are soft and flaccid, and their cheekbones are quite ‘prominent’. They differ significantly from most other tribal communities of Andhra Pradesh. On the basis of a multivariate distance analysis, they are found to occupy a mid-way place between the scheduled caste groups and the tribal communities, and are more close to the scheduled castes. With regard to ABO blood group, the Yanadis have a very high frequency of gene O (73.60) and the lowest frequency is of gene A (10.20) and gene B (16.20) (Negi and Maitra 1974). Reddy et al. (1980) observed that there are 2.26% deficient in the G-6 PD enzyme system. In the haptoglobin system, Reddy et al. (1982) found the frequency of HP1 gene is 8.93% and HP2 gene is 91.07% (Reddy et al. 1982). For the present study, Yanadi blood samples were collected from the state of Andhra Pradesh.

422

7

Genomic Diversity of 75 Communities in India

Fig. 7.211 Y chromosome haplogroups of Yanadi

Paternal Lineage (Y Chromosomal Haplogroups) The Y chromosome haplogroup of the 38 individuals were all assigned to eight haplogroups (Fig. 7.211) through the screening of Y-SNPs. Haplogroup F* has the highest frequency (26%) followed by haplogroup R2 (24%), haplogroup R1A1* (16%), haplogroup J2B1 and L1 (each 11%), haplogroup C5* (8%) and H1A* and H2* (each 3%). Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. Haplogroup R2 is of Central Asian lineage. Haplogroup R1A1* is widely distributed in Eurasia. Haplogroup J2B1 is distributed mainly in Italy, Czechoslovakia and Germany. Haplogroup L1 is distributed typically among Dravidian communities of India. Haplogroup C5* is found in high frequency among the Australian aborigines. While haplogroup H1A* is found among the Dravidian and Central Indian communities, H2* is distributed primarily

among the European, basically Western Europe. They are also found among Armenian, Iranian as well as among the people of India and Southern Asia.

Yerukulas Yerukala or Erukala or Erukula is a scheduled tribe community who are native to Andhra Pradesh and Telangana. The Yerukulas derive their name from Eruku which means acquaintance or knowledge. The Yerukulas believe that renuka was brought to life with the head of a man and came to be known as “Ellamma”, the patron deity of the community. According to 2011 census, the total population strength is 519,337. Out of which 259,169 are males and 259,169 females (sex ratio ¼ 1004). They have a total literacy rate of 55.1% in the state of Andhra Pradesh. The Yerukulas speak their own dialect known as Kurru basha or Kulavatha which belongs to the family of Dravidian languages. They rear pigs and sell pork. They are also involved in basket

References

423

Fig. 7.212 Y chromosome haplogroups of Yerukulas

making and as daily-wage labourers. The women of the community are fortune tellers and tattoo artists. The Yerukulas are also a denotified tribal community. For this study, Yerukulas blood samples were collected from the state of Andhra Pradesh.

Czechoslovakia and Germany. Haplogroup F* is found mostly among the Dravidian, Indo-European, Sino-Tibetan, Tibeto-Burmese and Turkic linguistic groups throughout Eurasia. Haplogroup C5* is found in high frequency among the Australian aborigines. Haplogroup H1A* is found among the Dravidian and Central Indian communities.

Paternal Lineage (Y Chromosomal Haplogroups)

References The Y chromosome haplogroup of the 39 individuals were all assigned to seven haplogroups (Fig. 7.212) through the screening of Y-SNPs. Haplogroup L1 has the highest frequency (40%) followed by haplogroup R1A1* (13%), haplogroup R2 (12%), haplogroup H1A* (8%), haplogroup J2B1 (5%) and haplogroup F* and C5* (each 3%). Haplogroup L1 is distributed typically among Dravidian communities of India. Haplogroup R1A1* is widely distributed in Eurasia. Haplogroup J2B1 is distributed mainly in Italy, Czechoslovakia and Germany. Haplogroup R2 is of Central Asian lineage. Haplogroup J2B1 is widely distributed in Italy,

Action Aid data (2016). http://www.actionaid.org/india/ what-we-do/odisha/working-paudi-bhuyan-tribessundargarh. Accessed 23 May 2016 Ahmed SN (1977) ABO blood groups and sickle cell trait among the Kondh and Nuka Dora of koraput district. In: Rakshit HK (ed) Bio-anthropological research in India. Anthropological Survey of India, Calcutta, pp 119–126 Ananthakrishnan R, Kirk RL (1969) The distribution of some serum protein and enzyme group systems in two endogamous groups in south India. Indian J Med Res 57:1011 Ananthakrishnan R, Blake NM, Kirk RL, Baxi AJ (1970) Further studies on the distribution of genetic variants of lactate dehydrogenase in India. Med J Aust 2(17):787– 789. https://doi.org/10.5694/j.1326-5377.1970. tb63171.x

424 Balgir RS (2007) Identification of a rare blood group “Bombay (oh) phenotype” in Bhuyan tribe of Northwestern Orissa, India. Indian J Hum Genet 13 (3):109–113 Banerjee MK, Banerjee DK (1967) ABO blood among the Jaunsaris. Man India 47:133–138 Banerjee S, Roy M, Dey B, Mukherjee DN, Bhattacharjee SK (1988) Genetic polymorphism of red cell antigen, enzyme, haemoglobin and serum proteins in 15 endogamous groups of South India. J Indian Anthropol Soc 23:250–259 Bareh H (1967) The history and culture of the Khasi people. Spectrum Publications, Guwahati Barnabas S, Shouche Y, Suresh CG (2005) Highresolution mtDNA studies of the Indian population: implications for palaeolithic settlement of the Indian subcontinent. Ann Hum Genet 70:42–58 Bhasin V (1994) People, health and disease: the Indian scenario. Kamla Raj Enterprises, Delhi Bhasin MK et al (1986) Biology of the people of Sikkim, studies on the variability of genetic markers. Zeitschrift fur Morphologie und Anthropologie 77(1):49–86 Bhattacharjee PN (1968) Sero genetic variations in the Lepcha of Darjeeling district with special reference to Indian Mongoloid. Bull Anthropol Surv India 17 (4):382–392 Bhattacharjee PN (1975) Blood groups of the AustroAsiatic and Tibeto-Burman speaking people in eastern India: a review. Proceedings Annual Conf. Ind J Hum Genet Bhattacharya KK, Biswas SK (1978) Select genetic marker in Lakhshadweep. Man India 58(2):65–70 Bonnington MCC (1932) Census of India 1931. The Andaman and Nicobar Islands, vol II. Government of India, Calcutta Boruah BH (2018) The Tai Ahom movement in northeast India: a study of all Assam Tai Ahom student union. IOSR J Human Social Sci 23(7):45–50 Buchanan F (1870) A journey from Madras through the countries of Mysore, Canara, and Malabar. Higginbotham and Co., Madras. ISBN 1108116302 Buchi EC (1953) ABO, MN, Rh blood groups and secretor factor in Kanikkar: a genetic survey in South Travancore and a contribution to the race problem in India. Bull Anth Surv India 2:83–98 Buchi EC (1959) Blut, Geshmack und Farbansionbei den Kurumba (Nilgiri, Sudindien) Archi der Julinsstiftimg fiervererbeings Forsch Sozialanthropologic und Rassenhygiens, 310–316 Census of India (2011) Office of the Registrar General and Census Commission. Government of India, New Delhi Černý V et al (2009) Out of Arabia—the settlement of island Soqotra as revealed by mitochondrial and Y chromosome genetic diversity. Am J Phys Anthropol 138(4):439–447 Chakrabartti MR, Mukherjee DP (1964) Dermatoglyphic study of tribe and castes of Nilgiri Hills, Tamil Nadu (Madras State). Zeitschrift fur Morphologie and Anthropologie 55(3):335–336

7

Genomic Diversity of 75 Communities in India

Chakravartti MR, Gupta P (1960) Dermatoglyphics of the Uralis of Kerala. Man India 40:36–51 Chakravartti MR, Mukherjee DP (1962) Dermatoglyphic study on the tribes and castes of Nagaland and Manipur states. Bull Anthropol Surv India 11(3&4):233–234 Chaubey G, Karmin M, Metspalu E, Metspalu M, SelviRani D, Singh VK et al (2008a) Phylogeography of mtDNA haplogroup R7 in the Indian peninsula. BMC Evol Biol 8:227. https://doi.org/10.1186/1471-2148-8227 Chaubey G, Metspalu M, Karmin M, Thangaraj K, Rootsi S et al (2008b) Language shift by indigenous population: a model genetic study in South Asia. Int J Hum Genet 8(1–2):41–50 Chaudhuri S et al (1964) A study of haematological factors, blood groups, anthropometric measurements and genetics of some of the tribal and caste groups of 1. South India-Kerala, 2. North Eastern IndiaTotoPara. Bibl Haemat 19:186–205 Chopra VP (1970) Studies on serum groups in Kumaon Region, India. Hum Genet 10:35–43 Clio DS et al (2013) Ancient DNA reveals prehistoric gene-flow from Siberia in the complex human population history of North East Europe. PLoS Genet 9(2): e1003296 Das BM (1958) Blood groups of the Rabhas. Man India 38:213–215 Das BM (1978) Variation in physical characteristics in the Khasi population of North East India. D. K. Publishers’ Distributors, New Delhi Das P (2015) Oral traditions of the Sonowal Kacharis of Assam. Int J Interdiscip Res Sci Soc Cult 1(2):89–99 Das SR, Ghosh L (1954) A genetic survey among the Paniyan, a south Indian Aboriginal Tribe: ABO, MN Blood group, secretor factor and taste ability. Bull Dept Anthropol Govt India 3(1):65–72 Das SR et al (1962) Blood groups (ABO, MN, Rh) and ABH secretion, Sickle Cell Trait and Colour Blindness in Gadaba and Bareng Paroja of Koraput District in Orissa. Bull Dept Anthropol Govt India 11(3 and 4):145–151 Das SR et al (1966) Blood groups (A1, A2, BO, MN, Rh) and ABH secretion in the Pareng Gadaba, Ollaro Gadaba and the Konda Paroja of Koraput District in Orissa. Acta Genet Stat Med 16:169–189 Das BM, Deka R, Flatz G (1975) Predominance of the HbE gene in a Mongoloid population in Assam (India). Humangenetik 30(2):187–191 Das BM, Deka R, Das R (1980) ABO blood groups and Rh factor in six populations of Assam. Bull Anthropol Surv India 29(1&2):1–3 Devi SS, Naidu JM (1987) Sickle cell trait among Bagatha, Konda Dora and Parangi Poraja tribes. Proceedings of Indian Society of Human Genetics Annual Conference, Calcutta Doungel J (2019) Ethnicity in Mizoram: a case study of ethnic issues in the Sixth Schedule Area of the state. Conference: International Level Seminar on Ethnicity, Identity & Literature at Guwahati, Assam

References Elwin V (1951) The tribal art of middle India. Oxford University Press, London Elwin V (1965) A philosophy for NEFA. North-East Frontier Agency, Shillong Flatz G, Chakravarty MR, Das BM, Delbruck H (1972) Genetic survey in the population of Assam, I. ABO blood groups, glucose-6-phosphate dehydrogenase and haemoglobin type. Hum Hered 22:323–330 Gaikwad S, Vasulu TS, Kashyap VK (2006) Microsatellite diversity reveals the interplay of language and geography in shaping genetic differentiation of diverse ProtoAustraloid populations of west-central India. Am J Phys Anthropol 129(2):260–267 Ganguly P (1976) Physical anthropology of the Nicobarese. Anthropological Survey of India, Calcutta Ghosh AK (1977) The distribution of genetic variation of glyoxalase 1, esterase-D and carbonic anhydrase I and II in Indian populations. Indian J Phys Anthropol Hum Genet 32(2):73–83 Goud JD, Rao PR (1979) Genetic distance among the five tribal populations of Andhra Pradesh, South India. Anthropol Anz 37:1–9 Goud JD, Rao PR (1980) Transferrin, haptoglobin and group specific component types in tribal populations of Andhra Pradesh. Hum Hered 30:12–17 Gogoi P (1968) The Tai and the Tai kingdoms. Gauhati University, Guwahati Gogoi J (2017) Socio-economic characteristics of Tai Ahom people in Assam a geographical analysis. Int J Interdiscip Res Sci Soc Cult 3(1):257–266 Goswami MC, Das PB (1990) The people of Arunachal Pradesh: a physical survey Directorate of research. Government of Arunachal Pradesh, Itanagar Gupta BA (1907) Anthropometric data from Bombay: Ethnographic Survey of India. Government Press, Calcutta Kapoor AK, Vaid NK (1977) Red cell glucose-6-phosphate dehydrognase deficiency in the Bhotias of Central Himalayas. Anthropologist 24:86–89 Karve I (1954) Anthropometric measurements in Karnataka and Orissa and a comparison of these two regions with Maharashtra. J Anthropol Soc Bombay 8 (1):45–75 Karve I, Dandekar V (1951) Anthropometric measurements of Maharashtra, Deccan College Monograph Series No. 8. Deccan College, Postgraduate and Research Institute, Pune Kaur H, Sehajpal PK, Srivastava PK (1977) Genetic studies among the Tharu tribals. Acta Anthropogenetica 1 (3):35–37 Kirk RL, Lai LYC, Vos GH, Wickremasinghe RL, Perera DJB (1962) The blood and serum groups of selected populations in South India and Ceylon. Am J Phys Anthropol 20:485–489 Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K et al (2003a) The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. Am J Hum Genet 72:313–332

425 Kivisild T, Rootsi S, Metspalu M, Metspalu E, Parik J et al (2003b) Genetics of the language and farming spread in India. In: Renfrew C, Boyle K (eds) Examining the farming/language dispersal hypothesis, McDonald Institute monographs series. McDonald Institute for Archaeological Research, Cambridge, pp 215–222 Krishan G (1987) Finger dermatoglyphics of three tribes of Gujarat. Vanyajati 35(1):3–6 Kumar N, Bhattacharjee PN (1976) Blood group of the Munda of Ranchi District in Bihar (India). Indian J Phys Anthropol Hum Genet 2(2):201–214 Mackenzie A (1884) History of the relations of the Government with the hill tribes of North-East Frontier of Bengal. Cambridge University Press, Cambridge Mahalanobis PC, Majnmder DN, Rao CR (1949) Anthropometric survey of the United Provinces, 1941: a statistical study. Sankhya 9:90–324 Majumdar DN, Krishen K (1947) Blood group distribution in the United Provinces; Report on the Serological survey of the U.P. Census Operations-1941. East Anthropol 1:8–15 Mandal D, Bera GK (2007) Sahariya-II. In: Meheta PC (ed) Cultural heritage of Indian tribes. Discovery Publishing house, New Delhi, pp 201–208 Mishra PK (1977) The nomadic Gadulia Lohar of Eastern Rajasthan. Director: Anthropological Survey of India, Calcutta Mitra PN (1936) Blood groups of the Angami Naga and the Lushai tribes. Indian J Med Res 23:685–686 Mohan Raj BK, Karutha Pandian S, Damodaran C, Chandrasekharan P (1986) ABO, Rh(D), MN blood groups and ABH secretion in Kotas and Badagas of Nilgiri Hills, South India. Indian J Phys Anthropol Hum Genet 12:213–217 Mohanty RP (2004a) Ethnographic profile of highland Bonda of Orissa. Tribal Tribune 1(3) Mohanty PK (2004b) Encyclopedia of primitive tribes in India, vol 2. Kalpaz Publication, New Delhi Morlote D, Gayden T, Arvind P et al (2011) The Soliga, an isolated tribe from Southern India: genetic diversity and phylogenetic affinities. J Hum Genet 56:258–269 Mukherjee BN, Malhotra KC, Kate SL (1977) Distribution of PTC threshold among 13 endogamous castes and tribes Maharastra. J Indian Anthropol Soc 12:157–166 Mukherjee BN, Malhotra KC, Das SK, Majumdar PP, Roy M et al (1979) Genetic polymorphism analysis among nine endogamous population groups of Maharashtra, India. J Hum Evol 8:555–566 Mukherjee K, Harashawaradhana VPN, Alam A, Rawat B (2015a) Body mass index and chronic energy deficiency among adults of Tharu population, Uttarakhand, India. Int J Biomed Res 6(7):475–478 Mukherjee K, Harashawaradhana VPN, Alam A, Rawat B (2015b) Anthropometric characteristics and nutritional status of adult Tharu: a tribal population of Uttarakhand, India. Int J Biol Pharm Res 6(9):759–764 Naidu JM (1989) Red further data on colour blindness among Andhra population. Man India 69(4):413–415

426 Naidu JM, Naidu Y, Sachidevi S (1990) Further data and Rhesus blood groups among Andhra population. J Hum Ecol 1(1):53–54 Nandy B (2016) The Bondo of Odisha, book of PVTGS in India. AnSI, Kolkata Nanjundayya HV, AnanthakrishnaIyer LK (1930) The Mysore tribes and castes, vol IV. The Mysore University, Mysore, pp 592–599 Natarajan N (1977) The missionary among the Khasis. Sterling Publishers Pvt Limited, New Delhi, p 24 Nath A (2019) Ethnic dialects of the Assamese language. Int J Sci Technol Res 8(12):3081–3088 Nayak AN (2010) Primitive tribal groups of Orissa: an evaluation of census data. Orissa Rev (Census Special):202–205 Negi RS (1968) Sickle cell trait in India: a review of known distribution. Bull Anthropol Surv India 17 (4):439–449 Negi TS (1976) Scheduled tribes of Himachal Pradesh: a profile. Raj Printers, Meerut Negi RS (1990) ABO blood groups in the North-Western India; a regional round up. In: Human variations in India. Anthropological Survey of India, Kolkata, pp 71–81 Negi RS, Maitra A (1974) ABO blood groups in some western and southern Indian tribes Olga AD et al (2002) Traces of early Eurasians in the Mansi of Northwest Siberia revealed by mitochondrial DNA analysis. Am J Hum Genet 70(4):1009–1014 Page HR Jr (2020) Bondo. Encyclopedia of world culture dictionary. https://www.encyclopedia.com/humanities/ encyclopedias-almanacs-transcripts-and-maps/bondo Papiha SS, Mukherjee BN, Chahal SMS, Malhotra KC, Roberts DF (1982) Genetic heterogeneity and population structure in north-west India. Ann Hum Biol 9:235–251 Papiha SS, White I, Singh BN, Agarwal SS, Shah KC (1987) Gc subtypes in the Indian subcontinent. Hum Hered 37:250–254 Papiha SS, Roberts DF, Mishra SC (1988) Serogenetic studies among an urban and two tribal populations of Orissa, India. Ann Hum Biol 15(2):143–152 Pingle U, Mukherjee BN, Das SK (1981) A genetical study of the five tribal groups of Andhra Pradesh, India. Z Morphol Anthropol 72(3):339–348 Quintana-Murci L, Chaix R, Wells S, Behar D, Sayar H et al (2004) Where West meets East: the complex mtDNA landscape of the Southwest and Central Asian corridor. Am J Hum Genet 74(5):827–845 Rajak J (2016) Nutritional and socio-economic status of Saharia tribes in Madhya Pradesh. Int J Human Social Sci 6(1):79–85 Ramesh A et al (1979) Genetic studies on the Kolams of Andhra Pradesh, India. Hum Hered 29:147–153 Ramesh A, Blake NM, Vijaykumar M, Mutty JS (1980) Genetic studies on the Chenchu tribe of Andhra Pradesh. Hum Hered 30:291–298

7

Genomic Diversity of 75 Communities in India

Ramya T (2012) Traditional religious beliefs, practices and impacts of Christianity among the Nyishis. Asian J Res Social Sci Human II(VI):15–28 Ramya T, Ramjuk T (2018) Changing cultural practices among the Nyishis of Arunachal Pradesh: a contextual study. Int J Res Anal Rev 5(2):619–624 Rao PM (1971) ABO blood group studies of Koya Doras of Rampachodavaram taluk, East Godavari District, AP, M.Sc dissertation (unpublished), Waltair, Andhra Pradesh Rao VV, Ramesh A, Vijaykumar M, Murry JS (1983) Digital dermatoglyphics in some tribal populations of A.P., India. Acta Anthropogenet 7:27–34 Rao IA, Chandrasekhar A, Pulamgatta V, Das S, Bose K (2014a) Sexual dimorphism in blood pressure and hypertension among Adult Parengi Porjas of Visakhapatnam, Andhra Pradesh, India. J Anthropol 2014:1–5 Rao IA, Chandrasekhar A, Pulamgatta V, Das S, Bose K (2014b) Blood pressure and hypertension among adult Savara tribals of Visakhapatnam, Andhra Pradesh, India: a public health concern. Afro Asian J Anthropol Soc Policy 5(1):96–101 Reddy AP, Mukherjee BN, Ramachandraiah T (1980) Distribution of ABO and Rh (D) blood groups among four endogamous groups of Andhra Pradesh. Hum Hered 30:31–33 Reddy AP, Mukheijee BN, Malhotra KC et al (1982) A serological and biochemical genetic study among the coastal and plateau Yaradi: a tribal population of Andhra Pradesh. Homo 33:174–218 Reddy BM, Chilke AM, Tummawar SD, Bacher SS (2012) Traditional equipments of Madia Gond of Devda Village, Gadchiroli District (Maharashtra), India. Golden Res Thoughts 2(3):1–3 Roy SB (1980) A note on the ABO, Rh(D) blood groups among the Karens of Andaman Islands. Indian J Phys Anthropol Hum Genet 6(2):151–152 Ruhela SP (1968) The Gadulia Lohars of Rajasthan, a study in the sociology of nomadism. Impex India, New Delhi Russell RV, Hiralal RB (1975) The tribes and castes of the central provinces of India. Macmillan and Company, London Saha N (1973) Haemoglobinopathies in the Indian subcontinent: a review article. Acta Genet Med Gemellol 2:117–138 Saha N, Kirk RL, Shanbhag S, Joshi SR, Bhatia HM (1976) Population genetic studies in the Kerala and Nilgiris (South West India). Hum Hered 28:175–197 Sarkar SS (1952) Blood groups from Andaman and Nicobar islands. Bull Anthropol Surv India 1(1):25–30 Sarkar BN (1985) The Jarawa of Andaman Islands: an anthropometric study. Hum Sci 34:186–194 Sarkar SS et al (1960) Further studies on ABO blood groups, Orissa. Sci Cult 25:694–695 Sastry DB (1990) Sickle cell disease in some tribes of South India. In: Human variations in India. Anthropological Survey of India, Calcutta

References Sati VP (2015) Sahariya tribe: society, culture, economy and habitation. Ann Nat Sci 1(1):26–31 Sengupta S, Dutta MN (1980) ABO blood groups in the Kibarta, Ahom and the Wancho of North-East India. Acta Anthropogenet 4(3&4):245–246 Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA et al (2006) Polarity and temporality of high-resolution Y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of central Asian pastoralists. Am J Hum Genet 78(2):202–221 Seth PK, Seth S (1971) Biogenetical studies of Nagas, G-6-PD deficiency in Angami Nags. Hum Biol 43:4 Seth PK, Seth S (1973) Genetical study of Angamai Nagas (Nagaland, India), A1, A2, Bo, Mn, Rh blood groups, ABO(H) secretion, PTC taste sensitivity and colour blindness. Hum Biol 45:457–468 Shreenath J, Ahmad SH (1989) All India anthropometric survey: south zone, Andhra Pradesh (analysis of data). Anthropological Survey of India, Calcutta Singh KS (1994) People of India: the scheduled tribes, vol 3. Oxford University Press, Delhi Singh KS (1998) India’s communities A-G. Anthropological Survey of India. Oxford University Press, New Delhi, pp 922–923 Singh NK (2006) Global encyclopaedia of the south Indian dalit’s ethnography, vol 1. Global Vision Publishing House, New Delhi, pp 759–763 Sirajuddin SM (1993) Human biology of the Chenchus of Andhra Pradesh. Anthropological Survey of India, Calcutta Srivastava RP (1965) Blood groups in the Tharus of Uttar Pradesh and their bearing on ethnic and genetic relationship. Hum Biol 37:1–12

427 Tehrani NH (2015) The ethnographic narration of Gadulia Lohar Tribe of Udaipur, Rajasthan: with the special reference to the ethnoarchaeological perspective and traditional iron tool technology. Ancient Asia 6 (2):1–11. https://doi.org/10.5334/aa.12321 Thurston E (1909) Castes and tribes of southern India. Government Press, Madras Tiwari SC (1952) Anthropometric study of the Bhotias of Almora district, U.P. Man India 32:148 Tiwari SC, Bhasin MK (1975) Ethnic composition of some general Himalayan population. Man India 55 (2):128–136 Tiwari VK, Pradhan PK, Agarwal S (1980) Haemoglobin in scheduled castes and scheduled tribes of Raipur (M.P.): a preliminary report. Indian J Med Res 71:397–401 Undevia JV, Gulati RK, Sukumaran PK, Bhatia HM, Bhatia HM et al (1981) Genetic variation in Tamil Nadu. In: Sanghvi LD et al (eds) Biology of the people of Tamil Nadu. Indian Society of Human Genetics, Pune, The Indian Anthropological Society, Calcutta, pp 75–102 Vyas GN, Bhatia HM, Sukumaran PK, Balakrishnan V, Sanghvi LD (1962) Study of blood group, abnormal haemoglobin and genetical character in tribes of Gujarat. Am J Phys Anthropol 20:255–265 Walter H, Dannewitz A, Veerraju P, Goud JD (1984) Gc subtyping in South Indian tribal and caste populations. Hum Hered 34:250–254 Weninger M (1952) cf. Gupta P, Dutta PC (1966) Anthropometry in India. Anthropological Survey of India, Calcutta

Index

A Achilli, A., 83, 121 Agrawal, S., 162 Ahmad, S.H., 421 Al-Abri, A., 83, 121 Altheide, T.K., 168 AnanthakrishnaIyer, L.K., 382 Atkinson, Q.D., 123 B Balgir, R.S., 358 Bamshad, M.J., 3 Banerjee, D.K., 245 Banerjee, M.K., 245 Banerjee, S., 248, 411 Bareh, H., 191 Barik, S.S., 3, 38, 124–126 Barnabas, S., 3, 418 Basu, A., 3, 4 Behar, D.M., 83, 121 Bellwood, P., 119 Bera, G.K., 372 Beteille, A, 1 Bhamshad, M.J, 121 Bhasin, M.K., 2, 299, 371 Bhasin, V., 206 Bhattacharjee, P.N., 237, 299, 330 Bhattacharya, K.K., 321 Bhattachayya, N.P., 3 Biswas, S.K., 321 Bonnington, K.S., 336 Bonnington, M.C.C., 333 Boruah, B.H., 392 Buchanan, F., 382 Buchi, E.C., 171, 175, 260, 350 Busi, B.R., 204 C Cabrera, V.M., 123, 168 Cann, R.L., 128 Cavalli-Sforza, L.L., 3

Černý, V., 183, 193, 257, 394, 413 Chakrabartti, D.P., 171, 237 Chakrabartti, M.R., 171, 237, 322, 411 Chandrasekar, A., 3, 118, 122–123, 158, 162–168 Chaubey, G., 3, 83, 121, 178 Chaudhuri, S., 408 Chopra, V.P., 400, 402 Clark, V.J., 122 Clio, D.S., 250 Coble, M.D., 83, 121 Cordaux, R., 3, 122, 124, 164 Cruciani, F., 168 D Dandekar, V., 175, 264 Das, A., 2 Das, B.M., 192, 368, 386 Das, B.P., 345 Das, P., 385 Das, P.B., 210, 225, 376 Das, S.R., 353, 363 Derenko, M., 83, 121 Derenko, M.V., 32 Devi, S.S., 363 Doungel, J., 316 Dutta, M.N., 414 E Edwin, D., 121 Elwin, K.S., 373 Elwin, V., 373, 413 Endicott, P., 124 Ewan, C., 123 F Fedorova, S.A., 83, 121 Fernandes, V., 121 Finnila, S., 32, 83, 121 Flatz, G., 392 Fornarino, S., 11, 121, 124 Forster, P., 33, 123

# Springer Nature Singapore Pte Ltd. 2021 Anthropological Survey of India, Genomic Diversity in People of India, https://doi.org/10.1007/978-981-16-0163-7

429

430 Fregel, R., 83, 109, 121, 123 Fucharoen, G., 118, 124 Fucharoen, S., 118, 124 G Gadgil, M., 1 Gaikwad, S., 307 Ganguly, P., 336 Ghosh, A.K., 292 Ghosh, L., 353 Gogoi, J., 392 Gogoi, P., 392 Goswami, M.C., 210, 225, 344, 376, 397 Goswami, P.B., 397 Goud, J.D., 293 Govindaraj, P., 83, 121 Gunnarsdóttir, E., 11, 45, 83, 121 Gupta, B.A., 252, 302 Gupta, P., 411 Gyllensten, U., 38 H Hadson, R.R., 129 Hammer, M.F., 164, 168 Harpending, H.C., 128 Hartmann, A., 83, 121 Harvati, K., 123 Herrnstadt, C., 11, 121, 122 Higham, C., 119 Hill, E.W., 124 Hiralal, R.B., 175, 287, 338 Horai, S., 32 Hudjashov, G., 38, 45 I Ingman, M., 11, 38, 45, 121, 122 Isukapatla, A.R., 3 J Ji, F., 83, 121 Jiang, C., 83, 121 Jinam, T.A., 11, 45, 121 K Kang, L., 83, 92, 121 Kapoor, A.K., 198 Karafet, T.M., 162 Karve, I., 175, 201, 264, 382 Kayser, M., 3, 32 Khurana, P., 3, 4 Kirk, R.L., 325 Kivisild, T., 3, 11, 35, 92, 121–123, 151, 164, 177, 187, 191, 193, 257 Kong, Q.P., 11, 29, 30, 32, 35, 38, 45, 118, 121 Krishan, G., 345 Krishen, K., 197, 213 Krithika, S., 3 Kshatriya, G.K., 4 Kumar, N., 330

Index Kumar, S., 3, 123–124 Kumar, V., 3, 118, 124 L Luis, J.R., 168 M Maca-Meyer, N., 123 Macaulay, V., 123 Mackenzie, A., 397 Mahalanobis, P.C., 400 Maitra, A., 287, 421 Majumdar, D.N., 197, 213 Majumder, P.P., 1, 3, 121 Malhotra, K.C., 1, 3 Malyarchuk, B., 83, 121 Malyarchuk, B.A., 32 Mandal, D., 372 Matisoff, J.A., 122, 124 Mellars, P., 123 Melton, T., 113, 118, 119 Metspalu, E., 96, 121–124 Metspalu, M., 96, 121–123 Mishmar, D., 35 Mishra, P.K., 218 Mitra, P.N., 181, 322 Mohan Raj, B.K., 292 Mohanty, V.P., 403 Mountain, J.L., 3 Mourant, A.E., 2 Mukherjee, B.N., 268, 309 Mukherjee, D.P., 322, 350 N Nagle, N., 11, 45, 121 Naidu, J.M., 280, 363 Nandy, B., 200 Nanjundayya, H.V., 382 Nath, A., 385 Nayak, A.N., 200 Negi, R.S., 2, 209, 287, 421 Negi, T.S., 293 Nie, M., 113 O Olga, A.D., 250 Olivieri, A., 83, 121–123, 168 Oota, H., 124 Oppenheimer, S., 123 Oven, M.V., 32 P Page, H.R. Jr, 201 Palanichamy, M., 92, 93 Palanichamy, M.G., 3, 11, 37, 38, 109, 121, 123, 124, 162 Passarino, G., 45, 113, 121, 122 Pawar, S., 2 Prasad, B.V., 118

Index Q Qin, Z., 83, 121 Quintana-Murci, L., 3, 45, 122, 123, 151, 250 R Rajak, J., 372 Ramana, G.V., 3 Ramesh, A., 204, 233 Ramjuk, T., 344 Ramya, T., 344 Rani, D.S., 83, 121 Rao, A.I., 3, 13 Rao, P.R., 293 Rao, V.V., 204 Rao, V.V.K., 204 Reddy, A.P., 421 Reddy, B.M., 11, 121, 124, 307 Reich, D., 4 Rieder, M.J., 13 Rogers, A.R., 129 Roostalu, U., 83, 121 Roy, S.B., 263 Roychoudhury, A.K., 113 Roychoudhury, S., 3, 90, 121, 123 Ruhela, S.P., 218 Russell, R.V., 175, 287, 338 S Saha, K.S., 243 Saha, N., 171, 238, 325, 350 Saha, R.L., 238 Sahoo, S., 151 Santoro, A., 83, 121 Sarkar, B.N., 243 Sarkar, S.S., 336, 373 Sastry, D.B., 238, 248, 323 Sati, V.P., 372 Schönberg, A., 11, 121 Schurr, T.G., 32 Semino, O., 151 Sengupta, S., 3, 164, 231, 414 Seth, P.K., 181 Seth, S., 181 Sharma, G., 4, 83, 121 Shreenath, J., 421 Singh, K.S., 1, 2, 4, 171, 175, 183, 190–192, 197, 198, 204, 206, 209, 210, 213, 218, 225, 233, 237, 238, 244, 247, 252, 255, 259, 264, 268, 271, 274, 280, 283, 287, 292, 293, 302, 307, 311, 330, 332, 338,

431 344, 345, 353, 363, 368, 371, 376, 382, 397, 400, 408, 411, 414, 421 Sirajuddin, S.M., 186 Slatkin, M., 129 Srivastava, A.C., 2 Srivastava, R.P., 400 Su, B., 122, 124 Sukernik, R.I., 83, 121 Summerer, M., 11, 45, 83, 121 Sun, C., 11, 13, 22, 24, 26–28, 121–124 Sylvester, C., 3 T Takahata, N., 128 Tanaka, M., 11, 30–34, 118, 121 Tehrani, N.H., 218 Thangaraj, K., 3, 11, 22, 26, 27, 37, 38, 106, 109, 111, 113, 118, 119, 121–124, 162, 164 Thanseem, I., 102 Thurston, E., 183, 410, 419 Tiwari, S.C., 198, 371 Tiwari, V.K., 255 Trejaut, J.A., 123 U Undevia, J.V., 171 V Vaid, N.K., 198 van der Walt, E.M., 83, 92, 121 van Holst Pellekaan, M., 121 van Holst Pellekaan, S.M., 11, 45 Vasulu, T.S., 3 Vyas, G.N., 332 W Walter, H., 2, 294 Wang, H.W., 83, 121, 124 Watkins, W.S., 118 Weale, M.E., 164 Wen, B., 124 Weninger, M., 338 Y Yao, Y.G., 33, 111, 119 Yao, Y.-G., 29, 118, 124 Z Zhang, Y., 124 Zhao, M., 83, 121