325 95 9MB
German Pages XXVII, 622 [629] Year 2020
Anja Mays · André Dingelstedt Verena Hambauer · Stephan Schlosser Florian Berens · Jürgen Leibold Jan Karem Höhne Hrsg.
Grundlagen – Methoden – Anwendungen in den Sozialwissenschaften Festschrift für Steffen-M. Kühnel
Grundlagen – Methoden – Anwendungen in den Sozialwissenschaften
Anja Mays · André Dingelstedt · Verena Hambauer · Stephan Schlosser · Florian Berens · Jürgen Leibold · Jan Karem Höhne (Hrsg.)
Grundlagen – Methoden – Anwendungen in den Sozialwissenschaften Festschrift für Steffen-M. Kühnel
Hrsg. Anja Mays Ruhr-Universität Bochum Bochum, Deutschland
André Dingelstedt Institut für Qualitätssicherung und Transparenz im Gesundheitswesen (IQTIG) Berlin, Deutschland
Verena Hambauer Methodenzentrum Sozialwissenschaften Universität Göttingen Göttingen, Deutschland
Stephan Schlosser Methodenzentrum Sozialwissenschaften Universität Göttingen Göttingen, Deutschland
Florian Berens Methodenzentrum Sozialwissenschaften Universität Göttingen Göttingen, Deutschland
Jürgen Leibold Methodenzentrum Sozialwissenschaften Universität Göttingen Göttingen, Deutschland
Jan Karem Höhne Universität Mannheim Mannheim, Deutschland RECSM-Universitat Pompeu Fabra Barcelona, Spanien ISBN 978-3-658-15628-2 ISBN 978-3-658-15629-9 (eBook) https://doi.org/10.1007/978-3-658-15629-9 Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.d-nb.de abrufbar. © Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2020 Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung des Verlags. Das gilt insbesondere für Vervielfältigungen, Bearbeitungen, Übersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in elektronischen Systemen. Die Wiedergabe von allgemein beschreibenden Bezeichnungen, Marken, Unternehmensnamen etc. in diesem Werk bedeutet nicht, dass diese frei durch jedermann benutzt werden dürfen. Die Berechtigung zur Benutzung unterliegt, auch ohne gesonderten Hinweis hierzu, den Regeln des Markenrechts. Die Rechte des jeweiligen Zeicheninhabers sind zu beachten. Der Verlag, die Autoren und die Herausgeber gehen davon aus, dass die Angaben und Informationen in diesem Werk zum Zeitpunkt der Veröffentlichung vollständig und korrekt sind. Weder der Verlag, noch die Autoren oder die Herausgeber übernehmen, ausdrücklich oder implizit, Gewähr für den Inhalt des Werkes, etwaige Fehler oder Äußerungen. Der Verlag bleibt im Hinblick auf geografische Zuordnungen und Gebietsbezeichnungen in veröffentlichten Karten und Institutionsadressen neutral. Planung/Lektorat: Katrin Emmerich Springer VS ist ein Imprint der eingetragenen Gesellschaft Springer Fachmedien Wiesbaden GmbH und ist ein Teil von Springer Nature. Die Anschrift der Gesellschaft ist: Abraham-Lincoln-Str. 46, 65189 Wiesbaden, Germany
Vom Nerd zum Direktor des Methodenzentrums und zurück
Festschriften wie Beerdigungen, enthalten normal erweise nur Erfolge und die guten Taten eines Menschen. Zudem haben sie oft den Charakter, dass selten besonders spannende oder provozierende Aufsätze enthalten sind. Dies wäre jedoch genau der falsche Ansatz bei einer Festschrift für Steffen Kühnel.
In der ganzen Zeit seines beruflichen Werdegangs und seiner beruflichen Tätigkeit, hat er sehr konsequent seine Meinung und seine Forschungslinie vertreten. Damit begann er bereits als Student der Sozialwissenschaften an der Universität Hamburg, wo er in den Seminaren sowohl Karl Dieter Opp (Soziologie), wie auch Walter Kristof (Statistik) immer wieder deutlich widersprach. Auf Fotos sowie aufgrund seiner Persönlichkeit könnte man sagen, er wirkt(e) wie ein echter Nerd. Nach dem Studium führte Steffen Kühnels Weg über eine Tätigkeit beim Rechenzentrum in Siegen zur Universität Bremen. Während dieser Zeit lernten wir Steffen im Zusammenhang mit der Auswertung der ZUMA Test-Retest Studie (gemeinsam mit Wolfgang Jagodzinski) kennen und wir wurden über die Jahre Freunde. Gegenstand dieser Studie war die Analyse des drei-welligen-Panels mit alternativen autoregressiven Modellen. „Sklaventreiber“ wären nichts im Vergleich zu einer Zusammenarbeit mit Steffen und Wolfgang. V
VI
Vom Nerd zum Direktor des Methodenzentrums und zurück
Steffen hat diese Haltung stets beibehalten und Nächte, Tage, wochenlanges Arbeiten und Urlaubsverzicht waren in seinen Augen eine durchaus angemessene Arbeitshaltung. Die Zusammenarbeit an der ZUMA Test-Retest Studie führte ebenso wie diejenige am Statistikbuch zu jeweils sehr erfolgreichen Resultaten. Im ersten Fall war es eine englischsprachige Publikation zum „socratic effect“ in Sociological Methods and Research 1987, die der Ausgangspunkt einer Debatte mit Willem W. Saris war und zu weiteren Publikationen (1988–1990) führte, die sich mit den Problemen äquivalenter Modelle bei gleichen Daten auseinandersetzten. Im zweiten Fall resultierte (Jahre später) die Zusammenarbeit mit Steffen Kühnel in einem grundlegenden Statistikbuch für Sozialwissenschaftler, das sich vielfältiger Anwendung in der Lehre erfreut. Über lange Jahre war Steffen ein Guru der Strukturgleichungsmodellierung mit LISREL womit nicht nur seine perfekte Beherrschung des Programms, sondern vor allem seine profunden Kenntnisse der statistischen Grundlagen des Verfahrens gemeint sind. In seiner Dissertation, die unter anderem von Erwin K. Scheuch und Wolfgang Jagodzinski betreut wurde, wendete er die Theorie des geplanten Verhaltens (eine Variante des rational choice Modells) auf die Daten des großen Volkszählungsprojektes von 1987 an und diskutierte die komplexen Modellierungsprobleme. Die Arbeit, mit dem Titel „Zwischen Boykott und Kooperation“, erschien 1993 und gilt immer noch als Standardwerk. In den 80er und 90er Jahren waren in der deutschen Politikwissenschaft komplexe multivariate Verfahren wie Struktur(ver)gleichungsmodelle, Logit und Probit sowie logistische Regression eher atypisch und Fremdkörper. Das hielt Steffen jedoch nicht davon ab, mit großer Energie diese Verfahren besonders im Bereich der politischen Soziologie kontinuierlich anzuwenden. Dabei befasste er sich mit der Messung und Wirkung der links-rechts Orientierung auf die Erklärung von Partizipation, Wahlverhalten und der Einstellung zu Minderheiten. Sein großes Interesse an kategorialen Verfahren führte dann zur Publikation eines wichtigen Lehrbuchs mit dem Titel „Analyse von Tabellen und kategorialen Daten“ (1997), das er gemeinsam mit Hans Jürgen Andreß und Jacques Hagenaars verfasste. Eine Neuauflage ist in Arbeit. Nach seiner Tätigkeit an der Universität Bremen wechselte Steffen Kühnel zum Zentralarchiv für empirische Sozialforschung (damals ZA, heute eine GESIS-Abteilung) in Köln. Seine Tätigkeit dort bot ihm die Möglichkeit, diese Art von Forschung in Verbindung mit Lehrtätigkeiten im Rahmen des Frühjahrsseminars kontinuierlich weiter zu führen. In dieser Zeit nahm er eine Dozententätigkeit bei der Essex Summer School for Social Science Data Analysis and Collection (bis 2000) auf und führte zusätzlich nationale und internationale Kurse
Vom Nerd zum Direktor des Methodenzentrums und zurück
VII
zu Strukturgleichungsmodellen durch. Bei all diesen Aktivitäten – ob Lehre an der Uni oder bei Workshops oder bei der Vorbereitung von Publikationen – leitete ihn sein hoher Anspruch an wissenschaftliche, vor allem mathematische Exaktheit und wir hatten zuweilen unterschiedliche Auffassungen, wobei wir meist zugunsten der Verständlichkeit und er zugunsten der Grundlagen und Komplexität der Argumente plädierten. 1996 wurde Steffen Kühnel als Vertretung auf die Professur für empirische Sozialforschung im Bereich der Politikwissenschaft an die Justus-Liebig-Universität Gießen berufen. Bis zum Ablauf dieser Vertretungsprofessur hat Steffen sich intensiv für das Institut für Politikwissenschaft und den Fachbereich 03 engagiert, indem er den Haushaltsplan des gesamten Fachbereichs strukturierte und organisierte. Danach vertrat Steffen mehrere Professuren unter anderem an der Universität Mannheim. 2000 erhielt er einen Ruf an die Universität Göttingen als Professor für quantitative Methoden der Sozialwissenschaften. Dort baute er in sehr kurzer Zeit und mit hohem persönlichen Einsatz das Methodenzentrum der G eorg-August Universität Göttingen auf und leitet dieses bis heute als Direktor. Mit der Berufung von Steffen Kühnel wurde die Sozialforschung am Methodenzentrum dahin gehend neu ausgerichtet, dass quantitative und qualitative Methoden durch Professuren vertreten sind und die Ausbildung in beiden Gebieten der Sozialforschung gleichermaßen garantiert wird. Dies ist ein wichtiger Schritt zur Überwindung des unfruchtbaren Streits zwischen qualitativ und quantitativ Forschenden, was zu einem großen Teil das Verdienst von Steffen Kühnel ist. Eine weitere wichtige Innovation war seine Beteiligung am interdisziplinären DFG Graduiertenkolleg Statistik der Universität Göttingen, welches einer Reihe von Doktoranden eine hervorragende Entwicklungsmöglichkeit gab und gibt. Wie schon in der Zeit seiner Vertretungsprofessur in Gießen gehörte es für Steffen zu den Pflichten eines Professors, sich intensiv für die akademische Selbstverwaltung einzusetzen. Dies hat er mit dem ihm eigenen Engagement realisiert, indem er über einen längeren Zeitraum das Amt eines Dekans und auch das eines Studiendekans übernahm. Darüber hinaus arbeitete er zehn Jahre als Fachgutachter im Fachkollegium der DFG. Alle die eine dieser Tätigkeiten ausgeübt haben, wissen, wie viel Aufwand und Nerven dies kostet und wie sehr dadurch die für eigene Forschung verfügbare Zeit immer geringer wird. In der Sprache des rational choice-Ansatzes hat Steffen Kühnel wichtige Beiträge für die Erstellung der Kollektivgüter Lehrqualität, Fachbereichsperformance und Forschungsqualität geleistet. Der Nerd, als der uns Steffen einst vorkam, ist längst ein vollkommen un-nerdischer Freund geworden, dessen systematischen
VIII
Vom Nerd zum Direktor des Methodenzentrums und zurück
Arbeitsstil verbunden mit seinen unkonventionellen Denkansätzen und Umgangsformen wir sehr schätzen. Wir hoffen und wünschen, dass Steffen mit der Beendigung seiner Tätigkeiten als Dekan, Studiendekan und Mitglied des Fachkollegiums, die nächsten Jahrzehnte an der Ostsee viele provokante und diskussionsauslösende Artikel schreibt und damit die Rückverwandlung in einen wissenschaftlichen Nerd vollzieht. Im November 2018
Dagmar Krebs Peter Schmidt
Inhaltsverzeichnis
Statistische Grundlagen Notes on Comparative and Causal Analyses Using Loglinear, Logit, Logistic, and Other Effect Coefficients . . . . . . . . . . . . . . . . . . . . . . . 3 Jacques Hagenaars and Hans-Jürgen Andreß Panel Conditioning or SOCRATIC EFFECT REVISITED: 99 Citations, but is there Theoretical Progress?. . . . . . . . . . . . . . . . . . . . . . 25 Peter Schmidt, Maria-Therese Friehs, Daniel Gloris and Hannah Grote Mixture Models in Longitudinal Research Designs. . . . . . . . . . . . . . . . . . . 67 Jost Reinecke Der Mythos von der „gefährlichen“ Multikollinearität bei der Schätzung von Interaktionseffekten. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Jochen Mayerl und Dieter Urban Mit Thomas Müller in die statistische Bildung: Grundvorstellungen und Begriffsbildung am Beispiel der Lagemaße . . . . . . . . . . . . . . . . . . . . . 97 Florian Berens Methoden Soziologisches Erklären. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Nina Baur Ein zweistufiges Modell zur Erklärung sozialen Handelns – Methodologische Grundlagen, statistische Modellierung und Anwendung auf kriminelles Handeln. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Stefanie Eifler und Heinz Leitgöb IX
X
Inhaltsverzeichnis
Die Renaissance der „Unobtrusive Methods“ im digitalen Zeitalter. . . . . 161 Andreas Diekmann Mixed Methods und die Qualität standardisierter Daten. . . . . . . . . . . . . . 173 Bettina Langfeldt, Udo Kelle und Brigitte Metje ‚Quanti‘ und ‚Quali‘ – zwei unversöhnliche Lager oder sich ergänzende Perspektiven? Zur Relevanz des selten und des häufig auftretenden Falls für die Forschung. . . . . . . . . . . . . . . . . . . . . 197 Gabriele Rosenthal und Nicole Witte Fälschungen von Umfragedaten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Jörg Blasius Antwortskalenrichtung und Umfragemodus. . . . . . . . . . . . . . . . . . . . . . . . 231 Dagmar Krebs und Jan Karem Höhne Interest in Science: Response Order Effects in an Adaptive Survey Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Uwe Engel Endgerätespezifische und darstellungsabhängige Bearbeitungszeit- und Antwortverhaltensunterschiede in Webbefragungen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Stephan Schlosser und Henning Silber Verwendung von Zensus-Paradaten unter besonderer Berücksichtigung von Namensverteilungen. . . . . . . . . . . . . . . . . . . . . . . . . 283 Rainer Schnell Harmonisierung von sozio-demografischen Hintergrundvariablen, dargestellt am Beispiel des „privaten Haushalts“. . . . . . . . . . . . . . . . . . . . 293 Jürgen H.P. Hoffmeyer-Zlotnik und Uwe Warner Funktionale Äquivalenz von Messinstrumenten in heterogenen Gesellschaften – Eine Prüfung der Stabilität der Big-Five-Messung im Sozio-oekonomischen Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Jürgen Leibold, Julia Lischewski, Stefan Kanis und Antje Rosebrock
Inhaltsverzeichnis
XI
Antisemitismus und Autoritarismus − Eine traditionell stabile Beziehung? Eine empirische Studie unter Berücksichtigung von Messinvarianz anhand der ALLBUS − Daten 1996/2006/2012/2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Aribert Heyder und Marcus Eisentraut Der Protest diktiert die Mittel. Über Methoden zur Erforschung neuer Protestformationen in liberalen Demokratien. . . . . . . . . . . . . . . . . . 345 Stine Marg Demokratie und Gesellschaftswandel. Zur Bedeutung von Partizipationsforderungen und ihrer Analyse. . . . . . . . . . . . . . . . . . . . . . . 363 Felix Butzlaff Anwendungen Eindeutige Ergebnisse? Methodische Überlegungen und Untersuchungen zur Rolle der Komplexität der Zustandsdefinition in der Sequenzdatenanalyse . . . . . . . . . . . . . . . . . . . . . 379 Okka Zimmermann Zwischen Heuristik und ideologischer Konzeptualisierungsfähigkeit: die Links-Rechts-Dimension und politisches Faktenwissen. . . . . . . . . . . . 399 Bettina Westle Are Individuals Utility Maximizers? Empirical Evidence and Possible Alternative Decision Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . 421 Karl-Dieter Opp Kommunikation von Religion und der Einfluss der Opportunitätsstruktur Eine empirische Analyse der Fidesnachrichten des Vatikans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 Dieter Ohr Enttäuschte Aufstiegshoffnungen? Eine Untersuchung zur Einkommensentwicklung in Deutschland 1995–2015. . . . . . . . . . . . . . . . . 459 Karin Kurz, Jörg Hartmann und Wolfgang Knöbl Einstellungsfunktionen zum freiwilligen Engagement bei den „jungen Alten“ am Beispiel von drei Modellregionen in Niedersachsen. . . . . . . . . 481 Johannes Laukamp, Elisabeth Leicht-Eckardt und Cornelius Frömmel
XII
Inhaltsverzeichnis
Wird man im Alter konservativer? Bundestagswahlen einer Kohorte ehemaliger Gymnasiasten bis zum 56. Lebensjahr. . . . . . . . . . . . . . . . . . . 509 Klaus Birkelbach und Heiner Meulemann Der Einfluss der physischen Attraktivität der Wahlkreiskandidaten bei den Bundestagswahlen 2005, 2009 und 2013 auf das Zweitstimmen-Wahlkreisergebnis ihrer Partei. . . . . . . . . . . . . . . . . . . . . . . . . . . 531 Ulrich Rosar und Markus Klein Soziale und emotionale Dispositionen der AfD-Anhängerschaft. . . . . . . . 547 Anja Mays, Verena Hambauer und Valentin Gold Einstellungen und Verhalten gegenüber geflüchteten Menschen: Ist die räumliche Distanz von Bedeutung?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561 Felix Wolter, Jürgen Schiener und Peter Preisendörfer Ein großer Unterschied mit kleinen Folgen? Einwanderungsskeptische Einstellungen von Frauen und Männern im Zeitverlauf. . . . . 579 Claudia Diehl, Michael Blohm und Daniel Degen Trendumkehr oder unvermeidlicher Niedergang? Sozialdemokratische Reformprozesse und Krisenerscheinungen in Deutschland, Österreich und Schweden. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605 Jens Gmeiner und Matthias Micus
Herausgeber- und Autorenverzeichnis
Über die Herausgeber Dr. Anja Mays Sektion für Soziologie, Ruhr-Universität Bochum, Bochum, Deutschland Dr. André Dingelstedt Institut für Qualitätssicherung und Transparenz im Gesundheitswesen (IQTIG), Berlin, Deutschland Verena Hambauer Methodenzentrum Göttingen, Göttingen, Deutschland
Sozialwissenschaften,
Universität
Stephan Schlosser Methodenzentrum Göttingen, Göttingen, Deutschland
Sozialwissenschaften,
Universität
Florian Berens Methodenzentrum Sozialwissenschaften, Universität Göttingen, Göttingen, Deutschland Dr. Jürgen Leibold Methodenzentrum Göttingen, Göttingen, Deutschland
Sozialwissenschaften,
Universität
Dr. Jan Karem Höhne Universität Mannheim, Mannheim, Deutschland; RECSM-Universitat Pompeu Fabra, Barcelona, Spanien
XIII
XIV
Herausgeber- und Autorenverzeichnis
Autorenverzeichnis Hans-Jürgen Andreß was Professor Emeritus, Methods of Empirical Social and Economic Research, University of Cologne, Germany. Research interests: social research, statistics, and multivariate methods, social inequality, labor market research, social and family policy. Nina Baur ist Professorin für Methoden der empirischen Sozialforschung an der Technischen Universität Berlin, Ko-Leiterin des „Global Center of Spatial Methods for Urban Sustainability“ (GCSMUS) und Past President des „Research Committee on Logic and Methodology in Sociology“ (RC33) der International Sociology Association (ISA). Arbeitsgebiet und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung (insbesondere prozessorientierte Methodologie, Methoden der Raumforschung und Mixed Methods), Soziologie der Prozesse, Innovationen und Risiken, Raumsoziologie, Marktsoziologie. Kontaktadresse: [email protected] Florian Berens ist wissenschaftlicher Mitarbeiter am Methodenzentrum der Universität Göttingen am Lehrstuhl für Quantitative Sozialforschung. Arbeitsgebiet und Forschungsschwerpunkte: Einstellungen und Überzeugungen von Studierenden in und zu ihrer methodischen und statistischen Ausbildung, Probleme mathematischer und hochschulischer Didaktik, Messung und Effekte von Multitasking in Onlineumfragen. Kontaktadresse: [email protected] Klaus Birkelbach ist Prof. (apl.) für Soziologie an der Universität Duisburg-Essen. Arbeitsgebiete und Forschungsschwerpunkte: neben der allgemeinen Soziologie u. a. empirische Sozialforschung, Bildungssoziologie, Soziologie des Lebenslaufs. Kontaktadresse: [email protected] Jörg Blasius ist seit 2001 Professor im Institut für Politische Wissenschaft und Soziologie, Abt. Soziologie, der Universität Bonn. Arbeitsgebiet und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung, der angewandten Statistik (hier insbesondere in den Bereichen Korrespondenzanalyse und Fälschungen von Umfragedaten), der Stadtsoziologie, sowie der sozialen Ungleichheit und der Lebensstile. Kontaktadresse: [email protected] Michael Blohm ist seit 2000 Mitarbeiter bei GESIS – Leibniz-Institut für Sozialwissenschaft. Studium der Soziologie in Mannheim. Von 1999 bis 2000 wissenschaftlicher Mitarbeiter am Mannheimer Zentrum für Europäische
Herausgeber- und Autorenverzeichnis
XV
Sozialforschung (MZES). Arbeitsgebiet und Forschungsschwerpunkte: Methoden der Umfrageforschung. Kontaktadresse: [email protected] Felix Butzlaff ist Universitätsassistent am Institut für Gesellschaftswandel und Nachhaltigkeit (IGN) an der Wirtschaftsuniversität Wien. Studium der Politikwissenschaften, VWL und Öffentliches Recht an den Universitäten Göttingen und Santiago de Chile. Arbeitsgebiet und Forschungsschwerpunkte: Demokratieforschung, aktuelle und historische Entwicklung von Gesellschaften, Parteien und Parteiensystemen, Bürgerproteste. Kontaktadresse: [email protected] Daniel Degen ist seit 2015 wissenschaftlicher Mitarbeiter an der Universität in Konstanz. Studium der Soziologie und der Philosophie in Konstanz. Arbeitsgebiet und Forschungsschwerpunkte: Wohlfahrtsstaatseinstellungen (insbesondere von Migranten), Migration und Integration, fremdenfeindliche Einstellungen, Methoden der empirischen Sozialforschung. Kontaktadresse: [email protected] Claudia Diehl ist seit 2013 Professorin für Soziologie an der Universität Konstanz. Promotion in Mannheim. Forschungsschwerpunkte: Migration, Integration von Einwanderern, ethnische Grenzziehungen und Diskriminierung. Kontaktadresse: [email protected] Andreas Diekmann ist Seniorprofessor für Soziologie an der Universität Leipzig und Professor em. der ETH Zürich. Zuvor war er Direktor des Instituts für Soziologie der Universität Bern (1990–2003) und von 2003 bis 2016 Professor an der ETH Zürich. 2017/18 war er Fellow am Wissenschaftskolleg in Berlin. Er forscht über soziale Dilemmas, Kooperation und soziale Normen und hat mit Unterstützung des Schweizerischen Nationalfonds (SNF) Maßnahmen zur Reduktion fossilen Energieverbrauchs und Umweltbelastungen in urbanen Ballungsgebieten untersucht. Arbeitsgebiet und Forschungsschwerpunkte: experimentelle Spieltheorie, Theorien sozialer Kooperation, sozialwissenschaftliche Umweltforschung, Methodik empirischer Sozialforschung. Kontaktadresse: [email protected] André Dingelstedt ist Projektleiter am Fachbereich Befragung des Instituts für Qualitätssicherung und Transparenz im Gesundheitswesen (IQTIG). Er war von 2008 bis 2016 Lehrkraft für besondere Aufgaben sowie wissenschaftlicher Mitarbeiter am Lehrstuhl für Quantitative Methoden des Methodenzentrums Sozialwissenschaften an der Universität Göttingen. Studium der Sozialwissenschaften mit anschließender Promotion an der Universität Göttingen. Arbeitsgebiet und
XVI
Herausgeber- und Autorenverzeichnis
Forschungsschwerpunkte: Antwortqualität, Incentives, Kausalität, Methoden der empirischen Sozialforschung. Stefanie Eifler ist seit 2013 Professorin für Soziologie und empirische Sozialforschung an der Katholischen Universität Eichstätt-Ingolstadt. Studium der Soziologie, Psychologie und Erziehungswissenschaft an den Universitäten Bonn, Köln und Bielefeld, Promotion und Habilitation an der Fakultät für Soziologie der Universität Bielefeld; 2009–2013 Professorin für Quantitative Methoden der Sozialwissenschaften an der Universität Halle-Wittenberg; Forschungsschwerpunkte: Handlungstheorie, Kriminalsoziologie, Experimentelle Methoden, Messen in den Sozialwissenschaften. Kontaktadresse: [email protected] Marcus Eisentraut ist wissenschaftlicher Mitarbeiter bei „GESIS – Leibniz-Institut für Sozialwissenschaften“, Abteilung: Datenarchiv für Sozial wissenschaften. Studium der Politikwissenschaft in Marburg. Arbeitsgebiet und Forschungsschwerpunkte: Vorurteilsforschung, Autoritarismusforschung. Kontaktadresse: [email protected] Uwe Engel ist seit April 2000 Inhaber der Professur für Soziologie mit dem Schwerpunkt Statistik und empirische Sozialforschung am Fachbereich Sozialwissenschaften, und Leiter des dortigen Methodenzentrums. Nach dem Studium der Erziehungswissenschaften, Soziologie und Psychologie erfolgte die Promotion zum Dr. phil. an der Universität Hannover und die Habilitation an der Universität Duisburg (Venia legendi für Soziologie). Von 1981 bis 1986 wissenschaftlicher Mitarbeiter an der Universität Hannover. Danach von 1986 bis 1988 wissenschaftlicher Mitarbeiter am Sonderforschungsbereich 227 Prävention und Intervention im Kindes- und Jugendalter der Universität Bielefeld. Wissenschaftlicher Assistent, zunächst in Bielefeld, dann an der Universität Duisburg von 1988 bis 1994. Anschließend Wechsel an die Technische Universität Chemnitz-Zwickau. Vertretung der Professur für Sozialstrukturanalyse an der Universität Potsdam 1994 bis 1995, danach Übernahme dieser Professur bis zum Wechsel an die Universität Bremen. Arbeitsgebiet und Forschungsschwerpunkte: Survey Methodologie, angewandte Sozialforschung, Datenanalyse, Computational Social Science. Kontaktadresse: [email protected] Maria-Therese Friehs is a research associate at the University of K oblenzLandau, Germany. She currently pursues her PhD on stereotyping, and holds B.Sc. and M.Sc. (with focus on Clinical and Social Psychology) from the Philipps-Universität Marburg. Her main areas of interest are the study of
Herausgeber- und Autorenverzeichnis
XVII
intergroup relations, with a focus on intergroup contact and stereotypes, and advanced data analysis. Kontaktadresse: [email protected] Cornelius Frömmel war von 2012 bis 2016 Gründungsprofessor für Orthobionik an der Universität Göttingen. Studium der Medizin in Berlin mit anschließender Promotion. Danach Stipendium mit einjährigen Aufenthalt am Max-Planck-Institut für medizinische Forschung in Heidelberg. Von 1988 bis 2005 verantwortlich für die institutionelle Entwicklung der Forschung an der Charité. 1994 Berufung als Professor für Biochemie (Schwerpunkt Proteinstrukturtheorie und Bioinformatik) an die Humboldt-Universität Berlin. Ab 2005 Dekan und Vorstandssprecher der Universitätsmedizin Göttingen. Gründungsvorsitzender der Gesundheitsregion Göttingen. Kontaktadresse: [email protected] Daniel Georg Gloris is lecturer at the TU Dortmund, Germany. He currently pursues his Doctoral Degree on political cohesion and holds a M.A. in Political Science and a M.A. in Philosophy from the Philipps-Universität Marburg. His main areas of research are empirical Social Research and Political Theory/ Philosophy. Kontaktadresse: [email protected] Jens Gmeiner ist seit 2017 wissenschaftlicher Mitarbeiter am Göttinger Institut für Demokratieforschung. Studium der Politikwissenschaften und der Skandinavischen Philologie in Göttingen und Schweden. Von 2012 bis 2015 wissenschaftliche Hilfskraft am Göttinger Institut für Demokratieforschung. Von 2013 bis 2017 Promotionsstipendiat der Friedrich-Ebert-Stiftung. Arbeitsgebiet und Forschungsschwerpunkte: Parteienforschung, politische Systeme und Kulturen Skandinaviens. Kontaktadresse: [email protected] Valentin Gold ist Akademischer Rat (a. Zt.) am Methodenzentrum Sozialwissenschaften der Universität Göttingen. Studium der Politik- und Verwaltungswissenschaften an der Universität Konstanz. Arbeitsgebiet und Forschungsschwerpunkte: Quantitative Textanalyse, Deliberative Kommunikation, kollektive Entscheidungsfindung, Methoden der empirischen Sozialforschung. Kontaktadresse: [email protected] Hannah Grote graduated 2018 in Psychology at the Philipps-University of Marburg (B.Sc.). Currently she is studying Psychology at the Philipps-University of Marburg (M.Sc.) In 2017 student assistant in the team of
XVIII
Herausgeber- und Autorenverzeichnis
Peter Schmidt (Prof. i. R.). From 2017 to 2020 student assistant at the Kinder- und Jugendlichen-Psychotherapie-Ambulanz. Kontaktadresse: [email protected] Jacques Hagenaars is emeritus professor of Methodology of the Social Sciences at Tilburg University, Faculty of Social and Behavioral Sciences. Research interests: categorical data analysis, longitudinal research, loglinear modeling, latent class analysis. Kontaktadresse: [email protected] Verena Hambauer ist wissenschaftliche Mitarbeiterin an der Sektion für Soziologie an der Ruhr-Universität Bochum. Zuvor war sie u.a. wissenschaftliche Mitarbeiterin am Methodenzentrum derUniversität Göttingen am Lehrstuhl für Quantitative Sozialforschung. Ihre Arbeits- und Forschungsgebiete sind: Politische Soziologie, Psychologie und Sozialisationsforschung, Einstellungs- und Werteforschung, Partizipationsforschung, Sozialstrukturanalyse Geschlechter- und Migrationsforschung sowie Methoden der empirischen Sozialforschung. Kontaktadresse: [email protected] Jörg Hartmann ist seit 2009 wissenschaftlicher Mitarbeiter an der Universität Göttingen. Nach dem Studium der Soziologie, Mathematik und Betriebswirtschaftslehre in Leipzig 2003–2009 folgte 2016 die Promotion in Göttingen. Arbeitsgebiet Forschungsschwerpunkte: Migration und Integration, Arbeitsmarkt, Lebensläufe und soziale Ungleichheiten. Kontaktadresse: [email protected] Aribert Heyder ist Akademischer Oberrat am Institut für Politikwissenschaft der Universität Marburg. Arbeitsgebiet und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung, Antisemitismus, Rechtsextremismus, Einstellungs-/Vorurteilsforschung. Kontaktadresse: [email protected] Jürgen H.P. Hoffmeyer-Zlotnik war Leiter der Stabsabteilung Wissensvermittlung von GESIS – Leibnizinstitut für Sozialwissenschaften in Mannheim und ist apl. Professor an der Justus-Liebig-Universität Gießen. Arbeitsgebiet und Forschungsschwerpunkte: Standardisierung und Harmonisierung sozio-demografischer Variablen für den Vergleich nationaler und internationaler sozialwissenschaftlicher Umfragen. Kontaktadresse: [email protected] Jan Karem Höhne ist Postdoc am Sonderforschungsbereich 884 „Politische Ökonomie von Reformen“ an der Universität Mannheim und permanenter
Herausgeber- und Autorenverzeichnis
XIX
Gastwissenschaftler am „Research and Expertise Centre for Survey Methodology (RECSM)“ an der Universitat Pompeu Fabra (Barcelona). Arbeitsgebiet und Forschungsschwerpunkte: Umfragemethodologie, Onlineumfragendesign, passive Datenerhebungstechniken und Eyetracking. Kontaktadresse: [email protected] Stefan Kanis ist seit 2016 wissenschaftlicher Mitarbeiter am Institut für interdisziplinäre Konflikt- und Gewaltforschung (IKG) der Universität Bielefeld. Studium der Soziologie an der Georg-August-Universität Göttingen. Arbeitsgebiete und Forschungsschwerpunkte: Jugendgewalt, islamistische Einstellungen und Radikalisierung Jugendlicher, Cross-National Homicide Studies – Makro-Level Determinanten von Gewalt, kriminologische Dunkelfeldforschung. Kontaktadresse: [email protected] Udo Kelle ist seit 2010 Professor für Methoden der empirischen Sozialforschung und Statistik an der Fakultät für Geistes- und Sozialwissenschaften der Helmut-Schmidt-Universität/Universität der Bundeswehr Hamburg. Nach dem Studium der Soziologie und der Psychologie in Hannover, Bielefeld und Bremen (Dipl.-Psych.) erfolgte die Promotion an der Universität Bremen (Dr. phil.) und die Habilitation an der Universität Bremen. Von 1989 bis 1997 wissenschaftlicher Mitarbeiter am Sonderforschungsbereich 286 der DFG („Statuspassagen und Risikolagen im Lebensverlauf“) an der Universität Bremen, dazwischen 1996 Visiting Research Fellow am Department of Sociology der University of Surrey, Großbritannien. Von 1997 bis 2005 wissenschaftlicher Mitarbeiter und Dozent am Institut für Interdisziplinäre Gerontologie der Universität in Vechta, dazwischen 2001 Vertretungsprofessor für qualitative Forschungsmethoden an der Fakultät für Soziologie der Universität Bielefeld. Von 2005 bis 2010 Professor für Methoden empirischer Sozialforschung am Institut für Soziologie der Philipps-Universität Marburg. Arbeitsgebiet und Forschungsschwerpunkte: Empirische Sozialforschung, ihre wissenschafts- und sozialtheoretischen Grundlagen und Methodenprobleme. Kontaktadresse: [email protected] Markus Klein ist seit 2008 Inhaber des Lehrstuhls für Politische Soziologie und Politische Sozialstrukturanalyse der Gottfried Wilhelm Leibniz–Universität Hannover. Nach dem Studium der Politikwissenschaft, Publizistik und Pädagogik (M.A.) sowie der Volkswirtschaftslehre (Dipl.) in Mainz erfolgte die Promotion und Habilitation in Köln. Von 1993 bis 1997 wissenschaftlicher Mitarbeiter am
XX
Herausgeber- und Autorenverzeichnis
Institut für Politikwissenschaft der Johannes Gutenberg–Universität Mainz. Von 1997 bis 2008 wissenschaftlicher Mitarbeiter und wissenschaftlicher Assistent am Zentralarchiv für Empirische Sozialforschung der Universität zu Köln. Arbeitsgebiet und Forschungsschwerpunkte: Wahl-, Werte- und Attraktivitätsforschung. Kontaktadresse: [email protected] Wolfgang Knöbl ist seit 2015 Direktor des Hamburger Instituts für Sozialforschung und seit 2017 nebenberuflicher Professor für politische Soziologie und Gewaltforschung an der Leuphana Universität Lüneburg. Nach dem Studium der Soziologie, der Politikwissenschaft und der Neueren Geschichte in Erlangen-Nürnberg und Aberdeen erfolgte die Promotion und Habilitation an der Freien Universität Berlin. Von 1990 bis 1995 wissenschaftlicher Mitarbeiter und von 1997 bis 2002 wissenschaftlicher Assistent am John F-Kennedy-Institut für Nordamerikastudien der Freien Universität Berlin. Danach von 2002 bis 2015 Professor für international vergleichende Sozialwissenschaft an der Universität Göttingen. Kontaktadresse: [email protected] Dagmar Krebs ist Professorin (im Ruhestand) für Methoden der empirischen Sozialforschung und Statistik an der Justus-Liebig-Universität Gießen. Arbeitsgebiet und Forschungsschwerpunkte: Umfragemethodologie sowie Antwortformat und Antwortverhalten. Kontaktadresse: [email protected] Karin Kurz ist seit 2008 Professorin an der Universität Göttingen. Nach dem Studium der Soziologie in Mannheim und Madison, Wisconsin (USA) erfolgte die Promotion in Mannheim und die Habilitation in Bamberg. Von 1986 bis 1989 und 1991 bis 2006 wissenschaftliche Mitarbeiterin an den Universitäten Mannheim, Bremen, Bielefeld und Bamberg. Professorin für Soziologie von 2006– 2008 an der Universität Leipzig. Arbeitsgebiet und Forschungsschwerpunkte: Lebensläufe, Bildung, Arbeitsmarkt, Familie und soziale Ungleichheiten. Kontaktadresse: [email protected] Bettina Langfeldt ist seit 2018 Vertretungsprofessorin für Methoden der empirischen Sozialforschung am Fachbereich Gesellschaftswissenschaften der Universität Kassel. Nach dem Studium der Politikwissenschaft, der Soziologie, der Psychologie und der Rechtswissenschaft an der Philipps-Universität Marburg (Dipl.-Pol.) erfolgte die Promotion an der Justus-Liebig-Universität Gießen (Dr. rer. soc). Von 1997 bis 2000 wissenschaftliche Mitarbeiterin bei GESIS in Mannheim. Von 2000 bis 2007 wissenschaftliche Mitarbeiterin am Institut für Soziologie der Justus-Liebig-Universität Gießen. Seit 2007 zunächst Lehrkraft
Herausgeber- und Autorenverzeichnis
XXI
für besondere Aufgaben, danach wissenschaftliche Mitarbeiterin (Postdoc) an der Fakultät für Geistes- und Sozialwissenschaften der Helmut-Schmidt-Universität/ Universität der Bundeswehr Hamburg. Arbeitsgebiet und Forschungsschwerpunkte: Methoden empirischer Sozialforschung, empirische Bildungs- und Hochschulforschung, geschlechtersensible Arbeits- und Organisationssoziologie. Kontaktadresse: [email protected] Johannes Laukamp ist seit 2013 tätig im Wissens- und Technologie-Transfer der Osnabrücker Hochschulen, Tätigkeitsschwerpunkt Fördermanagement. Studium der Physiotherapie (B.Sc.) und Management im Gesundheitswesen (M.A.) an der Hochschule Osnabrück. Anschließend von 2009 bis 2013 wissenschaftlicher Mitarbeiter an der Hochschule Osnabrück. Arbeitsgebiet und Forschungsschwerpunkte: Einsamkeit und Ehrenamt im Alter, evidenzbasierte Praxis, Gesundheitsförderung und Prävention. Kontaktadresse: [email protected] Jürgen Leibold ist wissenschaftlicher Mitarbeiter am Methodenzentrum der Universität Göttingen am Lehrstuhl für Quantitative Sozialforschung. Arbeitsgebiet und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung, Migrations- und Vorurteilsforschung. Kontaktadresse: [email protected] Elisabeth Leicht-Eckardt hat seit 2013 eine haushaltswissenschaftliche Eckprofessur für die Lehramtsausbildung Ökotrophologie für Berufsbildende Schulen mit dem Schwerpunkt Dienstleistungsmanagement inne. Studium Ökotrophologie (Haushalts- und Ernährungswissenschaften) an der TU München – Weihenstephan. Promotion über haushaltsspezifische und physiologische Aspekte des Wohnens. Von 1991 bis 1996 Professorin für Haushaltswissenschaften, Schwerpunkt Großhaushalt, an der FH Fulda. Von 1996 bis 2013 Professorin für Haushalts- und Wohnökologie an der Hochschule Osnabrück, Fakultät Agrarwissenschaften und Landschaftsarchitektur, im Studienprogramm Ökotrophologie, vor allem für Bauen und Wohnen und Arbeitswissenschaften. Initiatorin und zunächst wissenschaftliche Leiterin des WABE-Zentrums, des ökotrophologischen Lehr- und Versuchsbetriebs der Hochschule Osnabrück. Seit 2000 Sprecherin des Arbeitskreises „Wohnen und Leben im Alter“ der Lokalen Agenda 21 der Stadt Osnabrück. Kontaktadresse: [email protected] Heinz Leitgöb ist seit 2017 akademischer Rat a. Z. am Lehrstuhl für Soziologie und empirische Sozialforschung der Universität Eichstätt-Ingolstadt. Studium der
XXII
Herausgeber- und Autorenverzeichnis
Soziologie und Promotion an der Johannes Kepler Universität Linz, Österreich; Forschungsschwerpunkte: Surveymethodologie, nichtlineare Modelle, Soziologie des abweichenden Verhaltens, Handlungstheorie Kontaktadresse: [email protected] Julia Lischewski ist wissenschaftliche Mitarbeiterin am Methodenzentrum der Universität Göttingen am Lehrstuhl für Quantitative Sozialforschung. Arbeitsgebiet und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung, Bildungsforschung (insbesondere Weiterbildung), Vorurteilsforschung. Kontaktadresse: [email protected] Stine Marg ist geschäftsführende Leiterin des Göttinger Instituts für Demokratieforschung. Sie absolvierte ein Studium der Politikwissenschaft sowie Mittleren und Neueren Geschichte. Arbeitsgebiet und Forschungsschwerpunkte: politische Kulturforschung und Analyse politischer Deutungsmuster und Demokratievorstellungen, der Protestforschung sowie den Methoden der qualitativen Sozialforschung. Kontaktadresse: [email protected] Jochen Mayerl ist Univ.-Professor für Soziologie mit Schwerpunkt Empirische Sozialforschung am Institut für Soziologie der Technischen Universität Chemnitz. Arbeitsgebiet und Forschungsschwerpunkte: Moderatoren und Mediatoren der Einstellungs-Verhaltens-Relation, Umfrageforschung und Strukturgleichungsmodellierung. Kontaktadresse: [email protected] Anja Mays ist wissenschaftliche Mitarbeiterin an der Sektion für Soziologie an der Ruhr-Universität Bochum. Zuvor war sie u.a. wissenschaftliche Mitarbeiterin am Methodenzentrum der Universität Göttingen am Lehrstuhl für Quantitative Sozialforschung. Ihre Arbeits- und Forschungsgebiete sind: Politische Soziologie, Psychologie und Sozialisationsforschung, Einstellungs- und Werteforschung, Partizipationsforschung, Sozialstrukturanalyse Geschlechter- und Migrationsforschung sowie Methoden der empirischen Sozialforschung. Kontaktadresse: [email protected] Brigitte Metje ist seit 2010 wissenschaftliche Mitarbeiterin (Postdoc) an der Helmut-Schmidt-Universität Hamburg. Nach dem Studium der Gerontologie an der Hochschule Vechta (Dipl.-Gerontologin) erfolgte die Promotion an der Philipps-Universität Marburg (Dr. phil.). 2005 wissenschaftliche Mitarbeiterin am Institut für Erziehungswissenschaften an der Hochschule Vechta.
Herausgeber- und Autorenverzeichnis
XXIII
Von 2005 bis 2010 wissenschaftliche Mitarbeiterin am Institut für Soziologie der Philipps-Universität Marburg. Arbeitsgebiet und Forschungsschwerpunkte: Validitätsprobleme von Lehrveranstaltungsevaluationen, Mixed-Methods in der Evaluationsforschung. Kontaktadresse: [email protected] Heiner Meulemann war bis 2013 Professor für Soziologie an der Universität zu Köln. Von 2000 bis 2010 war er im Projektleiterteam des deutschen Teils des European Social Survey. Arbeitsgebiete und Forschungsschwerpunkte: Allgemeine Soziologie, Bildungssoziologie, Sozialer Wandel, Lebenslaufforschung, Internationaler Vergleich. Kontaktadresse: [email protected] Matthias Micus ist seit 2016 wissenschaftlicher Leiter der Forschungsstelle Fodex. Studium der Politikwissenschaft, Soziologie und Mittleren und Neueren Geschichte in Göttingen. Anschließend Promotion in Göttingen. Von 2010 bis 2016 Akademischer Rat am Institut für Demokratieforschung. Arbeitsgebiete und Forschungsschwerpunkte: Parteien, politische Führung, Radikalismus. Kontaktadresse: [email protected] Dieter Ohr ist seit 2006 Universitätsprofessor, „Methoden der empirischen Sozialforschung“, an der Freien Universität Berlin, Fachbereich Politik- und Sozialwissenschaften. Nach dem Studium der Politikwissenschaft und Volkswirtschaftslehre in Bamberg und Gießen erfolgte die Promotion in Köln. Von 1989 bis 1993 wissenschaftlicher Mitarbeiter, Justus-Liebig-Universität Gießen; von 1991 bis 1993 Stipendiat der Studienstiftung des Deutschen Volkes (Promotionsstipendium) und von 1993 bis 2006 Wissenschaftlicher Mitarbeiter/Universitätsassistent an der Universität zu Köln. Arbeitsgebiet und Forschungsschwerpunkte: Empirische Wahlforschung, Religionssoziologie. Kontaktadresse: [email protected] Karl-Dieter Opp ist Professor Emeritus an der Universität Leipzig und Affiliate Professor an der University of Washington (Seattle). Mitglied der European Academy of Sociology, der European Academy of Sciences and Arts und der Akademie für Soziologie. Arbeitsgebiet und Forschungsschwerpunkte: Soziologische Theorie, kollektives Handeln und politischer Protest, Normen und Institutionen, abweichendes Verhalten und Methodologie der Sozialwissenschaften. Kontaktadresse: [email protected]
XXIV
Herausgeber- und Autorenverzeichnis
Peter Preisendörfer ist Professor für Soziologie am Institut für Soziologie der Johannes Gutenberg-Universität Mainz. Arbeitsgebiete und Forschungsschwerpunkte: Organisationsforschung, Entrepreneurship, sozialwissenschaftliche Umweltforschung, Methoden der empirischen Sozialforschung. Kontaktadresse: [email protected] Jost Reinecke ist seit 2004 Professor für Quantitative Methoden der empirischen Sozialforschung in der Fakultät für Soziologie der Universität Bielefeld. Nach dem Studium der Soziologie, Geschichte und Sozialpädagogik an der Universität-GH-Duisburg erfolgte die Promotion in Gießen und die Habilitation in Münster. Von 1990 bis 2000 wissenschaftlicher Mitarbeiter an der Universität Münster. Zwischen 1997 und 2001 Vertretungsprofessuren an den Universitäten Dresden und Münster. Danach von 2001 bis 2004 Professor für Methoden der empirischen Sozialforschung an der Universität Trier. Arbeitsgebiet und Forschungsschwerpunkte: Rational-Choice Theorien in den Sozialwissenschaften: Theoretische und empirische Bedeutung, Methodologie und Anwendung von Klassifikations- und Strukturgleichungsmodellen im Querschnitt und Längsschnitt, Verfahren zur mehrfachen Ersetzung von fehlenden Werten in komplexen Datensätzen, Entwicklung der Jugendkriminalität im Längsschnitt. Kontaktadresse: [email protected] Ulrich Rosar ist seit 2010 Inhaber des Lehrstuhls Soziologie II am Institut für Sozialwissenschaften der Heinrich-Heine-Universität Düsseldorf. Nach dem Studium der Soziologie, der Politikwissenschaft und der Psychologie in Düsseldorf erfolgte die Promotion in Bamberg und die Habilitation in Köln. Von 1995 bis 2010 wissenschaftlicher Mitarbeiter und Geschäftsführer am Forschungsinstitut für Soziologie der Universität zu Köln. Arbeitsgebiete und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung, Ungleichheitsforschung und politische Soziologie. Kontaktadresse: [email protected] Antje Rosebrock ist wissenschaftliche Mitarbeiterin am Mannheimer Zentrum für Europäische Sozialforschung (MZES) und Promotionsstudentin des Center for Doctoral Studies in Social and Behavioral Sciences (CDSS) an der Graduate of Economic and Social Sciences (GESS) der Universität Mannheim. Arbeitsgebiet und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung (insbesondere Surveymethoden und interkulturelle empirische Forschung), Berufskodierung, Migrationssoziologie. Kontaktadresse: [email protected]
Herausgeber- und Autorenverzeichnis
XXV
Gabriele Rosenthal ist Soziologin und seit 2001 Professorin für Qualitative Methoden am Methodenzentrum der Sozialwissenschaftlichen Fakultät der Universität Göttingen. Die geografischen Schwerpunkte ihrer Forschungsprojekte und Gastdozenturen lagen oder liegen u. a. in Israel, Palästina, Brasilien, Ghana, Uganda und den spanischen Exklaven. Neben ihren methodischen Schwerpunkten im Bereich der Biografie- und Generationenforschung konzentriert sich ihre Forschung auf die Themenfelder Migration, Ethnizität, sozio-politische Konflikte, kollektive Gewalt und kollektive Traumabearbeitung. Kontaktadresse: [email protected] Jürgen Schiener ist Akademischer Direktor als Lehrkraft für besondere Aufgaben am Institut für Soziologie der Johannes Gutenberg-Universität Mainz. Arbeitsgebiete und Forschungsschwerpunkte: Soziale Ungleichheiten, Bildung, Ausbildung und Arbeitsmarkt, Kompetenzmessung, Methoden der Datenerhebung und -analyse. Kontaktadresse: [email protected] Stephan Schlosser ist wissenschaftlicher Mitarbeiter und Studiengangsbeauftragter am Methodenzentrum Sozialwissenschaften der Georg-August-Universität Göttingen. Arbeitsgebiete und Forschungsschwerpunkte: Web-Survey-Design und passive Datenerhebung. Kontaktadresse: [email protected] Peter Schmidt is emeritus professor at the University of Giessen. He studied Sociology, Statistics and Philosophy of Science at the University of Mannheim. Ph.D. in Sociology and Philosophy of Science at the University of Mannheim (1977). Research Associate at the University of Mannheim from 1970 to 1972. Lecturer in Sociology and Social Research at the University of Hamburg from 1972 to 1979. Project Director at ZUMA (now GESIS Mannheim) for the first general social survey (ALLBUS) in Germany (1979–1981). Professor for empirical research and methodology at the University of Giessen from 1981 to 1993 and from 2000 to 2008. Program Director for societal Monitoring at GESIS Mannheim from 1994 to 2000. Codirector of the International Laboratory of Socio-Cultural Research at the State Research University Higher School of Economics (HSE) in Moscow (2011 to 2013). Honorary Humboldt Research Fellow at the Cardinal Wysczinski University Warsaw (2015–2018). Kontaktadresse: [email protected]
XXVI
Herausgeber- und Autorenverzeichnis
Rainer Schnell ist seit 2008 Inhaber des Lehrstuhls für Sozialwissenschaftliche Methoden/Empirische Sozialforschung im Fachbereich Gesellschaftswissenschaften der Universität Duisburg-Essen. Zuvor war er von 1997 bis 2008 Professor für Methoden der empirischen Politik- und Verwaltungsforschung an der Universität Konstanz und von 2015–2017 Direktor des Centres for Comparative Social Surveys an der City University London. Seit 2017 ist er zusätzlich Adjunct Professor, Faculty of Health Sciences an der Curtin University, Perth (Australien). Arbeitsgebiet und Forschungsschwerpunkte: Stichprobenkonstruktion, Nonresponse und Erhebungsprobleme bei Befragungen und Zensen sowie die Entwicklung kryptografischer Verfahren zur Zusammenführung administrativer Daten. Kontaktadresse: [email protected] Henning Silber ist seit 2015 wissenschaftlicher Mitarbeiter bei GESIS – Leibniz-Institut für Sozialwissenschaften in Mannheim. Seit 2017 ist er dort Leiter des Survey Operations Teams. Studium der Soziologie und der Deutsche Philologie an der Georg-August-Universität Göttingen. Promotion in Sozialwissenschaften an der Georg-August-Universität Göttingen. Forschungsaufenthalt von 2012–2014 an der Stanford University in Kalifornien, USA. Kontaktadresse: [email protected] Dieter Urban ist Univ.-Professor für Soziologie am Institut für Sozialwissenschaften der Universität Stuttgart. Arbeitsgebiet und Forschungsschwerpunkte: theoretische und statistische Modellierung von sozialen Strukturen und Prozessen. Kontaktadresse: [email protected] Uwe Warner war wissenschaftlicher Mitarbeiter am Centre d’Etudes de Populations, de Pauvreté et de Politiques Socio Economiques/International Network for Studies in Technology, Environment, Alternatives, Development in Luxemburg und ist Gastwissenschaftler am Methodenzentrum Sozialwissenschaften (MZS) an der Georg-August Universität Göttingen. Arbeitsgebiet und Forschungsschwerpunkte: Standardisierung und Harmonisierung sozio-demografischer Variablen für den Vergleich nationaler und internationaler sozialwissenschaftlicher Umfragen. Kontaktadresse: [email protected] Bettina Westle ist Professorin für Methoden der Politikwissenschaft und Empirische Demokratieforschung an der Philipps-Universität Marburg. Arbeitsgebiet und Forschungsschwerpunkte: politische Kognitionen und Einstellungen,
Herausgeber- und Autorenverzeichnis
XXVII
politische Kultur, nationale und europäische Identität, Wahl- und Partizipationsforschung. Kontaktadresse: [email protected] Nicole Witte ist Soziologin und wissenschaftliche Mitarbeiterin am Methodenzentrum der sozialwissenschaftlichen Fakultät der Georg-August Universität Göttingen. Neben ihren methodischen Schwerpunkten im Bereich der Biografieforschung sowie der Methodenkombinationen konzentriert sich ihre aktuelle stadtsoziologische Forschung auf Zugehörigkeitskonstruktionen sowie In- und Exklusionsprozesse in urbanen Räumen – insbesondere im Mittleren Osten. Kontaktadresse: [email protected] Felix Wolter ist wissenschaftlicher Mitarbeiter am Institut für Soziologie der Johannes Gutenberg-Universität Mainz. Arbeitsgebiet und Forschungsschwerpunkte: Methoden der empirischen Sozialforschung, insbes. S urvey-Methodologie und sensitive Fragen in Surveys, Bildungschancen, Weiterbildung, Arbeitsmarkt, Rational Choice- und Spieltheorie, Kompetenzdiagnostik, Paraglaube und Parawissenschaft. Kontaktadresse: [email protected] Okka Zimmermann ist wissenschaftliche Mitarbeiterin am Institut für Soziologie der TU Braunschweig am Lehrstuhl für Sozialstrukturanalyse und empirische Sozialforschung. Arbeitsgebiet und Forschungsschwerpunkte: Lebenslaufforschung, Familiensoziologie, Methoden der empirischen Sozialforschung, Geschlechterforschung, Soziale Ungleichheit. Kontaktadresse: [email protected]
Statistische Grundlagen
Notes on Comparative and Causal Analyses Using Loglinear, Logit, Logistic, and Other Effect Coefficients Jacques Hagenaars and Hans-Jürgen Andreß
Abstract
Der Vergleich von Koeffizienten aus logistischen Regressionsmodellen zwischen verschiedenen Stichproben oder zwischen verschachtelten Gleichungen innerhalb einer Stichprobe ist aufgrund der Skalierungsproblematik der Koeffizienten bzw. der Aggregierung der Daten über einzelne Variablen schwierig. Die Ursprünge der Schwierigkeiten werden ebenso diskutiert wie einige Lösungen, und zwar getrennt für logistische Regressionsmodelle, die als LVM – latente Variablenmodelle interpretiert werden, und für logistische Regressionsmodelle, die als DRM – diskrete Responsemodelle interpretiert werden. Besonderes Augenmerk wird auf die mögliche kausale Interpretation der logistischen Koeffizienten gelegt. Keywords
Logistic regression · Latent variable model · Discrete response models · Group comparisons · Comparing nested equations · Scaling/collapsing/confounding
Hans-Jürgen Andreß ist leider verstorben. Wir trauern um einen guten Freund und geschätzten Kollegen. J. Hagenaars (*) Universität Tilburg, Evoisterwijk, Netherlands E-Mail: [email protected] H.-J. Andreß Universität zu Köln, Köln, Germany
© Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2020 A. Mays et al. (Hrsg.), Grundlagen – Methoden – Anwendungen in den Sozialwissenschaften, https://doi.org/10.1007/978-3-658-15629-9_1
3
4
J. Hagenaars and H.-J. Andreß
1 Introduction and Background (Too) Many years ago, in our first discussions about a possible revision of our book on categorical data analysis (Andreß et al. 1997), we decided to expand the book by elaborating on the comparative uses of loglinear/logit/logistic coefficients, a topic that deserved more attention in a methodologically oriented book on categorical data analysis. We soon discovered that part of the confusions and disagreements that occurred in our first discussions came from the fact that we did not sufficiently distinguish related but not identical aspects of the ‘comparability’ problems. Further, we wondered whether the exclusive attention on the comparability of logistic regression coefficients was justified and had the feeling that some similar issues might play a role regarding other association and effect measures. Therefore, several short notes were written. Steffen liked them, some even a lot, so he said (and of course contributed substantively and substantially). So we (from here: the above authors) decided to submit these notes as part of this ‘liber amicorum’, convincing ourselves that what was useful at a certain moment for us, might also be useful for others, even when no startling new facts would be provided. The notes were written in a very informal, sketchy style, with an emphasis on methodological, less so on purely statistical aspects. We kept them here mainly the way they were, although restructuring and updating them a little bit. A much more extensive and formal version of our conclusions may be found in the forthcoming revised edition of ‘our German book’ and in a forthcoming paper (Andreß et al., forthcoming) There are two main modes of comparing coefficients that should be distinguished. On the one hand, there is the comparison of association of effect coefficients over subgroups. On the other hand, there is the comparison of corresponding coefficients in two or more nested equations for the same population with the related decomposition of a marginal relationship in its direct and ‘indirect’ (or ‘spurious’) parts. Both these main issues may be discussed from a more descriptive or a more causal point of view. In principle, all kinds of association measures may function in corroborating or refuting a causal statement (for a discussion of several association or effect measures, see Kühnel and Krebs 2001). We see no a priori reason in favoring one coefficient over another to investigate a causal relationship. For example, if a causal account is closely linked to additive (regression) models and cannot handle logit equations, this is not a priori regarded as a weakness of the logit model but might well be a ‘shortcoming’ of the pertinent causal account.
Notes on Comparative and Causal Analyses Using Loglinear …
5
It also means that in principle association coefficients are taken in their own right. For example, odds ratios are interesting and useful as they are and not, as is sometimes suggested especially in a ‘causal’ context, as substitutes for percentage differences (‘epsilons’) as the ‘proper’ causal effect measures. If the coefficient of interest is the percentage difference, then the model restrictions should be defined through the linear probability model; if the odds ratio is the focal measure, loglinear (logit/logistic regression) models are appropriate. For some applications, the linear probability model and the loglinear (logit, logistic) model are identical, in that they imply the same restrictions on the data and yield identical expected frequencies, but often not. For example, imposing linearity restrictions on the coefficients or assuming the absence of particular interaction effects takes on different forms in these two models and leads to different sets of expected frequencies. Regarding causality, we take a rather liberal view. As is obvious from the pertinent philosophical literature, there are many ways to conceive causality and deliver a causal account, characterized by such terms as: action, manipulation, regularity, counterfactuals, necessity/sufficiency, capacities, or processes. There are definitely common elements among these various accounts, but at the same time they emphasize different aspects, as nicely and concisely demonstrated by Kim (1999). Also when looking at the research practices of the social sciences, different meaningful notions of causality can be encountered, as well as different useful ways of confirming or refuting a causal relationship (see edited volumes such as Morgan 2013; Vayda and Walters 2011; Diamond and Robinson 2010). Data from randomized experiments are used, as are survey data applying structural equations or graphical models, but also comparative case studies, process analyses, etc. In the end, it is all about telling a convincing causal story about the social world we live in, a story that is explicitly and systematically linked by as many crucial points as possible to the empirical world. This view on causality does not advocate a loose and ad hoc way of investigating causal relationships. For a story to be ‘convincing’, strict methodological guidelines must have been followed regarding theory, measurement, data collection, analysis and interpretation. Explicitly formulating alternative explanations for the empirical associations found other than ‘X causes Y’ and refuting them somehow is perhaps the most important and difficult requirement. Whenever it is possible and meaningful to formulate the causal story in a formal framework such as Rubin’s potential outcome model, closely linked to randomized experiments, or Pearl’s ‘do’ or ‘set’ operators in the context of graphical modeling, strict, formal procedures are available to warrant causal conclusions (Imbens and Rubin 2015; Pearl 2009). When this is not possible
6
J. Hagenaars and H.-J. Andreß
or meaningful, less formalized and less ‘routine’ procedures must be used or invented to still make methodologically convincing statements about causal relationships (In that sense, one might say that qualitative research is more difficult and demanding than quantitative research: if you are not good enough to carry out quantitative research, don’t even think about going into the qualitative field.). The logistic regression model with an observed categorical dependent variable Y can be understood in two ways, or as one might argue: there are essentially two different models. (This is also true for the closely related probit regression model which will not be discussed here because of the more insightful interpretation the logistic model can be given.) First, there is the latent variable interpretation, denoted here further as LVM. In LVM, it is assumed that the observed categorical dependent variable Y is a realization of an underlying unobserved latent variable Y* that depends linearly on the independent variables. On this latent continuum Y* there are certain ‘thresholds’ and depending on the respondent’s position on the latent continuum and the positions of the thresholds, the respondent will choose a category of the observed categorical dependent variable Y. The resulting probability distribution of Y as well as the sizes of logistic effects of the independent variables on Y further depend on the assumed distribution of the latent variable Y*. The coefficients of interest are essentially the unstandardized regression coefficients b in the underlying regression equation and the ‘problematic’ part of LVM is what can be inferred from the coefficients c in the logistic regression equation about these ‘latent’ unstandardized regression coefficients: what can ‘c’ tell us about ‘b’? In the DRM—Discrete Response Model—interpretation, there is no underlying latent dependent variable but the logistic regression model is essentially seen as a standard logit model with Y as the dependent variable and linearly restricted effects of the independent variables on Y. Odds ratios are the coefficients of interest in the DRM, which are simple, direct functions of the logistic regression coefficients c. Although the LVM ‘variant’ of the logistic regression model is widely used, especially in economics, it poses some serious challenges. First, the threshold response model to justify the latent variable approach makes fundamentally untestable assumptions about the response process and the distribution of the underlying latent variable. Second, the meaning of the underlying latent variable is not well defined. For example, for the categorical response variable employed/ unemployed, the latent variable might be denoted as ‘propensity to work’, but is this willingness, capability, preference, possibility or opportunity to work, or what? Moreover, using a dichotomy or just a few categories to represent a
Notes on Comparative and Causal Analyses Using Loglinear …
7
continuous variable is a very (too) crude way of ‘measuring’. If the interest lies in the underlying latent variable, one should define it more precisely and measure it accordingly and directly. (It is much like in the notorious Pearson-Yule disputes on the tetrachoric correlation where Pearson viewed 1) Vaccinated and 2) Not vaccinated as the realization of the continuous variable ‘Degree of effective vaccination’ or as ‘Degree of immunization’ while 1) Dead and 2) Alive were seen as ‘Strength to resist small-pox when incurred’, interpretations which Yule strongly rejected.) Finally, rather than just modeling the possible ‘discrepancies’ between the propensity to work and the actual work status in this way, it often is sociologically more interesting to investigate empirically why this discrepancy might exist (and take the categorical response variable as it is). Therefore, one should not too easily jump to the LVM variant in the interpretation of the logistic regression outcomes and, in our view, should have good reasons not to use the DRM ‘interpretation’.
2 Comparisons Among Subgroups and Generalizations 2.1 General Issues Let’s start with a two-dimensional probability distribution or table XY with a particular association and two subgroups A and B to be compared. Each association coefficient for such a bivariate distribution measures association, i.e. the distance of the data from the situation of statistical independence, in a particular way and from a certain perspective, where the values obtained depend on many features of the data. Whether the outcomes of a particular coefficient in two different subgroups can be meaningfully compared has to do with the purposes of the comparison and with the question which of the features that influence the outcomes are considered to be relevant and which to be a nuisance. To illustrate, take the whole family of ordinal association coefficients (Kendall’s tau-variants, Goodman and Kruskal’s gamma, Somers d coefficients etc.). It only makes sense to compare them for two tables if one is interested in answering the question to what extent the two populations differ regarding the directions and strengths of the monotonically de/increasing relationship. Further, the presence of ties in the tables may cause difficulties for the computation and interpretation of these coefficients; ties are handled in different ways for different ordinal coefficients. Thus comparability problems may arise if the amount or pattern of ties in one population differs from the ties in the cross-classification
8
J. Hagenaars and H.-J. Andreß
in the other population. However, if the given definition and handling of ties is accepted and makes ‘theoretically’ sense, then, given that definition and handling (e.g., as in gamma: ignore all tied pairs within rows, columns and cells) there are in principle no problems of comparability. Most association measures are also not variation independent of the marginal distributions, e.g., their actual range depends on the marginal distributions. Coefficients of the family of simple ‘additive’ asymmetric measures, like the raw differences between the means or between conditional probabilities or the unstandardized regression coefficients in standard linear regression, are often considered the least problematic for comparison reasons. They can be given simple, direct and comparable interpretations for the two comparison tables in terms of the original scores. However, also the values of these coefficients are influenced by the marginal distributions. Given their asymmetric nature, the X distribution can be regarded as given/fixed. But then if the (empirical) variance or the (empirical) range of the Y distribution is very small, percentage differences, unstandardized regression coefficients etc. cannot take on their theoretical maximum value. Does this hinder the comparison of two tables, one in which the Y-range is more severely restricted than in the other? In a purely descriptive sense, it is what it is: one is just smaller than the other. But do we want a marginal characteristic to influence our association or effect measure? If the smaller range in Y is ‘caused’ by a weaker effect of X, the comparison is sound. But what if the restricted range in Y is imposed by other factors that the researcher does not want to influence the comparison? The answer is then less clear. Attempts have been made in the groundbreaking work on associations by Goodman and Kruskal (1979) and also by Mosteller (1968) to free coefficients from marginal restrictions and make them in this sense comparable among tables with different marginal constellations, but this is impossible without imposing additional assumptions that might or might not make sense. Standardized measures such as correlation or standardized regression coefficients (betas) are even by definition sensitive to the marginal distributions. In most empirical situations, the correlation coefficient, for example, cannot take on its theoretical maximum value of plus/minus one. But perhaps more importantly, these standardized measures are by definition ‘influenced’ by the variance of the marginal distributions and expressed in standardized scores (z-scores). Does this feature cause difficulties for a meaningful comparison? Again, it depends to a large extent on the nature of our research questions. If the researcher is just interested in comparing the spread around the regression lines, there is no comparison problem when using correlation coefficients. And if it is thought that not the raw differences between persons in X- and Y-values
Notes on Comparative and Causal Analyses Using Loglinear …
9
in terms of their original scores are relevant but their differences relative to the standard deviation (z-scores), then standardized measures as betas are the right ones to choose, also for comparison purposes. However, often the interest lies essentially with the original scores, for example and certainly for a variable like Gender. Even then, standardized regression coefficients are still used but often in an automatic and thoughtless way. However, the comparisons of standardized measures across groups can be very misleading when the real interest is actually in the unstandardized effects. The only ‘regular’ association coefficient that has marginal variation independence is the odds ratio (and direct functions of it). No matter how the marginals look like, the odds ratio in a 2 × 2 (sub)table can approach its maximum values. In that sense, the odds ratio is a true association measure, independent of the marginal distributions. However, it has a very particular perfect association pattern: one empty cell in a 2 × 2 table. This feature can be given a nice interpretation in terms of sufficient or necessary conditions. But is this what a researcher wants/needs? And in terms of comparisons: the same value can indicate: A being a necessary condition for B in one subgroup table and in the other subgroup table: A being a sufficient condition for B. In making subgroup comparisons, very often an implicit linearity assumption is involved. For example, if the probability of Y = 1 increases in one subgroup from .05 for X = 1 to .25 for X = 2 and in another from .50 to .70, can these two percentage differences (both of .20) be regarded as indicating an equally strong effect of X on Y? Obviously only if the ‘Y = 1 starting level/point’ (for X = 1) is considered to be irrelevant, in other words only if an underlying linear relation is assumed. But even then, the two increases (from .05 to .25 and from .50 to .70) are only the same if linearity is defined in terms of an additive model for the frequencies or probabilities and not in an additive model for the logs of the probabilities (in other words, not for a multiplicative model for the probabilities or odds). What coefficient to choose, the percentage difference (epsilon) or the odds ratio when comparing is in a way arbitrary, because both coefficients are simple, direct functions of the same conditional probabilities. So, from a strictly mathematical point of view, it may be all the same, but if the two coefficients are used to estimate and compare the influence of a treatment in the two subgroups, different substantive conclusions may be arrived at. For example, it is not strange to discover that in terms of the percentage differences, the influence is the same in both subgroups but in terms of odds ratios, the influence is much bigger in subgroup A than in B. This is essentially a particular form of the general truism that the presence and degree of an interaction effect, i.e. the conditional X-Y
10
J. Hagenaars and H.-J. Andreß
association is not the same in all categories of Z, depends on the association measure used. There may also come up serious problems of comparability if latent variables are involved (as in the LVM variant of the logistic regression model, further dealt with below). The scores on the latent variables are by definition not known and their essential distribution characteristics have to be determined by other means than by direct observation. If the latent variable is considered as discrete (e.g., in latent class models), then, for a given, fixed number of latent classes, the ‘scale’ of the latent variable is known and its entire distribution can in principle be estimated. However, for continuous latent variables, e.g. in factor analysis or latent trait models, this is not possible. The solution is often found in assuming a particular (often: normal) distribution for the latent variable and assigning arbitrary values to its mean and variance, often 0 and 1 respectively. As long as the supposed measurement level of the latent variable is considered to be interval rather than ratio level, assigning an arbitrary mean value does in general not influence the interpretation of the effects or their comparability. However, assigning arbitrary values to the variance does. It makes, for example, the regression coefficients involving latent variables by definition (half) standardized giving rise to the kinds of comparability problems mentioned above when discussing standardized and unstandardized regression effects. There are other ways of assigning means or variances to continuous latent variables. For example, in (unstandardized) factor analysis (or more generally, in structural equation models (SEMs)) where sometimes a solution is found by setting the unstandardized effect of the latent variable on one particular indicator equal to one. In this way, the scale of the latent variable is determined by this particular indicator. The comparability of the subgroup outcomes then depends on the validity of the assumption that the effect of the latent variable on this indicator is indeed the same in both subgroups. Other identifying means such as introducing extra equality restrictions on particular parameters are also used, but of course comparability then rests on these extra model restrictions being true in both populations. When the analysis is taken from comparing bivariate to multivariate distributions, the same type of considerations still applies, albeit now in the form of direct, partial relationships. For example, it is only meaningful to compare corresponding partial betas among subgroups, i.e. the partial standardized regression coefficients, if the standardized z-scores are regarded as the relevant differences between the research units rather than the original variable scores; in the latter case the unstandardized partial regression coefficients should be compared.
Notes on Comparative and Causal Analyses Using Loglinear …
11
All of the scattered notes above do not directly address the core of the present discussions around the applicability of logistic regression coefficients in comparative research, especially not in comparing causal relations, but it does provide a more encompassing framework for this discussion and puts it in perspective. A still wider perspective is of course possible as in cross-national research where comparability is discussed in terms of possibly different meanings of the concepts in the populations to be compared, different measurement instruments used, different degrees of measurement error etc.
2.2 Comparing Logistic Regression Coefficients Among Subgroups LVM As stated above, in the LVM variant of the logistic regression equation for Y, the effect measures of interest are the unstandardized regression coefficients in the underlying regression equation for Y*. If for some reason, the focus of the subgroup comparisons is really on the standardized coefficients, these standardized effects (Y* standardized or fully standardized for both Y* and X; Long 1997) can be estimated from the logistic regression equations and the outcomes for the subgroups compared. We will focus here on the comparison of the underlying unstandardized coefficients. The underlying unstandardized regression coefficients will be denoted by b (bi for the effect of Xi) and the corresponding observed logistic effects by c (or ci). Directly related to the arbitrary distribution given to Y*, more specifically to the error variance, the logistic coefficients c are a scaled function of the underlying unstandardized effects b: ci = bi/s, where s is a(n unknown) scale factor. The scale factor is the same for all coefficients within a particular logistic regression equation. Regarding statistical significance and the sign of the effects (+ or −), c tells the same story as b. However, as a consequence of the scale factor, the differences among the unstandardized coefficients within an equation cannot be determined from the logistic coefficients or, stated otherwise, only up to the unknown scale factor s: c1 − c2 = b1/s − b2/s = (b1 − b2)/s. However, their ratios can be estimated: c1/c2 = (b1/s)/(b2/s) = (b1/b2). Therefore, one can estimate from the pertinent logistic coefficients c within a particular logistic regression equation how many times larger the underlying unstandardized effect bi of Xi is than the effect bi’ of Xi’, but not how big their difference is. Alas, it is differences that really count in a regression equation: with an identical scoring of the variables, the ratio of the values for two b coefficients of .04 and .02 would yield the same
12
J. Hagenaars and H.-J. Andreß
ratio 2 as the values of 8 and 4, but their differences (.02 and 4) would lead to a different substantive conclusion about the similarity of the effects. In terms of what interests us most here, viz. the comparison of the corresponding coefficients in subgroups A and B, the relations between the logistic coefficients c and the unstandardized b are: ciA = biA/sA and ciB = biB/sB, with the subscripts A and B referring to the effects b and scaling factors s in the subgroups A and B. Because the scale factors generally will not be the same in both subgroups, there is no direct way the comparison of the logistic coefficients will lead to a meaningful comparison of the unstandardized coefficients, neither their difference nor their ratio: ciA − ciB = biA/sA − biB/sB and ciA/ciB = (biA/sA)/(biB/sB) = (biA/(biB) × (sB/sA). Essentially, one can hardly derive anything useful about the comparison of the b coefficients in the two subgroups from comparing the c coefficients without making stringent additional and essentially non-testable assumptions. For example, if the logistic effect ci is larger in group A compared to group B, the opposite might be true for the unstandardized effects bi. One possible ‘solution’ is to assume that the scale factors in the subgroups, essentially the error variances in the underlying regression equations are the same: sA = sB = s. Although the difference in the b coefficients can still not be determined from the difference in the c coefficients: cA − cB = bA/s − bB/s = (bA − bB)/s, their ratio can be estimated given the equal scale factor: cA/cB = (bA/s)/ (bB/s) = (bA/bB). Another ‘solution’ is to assume that the effect bi for a particular independent variable Xi is the same in both groups A and B and therefore the ratio (biA/biB) equals one. This assumption does not help in identifying differences between the (other) b coefficients, but it does identify the ratios of the b coefficients for other variables than Xi. Given (biA/biB) = 1, the ratio ciA/ciB equals the ratio of the scale factors sB/sA and once sB/sA is ‘known’, the ratio bA/bB for the other variables in the equation can be determined from cA/cB and the estimated sB/sA. However, these are rather stringent assumptions that cannot be empirically tested within the framework of the logistic regressions but just have to be assumed to be true. In sum, within the LVM approach, it is very hard or next to impossible to make meaningful group comparisons about the unstandardized b coefficients on the basis of c. This of course extends to comparisons of the causal effects in the subgroups. Forgetting for a moment that the unstandardized regression coefficients are ‘latent’, no further specific problems arise within LVM regarding the causal interpretation of the unstandardized coefficients, i.e., not different from the conditions under which the unstandardized coefficients in an ordinary observed regression equation can be given a causal interpretation. But of course,
Notes on Comparative and Causal Analyses Using Loglinear …
13
the ‘latent’ character of b does pose the comparison problems discussed above: where the unstandardized coefficients cannot be compared in a descriptive sense among subgroups, neither can the causal effects. DRM Because in the DRM approach there is no latent, underlying Y* and no unknown underlying effects to be determined from the logistic coefficients, no special difficulties arise in comparing the odds ratios in two or more subgroups, given of course that the logistic coefficients and the corresponding odds ratio measure the association in the desired way. For example, computing differences in logistic coefficients among groups and the corresponding differences in log odds ratios make perfect sense, at least in a descriptive fashion. A more disputed issue is whether these differences reflect differences in causal effects, related to the more general, fundamental question whether or not odds ratios are appropriate measures for causal effects. In a very general sense, in which the causal theory (‘story’) predicts particular (partial) associations to empirically exist and such predictions turn out to be true, odds ratios may support causal statements. But is the odds ratio also an appropriate and precise measure of the strength of a causal effect? As stated in the beginning, we see no a priori reason for ‘deviations from statistical independence’ to be expressed in just one particular form, e.g., preferring differences between conditional probabilities (ε) above odds ratios. In that general sense, the odds ratio can be used as a measure of the strength of a causal effect. Nevertheless, there are some peculiarities of odds ratios that deserve attention. First, odds ratios do not squarely fit into the potential outcome framework. Very roughly speaking, in this framework, a causal effect is defined as the difference in reactions (the scores on the dependent variable) a particular individual shows when confronted with the experimental stimulus and is simultaneously subjected to the control situation. In the end, under additional conditions, the mean differences in Y between the (randomized) experimental and control groups can be interpreted as the expected individual causal effect. However, the logistic regression model is a multiplicative model or stated differently, it is an additive model defined on the log of the probabilities rather than the probabilities themselves. Given the properties of logarithms (in which, e.g., log a/b is not equal to log a/log b), the (log) odds ratio as the outcome of the randomized experiment cannot be interpreted as the expected causal effect in the form of the expected individual (log) odds ratio. The odds ratio obtained is what is usually called a population averaged measure and denotes what happens at the
14
J. Hagenaars and H.-J. Andreß
population level. Staying within the potential outcome framework, one might say that what happens at the population level is the result of all individual causal effects where the odds ratio obtained in the randomized experiment provides a lower bound for the expected individual odds ratio. Explicitly introducing into the logistic regression equation more and more independent variables that influence Y, the expected individual odds ratio can be approximated. If there are several observations per individual available, as in panel analysis, random or fixed individual effects can be added to the equation to get an even better estimate of the individual, non-averaged effects, given that the necessary assumptions such as the time constancy of the individual effects are valid. A second peculiarity of logistic effects and odds ratios, compared to ‘ordinary’ regression analysis is that odds ratios change when an additional independent variable Z is added that is statistically independent of the independent variables already in the equation. Say, in the simple logistic regression equation there is only one independent variable X, having a certain effect on Y. If Z-Gender is added to the equation, where Z influences Y but is orthogonal to X, the effect of X will nevertheless change (generally getting stronger) in the extended equation compared to the simple equation. If in the first simple equation, with X and Y dichotomous, the odds ratio equals 3, one might somewhat loosely say that the odds of scoring Y = 1 rather than Y = 2 are expected to be three times larger for an individual randomly chosen from X = 1 compared to another (not the ‘same’) randomly chosen individual from X = 2. In the extended equation with Z-Gender included, the odds ratio might become 4. The two persons randomly chosen are now conditioned on having the same gender, so within the category Men or within the category Women. If more orthogonal independent variables are added, the odds ratio still represents the strength of X but at different subpopulation levels. Whether one should use the marginal odds ratio in the simple equation with only X or the partial odds ratio(s) controlling for the orthogonal variable Z depends on the research question: Does one want to make the comparison for persons with known characteristics on Z or not? Further important ‘causal’ consequences of this behavior of logistic regression effects are discussed below.
3 Comparing Logistic Regression Coefficients from Nested Equations; Decomposing Marginal Relationships From the old times of Lazarsfeld (1955) and Blalock (1964) to the present day, decomposing marginal relationships into direct and ‘indirect’ (confounding or mediating) parts has been a core business of quantitative social science research.
Notes on Comparative and Causal Analyses Using Loglinear …
15
These decompositions play an important role in causal modeling but are also used in a more descriptive sense. To what extent the decompositions can be accomplished within the framework of the logistic regression model will be evaluated in this Section. LVM For standard SEMs, consisting of a recursive set of standard regression equations (with or without latent variables), it is well known how to decompose marginal relationships (correlations, covariances or unstandardized regression coefficients) into direct, indirect, and spurious parts, where the indirect parts are due to mediating variables and the spurious parts due to confounding variables. Especially Pearl has comprehensively shown how to transform many of the basic ideas of SEMs into general graphical models, often in the form of DAGs (Directed Acyclical Graphs) and how and when to attach causal meanings to these graphs. Here it will be discussed what special difficulties will be encountered when the DAGs take the form of logistic regression equations and the LVM variant is applied. To simplify matters, just two nested logistic regression equations, a restricted and a full one, will be dealt with. The first is the simple, ‘restricted’ equation for a categorical dependent variable Y with only one independent variable X. As before, the logistic regression coefficient will be denoted by c, now with subscript R: cR because it refers to the coefficient in the restricted equation. The restricted equation of interest within the LVM is the underlying regression equation with the unstandardized regression coefficient bR for the effect of X on Y*. The scaling factor is denoted as sR and so cR = bR/sR. The second equation is the ‘full’ equation in which a variable Z is added to the restricted equation. The partial logistic effect of X in the full equation is denoted by cF, the corresponding underlying partial unstandardized effect of X by bF and the scaling factor for this equation as sF and so cF = bF/sF. It will be assumed throughout that there are no interaction effects of X and Z on Y or Y*. Moreover, it will be ignored that when in the full logistic regression equation the effects of X and Z on Y follow a logistic curve, collapsing over Z will not result in a logistic curve for the effect of X on Y, regardless of what is assumed in the restricted equation. However, the deviations from the logistic curve are generally small and can usually be neglected. It will also be ignored, that in more complicated cases with more independent variables, collapsing over one of the independent variables yields small higher order interaction effects on Y among the remaining variables even when they are not present in the original
16
J. Hagenaars and H.-J. Andreß
full equation; these interaction terms are also ignored in the more restricted equation(s), generally without serious substantive consequences. In both nested logistic regression equations, the restricted and the full one, the error terms are assumed to follow the same standard logistic distribution with mean zero and the same (arbitrary) variance π2/3. However, the error variances of the underlying restricted and full equation will generally differ from each other, being smaller in the full than in the restricted underlying equation. Therefore, the scaling factor sR and sF will not be the same in the two equations. Crucial in the decomposition procedure is the comparison of the effects in the full and the restricted equation to determine the direct effects and to identify the confounding/indirect effects. The effects of interest in LVM are bR and bF. However, because of the different scaling factors, the difference between the marginal and partial logistic effects c of X does not correctly reflect the difference between the underlying b coefficients: cR − cF = bR/sR − bF/sF. Both ‘Scaling and Confounding’, i.e., both the difference between the scaling factors and the difference between the underlying b coefficients contribute to the difference between the marginal and the partial logistic effects cR and cF. Note that in the phrase ‘scaling and confounding’, ‘confounding’ is to be understood as ‘confounding or mediation’. (An analogous problem in the DRM approach is discussed in the literature under the heading of ‘collapsing and confounding’; see below). Within the logistic regression model, there is actually no (known) way to disentangle the scaling and confounding parts of the differences between the logistic coefficients in order to arrive at the relevant differences in b. This situation is similar to what was seen in the subgroup comparisons: the subgroup differences in b could only be determined up to a scale factor. Dealing with the ratios of the coefficients was a bit more successful in the subgroup comparisons, although only under stringent assumptions. ‘Ratios’ turn out to be even more useful in comparisons of marginal and partial coefficients; and this is the case without (too) stringent assumptions. Taking the ratio of the marginal and partial logistic coefficients yields: cR/cF = (bR/ sR)/(bF/sF) = (bR/bF)/(sF/sR). If somehow the ratio of the scaling factors could be determined, the ratio of the b coefficients could be estimated and one might be able to conclude how many times smaller or bigger the partial coefficient is compared to the marginal one or, in other words, what percentage the direct effect is of the marginal one. Luckily, estimates of this ratio can be obtained in several different but related ways. Given that the logistic error variances are set to the same known value in the two nested logistic equations, that the explained part of the variance can be estimated (as a particular weighted sum of the variances of the independent variables) and that the total variances of Y* (or Y) are the same
Notes on Comparative and Causal Analyses Using Loglinear …
17
in the two equations (being the same dependent variable), a direct estimation of the ratio (sF/sR) is possible. Breen and Karlson (2013; see also Karlson et al. 2012) proposed a somewhat different (and probably more stable) method, adding a third underlying equation for Y* to the restricted and full equations discussed so far and correspondingly, a third logistic regression equation. This new equation is like the full equation, but with Z replaced by Ze, where Ze is the residualized part of Z, i.e., Z is regressed on X and the error term of this equation is denoted here as Ze = Z − Z’, where Z’ is the predicted value of Z. This new ‘residualized’ full equation, subscripted by RF, with Ze and X as the independent variables has two useful properties which are very ingeniously used by Breen and Karlson to obtain an estimate of (sF/sR). First, because Ze and X are by definition not correlated with each other, the partial effect b of X on Y* in the underlying residualized full equation is the same as the marginal effect of X on Y* in the underlying restricted equation with only the effect of X on Y*. However, this is not true for the coefficients c in the corresponding logistic equations, because the restricted logistic regression equation and the residualized full logistic equation have different scale factors. The partial and marginal logistic effects c of X in these two equations will not be the same but reflect the different sizes of the scale factors. Actually, the ratio of the c coefficients in the restricted logistic equation (cR) and in the residualized full logistic equation (cRF) reflects the ratio of the scale factors in these two equations: cR/cRF = (bR/sR)/(bRF/sRF) = (bR/bRF)/(sRF/sR) = 1/(sRF/sR) = sR/sRF. (Alas, the difference (cR − cRF) is of no use (again): (cR − cRF) = (bR/sR) − (bRF/ sRF) = (bR/sR) − (bR/sRF ). Secondly, the error variance in the underlying ‘residualized’ full equation is the same as in the underlying full equation for Y* and hence the scale factors: sRF = sF. Therefore, cR/cRF = sR/sF. Knowing the ratio of the scale factors for the restricted and the full equation, the ratio of the underlying b coefficients can be estimated and it can be found out how many times larger or smaller bF is then bR. As Breen et al. (2013) explicitly show, in this way it can be determined what percentage of the marginal effect b is due to the direct and due to the confounding or indirect effects (and how to evaluate the statistical significance of these effects). Nevertheless, although, these insights about the relative contributions of direct, indirect and confounding effects are very important, it must be remembered that the absolute sizes of the effects and their differences are still not known. Given that it is differences between the unstandardized regression coefficients that ultimately and really count in regression analysis, interpretations of LVM still face serious problems.
18
J. Hagenaars and H.-J. Andreß
DRM At the core of the DRM is the odds ratio. So the focus here will be on the ‘decomposition’ of the observed odds ratios without any reference to underlying latent variables and coefficients. Further, effects in the form of differences between odds rather than their ratios or in the form of relative risks or percentage differences will not be considered. If one is interested in effects expressed in terms of these other coefficients, models should be chosen in which these coefficients are modeled. Although some of these ‘other models’ may yield the same outcomes in terms of expected frequencies, this is not true in general. Moreover, the possibilities and ways of decomposing marginal associations are different for different coefficients. For example, percentage differences in linear probability models behave like unstandardized regression coefficients with their clear and simple decomposition rules, contrary to odds ratios. The exposition will be in terms of simple logistic/logit/loglinear models for three dichotomous variables (where it is good to remember that in the DRM interpretation the logistic model is a logit model, which is identical to a loglinear model). The problems involved will be mainly exemplified. What is necessary in the near future is to derive the decomposition results more formally and generally. As analogously discussed for LVM, the reason that the decomposition of marginal relationships and the comparative interpretation of outcomes of nested equations are different and less intuitive in logit models compared to additive regression models comes ultimately from the fact that additive models are defined on the logs of the probabilities and not on the probabilities themselves; stated differently: a multiplicative model is defined for the cell probabilities rather than an additive one. Confounding and collapsing do not behave in the same way in additive and multiplicative models. If an ‘orthogonal’ variable Z is added to a regression equation in a linear probability model, the coefficients (percentage differences) in the full equation are the same as in the original restricted equation. Collapsing the probability distribution (or cross-classification) over an orthogonal Z does not change the (other) effects and Z does not play a ‘confounding’ role. Moreover, in linear probability models, analogous to standard regression equations and SEMs, the marginal coefficients can be simply and exactly decomposed in terms of direct, indirect, and spurious effects, where the total effect is equal to the sum of the direct and indirect effects and the indirect effects are simply products of the separate direct effects. All this is not true in the same way for logit models. This can be shown following the same kind of algebra as in showing that the sum of logistic curves is itself not a logistic curve. Rather than following this
Notes on Comparative and Causal Analyses Using Loglinear …
19
somewhat tedious path, the main points can be nicely illustrated by the (fictitious) data in Table 1. In panel a) of Table 1, a cross-classification ABC is constructed in such a way that the independent variables A and B have equal direct positive loglinear effects on the dependent variable C, without a higher order loglinear interaction effect; moreover, A and B are marginally positively associated, where arbitrarily it will be assumed that A influences B rather than vice versa. The lower part of panel a) presents the obtained values for the effect coded loglinear lambda coefficients for this table; the effect coded logit or logistic regression effects c would be twice and the log odds ratio four times the value of lambda. The Column Partial contains the partial two-variable lambda’s for model {AB,AC,BC} applied to table ABC and the Column Marginal the marginal two-variable lambda’s for the marginal tables AB, AC, and BC. In panel a), the partial direct effect of A on B controlling for C is smaller than the marginal effect A–B, but given the (arbitrarily chosen) causal order, this partial effect is also meaningless. It does not make substantively sense to control for a dependent variable. The effect of A on B must be found in the marginal table AB. Potentially more interesting is that the direct, partial effects of A and B on C are .6506. These direct effects are smaller than the corresponding marginal effects of .8688 for A–C and .8304 for B–C. In a naïve interpretation, a researcher might conclude that this reduction of .8688 − .6506 = .2182 (for A–C) and .8304 − .6506 = .1798 (for B–C) is due to the fact that part of the marginal relationship B–C is spurious because of the positive effects of A on B and on C and that part of the marginal relation A–C represents an indirect effect through B. (Note that in the comparisons of the partial and the marginal effects differences between lambdas will be used. The lambdas are the logarithms of the (multiplicative) log-linear parameters. Hence, when differences between the lambda’s are used this is equivalent to evaluating the (ratios of) odds ratios.) However, looking at the outcomes in panel b) in Table 1, these naïve interpretations, borrowed from the normal analogous practices in standard linear SEMs, appear less obvious. For the construction of the data in panel b), the data in panel a) have been used as the starting point. The data are the same as in panel a) except that now A and B are made statistically independent of each other in marginal table AB. How that has been achieved can be explained in two equivalent ways. Given the assumed causal order of the variables and following Goodman’s ‘modified path analysis approach’, the joint probability p(ABC = ijk) can be tautologically written as: p(ABC = ijk) = p(A = i)p(B = j|A = i) p(C = k|AB = ij). Making A and B statistically independent of each other means that p(B = j|A = i) = p(B = j) and therefore: p(ABC = ijk) = p(A = i)p(B = j)
95.29%
40.00%
21
22 4.71% lambda:
0.3950
0.6506
0.6506
AB
AC
BC
Partial
60.00%
40.00%
12
0.8304
0.8688
0.7272
Marginal
60.00%
10.00%
90.00%
11
C = 2
C = 1
AB
Cross-classification ABC:
25
5
15
55
100%=
BC
AC
AB
22
21
12
11
AB
0.6506
0.6506
0.3322
Partial
4.71%
40.00%
40.00%
90.00%
C = 1
0.4935
0.4748
0.0000
Marginal
95.29%
60.00%
60.00%
10.00%
C = 2
Table 1 Cross-classification ABC. (Source Author’s own calculation)
12
18
28
42
100%=
BC
AC
AB
22
21
12
11
AB
0.6506
0.6506
0.0358
Partial
4.71%
40.00%
40.00%
90.00%
C = 1
0.6331
0.6306
0.2964
Marginal
95.29%
60.00%
60.00%
10.00%
C = 2
18
12
22
48
100%=
20 J. Hagenaars and H.-J. Andreß
Notes on Comparative and Causal Analyses Using Loglinear …
21
p(C = k|AB = ij). This is no longer a tautological decomposition of the joint probability p(ABC = ijk) but constitutes a restricted model for the data. This last model with A and B independent of each other is then applied to the fictitious data in panel a) and the estimated (relative) frequencies as the outcomes of this modified path model are used as (fictitious) ‘observed’ frequencies in panel b). Another identical way of constructing the data in panel b) is to keep in panel b) the conditional percentage distributions of C within the categories of AB the same as in panel a) and therefore also the partial, direct loglinear effects of A and B on C; however, the marginal cross-classification AB (in the Total Colum 100%=) is changed in such a way that, taking the marginal one-variable distributions of A and of B as given and as in panel a), variables A and B are made statistically independent of each other. A and B are then marginally orthogonal and the marginal log odds ratio AB equals zero. As seen in panel b), the partial effects of A and B on C are now larger than the marginal ones. This increase in effects of .4748 − .6506 = −.1758 for A–C and .4935 − .6506 = −.1571 for B–C would in a naïve interpretation be attributed to the fact that A now works as a suppressor variable for the relation B–C; in other words, that there is a positive direct effect of A on C offset by a negative indirect effect through B, resulting in a smaller marginal effect. However, this interpretation is nonsensical from a substantive, ‘causal’ point of view because there is no (marginal) nonzero relationship between A and B: A does not influence B. Therefore, B cannot act as a mediating variable and A not as a confounding variable. The decrease in the partial effects of A and B is now purely the result of collapsing, due to properties of working with logarithms, analogous to the scaling factors discussed above for the LVM. In that sense, given the way the table was constructed, the outcomes in panel b) tell us what would have been expected to happen to the effects of A and B on C if only collapsibility would have had its influence. The partial effect of B on C (and analogously the partial effect of A on C) found in panel a) may still be given a causal meaning as the direct effect of B on C, as explained above when discussing subgroup comparisons in DRM. However, the importance of the confounding role of A is underestimated. The total difference (TD) in panel a) between the marginal coefficient B–C and the corresponding partial one of .8304 − .6506 = .1798 is the result of collapsing (CL) and confounding/spuriousness (CS): TD = CL + CS. If it would be justified to set the pure collapsibility part (CL) as in panel b) to: .4935 − .6506 = −.1571, the actual confounding influence (CS) of A would be: CS = TD − CL = .1798 − (−.1571) = .3369. Without the collapsibility effects, the marginal effect B–C of .8304 would have decreased due to controlling for A to .8304 − .3369 = .4935 rather than to the (partial effect) value .6506 actually found. Or, stating the same
22
J. Hagenaars and H.-J. Andreß
thing differently, one might say that the direct effect of .6506 would have resulted in a marginal effect B–C of .9875 rather than the found .8304, if only the spurious part due to A was added to the partial effect B–C, ignoring the collapsibility consequences. These ‘computations’ are mainly intended as illustrations of the underestimation of the mediating or confounding influences of ‘third’ variables in logistic regression analyses, because of the opposite effects of collapsing. Much more work must be done to obtain general and much more formal results. As follows from the standard conditions of collapsibility in loglinear modeling (see. e.g., Andreß et al. 1997, S. 180 ff.), that indicate when loglinear effects remain the same after a multidimensional table is collapsed over one or more variables, a sufficient condition for collapsibility regarding the effects of B on C in cross-classification ABC, given that there are no three-variable interactions, is that A does not have a direct influence on C. If that sufficient condition is satisfied,—as in loglinear models {AB, BC} or {A, BC}—the effect B–C in marginal table BC will be the same as in the full table ABC. Collapsing or not over A will yield the same and correct conclusions about the size of the effects of B on C. In other words, adding a variable that has no direct influence on the dependent variable does not change the effects. Another sufficient condition for collapsibility is that the partial, direct effect of A on B, controlling for dependent variable C, is absent, as in loglinear model {AC, BC}. Although as said above, such a condition does not make sense from a more causal perspective because a dependent variable is held constant, it may give rise to misleading outcomes that seem to implicate that a particular independent variable is causally irrelevant as confounding or mediating variable. An example is given in panel c) of Table 1. The data in panel c) have been constructed in much the same way as the data in panel b) but now imposing the condition that there is a nonzero marginal relationship between A and B, smaller than in panel a) but larger than in panel b). The direct effects of A and B on C are as before. As it turns out, the partial effect A–B is almost zero, very close to fulfilling the collapsibility condition. Therefore, the marginal effect and the partial effect A–C will be (almost) the same and the same applies to B–C. The (total) difference TD in panel c) between the partial and the marginal effect of B on C (and of A on C) is almost zero. A researcher might be inclined to conclude that A is irrelevant for the effects of B on C. In a way, this is true in that the size of the direct partial effect is (almost) identical to the marginal effect. But as before the difference between the partial and marginal effects is the result of collapsing and confounding/spuriousness: TD = CL + CS. An evaluation of the causal importance of A for the relation B–C should focus on the CS part. This CS part is not to be ignored; it is just of the same
Notes on Comparative and Causal Analyses Using Loglinear …
23
size, but in a different direction from CL, yielding a zero value for TD. Analogous calculations might be applied as before to ‘isolate’ the confounding part. But as said, these calculations were mainly intended to illustrate the problem. How to calculate and evaluate such kinds of indirect and spurious effects must be further and much more formally evaluated.
4 Conclusion What summary conclusions to draw? Comparability problems may arise not just in logistic regression models, but also regarding other effect or association measures. Nevertheless, the comparability problems in logistic regression are definitely serious. Regarding LVM, if a researcher is really interested in effects on the underlying latent variable Y*, the best possible advice is still: measure that variable directly. Regarding DRM, odds ratios can be given sensible interpretations as direct causal effects. Evaluating the indirect and spurious parts is more problematic. If a researcher is actually interested in percentage differences as (causal) effect measures, use linear probability models, for which the appropriate estimation procedures are available. On the other hand, in the social sciences, next to attempting to obtain precise estimates of the sizes of the causal effects, causal ‘analysis’ is often about telling a (methodologically) convincing causal story that at crucial points is validly linked to the appropriate empirical associations. For these purposes, logistic or logit models might be very useful and to be preferred, for example, for statistical reasons. Finally, a very large part of social science research is ‘explanatory’ in a more descriptive sense, e.g., trying to figure out what kinds of people vote for particular political parties, what the income consequences are of divorce for men and women, etc. It is also often about debunking seemingly obvious explanations of social phenomena. Logistic regression, if used appropriately, may be of great use for these purposes, as many empirical studies show, despite the actual and potential misinterpretations in many other publications.
References Andreß, H.-J., J.A. Hagenaars, and S. Kühnel. 1997. Analyse von Tabellen und Kategorialen Daten; Log-Lineare Modelle, latente Klassenanalyse, logistische Regression und GSK Ansatz. Berlin: Springer.
24
J. Hagenaars and H.-J. Andreß
Andreß, H-J, J.A. Hagenaars, and S. Kühnel. (forthcoming). Interpreting Effects in Logistic Regression, Logit and Log-Linear Models: Comparing coefficients within a single equation, among subgroups, and among equations. Quantitative Applications in the Social Sciences. Thousand Oaks: Sage Blalock, H.M. 1964. Causal inferences in nonexperimental research. Chapel Hill: University of North Carolina Press. Breen, R., and K.B. Karlson. 2013. Counterfactual causal analysis and nonlinear probability models. In Handbook of causal analysis for social research, Ed. S.L. Morgan, 167–187. Netherlands: Springer. Breen, R., K.B. Karlson, and A. Holm. 2013. Total, direct, and indirect effects in logit and probit models. Sociological Methods and Research 42:164–191. Diamond, J., and J.A. Robinson. 2010. Natural experiments of history. Cambridge: Belknap. Goodman, L.A., and W.H. Kruskal. 1979. Measures of association for cross classifications. New York: Springer (collecting articles from JASA 1954, 1959, 1963, 1972). Imbens, G.W., and D.B. Rubin. 2015. Causal inference for statistics, social and biomedical sciences; an introduction. New York: Cambridge University Press. Karlson, K.B., A. Holm, and R. Breen. 2012. Comparing regression coefficients between same-sample nested models using logit and probit: A new method. Sociological Methodology 42 (1): 286–313. Kim, J. 1999. Causation. In The cambridge dictionary of philosophy, Ed. R. Audi, 125– 127, 2nd Ed. Cambridge: CUP http://stoa.usp.br/rdeangelo/files/-1/10954/Cambridge+ Dictionary+of+Philosophy.pdf. Kühnel, S.M., and D. Krebs. 2001. Statistik für die Sozialwissenschaften; Grundlagen und Methoden. Reinbek bei Hamburg: Rowohlt. Lazarsfeld, P.F. 1955. Interpretation of statistical relations as a research operation. In The language of social research, Eds. P.F. Lazarsfeld and M. Rosenberg, 115–125. Glencoe: Free Press. Long, S.J. 1997. Regression models for categorical and limited dependent variables. Thousand Oaks: Sage. Morgan, S.L., Ed. 2013. Handbook of causal analysis for social research. Dordrecht: Springer. Mosteller, F. 1968. Association and estimation in in contingency tables. Journal of the American Statistical Association 65:35–48. Pearl, J. 2009. Causality; models, reasoning, and inference, 2. Ed. Cambridge: Cambridge University Press. Vayda, A.P., and B.B. Walters, Eds. 2011. Causal explanation for social scientists. Lanham: AltaMira.
Panel Conditioning or SOCRATIC EFFECT REVISITED: 99 Citations, but is there Theoretical Progress? Peter Schmidt, Maria-Therese Friehs, Daniel Gloris and Hannah Grote Abstract
In a paper published as early as 1987 by Jagodzinski, Kühnel and Schmidt on attitude measurement in a three wave panel study, we established empirically a general orientation toward foreign employees in Western Germany called “Gastarbeiter”. These items have been continuously used from 1980 till now in the ALLBUS studies (Wasmer and Hochman 2019). In this paper, we have analyzed how the citation, explanation and modeling of the Socratic effect for explaining changes in panel data developed over time starting with the original paper of Jagodzinski et al. (1987). According Elektronisches Zusatzmaterial Die elektronische Version dieses Kapitels enthält Zusatzmaterial, das berechtigten Benutzern zur Verfügung steht https://doi.org/10.1007/9783-658-15629-9_2. P. Schmidt (*) Zentrum für Entwicklung und Umwelt (ZEU), Universität Giessen, Gießen, Germany und Psychosomatische Medizin, Universität Mainz, Mainz, Germany E-Mail: [email protected] M.-T. Friehs Arbeitseinheit Entwicklungspsychologie und Pädagogische Psychologie, Universität Koblenz-Landau, Landau, Germany E-Mail: [email protected] D. Gloris Institut für Philosophie und Politikwissenschaft, TU Dortmund, Dortmund, Germany E-Mail: [email protected] H. Grote Fachbereich 04 Psychologie, Philipps-Universität Marburg, Marburg, Germany © Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2020 A. Mays et al. (Hrsg.), Grundlagen – Methoden – Anwendungen in den Sozialwissenschaften, https://doi.org/10.1007/978-3-658-15629-9_2
25
26
P. Schmidt et al.
to Google Scholar retrieved at 24.1.2019, 99 citations were found, which are all listed in the Online Supplementary. From the beginning on, there were discussions on eight different alternative model specifications derived from varying theoretical backgrounds, which all fitted the data (Jagodzinski et al. 1987, 1988, 1990; Steyer and Schmitt 1990; Saris and van der Putte 1988; Saris and Hartmann 1990). Till 2018, these authors continued with their model specifications in their publications, whereas the other authors citing the Socratic effect completely ignored the issue of the most adequate model specification. They used just the standard autoregressive model and in most cases did not discuss in a detailed way how the Socratic effect should guide the parameter restrictions in the model. In this paper, we take into account the criticism of Hamaker et al. (2015) of the autoregressive and the autoregressive cross-lagged model and their proposal of an random intercept autoregressive model as a more adequate alternative to separate within and between variance. We have used the attitude toward foreigners module of the GESIS ACCESS panel (Wagner et al. 2014) to specify and test how the Socratic effect can be taken into account in this model. The differences between the results of the autoregressive model and the random intercept model are substantial. Those differences refer to the sign, the strength and the significance of the coefficients and are similar to those found by Hamaker et al. (2015) and Kühnel and Mays (2019). Keywords
Confirmatory factor analysis · Method factor · Socratic effect · Panel conditioning · Autoregressive models · Random intercept and trait models · Attitude toward foreigners · True scores · Latent variables
1 Introduction Social and individual change have been central topics of the social sciences. To monitor and explain social and individual change, more and more panel studies have been made available in the social and behavioral sciences as public use files. Examples are the Socio-Economic Panel, Understanding Britain, European Community Household Panel, and access panel studies like the GESIS Access Panel. At the same time, several different models to analyze panel data have been developed, including fixed effects panel analysis, autoregressive and autoregressive cross-lagged models, random intercept models, latent growth curves, latent change models and continuous time models with stochastic differential equations.
Panel Conditioning or SOCRATIC EFFECT REVISITED …
27
Furthermore, one has to take into account whether only observed variables are used or whether multiple indicator models with latent variables are specified. Bollen and coauthors have integrated fixed effects panel models, autoregressive models, autoregressive cross-lagged models and latent growth curves into their general latent curve model (Bollen and Brand 2010; Bollen and Curran 2006; Kühnel and Mays 2019). Usami et al. (2019) have generalized panel autoregressive and latent growth curve models even more and included latent change models and the newly developed random intercept model. Instructive overviews of most of the models are given by Voelkle and Wagner (2017) and by Mayerl and Andersen (2018). Longitudinal Structural Equation modeling has been the topic of a wellwritten textbook by Little (2013). In the analysis of panel data, two especially important topics have been identified as early as in 1940 by Lazarsfeld (1940): 1. Panel Attrition (Haunberger 2011; Little 2013), that is the loss of persons over different waves and its treatment (Enders 2014). 2. Panel Conditioning (in psychology called test-retest effects) (Kroh et al. 2016; Shadish et al. 2002; Burgard et al. 2020) which implies the effects of filling out the questionnaire the first time on the response behavior in subsequent times. One possible explanation for these changes is McGuires (1960, 1992) application of cognitive consistency theory, labeled as Socratic effect. 3. A third but neglected topic is the choice of the model specification for the panel data, and how this is connected to the research questions to be studied (Saris and Gallhofer 2020; Usami et al. 2019). This implies both the substantive conceptualization of the variables involved and the choice of the adequate formalization via the choice of the models. Traditionally, autoregressive and cross-lagged models have been used, either only with observed variables or including latent variables, to study stability over time and “Granger causality” in form of cross-lagged methods (Granger 1969; Little 2013). The use of latent variables representing latent constructs was however challenged by Saris and van der Putte (1988) arguing that every item contains a specific content and that latent variables are mostly not deduced via theory as theoretical constructs, but inductively formed on the basis of factor analysis. Furthermore, Steyer and Schmitt (1990) argued that still another model specification, the Latent State-Trait Model, would be most adequate for the measurement of attitudes like it had been used in personality research. Finally, as alternative to autoregressive and cross-lagged autoregressive models, Hamaker et al. (2015) proposed an additional model for analyzing both stable between-person differences and within-person processes, named the random intercept cross-lagged panel model (RI-CLPM). This latter model often produced
28
P. Schmidt et al.
totally diverging results compared with autoregressive and cross-lagged autoregressive models (e.g., Kühnel and Mays 2019). In this paper, we want to concentrate on the second aspect, namely the phenomenon of the Socratic effect, and the third topic, the question of model specification. We want to review the use of one specific model specification of the autoregressive model for short term panel studies, the so called “Socratic effect” model applied for the first time by Jagodzinski et al. (1987). This model specification takes into account panel conditioning, which is defined as the effect of the answers given in the first measurement on later waves. In the 1987 paper, the authors studied whether there is a tendency of participants within the ZUMA Test-Retest Study to change their response behavior after the first wave and discussed which underlying social mechanisms could explain this. Furthermore, they used an autoregressive model with latent variables representing attitudes toward guest workers (foreigners) and additional latent variables to represent the specificity of items instead of autocorrelated errors (see Fig. 1). The authors also employed one alternative model within a structural equation modeling framework, called the response bias model, to account for the difference between the response behavior at the first and the following waves (see Fig. 2). Saris and van der Putte (1988) responded
ζ2
ζ1
ζ3
0.984
0.854 1
0.556
ε1
0.609
0.678 0.696 0.699
ε2
ε3
ε4
ε5
γ 11 γ 12 γ 13 γ 14
0.587
υ1
0.627
3
2
0.515
0.706 0.700 0.715
ε6
ε7
0.645
ε8
ε9
γ 21 γ 22 γ 23 γ 24
0.532
0.607
υ2
0.535
0.337
0.720 0.708 0.728
ε10
ε 11
ε 12
γ 31 γ 32 γ 33 γ 34
0.506
0.477
υ3
0.333
0.694
0.444
υ4
Fig. 1 Autoregressive Model with four specificity factors υ1–υ4. (Source Jagodzinski et al. 1987, p. 289)
Panel Conditioning or SOCRATIC EFFECT REVISITED … ζ1
29
ζ2
1.0
1.0
0.551
ε1
0.659 0.675 0.679
ε2
ε3
0.619
ε4
ε5
γ 11 γ 12 γ 13 γ 14 0.266
ζ1
0.706 0.697 0.713
ε6
ε7
0.649
ε8
ε9
γ 21 γ 22 γ 23 γ 24
0.711 0.702 0.711
ε10 ε 11 ε 12
γ 31 γ 32 γ 33 γ 34
0.375 0.384 0.386
0.582
υ1
3
2
1
-0.23
ζ3
0.621
0.509
0.573
0.603
υ2
0.535
0.334
0.513
0.462
υ3
0.336
0.694
0.441
υ4
Fig. 2 Standardized coefficients AR model response bias. (Source Jagodzinski et al. 1987, p. 290)
by discussing the adequate model specification and proposed another model specification in form of a true score model using the same data set (see Fig. 3). In addition, they argued that the four items represented four different variables and not one theoretically derived latent variable. In this article, we want to elaborate how far the idea behind the Socratic effect was useful as underlying social mechanism (Opp 2005) to explain panel conditioning, and how the two original model specifications and the competing model of true scores by Saris and van der Putte (1988) were applied in all the papers citing the original paper on the Socratic effect. Furthermore, we want to examine whether researchers citing the Jagodzinski et al. (1987) paper on the Socratic effect decided for one of the three different model specifications or discussed them at least. Finally, we want to use the GESIS ACCESS Panel to employ the classical model with a data set collected in 2016 and 2017 using slightly different items. As an alternative, we use the newly proposed random intercept autoregressive model1 of Hamaker et al. (2015) to examine the robustness of the Socratic effect. 1Hamaker
et al. (2015) called their model “random-intercept cross-lagged panel model”. However, as we will not specify any cross-lagged effects, we will refer to this model in the following as “random intercept autoregressive model”.
30
P. Schmidt et al.
ζ 1
1
γ11
ε11
γ21
ε21
2
γ12
γ22
ε12
γ31
ζ 6
ζ 4
γ13
γ23
ε13
ζ 7
γ14
ε14
γ24
ε24
γ34
ε34
8 γ33
ε32
4
ε23
7 γ32
ε31
3
ε22
6
5 ζ 5
ζ 3
ζ 2
ε33 ζ 8
Fig. 3 True-score model. (Source Jagodzinski, W., Kühnel, S. M., & Schmidt, P. 1990. Searching for parsimony: are true-score models or factor models more appropriate? Quality and Quantity, 24(4), 447–470).
Panel Conditioning or SOCRATIC EFFECT REVISITED …
31
One major finding of the analysis of the 99 follow-up publications of Jagodzinski et al. (1987) found by Google Scholar till 2019 was the following: Choosing the adequate model specification as a common factor model versus as a true-score model (Saris and van der Putte 1988; Jagodzinski et al. 1988, 1990; Saris and Hartmann 1990) has not been discussed in the 96 later publications. However, both the choice of different model specifications like autoregressive or latent growth curve models (Schlüter et al. 2007) and the choice of constraints within a certain specification like autoregressive models require that the researchers have to adequately formalize their substantive hypotheses (see generally Jaccard and Jacoby 2010; for growth curve models Legge et al. 2008). Interestingly, in nearly all applications of panel models with latent variables and multiple indicators, the choice of the model specifications is not discussed at all or not in detail in light of available alternative model specifications. In most cases, standard autoregressive models were chosen (Little 2013; Reinecke 2014). In psychology, especially developmental psychology, and in criminology, the use of latent growth curves as alternative model specification has increased substantially in the last ten years, especially to analyze mean value changes. Mostly, in both approaches, no additional restrictions derived from theory were used beside those which were necessary for identification. That is to say that the empirical analyses were mainly data-driven and not theory-driven. This refers to constraints concerning the estimation of factor loadings over time for testing measurement invariance (Seddig and Leitgöb 2018; Sosu and Schmidt 2016) and for example equality constraints referring to the estimation of latent mean changes, the stabilities over time and the cross-lagged effects (Little 2013). Our main research questions for the literature review have been as follows: 1. Could the Socratic effect model, which accounts for the effect of short-term panel studies, be replicated in later studies which cited the original paper by Jagodzinski et al (1987)? 2. What have been the major criticisms of the model and which model specifications were used? 3. Can we replicate the original model, based on the ZUMA Test-Retest Study (Porst and Zeifang 1987) 32 years later, using the new GESIS Access Panel with four waves and similar items, applying both the classical autoregressive model and the alternative random intercept autoregressive model? In the next part of the paper, we summarize the basic assumptions and hypotheses of the Socratic model, of alternative model specifications, and the empirical results for these different models. In the third part, we give an overview of all
32
P. Schmidt et al.
papers citing the original Socratic effect article (Jagodzinski et al. 1987) in Sociological Methods and Research, discuss its use and replication, and try to answer the first and the second research question. In the fourth part, we use four waves of the GESIS Access Panel to compare the original model and the recent alternative, the model proposed by Hamaker et al. (2015), to answer the third research question. Finally, we give a summary and discuss recommended changes for future analyses of panel data.
2 Socratic Effect and its Model Specification The starting point for the use of the concept of the Socratic effect was the observation in the ZUMA Test-Retest Study (Porst and Zeifang 1987) that in all attitude scales, the correlations between the items were consistently higher at measurement waves two and three compared with the first measurement wave. The interval between the waves was one month. As one possible explanation, Jagodzinski et al. (1987) employed the concept of the “Socratic effect” developed originally by McGuire based on consistency theory and tested in experiments, but not applied to panel studies (McGuire 1960, 1992.) See also the comment of McGuire (1992) in Appendix 1. The content of the Socratic effect can be specified as follows: Over time, people become more consistent in their answers and remembering the questions from the first wave helps them (memory effect); this effect was explained by McGuire (1960, 1992) using cognitive consistency theory as the underlying social mechanism. Alternatively, people might not have formed crystallized attitudes toward an object or even have non-attitudes (Converse 1964, 2000; Zaller 1992; Saris and Sniderman 2004), and therefore, the responses are less consistent at the first measurement wave. As participants are repeatedly stimulated to think about the topics of the questions, participants give less random answers from the second panel wave on. The specific hypotheses derived from the Socratic effect explanation are as follows. The effect was postulated given short term panel waves (one month or less distance): H1 T he standardized factor loadings (indicating the item’s variance explained by the latent construct factor, i.e., it’s reliability) are significantly lower in the first wave than in the subsequent waves in short-wave panel-studies. H2 The standardized factor loading of the items do not increase beyond subsequent waves.
Panel Conditioning or SOCRATIC EFFECT REVISITED …
33
Because respondents are sensitized by the first interview, their responses, in general, should be more consistent in subsequent waves. H3 T he variance of the random measurement error of the items is greater in the first wave than in the subsequent ones. H4 The random measurement error does not decrease significantly from the subsequent waves on. As we postulate that the underlying latent variable is an attitude and not a more fluent opinion, we expect in short distance panels a high consistence of the responses. H5 In short-wave panel (for example only a few weeks), the inter-temporal consistency of a latent variable is nearly perfect, that is, the unstandardized and the standardized stability coefficients are close to unity. Hypotheses H1−H5 form the core of the Socratic effect. In the next two Sects. 2.1 and 2.2, we report how these hypotheses were empirically tested and discuss the empirical results which were the outcomes of three different model specifications.
2.1 Autoregressive Models The input data came from the ZUMA Test-Retest Study of Porst and Zeifang (1987) and details including the correlation matrix are given in Jagodzinski et al. (1987). Sample Size was N = 152. The empirical results are shown in Fig. 1. As one can see, for each time point, the authors specified a latent variable as a common factor model with reflective items representing the construct attitude toward guest workers conceptualized as a latent variable. It is regarded as a general attitude toward a specific object (Eagly and Chaiken 1993) and specified as a first order common factor model, postulating only direct effects between the consecutive time points. To take into account the autocorrelated errors, we introduced four additional method factors, which represent the item-specific variance over the time points and can be regarded as a special case of a Multi-Trait-Multi-Method (MTMM) specification (Brown 2015). From Fig. 1, one can see that the standardized factor loadings are descriptively increasing from wave 1 to wave 2, and slightly in three cases in wave 3. The authors used no significance test because of the low power given the sample size.
34
P. Schmidt et al.
Furthermore, the stability coefficients between the latent attitudes (factors) of the different time points also increased from .854 to .984. The error variances decreased from wave 1 to wave 2. Therefore, all hypotheses formulated above as H1 to H5 were confirmed. The alternative model specification proposed in Jagodzinski et al. (1987) differed from the model in Fig. 1 by introducing an additional exogenous latent variable, which represents the response bias during the first time point. Furthermore, to achieve an adequate model fit, the authors introduced a negative correlation between the response bias and the attitude toward guest-workers factor at wave 1. The model is not nested in the Attitude Change (Socratic) effect model. Therefore, only the information theoretic global fit measures (AIC, CAIC, BIC and BCC) can be used for model selection, which were not available in the SEM program LISREL in 1987 and were consequently not reported in the original paper. However, the global fit measures available at that time did not allow a decision between the two models, because they had an approximately similar global fit and equal probability level. The fit values are for the Attitude Change model: Chi2 = 45.09, df = 48, p = .593. For the Response bias model, see Fig. 2, we found: Chi2 = 46.07, df = 49, p = .593.
2.2 True Score Model as Alternative In 1988, both models were critically evaluated by Saris and van der Putte. They argued that out of theoretical reasons, a true score model would be much more adequate. They criticized the original models as treating items on related issues rather than measuring the same underlying latent variable and theoretical construct (Saris and van der Putte 1988), as the items measured different aspects of the attitude toward guest-workers like adaptation, political participation, interethnic marriage and workplaces. Furthermore, Saris and van der Putte challenged two major findings: Firstly, the conceptualization of measurement error and the existence of a Socratic effect. Secondly, the attempt to measure a general attitude formalized as a latent variable in a common factor model. Saris and van der Putte (1988), in contrast, assumed that each of the four items used to measure the attitude toward guest-workers was related to a specific opinion and not to a general attitude construct. These opinions would be fairly unstable between the first and second wave. As an alternative model, they specified a pure true-score model, see Fig. 3. In this model, each item is related to a single true score which is regarded as an opinion. The fit of their model was slightly better.
Panel Conditioning or SOCRATIC EFFECT REVISITED …
35
Thus, it can be concluded that no goodness of fit measure or other statistical criteria were available at that time to decide between their model and the two original models (Jagodzinski et al. 1988). The fit values were only a little bit better than for the former two models: Chi2 = 48.18, df = 53, p = .662. The results of the two papers lead to the following research questions, which were dealt in a follow-up of the 1987/1988 discussion published in 1990 (Jagodzinski et al. 1990; Saris and Hartmann 1990): 1. Which of the model specifications for attitude stability and opinion formation is superior from a theoretical, methodological and empirical point of view? 2. Is a common factor model based on attitude theory more appropriate than a true score model, which denies the existence of an underlying latent variable? 3. Is it reasonable to assume a Socratic effect in short wave panels and its effects of decreasing error variances, increasing reliabilities and a stabilization of the underlying opinions? In the 1990 paper (Jagodzinski et al. 1990), it was argued that a major advantage of the common factor model is apparent when we switch from CFA models to full SEM models with theory driven predictors of attitudes toward guest workers as a latent variable. The authors demonstrated this by introducing age and education as determinants of attitudes toward guest workers and were able to show that the expanded model was much more parsimonious and restrictive than the true score model. Consequently, the authors argued that if the recommendations of Saris and van der Putte (1988) and Saris and Hartmann (1990) were followed, one would have to formulate a separate hypothesis for every single opinion/item, which is not desirable as it would lead to a plethora of hypotheses. While the model of Saris and van der Putte (1988) revealed the best fit in the total sample, the revised model with response bias had the best fit for a reduced sample excluding all respondents older than 70 whose response behavior has often been found to diverge (Herzog and Rodgers 1988). But also in the model of Saris and van der Putte (1988), the decrease of the measurement error from wave 1 to wave 2 was significant, which was one of the three core hypotheses of the Socratic effect. Therefore, the debate remained unresolved and to this day, the two parties applied their preferred model specification. In the next section, we want to summarize the papers which mentioned the Socratic effect and analyze how far the model specifications were tested and whether the authors referred to the discussion between the true score model and the common factor model.
36
P. Schmidt et al.
3 Citations and Replications To count the number of articles and books mentioning the 1987 paper by Jagodzinski et al. on the Socratic effect, we registered its google citations. We took into account all publications until 31.12.2018. Results are displayed in Fig. 4 (below). Originally, we intended to perform a meta-analysis. However, as the number of published correlation matrices found in the publications was so small, we had to change this purpose and did a systematic review. In total, we found 99 papers which cited the article. Most of the 99 publications found only the paper of Jagodzinski et al. (1987), but had not much additional information. seventeen publications including those supplying correlation matrices were finally available for a more detailed analysis. As criteria for coding, we used: a) Correlation matrix available; b) Time Distance between the waves; c) Number of waves; d) Sample size; e) Type of sample; f) Topic of the constructs; g) Topic of the study; h) CFA/SEM vs. test retest correlations; i) Data collection mode; j) same SEM model specification as our models 1 and 2 in Jagodzinski et al. (1987) vs. not the same specification as models 1 or 2. All 17 publications were coded. The criteria used are given in Table 1. Only five publications of the total number of 99 contained correlation matrices in the text or appendix, see Table 2. For a meta-analysis, this is by far too small.
Citations per year 8 7 6 5 4 3 2 1 0
Fig. 4 Citations per year. (Source author’s own presentation)
Study
Time distance between waves
in weeks
Correlation matrix available
a.–n.a.
Students (S) Representative sample (Rs) Rest (R)
Number Type of Sample Sample of waves Size
Table 1 Table of criteria. (Source author’s own presentation) Area of survey
Politics (P) Prejudice (Pr) Health (H)
Measured constructs
Values (V) Personality traits (P) Attitudes (A) Opinion (O)
Model variation
yes–no Personto-person/ CAPI (P) Internet survey (I) Written survey (W) Telephone (T)
CFA/SEM Method of data vs. Time collection indices (Test-retestcorrelation)
Panel Conditioning or SOCRATIC EFFECT REVISITED … 37
a.—n.a.
52
in weeks
Correlation Time distance matrix between available waves
n.a. Cernat, A. 2015. The impact of mixing modes on reliability in longitudinal studies. Sociological Methods & Research 44: 427–457
Study
4
Rs
Students (S) Representative sample (Rs) Rest (R)
2384 (w1)– 1621 (w3) A
Values (V) Personality traits (P) Attitudes (A) Opinion (O) Competence(C)
Number Type of Sample Sample Measured of waves Size constructs
Table 2 Overview of the studies. (Source author’s own presentation)
P, H
Politics (P) Prejudice (Pr) Health (H) Health/ Psychology (Hp) Webdesign (W)
Area of survey
SEM (QSMS, P, T LMC)
(Continued)
no
Personyes–no to-person/ CAPI (P) Internet survey (I) Written survey (W) Telephone (T)
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
38 P. Schmidt et al.
R
501
Ms
4
a. (p. 84)
Courvoisier, D.S., M. Eid, und F.W. Nussbeck. 2007. Mixture distribution latent state-trait analysis: Basic ideas and applications. Psychological Methods 12: 80–104
3
Number Type of Sample Sample Measured of waves Size constructs
Correlation Time distance matrix between available waves
Study
Table 2 (Continued)
Hp
Area of survey
SEM (LST)
W
(Continued)
no
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
Panel Conditioning or SOCRATIC EFFECT REVISITED … 39
R R
501 375
Ms Ms
4 4
a. (p. 42) a. (p. 139)
Courvoisier, D.S. 2006. Unfolding the constituents of psychological scores: Development and application of mixture and multitraitmultimethod LST models (Doctoral dissertation, University of Geneva)
3 26
Number Type of Sample Sample Measured of waves Size constructs
Correlation Time distance matrix between available waves
Study
Table 2 (Continued)
Hp Hp
Area of survey
SEM (LST) SEM (MTMM, LST)
W W
(Continued)
no no
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
40 P. Schmidt et al.
18
ca. 5
n.a. Finn, A. 2007. Doing a Double Take Accounting for Occasions in Service Performance Assessment. Journal of Service Research 9: 372–387
Correlation Time distance matrix between available waves
n.a. Ferrando, P.J. 2003. Analyzing retest increases in reliability: A covariance structure modeling approach. Structural Equation Modeling 10: 222–237
Study
Table 2 (Continued)
2
2
R
S
17
218
O
P
Number Type of Sample Sample Measured of waves Size constructs
W
Hp
Area of survey
W
ANOVA, I G-coefficients
SEM
(Continued)
no
no
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
Panel Conditioning or SOCRATIC EFFECT REVISITED … 41
4
a. (p. 282)
Jagodzinski, W., S.M. Kühnel, und P. Schmidt. 1987. Is there a “socratic effect” in nonexperimental panel studies? Consistency of an attitude toward guest workers. Socio logical Methods & Research 15: 259–302
4
Correlation Time distance matrix between available waves
a. (p. 467) Jagodzinski, W., S.M. Kühnel, und P. Schmidt. 1990. Searching for parsimony: are true-score models or factor models more appropriate? Quality and Quantity 24: 447–470
Study
Table 2 (Continued)
3
3
Rs
Rs
152
137
A
A
Number Type of Sample Sample Measured of waves Size constructs
Pr
Pr
Area of survey
SEM
SEM
P
P
(Continued)
no
yes
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
42 P. Schmidt et al.
52
ca. 52
a. (p. 211) Maag, G. 2013. Gesellschaftliche Werte: Strukturen, Stabilität und Funktion. Wiesbaden: Springer VS
Correlation Time distance matrix between available waves
Kroh, M., F. Winter, n.a. und J. Schupp. 2016. Using personfit measures to assess the impact of panel conditioning on reliability. Public Opinion Quarterly 80: 914–942
Study
Table 2 (Continued)
3
ca. 18
Rs
Rs
231– 204
49522
V
P, A
Number Type of Sample Sample Measured of waves Size constructs
P
P, H, Hp
Area of survey
EFA, Time indices
SEM
P, W
P, W
(Continued)
no
yes
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
Panel Conditioning or SOCRATIC EFFECT REVISITED … 43
ca. 52
ca. 4
a. (p. 126) Saris, W.E., und B. van der Putte. 1988. True Score or Factor Models a Secondary Analysis of the ALLBUSTest-Retest Data. Sociological Methods & Research 17: 123–157
Correlation Time distance matrix between available waves
a. (p. 17) Reuband, K.H. 1999. Kriminalitätsfurcht: Stabilität und Wandel. Neue Kriminalpolitik 11 (2): 15–20
Study
Table 2 (Continued)
3
3
Rs
R
152
429
A
A, O
Number Type of Sample Sample Measured of waves Size constructs
Pr
Pr, Hp
Area of survey
SEM
Time indices (Test-retestcorrelation)
P
W
(Continued)
yes
no
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
44 P. Schmidt et al.
Correlation Time distance matrix between available waves
a. (p. 2001) 18 Shevlin, M., G. Adamson, und K. Collins. 2003. The Self-Perception Profile for Children (SPPC): A multiple-indicator multiple-wave analysis using LISREL. Personality and Individual Differences 35: 1993–2005
Study
Table 2 (Continued)
4
R
155
P, C
Number Type of Sample Sample Measured of waves Size constructs
Hp
Area of survey
SEM
W
(Continued)
yes
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
Panel Conditioning or SOCRATIC EFFECT REVISITED … 45
3
4
a. (p. 438) Steyer, R., und M.J. Schmitt. 1990. Latent state-trait models in attitude research. Quality and Quantity 24: 427–445
Rs
R
152
221
Area of survey
A
Pr
P, H – (Education, unemployment, history, loyalty)
Number Type of Sample Sample Measured of waves Size constructs
27 (t1–t2, 4 t2-t3) 312 (t3– t4)
Correlation Time distance matrix between available waves
a. (p. 133) Sprengers, M. 1992. Explaining unemployment duration: An integrative approach (Unpublished doctoral dissertation). University of Utrecht, Netherlands
Study
Table 2 (Continued)
SEM
SEM
P
P
(Continued)
yes
no
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
46 P. Schmidt et al.
ca. 52
n.a.
Sturgis, P., C. Roberts, und N. Allum. 2005. A different take on the deliberative poll information, deliberation, and attitude constraint. Public Opinion Quarterly 69 (1): 30–65
ca. 52
Correlation Time distance matrix between available waves
n.a. Sturgis, P., N. Allum, und I. Brunton-Smith. 2009. Attitudes over time: The psychology of panel conditioning. In: Methodology of longitudinal surveys, Ed. P. Lynn, 113–126. Hoboken, New Jersey: Wiley
Study
Table 2 (Continued)
5
10
Rs
Rs
A
A
5122
224– 300
Number Type of Sample Sample Measured of waves Size constructs
P
P, Hp
Area of survey
(Continued)
no
No no indication
P Cronbach’s Alpha, fixed effects models
CFA, Pearsons r
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
Panel Conditioning or SOCRATIC EFFECT REVISITED … 47
Rs
3107
C
Number Type of Sample Sample Measured of waves Size constructs
ca. 25–52 2
Correlation Time distance matrix between available waves
Van der Zouwen, J., n.a. und T. Van Tilburg. 2001. Reactivity in panel studies and its consequences for testing causal hypotheses. Sociological Methods & Research 30: 35–56
Study
Table 2 (Continued)
H, Hp
Area of survey
t-Tests, ANOVA
P
no
Model CFA/SEM vs. Method variation Time indices of data (Test-retest- collection correlation)
48 P. Schmidt et al.
Panel Conditioning or SOCRATIC EFFECT REVISITED …
49
Although the article has been cited 99 times, in the papers listed, the Socratic effect model, and not the response bias model or the true-score models, were mentioned most of the times. Furthermore, the issues of latent variables as common factors vs. single items combined with true score models or the question of equivalence of the different models (Aspourov and Muthén 2019) were not discussed at all. An exception is the monograph by Saris and Gallhofer (2014) who have applied the true score models in many experiments in the European Social Survey. Although we found 99 citations in Google Scholar, there has been no systematic replication with new data over the period of 31 years from 1987 till 2018, with the exception of six publications: The dissertation by Sprengers (1992), the book by Maag (2013), the study by van der Zouwen and van Tilburg (2001), the studies by Sturgis et al. (2005, 2009) and the large study by Kroh et al. (2016). Sturgis et al. (2009) started by citing Holt (1989), who argued that the existing research in the area of panel conditioning has been largely atheoretical with a primary focus on estimating the direction and magnitude of possible biases as opposed to a specification of the underlying causal mechanisms. The authors proposed an additional theoretical model, which they call the cognitive stimulus hypothesis (CS model). This model assumes that most respondents will answer in their first interview without giving much thought to their positions on a variety of political and social issues. This corresponds to the concept of non-attitudes which Converse (1964, 2000) developed firstly when comparing the “mass” subsample with the elite subsample in cross-sectional data. This was continued by Zaller (1992), who argued that each answer to a survey question reveals a snapshot of the considerations at the top of a respondent’s head at the time when the question was being answered. Being questioned about these issues may stimulate the respondents to think and reflect more intensively about these issues. Alternatively, or additionally, it may lead to more discussions with friends and in social networks, which was confirmed in a qualitative study by Waterton and Lievesley (1989). They used signal detection theory to deduce that the ratio of signal to noise for individual items should increase over time. Sturgis et al. (2009) predicted that this should decrease the random component in panel studies over time. Kroh et al. (2016), using the data from 30 waves of the Socio-Economic Panel and approx. 50,000 respondents, however demonstrated that even in panels with one year of distance, reliability increases over time. The authors were, however, skeptical concerning the underlying process of consistency formation, which is the basis of the Socratic effect, referring to the results of a study by Saris and van Meurs (1991). The latter authors showed that a 25 min gap between replications is sufficient to obtain independent measures and therefore no test-retest effect takes place.
50
P. Schmidt et al.
Furthermore, Kroh et al. (2016) elaborated and applied the conceptual scheme of Tourangeau et al. (2000) to explain survey responses in terms of the underlying cognitive processes (Bergmann and Barth 2018). According to these authors, one needs to differentiate four phases: The first phase consists of the comprehension of a question. The second phase, called retrieval phase, is characterized by drawing relevant information from the short and long term memory based on the understanding of the item. Kroh et al. (2016) cite Bailor (1989) in this respect, who indicates that the number of errors in recalling retrospective events decreases over panel waves. They classify the original explanation in the same way, saying “for a related interpretation Jagodzinski et al. (1987)” (Kroh et al. 2016, p. 917). The third phase is defined by the respondent’s interpretation of the retrieved material and the formation of a judgment. Here, the concept of satisfying (Krosnick and Alwin 1987; Krosnick 1991; Simon 1957) is introduced as an explanatory mechanism. This implies that respondents do not maximize their expected utility when answering questions in surveys, but that they stop once their aspiration level is reached. Finally, in the fourth phase, respondents report their latent judgment using the offered response categories. Summarizing the effects of survey experience, the authors predict specific effects for all four phases. It diminishes ambiguities in question comprehension, facilitates the retrieval of relevant information, reduces the need for heuristics of judgment and makes haphazard misreporting less likely. As a consequence, Kroh et al. (2016) state that “their most important finding is that general experience and exposure with a single multi-item instrument increases reliability robustly” (p. 937). Furthermore, they state that largest absolute gains in the reliability of instruments take place in the first four years (4 waves), in contrast to the prediction of the Socratic effect model, which predicts an increase of standardized factor loadings from the second time point on and no further changes after wave 2 are predicted. However, the predictions in the Socratic model were constrained to be only valid for short term panels of maximally three months length. One has to take into account that the time distance of the SOEP between the waves was one year and not one month like in the data of the test-retest study of Jagodzinski et al. (1987). In the other papers, only the term Socratic effect was used, but no specific restrictions derived from the Socratic effect were specified and tested. The other three papers beside Jagodzinski et al. (1987) involved in the original discussion showed much lower but roughly equal citation numbers retrieved from Google Scholar on 15.02.2019. Citation numbers for the alternative true score model by Saris and van der Putte (1988) were: 28. For the improved common factor model in Jagodzinski et al. (1990): 25. For the response model of Saris and Hartmann (1990): 16.
Panel Conditioning or SOCRATIC EFFECT REVISITED …
51
4 Replication Study with the GESIS Access Panel The following analyses are based on the GESIS Access Panel, a mixed mode repeated survey on a sample representative for Germany with online as well as paper-pencil survey application (Bosnjak et al. 2018). The basic population of the survey is the German speaking population with permanent residence in Germany between 18 and 70 years of age, as defined in official records. Respondents were invited to participate in surveys every two months. For the analyses at hand, we used the survey module „A longitudinal multilevel approach to study causes and consequences of positive and negative attitudes towards ethnic minority groups in Germany” (Wagner et al. 2014), which focuses on attitudes towards migrant groups in the German population. The sample (N = 3341) was separated into four sub-samples (random groups), for whom the target group of the survey questions differed (either Muslims, Foreigners, Refugees, or Sinti and Roma as target groups of attitudes). Four waves of measurement were collected, with measurement intervals of six months (spring 2016; autumn 2016; spring 2017; autumn 2017). In contrast to the original study, we use one more wave and the time distance is not one month, but six months. The larger time distance between the waves might have an additional influence, although the large study by Kroh et al. (2016) found robust effects on reliability in the SOEP (Goebel et al. 2018) using data with a time distance of even one year. For the following analyses, we will focus only on the sub-sample (n = 827) which answered survey questions on “Foreigners” as a target group, as out of the pool of target groups, this group shows most conceptual overlap with the target group “guest workers” that was used in the original analysis of the Socratic effect. The label “Foreigners” in Germany is associated with a number of different nationalities, but mainly with people of Turkish origin (Asbrock et al. 2014; Wasmer and Hochman 2019). Two items were used in all four waves to assess participants’ attitudes towards foreigners: “How would you generally describe your feelings towards foreigners?” and “Altogether, how would you evaluate foreigners?”, both answered on a scale from 1 = very negative to 5 = very positive. This is an additional difference compared with the items of the original test-retest study, which used a different operationalization of the attitudes toward guest workers. Descriptive statistics as well as item intercorrelations for these items for all waves are given in Table 3. We computed all following models using Mplus 8.0 (Muthén and Muthén 1998–2017) with robust maximum likelihood estimation (MLR), which takes into account missing information based on the assumption of a missing-at-random (MAR) process and accounts for non-normality of the data distribution.
52
P. Schmidt et al.
Table 3 Item descriptive values und correlations for the two items measuring attitudes towards foreigners in all four waves of the GESIS Access Panel. (Source author’s own presentation) Target group: Foreigners (n = 827)
Descriptive values
Item correlations
#
Item
M
s2
1
1
Feelings T1
3.15
0.58
1.00
2
Evaluation 3.19 T1
0.59
0.78
1.00
3
Feelings T2
3.17
0.45
0.51
0.54
1.00
4
Evaluation 3.16 T2
0.47
0.53
0.54
0.71
1.00
5
Feelings T3
3.13
0.48
0.51
0.48
0.54
0.54
1.00
6
Evaluation 3.10 T3
0.45
0.54
0.52
0.53
0.55
0.72
1.00
7
Feelings T4
3.10
0.46
0.51
0.48
0.57
0.59
0.60
0.58
1.00
8
Evaluation 3.10 T4
0.45
0.50
0.51
0.55
0.59
0.53
0.56
0.72
2
3
4
5
6
7
8
1.00
Note M = mean value, s2 = variance. Feelings = How would you generally describe your feelings towards foreigners?; Evaluation = Altogether, how would you evaluate foreigners? Scaling from 1 = very negative to 5 = very positive. T1 = Wave 1 (spring 2016), T2 = Wave 2 (autumn 2016), T3 = Wave 3 (spring 2017), T4 = Wave 4 (autumn 2017)
4.1 Replication of the Socratic Effect Model as an Autoregressive Model As a first step, we established a baseline model in which four attitudes factors, representing the four waves of measurement and each loading on the two respective indicators, were modelled without any specified structural relations between the factors (i.e., all attitude factors were correlated with each other). Please find this model in Fig. 5. The model fit was excellent, χ2(14) = 17.358, p = .238; RMSEA = .017 [90% CI: .000 − .040]; SRMR = .010; CFI = .999. An examination of the modification indices indicated that the specification of a
Panel Conditioning or SOCRATIC EFFECT REVISITED …
53
.66
.80
.69 .75
.71
Attitudes T1
.79
Attitudes T2
Attitudes T3
Attitudes T4
.89
.88
.83
.86
.84
.86
.87
.84
Feelings T1
Evaluation T1
Feelings T2
Evaluation T2
Feelings T3
Evaluation T3
Feelings T4
Evaluation T4
.21
.22
.31
.27
.30
.26
.25
.30
Ɛ
Feeli ngs T1
Ɛ
Evalu ation
T1
Ɛ
Feeli ngs T2
Ɛ
Evalu ation
T2
Ɛ
Feeli ngs T3
Ɛ
Evalu ation
T3
Ɛ
Feeli ngs T4
Ɛ
Evalu ation
T4
Fig. 5 Baseline measurement model. Standardized coefficients. Non-significant parameters are depicted in dotted lines. Parameters are rounded. (Source author’s own presentation)
methods factors to account for autocorrelated errors, as in the original model, was not required. As a second step, we established metric measurement equivalence. Generally, measurement equivalence tests whether an “instrument measures the same concept in the same way across various subgroups of respondents” (Davidov et al. 2014, p. 58), which is of course a vital precondition for all interpretations concerning a concept’s stability (Brown 2015; Seddig and Leitgöb 2018). Metric measurement equivalence can be assumed if a model in which the unstandardized factor loadings of identical items are restricted to be similar across measurement waves does not show substantially worse model fit than a more liberal model not imposing those restrictions. Metric measurement equivalence is required for a meaningful interpretation of latent correlation/regression coefficients (Brown 2015). The metric measurement equivalent model again showed excellent model fit, χ2(17) = 19.958, p = .276; RMSEA = .015 [90% CI: .000 − .036]; SRMR = .020; CFI = .999, and the comparison between the metric measurement equivalent model and the more liberal model above did not indicate a substantial decrease in model fit, Δχ2(3) = 2.384,2 p = .497; ΔRMSEA = − .002; ΔSRMR = .010; ΔCFI = .000 (Satorra and Bentler 2010). This indicates that metric measurement equivalence, and thus, the precondition for a valid interpretation of latent stability coefficients, can be assumed. 2The χ2 difference test was corrected for the use of the robust maximum likelihood estimator (MLR).
54
P. Schmidt et al.
Attitudes T1
Attitudes T2
.76
.89
.88
.83
.83
Feelings T1
Evaluation T1
Feelings T2
.21
.22
Ɛ
Ɛ
Feeli ngs T1
Evalu ation
T1
Attitudes T3
.86 . 42
Ɛ
Attit ude s
. 25
.81
.83
Evaluation T2
Feelings T3
.31
.32
Ɛ
Ɛ
Feeli ngs T2
Evalu ation
T2
T2
Attitudes T4
.86
Ɛ
Attit ude s
.26
.86
.84
Evaluation T3
Feelings T4
Evaluation T4
.34
.32
.26
.29
Ɛ
Ɛ
Ɛ
Feeli ngs T3
Evalu ation
T3
T3
Feeli ngs T4
Ɛ
Attit ude s
T4
Ɛ
Evalu ation
T4
Fig. 6 Autoregressive stability model for the consecutive waves of measurement. Parameters are rounded. All depicted parameters are significant. (Source author’s own presentation)
Lastly, based on the metric model, we specified the structural relations between the four attitude factors by modeling auto-regressive paths between consecutive waves of measurement (i.e., T1–T2, T2–T3, and T3–T4; but not T1– T3, T1–T4, or T2–T4), thus directly replicating the original model above. No equivalence restrictions were imposed for the stability coefficients. The stability model is depicted in Fig. 6. The model fit was good, χ2(20) = 82.695, p