First Observation of Fully Reconstructed B0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb (Springer Theses) 3031227522, 9783031227523

This book presents the latest results on the branching fraction and phase space distribution of B0 and Bs0 decays into f

126 79 7MB

English Pages 249 [241] Year 2023

Table of contents :
Supervisors’ Foreword
Abstract
Acknowledgements
Contents
1 Introduction
References
2 Theory Background
2.1 Spin, Helicity and Chirality
2.2 Symmetries and Gauge Bosons
2.3 The Standard Model
2.4 Spontaneous Symmetry Breaking
2.5 The CKM Matrix
2.6 The SM Lagrangian
2.7 Measurements of γ at LHCb
References
3 The LHCb Detector
3.1 The Large Hadron Collider
3.2 The LHCb Detector
3.2.1 Tracking
3.2.2 Particle Identification (PID) Systems
3.2.3 Calorimeters
3.2.4 The ECAL
3.2.5 The Muon Detector
3.2.6 PID Algorithms
3.2.7 The Trigger System
References
4 Analysis Strategy
References
5 Data Samples
5.1 Stripping
5.2 Auxiliary Samples
5.2.1 Samples from Other Stripping Lines
5.2.2 MC Generated Samples
5.3 Truth Matching
References
6 Candidate Selection
6.1 Trigger Selection
6.2 Preselection
6.3 D from B MVA
6.4 overlineD0 Signal Window Definition
6.5 Multivariate Selection
6.6 Yields
References
7 Characterization of Backgrounds
7.1 Combinatorial Background
7.2 Partially Reconstructed Backgrounds
7.3 Misidentified Backgrounds
7.4 Partially Combinatorial Backgrounds
7.5 Misreconstructed Signal
7.6 Wrong overlineD*(2007)0 Decay Backgrounds
References
8 Simultaneous Fit
8.1 Double Crystal Ball Function
8.2 ARGUS Function Convolved with a Crystal Ball Function
8.3 Johnson Function and Double Crystal Ball Function
8.4 Exponential Function
8.5 Fit Strategy
8.6 Fit Results
References
9 Signal Efficiencies
9.1 The Weight Function ws
9.2 Correlation Studies
9.3 Efficiency Determination
References
10 Systematic Uncertainties
10.1 Efficiency Systematic Uncertainties
10.1.1 PIDCORR Resampling
10.1.2 MC Statistics
10.1.3 SDP Binning for Efficiency Estimation
10.1.4 L0Hadron Trigger Systematics
10.1.5 Data/MC Disagreement
10.1.6 Data/MC Disagreement in B 0s Lifetime
10.1.7 Biases in the sWeights Procedure Due to Correlations
10.2 Yields Systematic Uncertainties
10.2.1 Fit Stability
10.2.2 Contributions from Λ0b Backgrounds
10.2.3 Multiple and Duplicated Candidates
10.2.4 Alternative Background Models
10.3 Systematic Uncertainties Summary
References
11 Results
11.1 Relative Branching Fraction Measurements
11.2 Dalitz Plot Distributions
11.3 Final Conclusions
References
Appendix A Selection Variables Description
Appendix B Data/MC Agreement of BDT Input Variables
Appendix C Signal MC Correlation Studies
Appendix D Effects of Fit Constraints on Yield Statistical Uncertainties

Recommend Papers

Higgs Boson Decays into a Pair of Bottom Quarks: Observation with the ATLAS Detector and Machine Learning Applications (Springer Theses) 3030879372, 9783030879372

The discovery in 2012 of the Higgs boson at the Large Hadron Collider (LHC) represents a milestone for the Standard Mode

102 16 6MB Read more

The Search for Supersymmetry in Hadronic Final States Using Boosted Object Reconstruction (Springer Theses) 3030345475, 9783030345471

124 98 17MB Read more

Search for Higgs Boson Decays to Charm Quarks with the ATLAS Experiment and Development of Novel Silicon Pixel Detectors (Springer Theses) 3031362195, 9783031362194

This book explores the Higgs boson and its interactions with fermions, as well as the detector technologies used to meas

110 18 8MB Read more

Quantum Oscillations and Charge-Neutral Fermions in Topological Kondo Insulator YbB₁₂ (Springer Theses) 9811656770, 9789811656774

121 14 3MB Read more

High-Precision W-Boson Studies with LHCb: Measurements of the W Boson's Mass and Lepton Flavour Universality, and Trigger Development for the LHCb Upgrade (Springer Theses) 3031497023, 9783031497025

This book details a new and ground-breaking contribution to the search for a successor to the Standard Model (SM) of par

109 24 7MB Read more

Observation and Control of Magnetic Order Dynamics by Terahertz Magnetic Nearfield (Springer Theses) 9789811687921, 9789811687938, 9811687927

This book explicates the optical controls of antiferromagnetic spins by intense terahertz (THz) electromagnetic waves. T

120 43 4MB Read more

Observation and Control of Magnetic Order Dynamics by Terahertz Magnetic Nearfield (Springer Theses) 9789811687938, 9789811687921, 9811687935

119 109 30MB Read more

Investigations into the Combustion Kinetics of Several Novel Oxygenated Fuels (Springer Theses) 9819945097, 9789819945092

In this thesis, attention was paid to several novel oxygenated fuels―carbonates, polyethers and ketones. Combustion kine

99 52 9MB Read more

Development and Testing of Hand Exoskeletons (Springer Theses) 9783030376840, 3030376842

This book describes the development of portable, wearable, and highly customizable hand exoskeletons to aid patients suf

133 103 5MB Read more

Neutrality and the Neutral States in Soviet New Political Thinking

166 63 9MB Read more

First Observation of Fully Reconstructed B0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb (Springer Theses)
3031227522, 9783031227523

Author / Uploaded
Arnau Brossa Gonzalo

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Springer Theses Recognizing Outstanding Ph.D. Research

Arnau Brossa Gonzalo

First Observation of Fully Reconstructed 0 0 B and Bs Decays into Final States Involving an Excited Neutral Charm Meson in LHCb

Springer Theses Recognizing Outstanding Ph.D. Research

Aims and Scope The series “Springer Theses” brings together a selection of the very best Ph.D. theses from around the world and across the physical sciences. Nominated and endorsed by two recognized specialists, each published volume has been selected for its scientific excellence and the high impact of its contents for the pertinent field of research. For greater accessibility to non-specialists, the published versions include an extended introduction, as well as a foreword by the student’s supervisor explaining the special relevance of the work for the field. As a whole, the series will provide a valuable resource both for newcomers to the research fields described, and for other scientists seeking detailed background information on special questions. Finally, it provides an accredited documentation of the valuable contributions made by today’s younger generation of scientists.

Theses may be nominated for publication in this series by heads of department at internationally leading universities or institutes and should fulfill all of the following criteria • They must be written in good English. • The topic should fall within the confines of Chemistry, Physics, Earth Sciences, Engineering and related interdisciplinary fields such as Materials, Nanoscience, Chemical Engineering, Complex Systems and Biophysics. • The work reported in the thesis must represent a significant scientific advance. • If the thesis includes previously published material, permission to reproduce this must be gained from the respective copyright holder (a maximum 30% of the thesis should be a verbatim reproduction from the author’s previous publications). • They must have been examined and passed during the 12 months prior to nomination. • Each thesis should include a foreword by the supervisor outlining the significance of its content. • The theses should have a clearly defined structure including an introduction accessible to new PhD students and scientists not expert in the relevant field. Indexed by zbMATH.

Arnau Brossa Gonzalo

First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb Doctoral Thesis accepted by University of Warwick, Coventry, UK

Author Dr. Arnau Brossa Gonzalo Instituto Galego de Fisica de Altas Enerxias - IGFAE Santiago de Compostela La Coruña, Spain

Supervisors Prof. Tim Gershon Department of Physics University of Warwick Coventry, UK Dr. Tom Latham Department of Physics University of Warwick Coventry, UK

ISSN 2190-5053 ISSN 2190-5061 (electronic) Springer Theses ISBN 978-3-031-22752-3 ISBN 978-3-031-22753-0 (eBook) https://doi.org/10.1007/978-3-031-22753-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Supervisors’ Foreword

The high energy proton-proton collisions at the Large Hadron Collider produce copious amounts of hadrons containing beauty or charm quarks. The production mechanism means that these tend to travel predominantly close to the beam line, which makes it difficult to distinguish their decay products from the even-morecopious lighter particles that are also produced in the collisions. The LHCb detector is designed to overcome this challenge, using precise tracking of charged particles to achieve excellent momentum resolution and the ability to distinguish the vertices of the decay positions of the beauty and charm hadrons from the primary proton-proton interaction positions. This approach has proved enormously successful, allowing LHCb to publish over 600 papers on topics including asymmetries between matter and antimatter, searches for physics beyond the Standard Model and discoveries of new types of hadrons. However, the majority of these results are based on final states involving charged particles only. For processes involving neutral particles, reconstruction of energy deposits in LHCb’s calorimeter system is required, making it even more challenging to separate signal from background. This is particularly true for low momentum neutral particles, such as the photon or neutral pion emitted in decays of excited neutral charm mesons, i.e. D¯ ∗ (2007)0 → D¯ 0 γ or D¯ 0 π 0 . For this reason, until recently there was no published LHCb analysis involving reconstruction of D¯ ∗ (2007)0 mesons. This left a hole in LHCb’s physics programme, since studies of processes involving these particles can provide additional insights to the fundamental physics questions being addressed. During his Ph.D. studies, Arnau Brossa Gonzalo has addressed and overcome the numerous issues that make reconstruction of processes involving D¯ ∗ (2007)0 mesons so difficult. In particular, he has made the first observations of B 0 and Bs0 meson decays to final states containing an excited neutral charm meson, a kaon and a pion, i.e. B 0 → D¯ ∗ (2007)0 K + π − and Bs0 → D¯ ∗ (2007)0 K − π + , and measurements of their branching fractions relative to that of the B 0 → D¯ ∗ (2007)0 π + π − decay. Two unique aspects of this analysis deserve particular mention. The first is the understanding of how to combine signals from LHCb’s electromagnetic calorimeter

v

vi

Supervisors’ Foreword

with tracking information, leading to a new classification for the different background sources that survive the selection algorithms, including misreconstructed signal processes ( D¯ ∗ (2007)0 → D¯ 0 γ reconstructed as D¯ ∗ (2007)0 → D¯ 0 π 0 and vice versa). The second is the implementation of a novel weighting method that allows the efficiency variation across the phase space of the signal decays to be corrected for, when the distributions of the signal in the phase space is itself not previously measured. Each of these is a major breakthough, beyond the level that would normally be expected in a Ph.D., and mark Arnau’s thesis out as exceptional. Coventry, UK July 2022

Prof. Tim Gershon Dr. Tom Latham

Abstract

Measurements of the γ angle of the unitarity triangle formed from elements of the CKM matrix are achieved through the study of B hadron decays featuring interference between b → cus ¯ and b → u cs ¯ processes. Such measurements are affected by a systematic uncertainty due to the limited knowledge of yet to be observed partially reconstructed decays featuring soft neutral final state particles. Unless knowledge of these decays is improved, the precision of future determination of γ may become limited by this source of uncertainty. In this thesis such decays are studied, resulting in the first observation of B 0 → D¯ ∗ (2007)0 K + π − and Bs0 → D¯ ∗ (2007)0 K − π + decays, and measurements of their branching fractions with respect of that of B 0 → D¯ ∗ (2007)0 π + π − decays. In addition, results on the phase space distribution of such decays are given, which directly benefit the understanding of these decays as partially reconstructed backgrounds in measurements of γ in B + → D K + and B 0 → D K + π − decay studies. The B 0 → D¯ ∗ (2007)0 K + π − and Bs0 → D¯ ∗ (2007)0 K − π + decay channels also provide interesting insight for the study of resonances in the charm sector and could themselves be used in the future for determinations of γ . The study presented in this thesis is based on the data recorded by the LHCb experiment during the years from 2016 to 2018, corresponding to a sample of 5.4 f b−1 . It constitutes the first analysis within the LHCb collaboration featuring fully reconstructed D ∗ (2007) → D 0 γ and D ∗ (2007) → D 0 π 0 decays, and pioneers the usage of a novel weighting method while implementing an eventby-event efficiency correction to account for resonant structures in the decays under study.

vii

Acknowledgements

This thesis is the result of four years of hard work and would have never been possible without the help and support of many people. Thus, I want to express my gratitude and share a few words with them. Firstly, I would like to thank my supervisors Prof. Tim Gershon and Dr. Tom Latham for their support over these four last years. Their unlimited patience and constant efforts to maintain me motivated have been without a doubt an essential piece for the completion of this thesis. I would also like to thank Dr. Matt Kenzie for his key contributions to this thesis, which were crucial to solve many challenges that this work presented. I would also like to thank my fellow Ph.D. students and postdocs, past and present, not only for sharing the Ph.D. struggles over these four years but also for making me feel welcome after moving to a new country. In particular, I would like to mention Dr. Edward Millard, for so many late night discussions, only interrupted to have a game of pool. I am also extremely grateful to Ross, Alice, Luismi and Marina for making me feel at home with their unconditional support during the last segment of my Ph.D., which despite being defined by the quarantine, was full of very fond memories. També vull donar les grácies a tothom que, malgrat començar aquesta etapa lluny de casa, m’ha seguit fent sentir com si mai hagues marxat. A la Clara, per els coffee breaks de dues hores, les trucades randomn i per animar-me cada cop que ho he necesitat. A l’Ana, per omplir l’any a Ginebra de sessions de escalada, festes i cotis, sovint per parts iguals. A l’Andreu, per totes les quedades cada cop que vaig baixar a Barcelona buscant un tastet de casa. Finalment, a la meva mare, la Mercé, per el seu support i els sacrificis sense els quals tot aixó no hauria sigut possible.

ix

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2

2

Theory Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Spin, Helicity and Chirality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Symmetries and Gauge Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 The Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Spontaneous Symmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 The CKM Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 The SM Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Measurements of γ at LHCb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 5 6 8 11 14 19 22 28

3

The LHCb Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The Large Hadron Collider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The LHCb Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Particle Identification (PID) Systems . . . . . . . . . . . . . . . . 3.2.3 Calorimeters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 The ECAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5 The Muon Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.6 PID Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.7 The Trigger System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29 29 31 34 48 53 55 57 58 60 63

4

Analysis Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67 71

5

Data Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Auxiliary Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Samples from Other Stripping Lines . . . . . . . . . . . . . . . . . 5.2.2 MC Generated Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73 73 78 78 79 xi

xii

Contents

5.3 Truth Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82 83

6

Candidate Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.1 Trigger Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.2 Preselection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.3 D from B MVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 D 0 Signal Window Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.4 6.5 Multivariate Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.6 Yields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

7

Characterization of Backgrounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Combinatorial Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Partially Reconstructed Backgrounds . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Misidentified Backgrounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Partially Combinatorial Backgrounds . . . . . . . . . . . . . . . . . . . . . . . 7.5 Misreconstructed Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Wrong D ∗ (2007)0 Decay Backgrounds . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109 111 111 113 120 123 125 125

8

Simultaneous Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Double Crystal Ball Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 ARGUS Function Convolved with a Crystal Ball Function . . . . . 8.3 Johnson Function and Double Crystal Ball Function . . . . . . . . . . . 8.4 Exponential Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Fit Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Fit Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127 128 131 136 140 144 149 154

9

Signal Efficiencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 The Weight Function ws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Correlation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Efficiency Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

157 158 162 166 170

10 Systematic Uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Efficiency Systematic Uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.1 PIDCORR Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.2 MC Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.3 SDP Binning for Efficiency Estimation . . . . . . . . . . . . . . 10.1.4 L0Hadron Trigger Systematics . . . . . . . . . . . . . . . . . . . . 10.1.5 Data/MC Disagreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.6 Data/MC Disagreement in Bs0 Lifetime . . . . . . . . . . . . . . 10.1.7 Biases in the sWeights Procedure Due to Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

171 171 171 172 176 178 179 183 185

Contents

xiii

10.2 Yields Systematic Uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1 Fit Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2 Contributions from Λ0b Backgrounds . . . . . . . . . . . . . . . . . 10.2.3 Multiple and Duplicated Candidates . . . . . . . . . . . . . . . . . 10.2.4 Alternative Background Models . . . . . . . . . . . . . . . . . . . . 10.3 Systematic Uncertainties Summary . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

187 187 189 191 199 199 204

11 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Relative Branching Fraction Measurements . . . . . . . . . . . . . . . . . . 11.2 Dalitz Plot Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Final Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

205 205 209 212 213

Appendix A: Selection Variables Description . . . . . . . . . . . . . . . . . . . . . . . . . 215 Appendix B: Data/MC Agreement of BDT Input Variables . . . . . . . . . . . . 219 Appendix C: Signal MC Correlation Studies . . . . . . . . . . . . . . . . . . . . . . . . . 231 Appendix D: Effects of Fit Constraints on Yield Statistical Uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Chapter 1

Introduction

Sometimes, science is so difficult it makes me sad. – Nathan W. Pyle, Strange planet

The Standard Model is a quantum field theory that works as a toolbox with all necessary pieces to describe interactions of matter at the most fundamental level. Many contributions have been added to improve the model throughout the second half of the 20th century, which led to predictions of experimental results up to an incredible degree of precision. Several Nobel prizes have been awarded to the combined efforts of a large number of theorists for developing such tools in an effective and consistent manner and to experimentalists for the corresponding discoveries, the first of them being for the formalization of quantum electrodynamics in 1965 to Feynman, Schwinger and Tomonaga [1]. In 1979, Glashow, Salam and Weinberg were awarded the prize for unifying weak and electromagnetic interactions [2, 3]. While 20 years later, in 1999, ’t Hooft and Velman were given the prize for the quantum formalization of the weak interaction [4]. On the side of quantum chromodynamics (QCD), the prize went to Gell-Mann in 1969 for its initial formulation [5] and to Gross, Politzer and Wilczek in 2004 for its further developments [6]. Other improvements that amended some inherent flaws of the SM also were worthy of Nobel prizes. Kobayashi and Maskawa were given the prize in 2008 for the introduction of a quark flavour mixing mechanism in the SM, and in 2013 Higgs and Englert were awarded the prize for presenting the spontaneous symmetry breaking mechanism (SSB) that gives mass to all elementary particles [7, 8]. All these tools have been proven to be extremely effective and have offered several predictions that have been widely tested by enormous experimental efforts in collider experiments. However, still some experimental results contradict the current state of the Standard Model, and several new models are in constant testing yet today as an attempt to build a complete theory. Some of these unanswered questions are the quantization of the gravitational interaction, a possible description of dark matter or the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_1

1

2

1 Introduction

inclusion of neutrino masses to account for the observed neutrino oscillations, which also were worthy of a Nobel prize in 2015, awarded to Kajita and McDonald [9]. Thus, the incomplete (though extremely successful) nature of the SM makes the search for experimental results that deviate from its current formulation crucial to push the boundaries of the SM and provide tests for the candidate models. One of such topics of interest is the difference between the amount of matter and antimatter in the early stages of the Universe. The precise mechanism for this asymmetry to arise, known as baryogenesis, requires that particle interactions exist that violate charge-parity (C P) symmetry [10]. While this requirement is satisfied by the SM, the amount of C P asymmetry provided by the SM is orders of magnitude smaller than required by matter-antimatter asymmetry observations. The only source of C P violation in the SM is through flavour changing transitions. Thus, the study of C P-violating processes in the flavour sector is one of the main areas of interest of many high energy physics experiments, in particular of the LHCb collaboration. The LHCb collaboration has already provided a large list of studies of known C P-violation processes and searches of new sources of C P violation, and currently leads on the race for ultimate precision measurements in this sector. The processes studied in this thesis represent an additional step to that objective. In order to put this work into context, a summary of the different symmetries of the SM and the mechanism for C P asymmetry processes to occur are provided in Chap. 2. putting special emphasis on how measurements of C P violation processes are performed in LHCb. With that in mind, a description of the LHCb detector and its performance is given in Chap. 3. In Chap. 4, the strategy of the analysis described in this thesis is summarized, which is detailed in the next chapters. In Chap. 5, the data samples used for this study are described. The selection of signal candidates is shown in Chap. 6, with the description of the different backgrounds presented in Chap. 7. The fitting strategy used to determine the yields of the signal channels is described in Chap. 8, with the calculation of the signal efficiency necessary for a branching fraction measurement given in Chap. 9. Afterwards, a discussion of the systematic uncertainties that affect this study is shown in Chap. 10. To conclude, the results of this thesis are presented in Chap. 11, and their relevance to other ongoing and future studies is described.

References 1. Schwinger JS (1948) Quantum electrodynamics. A covariant formulation. Phys Rev 74:1439. https://doi.org/10.1103/PhysRev.74.1439 2. Glashow SL (1961) Partial symmetries of weak interactions. Nucl Phys 22:579. https://doi. org/10.1016/0029-5582(61)90469-2 3. Salam A (1968) Weak and electromagnetic interactions. Conf Proc C 680519:367. https://doi. org/10.1142/9789812795915_0034 4. ’t Hooft G, Veltman MJG (1972) Regularization and renormalization of gauge fields. Nucl Phys B 44:189. https://doi.org/10.1016/0550-3213(72)90279-9 5. Fritzsch H, Gell-Mann M, Leutwyler H (1973) Advantages of the color octet gluon picture. Phys Lett B 47:365. https://doi.org/10.1016/0370-2693(73)90625-4

References

3

6. Gross DJ, Wilczek F (1973) Asymptotically free gauge theories - I. Phys Rev D 8:3633. https:// doi.org/10.1103/PhysRevD.8.3633 7. Higgs PW (1964) Broken symmetries and the masses of gauge bosons. Phys Rev Lett 13:508. https://doi.org/10.1103/PhysRevLett.13.508 8. Englert F, Brout R (1964) Broken symmetry and the mass of gauge vector mesons. Phys Rev Lett 13:321. https://doi.org/10.1103/PhysRevLett.13.321 9. Super-Kamiokande, Fukuda Y et al. (1998) Evidence for oscillation of atmospheric neutrinos. Phys. Rev. Lett 81:1562. https://doi.org/10.1103/PhysRevLett.81.1562, arXiv:hep-ex/9807003 10. Sakharov AD (1967) Violation of CP invariance, C asymmetry, and baryon asymmetry of the Universe. Pisma Zh Eksp Teor Fiz 5:32. https://doi.org/10.1070/ PU1991v034n05ABEH002497

Chapter 2

Theory Background

In this chapter we will build all the elements of the SM, making a special emphasis on flavour mixing measurements in B hadron decays. However, although most of the developments of the SM have been experimentally driven, starting with the formulation of quantum electrodynamics, it is more instructive to construct the model from the context of quantum field theory and its underlying symmetries. To do this, we cannot start without first introducing one of the most crucial properties of quantum fields that motivate its quantum formulation, the spin.

2.1 Spin, Helicity and Chirality The spin of a particle describes an intrinsic angular momentum with no direct analogue in the macroscopic world. Contrary to classical angular momentum, only the module of the spin of a particle (S) and its projection on a given axis (Sz ) can be measured simultaneously. Also, these magnitudes can only take exact distinct values rather than any possible value (Sz = N 21 , with N being an integer, and √ S = 2 N (N + 2) ). On top of that, while the angular momentum of a macroscopic object can be measured by the movement of its parts with respect to a certain axis, the point like nature of a particle (or even worse, its description as a probability density function) makes it impossible to measure it in the same way. As such, intrinsic spin is something quite different to classical angular momentum. Nonetheless, its value has crucial implications on the behaviour of particles. The wave-function of a system with semi-integer spin particles (up to a factor), also called fermions, will be completely antisymmetric under any swap of identical particles within the system. By contrast, it will be completely symmetric for integer spin particles, also called bosons. This directly leads to Pauli’s exclusion principle, as a system with two indistinguishable particles in the same location is symmetric under the swap of such particles by definition, and therefore must be populated by bosons. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_2

5

6

2 Theory Background

The impossibility of a complete measurement of all components of the spin of a particle arises from the uncertainty principle, which states that perpendicular components of the spin are not physically defined simultaneously and only its magnitude and its projection onto a certain axis can be measured. In the context of particle physics, this gives birth to the concept of helicity which can be defined as the projection of the spin of a particle in the direction of its momentum. This however, it is not invariant under Lorentz boost transformations for massive particles, and is therefore not a good candidate as a fundamental property of a particle. A Lorentz invariant analogue of helicity for massive particles is chirality. Its value its given by how a particle transforms under the operator γ 5 . We denote a fermion as right-handed (left-handed) if it is completely symmetric (antisymmetric) under this transformation. One can decompose a wave function into its left handed and right handed counterparts by using their associated projectors PL and PR . ψ L = PL ψ =

1 (1 − γ 5 )ψ, 2

ψ R = PR ψ =

1 (1 + γ 5 )ψ 2

(2.1)

The left and right handed components are linearly independent, and return the null state when the opposite projector is applied on them PL ψ R = 0,

PR ψ L = 0.

(2.2)

Chirality is one of the central concepts in the SM, as it is crucial to describe one of the underlying symmetries of the Standard Model, SU (2) L , which only acts on left-handed fermions. In the following, we will describe how SU (2) L , and more generally, any symmetry transformation can be interpreted as a fundamental interaction between the elementary particles of the model.

2.2 Symmetries and Gauge Bosons Consider a Dirac field ψ(x) that exists within a theory that is invariant under the following set of transformations of a local symmetry group ψ(x) → ψ (x) = ei T

a

αa (x)

ψ(x).

(2.3)

Here T a are the generators of the symmetry group and αa is a set of arbitrary phases that can depend on the space-time coordinates, x. There must be then a Lagrangian that describes the theory that is also invariant under this transformation. A good place to start could be the free Lagrangian of a fermion field given by the Dirac equation L0 = iψγ μ ∂μ ψ − mψψ.

(2.4)

2.2 Symmetries and Gauge Bosons

7

The second term is clearly invariant under this transformation, and so is the first term as long as the phases of the transformation are independent of the space time coordinates, or, in other words as long as this is a global transformation. However, in the case of a local transformation in which this dependence exists, a modification to the Lagrangian must be introduced to ensure its invariance. This can be achieved by defining a new covariant derivative Dμ = ∂μ − igT a Aaμ (x)

(2.5)

and introducing the new gauge fields Aaμ (x) that transform as 1 Aaμ (x) → Aaμ (x) = Aaμ (x) + ∂μ αa (x) g

(2.6)

where g is a dimensionless free parameter of the theory (the coupling strength). The expression for the gauge fields is not unique as the only requirement is that they transform in this form. This new covariant derivative transforms in the same way as the Dirac fields Dμ ψ → (Dμ ψ) = ei T

a

αa (x)

(Dμ ψ)

(2.7)

therefore leaving now all the terms in the Lagrangian invariant. Once the covariant derivative is introduced, the new modified Lagrangian is L = iψγ μ Dμ ψ − mψψ = iψγ μ (∂μ + igT a Aaμ )ψ − mψψ = iψγ μ ∂μ ψ − mψψ + igT a ψγ μ Aaμ ψ = L0 + igT a ψγ μ Aaμ ψ.

(2.8)

It contains the original L0 , which describes the kinematics of a free Dirac field, plus an extra term that describes the interaction with strength g of this field with the newly introduced gauge fields. At this point our Lagrangian is completely invariant under the symmetry transformation defined in Eq. (2.3). Nevertheless, we must finish the construction of our locally invariant Lagrangian by at least adding a kinetic energy term for the new fields Aaμ , which also preserves invariance under this transformation. We can use our previous knowledge on how the covariant derivative transforms to construct this term. This derivative transforms in the same way as our fields, and consequently the commutator [Dμ , Dν ] will transform in the same way, making the term ψ[Dμ , Dν ]ψ completely invariant. The commutator can be expressed as a Ta [Dμ , Dν ] = −ig(∂μ Aaν − ∂ν Aaμ + g f abc Abμ Acν )T a = −ig Fμν

(2.9)

where f abc are the structure constants of the symmetry group and follow the Jacobi identity. Equation (2.9) however contains non derivative terms and thus the whole expression must remain invariant independently of ψ. Consequently, the kinetic terms

8

2 Theory Background

a of Aaμ can be built by using combinations of Fμν . The only possibility (including a 2 terms of up to dimension 4) is (Fμν ) , so the final expression for the most general Lagrangian invariant under the transformation described in Eq. (2.3) is a 2 ) + L0 + igψT a γ μ Aaμ ψ. L = (Fμν

(2.10)

Finally, using Noether’s theorem we can associate this symmetry with a set of conserved currents jμa = ψγμ T a ψ

(2.11)

and their associated charges Qa =

j0a d 3 x.

(2.12)

We have seen how a symmetry in our theory leads directly to a fundamental interaction between the fermions of the system and the gauge boson, and that give a charge that describes the interaction strength up to the factor g (coupling strength). Now, we are ready to visit the complete set of symmetries that define the Standard Model and present all the characters that will play a role in the theory.

2.3 The Standard Model The SM is a gauge theory based on the symmetry group G S M = SU (3)C × SU (2) L × U (1)Y

(2.13)

where SU (3)C describes the strong interaction through the symmetry of color (C), and SU (2) L × U (1)Y describe the weak and electromagnetic interactions, that are given by the weak isospin symmetry (I3 ) and the symmetry of hypercharge (Y). We have previously seen that each one of these symmetries introduces a set of gauge bosons that will be the carriers of each one of the interactions driven by these symmetries. There are 8 generators associated to SU (3)C , λa /2 for a = 1, 2, . . . , 81 with 8 associated gauge bosons G aμ , also referred to as gluons. This will also provide to particles a color charge with 8 different eigenvalues in the adjoint representation (or in the fundamental representation, a combination of three color charges).

In their fundamental representation, λa correspond to the Gell-Mann matrices. In that representation the color charge of a particle can be described using the three base vectors r = (1, 0, 0), g = (0, 1, 0), b = (0, 0, 1).

1

2.3 The Standard Model

9

SU (2) L characterizes the symmetry of weak isospin, and has 3 associated generators, Ii where i = 1, 2, 3. In their natural representation these can be expressed in terms of the Pauli matrices Ii = σ2i , with three complementary gauge bosons: Wμi . U (1)Y , with only one generator Y (the identity) and one gauge boson Bμ , describes the symmetry of hypercharge. The two latter symmetry groups will be connected through the process of electroweak symmetry breaking, through which emerge the physical weak interaction gauge bosons W ± and Z 0 and the photon γ that mediates electromagnetism. It is of crucial importance to note that the weak isospin symmetry acts only on left-handed fields and introduces only left-handed charged currents. Thus, left and right components of our fermion fields must enter separately in the SM. One can split the two components of a Dirac field using the projectors described in Eq. 2.1. Having this structure in mind, we can finally introduce the complete list of particles that can be found in Table 2.1. It consists of SU (2) L quark doublets Q and sin-

Table 2.1 List of fields that populate the Standard Model and their charges associated to all symmetries. The SU (3)C and SU (2) L columns correspond to the number of flavours of each fermion field, while the U (1)Y column corresponds to the hypercharge of each fermion field. The doublet structure arises from expressing these fields in the SU (2) L fundamental representation, so u l and dl have +1/2 and −1/2 weak isospin respectively, and similarly for the remaining doublets. The triplet structure of the fundamental representation of SU (3)C is not explicitly indicated. The table also includes the list of gauge bosons and their associated symmetries, here N stands for the number of gauge bosons of each type Fermionic Generations SU (3)C SU (2) L U (1)Y Qe fields +2/3 ul cl tl , , Q 3 2 +1/3 −1/3 dl sl bl U

u r , cr , tr

3

1

+4/3

+2/3

D

dr , sr , br el μl τl , μ , τ νle νl νl

3

1

−2/3

−1/3

1

2

−1

−1 0

E

er , μr , τr

1

1

+4/3

−1

Bosonic fields

Associated symmetry

N

G aμ

SU (3)C

8

Wμi

SU (2) L

3

Bμ

U (1)Y

1

L

10

2 Theory Background

glets U and D, plus the lepton doublets L and singlets E. For each fermion, three different generations exist, these share all the quantum numbers associated to all the symmetries and in the SM only differ through their masses. We have chosen to describe these in the SU (2) L basis to detail the chiral nature of the SM, however these can also be expressed in the flavour basis in which fl = PL f and fr = PR f , with f = u, c, t, d, s, b, e, μ, τ , ν e , ν μ , ν τ , taking into account that νri = 0. These however are not eigenstates of SU (2) L . These could also be represented in the SU (3)C color basis to exhibit the quarks triplet structure of Q, U and D, while the leptons L and E would show up as singlets, as they do not take part in strong interactions. A total of 12 gauge bosons as described above complete the gauge structure of the SM (a final boson, the Higgs, will be introduced in the following section that will complete the SM picture). Following the same recipe as in the previous section we can now build the SM Lagrangian. This will include the gauge boson terms given by 1 i W i,μν + Bμν B μν ) L Bos = − (G aμν G a,μν + Wμν 4 with

(2.14)

G aμν = ∂μ G aν − ∂ν G aμ + gs f abc G bμ G cν i Wμν = ∂μ Wνi − ∂ν Wμi + gw i jk Wμj Wνk

(2.15)

Bμν = ∂μ Bν − ∂ν Bμ where gs and gw are the strong and weak couplings respectively, that govern the strength of such interactions and i jk are the structure constants for SU (2) L and are given by the Levi-Civita symbol. Since SU (3)C and SU (2) L are non-Abelian groups, Eq. 2.14 will not only include the free terms for the gauge bosons, but will also include triple and quadruple boson interaction terms, which have the following form igTr(∂ν Vμ − ∂μ Vν )[Vμ , Vν ] g2 Tr[Vμ , Vν ][Vν , Vμ ]. 2

(2.16)

The interaction terms for the fermions can be obtained by just inserting the covariant derivative into the free fermion Lagrangian, which for the SM fermions will look like / Q + U i DU / + Di D / D + Li DL / + Ei D /E L F = Qi D

(2.17)

The covariant derivative now must include three new gauge boson terms to preserve invariance under the complete symmetry group, and is Dμ ψ = (∂μ − i

gs gw g λa G aμ − i σi Wμi − i Yq Bμ )ψ. 2 2 2

(2.18)

2.4 Spontaneous Symmetry Breaking

11

Including both parts and removing the vanishing terms, which can be identified as the singlets under each of the symmetries, the Lagrangian takes the following form 1 1 i 1 W i,μν − Bμν B μν + L S M = − G aμν G a,μν − Wμν 4 4 4 gs gw g + Qγ μ (∂μ − i λa G aμ − i σ i Wμi − i Yq Bμ )Q+ 2 2 2 gs g μ a + U γ (∂μ − i λa G μ − i Yu Bμ )U + 2 2 gs g μ a + Dγ (∂μ − i λa G μ − i Yd Bμ )D+ 2 2 gw i i g μ + Lγ (∂μ − i σ Wμ − i Yl Bμ )L+ 2 2 g μ + Eγ (∂μ − i Ye Bμ )E. 2

(2.19)

This Lagrangian describes a gauge theory invariant under the SM symmetry group for massless bosons and fermions. In the previous example, we trivially included the mass terms for the fermions mψψ as they preserved the Lagrangian invariance, however, these terms are not invariant under SU (2) L , which only affects left-handed fermions. The mass terms for the boson fields must also be included very carefully to preserve this invariance. In the next section, we will obtain these terms through spontaneous symmetry breaking using the Higgs mechanism. This will not only allow us to introduce new mass terms for all fields but also will also allow us to make a physical interpretation of the gauge fields Wμi and Bμ by relating them to the weak bosons Wμ± , Z μ0 and the photon.

2.4 Spontaneous Symmetry Breaking The addition of gauge invariant mass terms of boson fields can be achieved by the introduction of an new scalar field, the Higgs, with a non-vanishing expectation value [1, 2]. That is, we can find the Lagrangian that maintains the symmetry of the symmetry group, but has a vacuum state that is not invariant under the symmetry transformation (and hence it spontaneously breaks one of the symmetries). Let us start with the SM Lagrangian including only the SU (2) L and U (1)Y terms of a SU (2) L scalar doublet2 1 1 i W i,μν − Bμν B μν + (Dμ )† Dμ − V () L = − Wμν 4 4

(2.20)

We can define this new scalar field as a singlet under SU (3)C so all interaction terms with G aμ vanish. This will enormously simplify our calculation and thus the SU (3)C terms are not included in this section. 2

12

2 Theory Background

with =

φ1 φ2

(2.21)

being a SU (2) L doublet. The most general shape of V () which includes terms up to 4 is (2.22) V () = −μ2 († ) + λ(† )2 where λ > 0. If μ2 > 0 this potential presents a local maximum at = 0 and a local minimum at μ2 v † = (2.23) ≡√ . 2λ 2 One possibility is to select

= 0

0

(2.24)

√v 2

where 0 is the vacuum expectation value of , but infinite other possibilities can be achieved by applying SU (2) L rotations. This choice is especially interesting as it is invariant under the transformation 10 φ0 → φ0 = eiαQ φ0 , with Q = (2.25) 00 This generator Q can be described in terms of the SU (2) L and U (1)Y generators as Q=

Y σ3 + . 2 2

(2.26)

This establishes that Q generates a new conserved quantity for our fields, which will return the electromagnetic charge as its associated charge.3 This non-vanishing vacuum expectation value of will introduce new terms into its covariant derivative, which can be interpreted as the gauge bosons masses. In the SU (2) L basis this can be written as ⎛ ⎞ − 2ig√w2 (Wμ1 − i Wμ2 )(v + χ) ⎠. Dμ = ⎝ ig (2.27) 1 − 2√2 (g Bμ − gw Wμ3 )(v + χ) + √2∂χ

Here we have used =

3

√v 2

0 +

√χ 2

(2.28)

From an empirical point of view the values of hypercharge for each field are given by this relation, instead of the other way around.

2.4 Spontaneous Symmetry Breaking

13

to define the field as a perturbation of the vacuum state φ0 . Perturbations perpendicular to φ0 are not considered here. These correspond to the so-called Goldstone bosons which can be absorbed by rotations on the SU (2) L space. At this point, it is convenient to introduce the complex fields 1 Wμ± = √ (Wμ1 ∓ i Wμ2 ) 2 1 (gw Wμ3 − g Bμ ) Z μ0 = 2 g + gw2 1 (gw Bμ − gWμ3 ) A0μ = g 2 + gw2

(2.29)

This new basis is an orthogonal transformation of the fields Wμi and Bμ which allows us to rewrite the first two self interacting terms of the Lagrangian defined in Eq. (2.20) in terms of these new fields without modifying the underlying structure. Moreover, this new basis leads to the third term of the Lagrangian being (Dμ )† Dμ =

1 g2 v2 1 (∂μ χ)2 + w Wμ+ Wμ− + 2 4 2

(gw2 + g 2 )v 2 4

Z μ0 Z μ0 + I nt.

(2.30) Where I nt. stands for all interaction terms between the Higgs field χ and the gauge bosons. We can directly associate these terms with masses of the bosons fields 1 vgw 2 v mZ = g2 + g2 2 w m A = 0.

mW =

(2.31)

Finally, expanding the potential V () one can obtain V () = −

λ λv 4 + λv 2 χ2 + λvχ3 + χ4 4 4

(2.32)

where the first term corresponds to the vacuum energy, which corresponds to the minimum of the system. The second term can be identified as the mass of the Higgs field, that also arises naturally from this symmetry breaking mechanism. mχ =

√ 2λv.

(2.33)

The last two terms in the potential expansion correspond to 3-point and 4-point Higgs boson self-interaction terms. We have seen how this mechanism generates mass terms for the gauge bosons of the SM. However, its implications are even more powerful than that. The chiral

14

2 Theory Background

nature of the SM also forbids fermion mass terms. Those would have the structure f f f f of −m f (ψ L ψ R + ψ R ψ L ) in the flavour basis, the index f standing for each possible fermion field. Fortunately, the same non-vanishing Higgs field can be also used to generate fermion masses. We can introduce gauge invariant interaction terms of the fermion and the Higgs fields, also called the Yukawa Lagrangian, as LY = −λi ψ i L ψ iR

(2.34)

where λi is an free parameter of the theory, which is different for each fermion field. When considering the non-zero vacuum expectation in the Higgs field parametrization used in Eq. (2.28) this new interaction term leads to 1 1 i i LY = − √ λi vψ L ψ iR − √ λi χψ L ψ iR . 2 2

(2.35)

This non zero vacuum expectation value v naturally generates mass terms for all the fermion fields with masses proportional to it as λfv mf = √ 2

(2.36)

which is proportional to the interaction strength of the fermions with the Higgs field, given by the second term in Eq. (2.35). Since these terms arise from Yukawa interactions with couplings λi , there are no theoretical constrains for these values, i.e. they are free parameters of the theory and have to be determined empirically. This is in contrast with the gauge bosons, in which the masses are strictly linked to the vacuum self energy v and the coupling constants gw and gs that scale the weak and strong interaction strengths. At this point, we have built all the components of a Lagrangian that contains all the essential elements for the SM, including mass terms for all the fields except the neutrinos. Furthermore, the addition of the Higgs field allowed us to obtain a physical interpretation of the SU (2) L and U (1) symmetries with the electromagnetic and weak interactions. As a final ingredient for our model, in the next section we will see how fermion mixing terms arise from an intrinsic difference between the flavour and mass eigenstates. This will enable mixing between fermion generations to appear in the weak charged currents.

2.5 The CKM Matrix In order to see this mixing, lets take a step back and recover the weak fermion currents that we described in Eq. (2.19), but now in terms of the new basis defined by the gauge bosons Wμ± , Z μ0 and Aμ .

2.5 The CKM Matrix L E W = ψγ μ (−i

15

gw i i g σ Wμ − i Yq Bμ )ψ = 2 2

gw = − √ (ψ 1 γ μ ψ2 W + + ψ 2 γ μ ψ1 Wμ− ) − ψi 2

Zμ

gw (T 3 − sin2 θw Q e ) + eQ e Aμ γ μ ψi cosθw

(2.37) ψ1 where the field fermions ψ are expressed as SU (2) L doublets as ψ = . Here ψ2 θw is the Weinberg mixing angle which is defined by the relations

cos θw =

gw

gw2 + g 2 g sin θw = , 2 gw + g 2

(2.38)

where Q e is the electric charge of the fermion and e ≡ gw sinθW is the electromagnetic coupling. Let us take special attention to the charged weak currents in the quark sector Q, which expressed in the flavour basis are ⎛ ⎞ μ dl gw √ u l , cl , t l γ ⎝ sl ⎠ Wμ+ + h.c. 2 bl

(2.39)

This term describes the interaction between up type quarks, down type quarks, and the Wμ± bosons, but does not include any mixing among the different flavours. Nonetheless, the mass eigenstates are not required to be the same as the flavour eigenstates. Independently on how are they related, one can move from one to another by a rotation in the flavour space ij

j

ij

j

u li = UUl u l , dli = U Dl dl ,

(2.40)

where in here we have used the notation u li = (u l , cl , tl ) and dli = (dl , sl , tl ). Uu and Ud are rotation matrices that relate the two basis. We want to choose rotation matrices that leaves the mass terms diagonal, i.e. the new u i , di states correspond to the mass eigenstates. Since the mass terms contain both left and right handed components, this rotation fixes the values of UUl with respect to U Dl which will affect the weak charged currents. Expressing these in the new mass eigenstates basis one gets gw i j† jk √ u li γ μ UUl U Dl dlk Wμ+ + h.c. 2 i j†

jk

(2.41)

The matrix UU U D is completely fixed by the defined rotations to diagonalise the mass terms, and thus can be non-diagonal, introducing some mixing between the mass quark eigenstates. This is in fact the Cabbibo-Kobayashi-Maskawa (CKM) i j† jk matrix VCikK M = UUl U Dl [3, 4]. The elements of VC K M are, in general, complex and thus it has a total of 18 parameters. However, these can be reduced by considering that VC K M arises from an orthogonal change of basis and thus must satisfy the unitary

16

2 Theory Background

relation VC K M VC† K M = 1, bringing down the number of independent parameters to 9. All phases except one can also be removed by applying global rotations to the six quark fields.4 In conclusion, the mixing matrix VC K M can be described as three rotation angles in the flavour space and a complex phase. Using the Euler notation to describe rotations in three dimensional space this is ⎛

VC K M

⎞ c12 c23 s12 c13 s13 e−iδ = ⎝ −s12 s23 − c12 s23 eiδ c12 c23 − s12 s23 s13 eiδ s23 c13 .⎠ s12 s23 − c12 c23 s13 eiδ −c12 s23 − s12 c23 s13 eiδ c23 c13

(2.42)

Here ci j and si j correspond to cos θi j and sin θi j respectively, θi j being the quark mixing rotation angles. The terms in VC K M scale weak interactions between quarks of different generations, which are framed by the coupling constant gw . Thus, it is more common to write this in terms of coupling constants for each possible charged current as ⎛ ⎞ Vud Vus Vub u li γ μ ⎝ Vcd Vcs Vcb ⎠ dlk Wμ+ + h.c. (2.43) Vtd Vts Vtb It is worth mentioning that a non-zero complex phase introduces C P violation terms in flavour mixing interactions. The “amount” of C P violation introduced by VC K M can be related to the commutator relation given by C PS M = 2J (m 2t − m 2c )(m 2t − m 2u )(m 2c − m 2u )(m 2b − m 2s )(m 2b − m 2d )(m 2s − m 2d ) (2.44) where J is the Jarlskog parameter 2 s12 s13 s23 sin δ. J = Im(Vud Vcs Vus∗ Vcd∗ ) = c12 c23 c13

(2.45)

Another widely used notation to describe VC K M arises from the experimental fact that the couplings for quarks of the same generation are much bigger than those between quarks of different generations. Thus, it is convenient to write VC K M as a perturbative expansion of the sine of the mixing angle s12 , using the Wolfenstein parametrization [5]. ⎛ ⎜ VC K M = ⎜ ⎝

1−

λ2 2

−λ

λ 1−

Aλ3 (ρ − iη) λ2 2 2

Aλ3 (1 − ρ − iη) −Aλ

Aλ2

⎞ ⎟ ⎟ + O(λ4 ) ⎠

(2.46)

1

which results from defining the parameters 4

VC K M is given by the relation between the flavour and mass eigenstates, but a global rotation to both can be chosen to reduce all phases except one to zero.

2.5 The CKM Matrix

17

λ = s12 , A=

s23 , s12 s12

ρ=

s13 cos δ , s12 s23

η=

s13 sin δ . s12 s23

(2.47)

The Jarlskog parameter expression using this perturbative expansion being J = A2 λ6 η(1 − λ2 /2) + O(λ10 )

(2.48)

A last widely used notation of the mixing C K M matrix arises from the unitary nature of VC K M . In terms of the mixing couplings that we introduce in Eq. (2.43), the following 9 relations can be written 3

Vki∗ Vk j = δi j

(2.49)

k=1

where δi j is the Dirac delta. One of these equations is ∗ + Vcd Vcb∗ + Vtd Vtb∗ = 0. Vud Vub

(2.50)

This can be interpreted as triangle in the complex plane, as it is represented in Fig. 2.1, where all sides have been normalized by Vcd∗ Vcb . The internal angles of this triangle are Vtd Vtb∗ , α ≡ arg − Vcd Vcb∗ Vcd Vcb∗ , β ≡ arg − (2.51) Vtd Vtb∗ ∗ Vud Vub . γ ≡ arg − Vcd Vcb∗ This notation is particularly remarkable as the area inside the triangle before the normalization is exactly two times the Jarlskog parameter, giving a geometric interpretation of the amount of C P violation in the standard model. Moreover, this representation directly relates the complex phase δ with one of the angles γ in the unitarity triangle, γ = δ + O(λ4 ).

(2.52)

18

2 Theory Background

Fig. 2.1 Triangle in the complex plane that arises from the unitarity relation of VC K M

νl

ll W+ ui

ll

ll

W− dj (a)

ll W+

uk

ui

γ dj

W− uk

(b)

Fig. 2.2 Box (a) and Penguin (b) diagrams that describe a flavour changing quark transition through a neutral current mediated by two opposite charged weak bosons. i, j, k and l correspond to flavour indices

It is worth noting that this derivation does not introduce any flavour mixing in the neutral currents at tree level, which are generated by Z μ and Aμ , due to UU† UU and U D† U D being completely diagonal by definition. Flavour changing neutral currents (FCNC) can still be obtained through one-loop interactions involving two charged neutral current vertices as shown in Fig. 2.2, also called box and penguin diagrams. This fact makes VC K M the only source of flavour mixing and C P violation in the Standard Model, offering a great framework for searches beyond the standard model. Deviations of C P violation measurements from the values expected due to the VC K M parameters would indicate the existence of new physics beyond the Standard Model (BSM). Moreover, the closure of the C K M unitary triangle is a crucial test of the three generation model of the SM. An unclosed triangle would indicate unaccounted mixing between the mass eigenstates and could suggest the presence of a fourth quark generation among other possible models. A similar strategy could be applied on the lepton sector. However, the absence of right-handed neutrinos leaves complete free the choice of the rotation Uνl so that Uel† Uνl is allowed to be completely diagonal. In other words, a rotation in the flavour space of the neutrino fields is always possible so that the flavour eigenstates coincide with the mass eigenstates. This however would change under the addition of mass

2.6 The SM Lagrangian

19

terms for the neutrinos. As we have seen in Eq. (2.35), one way to obtain these in the SM is through the Higgs mechanism and the addition of a Yukawa term LY = −λi ν li νri

(2.53)

with i = e, μ, ν. This would force the addition of right-handed neutrinos in the SM that would constrain the permitted rotations Uνl , naturally introducing flavour mixing in the lepton sector and the appearance of a matrix analogous to VC K M , the so called Pontecorvo-Maki-Nakagawa-Sakata matrix (V P M N S ) [6]. This fact proves lepton flavour violation measurements and the observation of neutrino oscillations to be a prominent topic to push the boundaries of the SM.

2.6 The SM Lagrangian Let us gather all the pieces we have developed to construct the full SM after the spontaneous symmetry breaking and the Yuakawa terms are introduced (for massless neutrinos). The complete Lagrangian can be summarized as L S M = L QC D + Llep + LV + L E M + Lweak + L H + L H V .

(2.54)

L QC D contains all the relevant terms for the strong interaction, including the kinematic terms for the gluons and the quarks. The mass terms for the quarks that appear from the Yukawa terms after SSB have also been included here. It can be written as λa 1 L QC D = − G aμν G μν,a + q i (iγ μ ∂μ − m i − gs γ μ G aμ )q i 4 2

(2.55)

where the index a = 1, . . . , 8 is the color index and i = u, d, c, s, t, b. Both indices are summed in this expression. Llep includes the kinematic and mass terms for the remaining fermions Llep = ei (iγ μ ∂μ − m l )ei + ν i iγ μ ∂μ PL ν i

(2.56)

where i = 1, 2, 3 is the lepton generation with e1 = e, e2 = μ, e3 = τ , and similarly for the neutrinos. Note that we have explicitly included the projector PL to vanish neutrino right-handed components (νri ), removing the mass terms for the neutrinos. LV contains the kinematic terms for the SU (2) L × U (1)Y bosons, which written in the basis defined by the weak and electromagnetic fields, is

20

2 Theory Background

1 1 1 − −,μν Fμν F μν − Z μν Z μν − Wμν W + 4 4 2 m2 + Z Z μ Z μ + m 2W Wμ− W +,μ + 2 g2 + w (Wμ+ Wν− − Wν+ Wμ− )2 + 4 igw (Fμν sin θw + Z μν cos θW )(W −,μ W +,ν − W +,μ W −,ν ) + 2

LV = −

(2.57)

where we have defined the following Fμν = (∂μ Aν − ∂ν Aμ ) Z μν = (∂μ Z ν − ∂ν Z μ )

(2.58)

± Wμν = (∂μ − igw sin θw Aμ − igw cos θw Z μ )Wν± − (μ ↔ ν)

Note that this expression includes 4-vertex self interacting terms for the bosons Wμ± as well as triple gauge boson interactions. LEM describes the interaction of charged leptons with the field Aμ , the photon field, (2.59) LEM = e Aμ Q f f γ μ f where f stands for all charged fermions with charge Q f . Here we have also defined the electron charge e ≡ gw sin θw to highlight this term as the electromagnetic interaction. Lweak describes the weak interaction for the fermion fields, and includes the only source of flavour mixing present in the SM, as we have seen in the previous section. It is gw Lweak = √ ν i γ μ Wμ± PL ei + 2 gw i μ + + √ u γ Wμ VC K M PL d i + (2.60) 2 gw + f (T 3 − sin2 θw Q f )γ μ f Z μ + h.c. cosθw where the first two terms only are relevant for left handed fermions. L H is the Higgs potential and includes the Higgs mass as well as its kinematics and self-interacting terms, m 2χ 2 1 λ (∂μ χ)2 − χ − λvχ3 − χ4 (2.61) 2 2 4 √ with χ being the Higgs scalar field and v/ 2 the vacuum expectation value. Finally, L H V describes the interaction terms between the Higgs and the weak bosons that arise from SSB, LH =

2.6 The SM Lagrangian

21

vgw4 vg 2 cos2 θw χWμ− W +,μ + w χZ μ Z μ + 4 4 g 2 cos2 θw 2 g2 χ Zμ Z μ + w χ2 Wμ− W +,μ + w 4 8

LH V =

(2.62)

In total, the SM Lagrangian has 18 free parameters. These include 3 coupling strengths and 2 parameters of the Higgs potential: gc , gw , θw , λ and v. These however can be related to different observables using vgw cos θw = 91.1876 ± 0.0021 GeV, √ 2 m χ = 2λv = 125.10 ± 0.14 GeV, mZ =

sin2 θw = 0.2312 ± 0.00017, g 2 sin2 θW 1 α≡ w = = 7.2973535693 ± 0.00000000015 · 10−3 , 4π 137 g2 αs ≡ c = 0.1179 ± 0.0085 4π

(2.63)

which have been measured to an incredible precision through the combined efforts of multiple collaborations. Nine more parameters correspond to the masses of all the different fermions, which are shown in Table 2.2. For completion the charges of each one of the underlying symmetries of the SM for each fermion have been also included.

Table 2.2 List of fermions that populate the standard model with their corresponding charges of each one of the underlying symmetries of the SM. The mass value according to their most recent combination is also included Name Qf Y T3 # Colours Mass (MeV) u d c s t b e νe μ νμ τ ντ

+2/3 −1/3 +2/3 −1/3 −1/3 +2/3 −1 0 −1 0 −1 0

1/3 1/3 1/3 1/3 1/3 1/3 −1 −1 −1 −1 −1 −1

1/2 −1/2 1/2 −1/2 1/2 −1/2 1/2 −1/2 1/2 −1/2 1/2 −1/2

3 3 3 3 3 3 1 1 1 1 1 1

2.16+0.49 −0.26 4.67+0.48 −0.17 1270 ± 20 93+11 −5 172760 ± 300 +30 4180−20 0.5109989461 ± 0.0000000031 5000 MeV/c2 , < 6000 MeV/c2

2 min χIP

0.99995

(χ 2 /dof)vertex

< 10

D0

D 0 daughters

B2C BDT output

> 0.05

daughters pT

> 1.8 GeV/c

m reconstructed

> 1764.84 MeV/c2 , < 1924.84 MeV/c2

(χ 2 /dof)vertex

< 10

2 χdistancefromPV

> 36

cos θdir

> 0.0

(χ 2 /dof)track

100 MeV/c

p

> 1000 MeV/c

2 min χIP

>4

ghost probability

< 0.4

max(DOCA)

< 0.5 mm

K or π from D 0 and K or π from 0 B(s)

pT

> 500 MeV/c

p

> 5000 MeV/c

D ∗ (2007)0

|m reconstructed − m D 0 |

< 200 MeV/c2

0 ( K π ) pair from B(s)

(χ 2 /dof)

100 MeV/c

p

> 2000 MeV/c

2 min χIP

>4

ghost probability

< 0.4

pT

> 1000 MeV/c

m reconstructed pair

< 5.2 GeV/c2

max(DOCA)

< 0.5 mm

p

> 10000 MeV/c

pT

> 1700 MeV/c

2 min χIP

> 16

min IP

> 0.1 mm

pT

> 400 MeV/c

Confidence level

> 0.25

pT

> 500 MeV/c

p

> 1000 MeV/c

γCL

> 0.25

|m reconstructed − m π 0 |

< 30 MeV/c

78

5 Data Samples

Fig. 5.1 Sketch of a reconstructed B 0 → D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 γ decay candidate. 0 The red dotted line corresponds to the extrapolated momentum of the D meson, used to compute its impact parameter with all primary interaction vertices. The blue dotted line refers to the B 0 reconstructed momentum, obtained from the sum of the momenta of all final state particles. In contrast, the B 0 direction in the dashed line is obtained by considering the primary and secondary vertices. The angle between the two lines is referred to as θdir . For clarity, the momenta of the D 0 meson and the γ are drawn significantly apart, however, they are usually closely aligned due to the small momentum release in the D ∗ (2007)0 decay. A second primary vertex is included for clarity on the calculation of min χ I2P which only considers the closest vertex to a given track. A second primary vertex in the same event is expected in a typical LHCb event, with an average of 1.8 collisions. However, no more than one vertex per event expected to produce a B hadron decay

5.2 Auxiliary Samples One of the main purposes of the stripping is to reject events that despite being selected by the trigger, do not have any viable candidate of the decay under study. The stripping is extremely successful for that matter. However, contributions from backgrounds with similar topologies to the signal decays are also kept and need further study. A wide set of auxiliary samples are used in this analysis to characterize these backgrounds and ultimately reject them from the data sample. These samples can be summarized in two different categories, data samples from different stripping lines to study specific background topologies, and Monte-Carlo (MC) generated samples.

5.2.1 Samples from Other Stripping Lines Candidates created by the stripping line B2D0PiD2HHBeauty2CharmLine are used to build B + → D 0 π + candidates from the data sample corresponding to the year 2018. The stripping version used is the same as the data sample, s34. Due to

5.2 Auxiliary Samples

79

the large number of candidates of this decay and the similarity of the data sample in all years, only the data sample from 2018 is used. Candidate B + → D 0 π + decays are used to enhance the ability of the selection process to identify D 0 mesons that originate from B meson decays. These help reject backgrounds that either do not contain a real D 0 meson (combinatorial background or from charmless B decays) as well as combinatorial background that contains a D 0 meson originated at the primary vertex.

5.2.2 MC Generated Samples A variety of Monte Carlo (MC) simulated samples, listed in Table 5.5, are used to characterise the signal processes as well as the most relevant peaking backgrounds. When possible, MC samples have been generated using the PYTHIA8 [1, 2] program. PYTHIA8 is a C++ package designed to generate high energy collisions, comprising a coherent set of physics models for the evolution of multi-body hard processes to multi-hadronic final states. The hadronic decays are then handled by the EvtGen [3] package, which simulates the final state phase space distribution according to the provided decay model. Afterwards, the Geant4 [4, 5] package simulates the passage of the final state particles thorough the detector material to describe possible resolution effects due to multiple scattering. In order to match the data samples, different MC samples have been generated mimicking the LHCb detector configuration for each year. Same size samples have been generated for the years 2016, 2017 and 2018. All decay channels available in the LHCb simulation framework have a unique 8-digit number in the format “gsdctnxu” that identifies them, also referred to as Event Type in Table 5.5, each individual digit referring to an specific property of the decay: • “g”: The general flag, indicating general properties of the mother particle such as the inclusion of a b or c quark. • “s”: The selection flag, indicating specific requirements to be present in the event, this usually refers to the presence of an specific particle. • “d”: The decay flag, indicating if the decay chain is forced into an specific final state with a concrete number of intermediate final states, or if it instead includes an inclusive set of final states. • “c”: The Charm-and-Lepton flag, indicating the number of charm mesons, electrons and muons present in the decay path, excluding inclusive decays. • “t”: The track flag, indicating the number of stable charged particles in the final state. • “n”: The neutral flag, indicating the number and type of neutral final state particles in the decay chain.

80

5 Data Samples

Table 5.5 Monte Carlo generated samples used in the analysis. Numbers of events marked with * correspond to the numbers of generated events after stripping filtering—in these samples only events passing the stripping selection are saved to avoid wasting disk space. Finally, all samples that include a D ∗ (2007)0 meson have been generated including both D 0 γ and D 0 π 0 decay modes, in the same ratio as their relative branching fractions [9] Event number Decay Configuration Events per year PYTHIA8 Signal samples 11164611 B 0 → D ∗ (2007)0 K + π − 13164601 Bs0 → D ∗ (2007)0 K + π − 11164601 B 0 → D ∗ (2007)0 π + π − PYTHIA8 Peaking background samples 11164621 B 0 → D ∗ (2007)0 K + K − 13164611 Bs0 → D ∗ (2007)0 K + K − 11164072 B 0 → D0 K + π − 13164062 Bs0 → D 0 K + π − 11164063 B 0 → D0 π + π − 11164086 B 0 → D0 K + K − 13164076 Bs0 → D 0 K + K − † 12163421 B + → D ∗ (2007)0 π + † 12163411 B + → D ∗ (2007)0 K + 12163011 B + → D0 π + 12163001 B + → D0 K + 15164601 Λ0b → D 0 pπ − 15164611 Λ0b → D 0 pK − PGun samples B 0 → D ∗ (2007)0 K + π − Bs0 → D ∗ (2007)0 K + π − B 0 → D ∗ (2007)0 π + π − B 0 → D ∗ (2007)0 K + K − Bs0 → D ∗ (2007)0 K + K − RapidSim samples B 0 → D ∗ (2007)0 K 1 (1270)0 Bs0 → D ∗ (2007)0 K 1 (1270)0 B 0 → D ∗ (2007)0 a1 (1260)0 B 0 → D ∗ (2007)0 η (958)0

SQDALITZ,ReDecay SQDALITZ,ReDecay SQDALITZ,ReDecay

4000000* 4000000* 4000000*

SQDALITZ,ReDecay SQDALITZ,ReDecay SQDALITZ,ReDecay SQDALITZ,ReDecay SQDALITZ,ReDecay SQDALITZ,ReDecay SQDALITZ,ReDecay ReDecay ReDecay ReDecay ReDecay SQDALITZ,ReDecay SQDALITZ,ReDecay

4000000* 4000000* 4000000* 4000000* 4000000* 1000000* 1000000* 4000000* 4000000* 2000000 2000000 1000000* 1000000*

Custom model Custom model Custom model Custom model Custom model

10000000 10000000 10000000 10000000 10000000

PHSP PHSP PHSP PHSP

10000000 10000000 10000000 10000000

5.2 Auxiliary Samples

81

• “x” and “u”: The extra and user flags. These have no specific meaning in most cases and instead are used to differentiate samples that share the same previous six flags. Full information on the meaning on the different values on each one of the flags can be found in the Monte Carlo Event Type Definition Rules document [6]. Simulated samples are generated with a chosen phase space distribution. In most cases, the chosen distribution corresponds to the measured phase space distribution for a given decay. However, since most of the relevant backgrounds in this analysis (as well as the signal decays) have not been measured yet, the underlying phase space distribution for these decays is not known. Thus, the phase space distribution is chosen to be flat across the square Dalitz plot for all samples, this flat model is also referred to as SQDALITZ. This decision ensures that all regions of the phase space are covered so the selection efficiency can be measured with low enough uncertainty in all regions of the phase space. In order to fully simulate real data events, PYTHIA8 not only generates the desired decay chain of a multi-body process but it also includes the full hadronic and radiative environment generated from proton proton collisions, referred to as the rest-of-the-event. This, however, is extremely computationally demanding and thus some simplifications are made to reduce the needed resources to generate and store some of these samples. First, the fast simulation option ReDecay has been used to generate most of the samples [7]. The ReDecay option reuses the events as generated by Pythia and only re-generates the signal and other heavy hadron decays using EvtGen. This reduces the generation time of MC samples by a factor of 10. Another important constraint to take into account is the disk space required to store these samples. A large number of events are required for a good characterization of the different backgrounds. Nevertheless, only events with candidates from the aforementioned stripping lines will be studied and thus not all the events need to be stored. It is a common practice within LHCb to run the stripping for the generated samples as part of the production and only store generated events with at least one candidate passing one specific stripping line (or a group of lines). This is known as filtered MC. Although this practice reduces the storage space required for the MC samples, it prevents the re-stripping of the generated samples, and also stops these samples being used to build candidates of stripping lines not included in the filtering selection. Two extra sets of MC samples have been generated using different configurations to provide additional insight to specific parts of the analysis. First, additional samples for the signal decays as well as for misidentified backgrounds have been generated using a custom phase space distribution using the BParticleGun package (PGun). Contrary to the standard PYTHIA8 configuration, PGun, in conjunction with EvtGen and Geant4 only simulates the desired decay chain but not the rest-of-the-event. This reduces dramatically the generation time of the samples but typically introduces some mismodelling of detector resolution effects due to the difference in detector occupancy in the samples. These samples are however not used

82

5 Data Samples

to fully characterize the signal and misidentified background components but are used instead to correct the full MC samples for any mismodelling effects due to the incorrect phase space distribution. More details on the procedure to correct the phase space distribution in the full MC samples is shown in Chap. 7. Secondly, a set of samples have been generated using the RapidSim package [8]. The RapidSim package proceeds similarly to the PGun package program but does not fully simulate detector effects and does not rely on the usage of Geant4, which is the most computationally expensive stage in the simulation process. Instead, a parametric smearing is applied to the momenta of the final state particles in the simulated decay as a first approximation to the detector effects. The RapidSim package is used to generate samples from partially reconstructed backgrounds. Although the RapidSim samples do not provide an accurate representation of the data, they offer a good first approximation to characterize these decays. Since not all the event information is included in both the PGun and RapidSim samples, it is not possible to apply the stripping selection to these samples, and instead all generated events are used.

5.3 Truth Matching All full MC samples are produced so that they contain a specific decay channel, but since the full pp collision is simulated they also include many additional particles. Thus, it is possible that the candidate built by the stripping line is due to combinatorial background, rather than the signal decay. This effect is particularly relevant when considering decays with π 0 and γ in the final state, for which the reconstruction efficiency is lower. Candidates reconstructed by the stripping using particles from the rest-of-the-event must then be excluded when trying to give a good description of the generated decay. This is done through a custom MC truth matching procedure. While standard MC truth matching procedures exist in LHCb, these have been designed and validated mainly for decays with charged particles only in the final state. For this analysis, a bespoke matching strategy has been defined that verifies that all of the reconstructed candidates have been built using only the particles generated in the signal decay chain, not involving any other particles in the pp collision event. The matching strategy consists of a set of requirements on the true generated information of the particles used to build the candidate. Details of this truth matching procedure are shown in Table 5.6. For the peaking background samples, additional subtleties must be accounted for and the matching strategy is slightly different depending on the background channel. These particularities are described later in Chap. 7, where all the peaking backgrounds considered in this analysis are discussed. The truth matching procedure removes from the MC samples candidates that are formed solely from random combinations of particles, but also removes (more often) candidates obtained by combining some of the signal decay with some random par-

References

83

Table 5.6 Truth matching requirements applied to signal MC samples. The B prefix refers to either a B 0 or a Bs0 , while the Dk and Dpi prefixes refer to the K + and π − coming from the D 0 decay, respectively. The ID suffix corresponds to the PDG particle ID and is unique for each particle species. In D ∗ (2007)0 → D 0 π 0 decay the final state photons are requested to come from the D ∗ (2007)0 vertex instead of the π 0 vertex since the π 0 is simulated with negligible lifetime in the MC samples so it shares the D ∗ (2007)0 decay vertex. The same strategy is applied for the normalisation channel, with the exception of the cuts with the K in which a pion is considered instead Particle ID cuts

Vertex cuts

Common cuts for all signal channels B_TRUEID==B_ID

B_TRUEENDVERTEX==Dst_TRUEORIGINVERTEX

D_TRUEID==D_ID

B_TRUEENDVERTEX==K_TRUEORIGINVERTEX

Dst_TRUEID==Dst_ID

B_TRUEENDVERTEX==pi_TRUEORIGINVERTEX

K_TRUEID==K_ID

Dst_TRUEENDVERTEX==D_TRUEORIGINVERTEX

pi_TRUEID==pi_ID

D_TRUEENDVERTEX==Dk_TRUEORIGINVERTEX

Dk_TRUEID==Dk_ID

D_TRUEENDVERTEX==Dpi_TRUEORIGINVERTEX

Dpi_TRUEID==Dpi_ID Specific cuts for D ∗ (2007)0 → Dγ channels gamma_TRUEID==gamma_ID

Dst_TRUEENDVERTEX==gamma_TRUEORIGINVERTEX

Specific cuts for D ∗ (2007)0 → Dπ 0 channels pi0_TRUEID==pi0_ID

Dst_TRUEEDNVERTEX==pi0_TRUEORIGINVERTEX

gam1_TRUEID==gam1_ID

Dst_TRUEENDVERTEX==gam1_TRUEORIGINVERTEX

gam2_TRUEID==gam2_ID

Dst_TRUEENDVERTEX==gam2_TRUEORIGINVERTEX

ticles. Such “misreconstructed signal” will also be present in the data, and therefore such candidates in the MC samples are studied independently in order to be able to describe properly this contribution. The complete treatment of these events is discussed in also discussed in Chap. 7. Note that this is only required to study events in which the misreconstructed particle is a γ or a π 0 , as candidates with misreconstructed charged tracks are for the most part rejected in the selection procedure.

References 1. Sjöstrand T, Mrenna S, Skands P (2008) A brief introduction to PYTHIA 8.1. Comput Phys Commun 178:852. https://doi.org/10.1016/j.cpc.2008.01.036, arXiv:0710.3820 2. Belyaev I et al (2011) Handling of the generation of primary events in Gauss, the LHCb simulation framework. J Phys Conf Ser 331:032047. https://doi.org/10.1088/1742-6596/331/3/032047 3. Lange DJ (2001) The EvtGen particle decay simulation package. Nucl Instrum Meth A462:152. https://doi.org/10.1016/S0168-9002(01)00089-4 4. Geant4 collaboration, Agostinelli S et al (2003) Geant4: a simulation toolkit. Nucl Instrum Meth A506:250. https://doi.org/10.1016/S0168-9002(03)01368-8 5. Geant4 collaboration, Allison J et al (2006) Geant4 developments and applications. IEEE Trans Nucl Sci 53:270. https://doi.org/10.1109/TNS.2006.869826 6. Corti G et al (2014) Monte Carlo event type definition rules. http://cdsweb.cern.ch/search? p=LHCb-2005-034&f=reportnumber&action_search=Search&c=LHCbLHCb-2005-034

84

5 Data Samples

7. Müller D, Clemencic M, Corti G, Gersabeck M (2018) ReDecay: a novel approach to speed up the simulation at LHCb. Eur Phys J C 78:1009. https://doi.org/10.1140/epjc/s10052-018-64696, arXiv:1810.10362 8. Cowan GA, Craik DC, Needham MD (2017) RapidSim: an application for the fast simulation of heavy-quark hadron decays. Comput Phys Commun 214:239. https://doi.org/10.1016/j.cpc. 2017.01.029, arXiv:1612.07489 9. Particle Data Group, Zyla PA et al (2020) Review of particle physics. Prog Theor Exp Phys 2020:083C01. http://pdg.lbl.gov/, https://doi.org/10.1093/ptep/ptaa104

Chapter 6

Candidate Selection

The stripping selection is extremely successful at rejecting events with no candidates topologically similar to the signal decays. However, the selected candidates are still populated by background events that include similar structures and cannot be rejected so easily. Thus, a more detailed offline study of these events is required to identify and select signal candidates. This procedure is known as the offline selection and relies on MC samples as well as alternative data samples to characterise signal and background events. The offline selection in this analysis follows a similar strategy to previous similar 0 → D K ± π ∓ decays [1–4], though of course with LHCb Dalitz plot analyses of B(s) modifications to account for the selection of D ∗ (2007)0 mesons. In this section, the details of the candidate selection strategy are described. The selection requirements are chosen to be identical in all channels when possible, in order to reduce the systematic uncertainties when computing the relative branching fraction between channels. Thus, the requirements are assumed to be common for all channels unless stated otherwise—the key discrimination between the two signal channels and the control channel in the selection procedure is from charged hadron identification, which needs different treatment in the control channel due the different final state. The offline selection can be divided in different steps. First, the trigger selection, when only candidates that triggered specific trigger lines are kept to select only events with specific topologies. Secondly, the preselection, consisting of a set of rectangular cuts on the most discriminating variables. Finally, a multivariate analysis, where different machine learning techniques are used to further reduce the number of background candidates.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_6

85

86

6 Candidate Selection

6.1 Trigger Selection Once a candidate has been built by the stripping, the reconstructed particles can be associated with the objects in the detector that triggered the L0 lines. This enables candidates containing the particle that has triggered the L0 to be distinguished. These candidates are classified as trigger on signal (TOS) under the given trigger line. Alternatively, if the particles that triggered the line does not form part of the candidate, the event is classified as trigger independent of signal (TIS) under a given line. Note that it is possible for a candidate to be both TIS and TOS under the same trigger line if it belongs in an event which was triggered at least twice. Once due to a particle from the candidate and once due to a rest-of-the-event particle. In this analysis, only candidates classified under L0Hadron_TOS or L0Global_TIS lines in the level 0 (L0) trigger are kept, where L0Global refers to any L0 trigger line. This requirement is widely used in other analyses studying hadronic B decays within LHCb. This requirement selects candidates in which at least one of the final state particles in the candidate has triggered the L0Hadron line, or that includes another high pT particle in the event that does not belong to the reconstructed candidate. Events in which the particle that triggered the L0 line is ambiguous, for example if two tracks are very close together, one of them belonging to the candidate, are rejected. The acceptance rate for these candidates is typically not well reproduced by the MC samples. These events are classified as trigger on both (TOB) events. Further differences on the acceptance rate of L0Hadron_TOS in the MC and data samples are expected, since it is difficult to simulate the calorimeter performance and its variation with detector occupancy. This may impact the calculation of the signal efficiency required for the measurement of the branching fraction. However, differences are expected to cancel out in the computation of the relative branching fractions. Nevertheless, such potential discrepancies are studied as a possible source of systematic uncertainty in Chap. 10. After the L0 trigger selection, High Level Trigger (HLT) requirements are applied to select specific topologies. These requirements are also standard for analyses of multibody hadronic B decays in LHCb, specifically requiring at least one of the following lines to have fired: Hlt2Topo2Body_TOS, Hlt2Topo3Body_TOS or Hlt2Topo4Body_TOS. These select candidates containing two, three or fourbody decays in the reconstructed candidate with a decay vertex significantly displaced from all primary vertex.

6.2 Preselection A preselection is applied to the candidates that are retained in the trigger selection, consisting of rectangular cuts on the variables with the most discriminating power. The objective is to reduce the data sample size by rejecting the most obvious backgrounds while retaining as high signal efficiency as possible. A reduction of the data

6.2 Preselection

87

samples sizes is a requirement for the training of the multivariate methods used in the following steps of the selection. The different cuts applied in this step are defined by comparing the distributions of the observables between signal samples (obtained from MC) and data (which is dominated by combinatorial background candidates at this stage), as shown in Fig. 6.1. 0 impact parameter is applied to ensure that the reconstructed First, a cut on the B(s) candidate is consistent with having originated from the primary vertex. Similarly, the cosine of the angle between the B momentum and the vector between primary and secondary vertices (also referred to as cos θdir or DIRA) is required to be close to 1 for all candidates. A requirement on the B decay vertex χ2 is also imposed to remove candidates where the decay products do not appear to have originated from a common vertex. To select events that contain true D 0 mesons, cuts on the D 0 mass and the D 0 flight distance along the beam axis (i.e. the difference in z position between the D 0 and B vertices) are applied. The cut on the D 0 mass is kept loose at this stage so that checks on contributions from charmless backgrounds can be made later. Finally, a cut on the difference between the reconstructed masses of the D ∗ (2007)0 D0 D ∗ (2007)0 and D 0 candidates (m rec − m rec ) is used to reject candidates with ∗ 0 fake D (2007) mesons. Nevertheless, the reconstruction of the neutral D ∗ (2007)0 decay products is one of the most challenging aspects of this analysis, and thus this remains one of the most relevant backgrounds even after this cut has been applied. With the exception of the variables that rely on the mass measurement of the D 0 and D ∗ (2007)0 mesons, all of the aforementioned observables have been computed using the DecayTreeFitter [5] tool. This tool, used in many LHCb analyses, simultaneously varies the momenta of the final state particles within their uncertainties using a Kalman filter in order to build a D 0 meson with a fixed known value for its mass of m D0 = 1864.83 MeV/c2 [6], also ensuring that the tracks of the reconstructed final state particles originate from a common vertex. Simultaneously, the DecayTreeFitter tool also modifies the momenta of the γ/π 0 (also within their uncertainties) to build a D ∗ (2007)0 meson with a known mass fixed to m D∗ (2007)0 = 2006.85 MeV/c2 [6], also ensuring that the γ/π 0 originate from a point 0 decay vertex). within the D 0 trajectory (the B(s) The usage of the DecayTreeFitter tool greatly improves the resolution of different observables in signal candidates, which is essential for their precise characterization. The variation required to achieve the different constraints set by DecayTreeFitter is measured in terms of the modified momenta uncertainties as χ2DTF . This is also used as a variable in the preselection procedure, where only candidates with a low χ2DTF are kept. Since the main purpose of the preselection is to reduce the data sample size, the exact cut values for all the parameters are chosen by eye. The complete list of the requirements applied in the preselection is given in Table 6.1. The cuts are chosen to be identical in each of the different channels as all of the variables used in the preselection are similarly distributed in each. The exception to this is the distribution D ∗ (2007)0 D0 − m rec , which depends on the D ∗ (2007)0 decay mode. Nonetheless, of m rec the signal populates a similar region for both cases so the cut is chosen to be the

A.U.

6 Candidate Selection A.U.

88 10−1

10−2

1 10−1

10−2 10−3

10−3

10−4 10−4

10−5 0

2

4

0.9999

6

B0 min χ 2*

0.99992 0.99994 0.99996 0.99998

A.U.

A.U.

10−1

10−2

10−3

10−4 2

4

6 *

0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0

1800

B0 (χ/dof)vertex A.U.

A.U.

0

10−1

10−2

10−3

10−3

10−4

10−4

10

1

1850

1900

1950

0

mDrec [MeV/c2]

10−1

10−2

−5

*

B0 cosθ dir

IP

10−5 −20

−10

0

10

0

20

0

2

4

6

A.U.

D z flight distance

χ2*

DTF

0.1 0.08 0.06 0.04 0.02 0 50

100

*

150

0

200

0

D (2007) D [MeV/ c2] mrec - mrec

MC B0→ D*(2007)0K +π −, D*(2007)0→ D γ

Bg B0→ D*(2007)0K +π −, D*(2007)0→ D γ

0

MC B → D (2007)

K + −,

MC Bs→ D (2007)

K −π +,

0

0

*

*

0

0

π D (2007) → D π *

0

0

0

0

D (2007) → D γ *

0

0

MC B0s→ D*(2007)0K −π +, D*(2007)0→ D π 0 0

MC B → D (2007) 0

*

0

π +π −,

D (2007) → D γ *

0

0

MC B0→ D*(2007)0π +π −, D*(2007)0→ D π 0 0

Bg B → D (2007)

K +π −,

D (2007) → D π 0

Bg Bs → D (2007)

K −π +,

D (2007) → D γ

0

0

*

*

0

0

*

*

0

0

0

0

Bg B0s → D*(2007)0K −π +, D*(2007)0→ D π 0 0

Bg B → D (2007) 0

*

0

π +π −,

D (2007) → D γ *

0

0

Bg B0→ D*(2007)0π +π −, D*(2007)0→ D π 0 0

Fig. 6.1 Distribution of the variables used in the preselection for data (dashed lines; assumed to be dominated by combinatorial background) and signal MC (solid lines) of each of the channels. In both signal and data samples, similar distributions are seen for each channel, with the exception of D ∗ (2007)0

D variable in which the distribution depends on the D ∗ (2007)0 decay channel. the m rec − m rec However, the signal still falls in the same region and thus the same cut is used for all channels 0

6.3 D from B MVA

89

Table 6.1 Cuts applied in the preselection. Variables with the * label are computed after a mass constraint on the D 0 meson applied with the DecayTreeFitter tool. The χ2DTF variable corresponds to the quality of this fit. Particle Variable Cut value 0 B(s)

m ∗rec

D0

min χ2∗ IP ∗ cos θdir (χ2 /dof)∗vertex χ2∗ DTF z flight distance

> 5000 MeV/c2 and < 6000 MeV/c2 0.99999 1814 MeV/c2 and < 1914 MeV/c2

D m rec

∗

D (2007) D m rec − m rec 0

0

> 117.016 MeV/c2 and < 167.016 MeV/c2

Table 6.2 Signal efficiency and data retention of the offline preselection requirements for each of the decay modes Channel Efficiency (%) Data retention (%) B0 B0 Bs0 Bs0

→ → → →

D ∗ (2007)0 K + π − D ∗ (2007)0 K + π − D ∗ (2007)0 K − π + D ∗ (2007)0 K − π +

, , , ,

D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0

→ → → →

D0 γ D0 π0 D0 γ D0 π0

72.9 75.9 74.3 77.0

13.4 13.7 13.4 13.6

same for this variable as well. The signal efficiencies and data retention rates for the preselection in each of the channels are given in Table 6.2.

6.3

D from B MVA

The signature of a charm hadron originating from a B decay (referred to as D from B), and hence inconsistent with originating from the primary vertex, provides important discrimination power against backgrounds for all beauty to open charm decays. It is particularly useful as it discriminates not only against combinatorial background, but also charmless B decay backgrounds. A standard tool was available for, and widely used in, Run 1 analyses within LHCb [7]. Here, the same approach is used to obtain a D from B multivariate discriminant that can be applied in this analysis, and potentially used in other similar analyses of beauty to open charm decays in future. A Neural Network (NN) is trained using the NeuroBayes package [8], which provides a framework to train MultiLayer Perceptron (MLP) NNs. The training is done using reconstructed B + → D 0 π + decays from the

Fig. 6.2 D 0 -candidate mass distribution obtained from B + → D 0 π + data, from which signal and background samples used to train the D from B MVA are obtained. Only a fraction of the signal sample is used for the training so the sizes of both samples are similar

6 Candidate Selection

A.U.

90

80000 70000 60000

Signal Sample (SWeighted)

Background Sample

50000 40000 30000 20000 10000 0

1800

1850

1900

0

1950

mDrec [MeV/ c2] B2D0PiD2HHBeauty2CharmLine stripping line in 2018 data, only considering D 0 mesons reconstructed as D 0 → K + π − . Moreover, the same trigger and, where appropriate, preselection requirements have been applied to these samples as for the signal channels. The background sample is selected from the D 0 mass sidebands in data, and the signal sample is selected from the D 0 mass signal window (1830–1905 MeV/c2 ), shown in Fig. 6.2. Since the signal mass window contains some background, sWeights [9] have been computed from a fit of the D 0 mass distribution. The fit includes a double Crystal Ball distribution (see Sect. 8.1) to describe the signal yield plus a flat distribution that accounts for the background. The events that populate the sidebands are considered to be pure background events and thus they are assigned a weight of −1. The NN has a topology of a single internal layer of 2N neurons, where N is the number of input variables used for the training, in this case 21. Since this NN is meant to provide a generic D from B selection, independently of the B hadron decay topology, all the variables used for the training are only related to the D 0 meson or its daughters. Moreover, since the NN is designed to be used in analyses where different D decay modes are used (e.g. in a measurement of γ using B 0 → D 0 K + π − analysis, in which all D 0 → h + h − combinations are included, where h ∈ π, K ), particle identification variables associated to the D 0 decay products are excluded from the training. The full list of variables used in the training is given in Table 6.3, with their detailed description given in Sect. 11.3. The obtained NN output (also referred to as D PID) distributions for signal and background are shown in Fig. 6.3. In order to check for any sign of overtraining in the NN, a Kolmogorov–Smirnoff test has been performed as shown in Fig. 6.4. A loose cut on the NN output (D PID > −0.5) is used as a final preselection requirement due to its strong discriminating power. However, this variable is also used as an input for the multivariate selection that follows, as described in Sect. 6.5. The associated signal efficiencies and data retention rates of the preliminary cut are shown in Table 6.4, with the effect on the D 0 mass distribution for B → D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 γ candidates shown in Fig. 6.5.

6.4 D 0 Signal Window Definition

91

Table 6.3 Variables used for the D from B NN training with B + → D 0 π + data Particle Variable D0

D 0 decay products

p pT χ2FD from PV min χ2IP χ2Vertex χ2Z FD χ2T FD p pT min χ2IP VELO (χ2 /dof)track TT (χ2 /dof)track Track χ2 Ghost probability

Fig. 6.3 D from B NN output distributions for signal and background, as obtained after training. Distributions for both training and testing samples are shown as a check for NN over-training

6.4

D0 Signal Window Definition

As discussed in Sect. 6.2, candidates with a D 0 mass in the sidebands of the distribution are kept in the preselection step. This is crucial to assess the effectiveness of the NN described in the previous section. However, after the NN is applied, these candidates need to be removed as they correspond to charmless background. Thus, all candidates with reconstructed D 0 mass outside the signal window defined as μ D ± 2.5σ D are removed from the data sample. The values of μ D and σ D are computed, independently for each signal channel, by fitting the MC distribution with a double Crystal Ball function (see Sect. 8.1). An example of one of the channels is given in Fig. 6.6. The fit is used to estimate the peak position and width of the

A.U

92

6 Candidate Selection 1

Signal Train sample 0.8

Signal Test sample Bg Train sample

0.6

Bg Test sample 0.4

K-S statistic (Signal) = 0.0060 N+M (Signal) = 0.0052 N ⋅M K-S statistic (Background) = 0.0047 N+M (Background) = 0.0056 N ⋅M

0.2

0

−1

−0.5

0

0.5

1

D PID

Fig. 6.4 Cumulative distribution of the D PID for the training and testing samples. The Kolmogorov–Smirnov statistics between the training and testing samples are also included. These

are expected to be comparable to NN+M ·M in the absence of overtraining, where N and M refer to the number of events in the training and testing samples, respectively Table 6.4 Signal efficiency and data retention rate of the D PID cut for each of the decay modes Channel Efficiency (%) Data retention (%) → → → →

D ∗ (2007)0 K + π − D ∗ (2007)0 K + π − D ∗ (2007)0 K − π + D ∗ (2007)0 K − π +

, , , ,

D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0

Fig. 6.5 Effect of the D PID > −0.5 requirement on the D 0 mass distribution for B → D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 γ candidates

A.U.

B0 B0 Bs0 Bs0

→ → → →

D0 γ D0 π0 D0 γ D0 π0

97.29 97.68 97.36 97.82

80.68 87.38 81.42 87.56

0.04 0.035

No D PID cut

0.03

D PID cut applied

0.025 0.02 0.015 0.01 0.005 0

1850

1900

mDrec [MeV/ c2] 0

distribution which define the signal window. Since the values obtained are consistent in all channels, the averages of the four results (μ D and σ) are used to define a common signal window for all modes. The quantities obtained from each of the fits and their averages are listed in Table 6.5.

A.U.

6.5 Multivariate Selection

93

25000 20000 15000

Data sample Gaussian Fit

10000 5000 0

1840

1860

1880 0

mDrec

1900

[MeV/c2]

Fig. 6.6 Double Crystal Ball fit to D 0 mass in the B 0 → D ∗ (2007)0 π + π − , D ∗ (2007)0 → D 0 γ MC sample Table 6.5 Peak positions and widths of the signal obtained from double Crystal Ball fits to MC samples. The average values, μ D and σ D , used to define the D 0 signal window applied in the selection, are given in the last row Channel μ D (MeV/c2 ) σ D (MeV/c2 ) B 0 → D ∗ (2007)0 K + π − B 0 → D ∗ (2007)0 K + π − Bs0 → D ∗ (2007)0 K − π + Bs0 → D ∗ (2007)0 K − π + Average

, , , ,

D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0

→ → → →

D0 γ D0 π0 D0 γ D0 π0

1865.67 1865.52 1865.65 1865.53 1865.59

7.06 7.13 7.07 7.19 7.11

6.5 Multivariate Selection After the preselection process, the data sample size has been reduced to a level which is suitable for multivariate analyses designed to reject dominant backgrounds. Furthermore, none of the aforementioned selection steps contains any particle identification requirements, and thus the data sample is expected to contain an important contribution from misidentified events. Two Boosted Decision Tree algorithms (BDTs) are used to further reduce the misidentified and combinatorial backgrounds, exploiting the correlations between various kinematic and PID variables to do this in an optimal way. Each of the two BDTs is trained specifically to remove one of these backgrounds. Contrary to the D from B MVA, which is a NN trained with the NeuroBayes package, both BDTs have been trained using the XGBoost [10] package as this was found to provide the best discriminating power among a selection of multivariate analysis software tools.

94

6 Candidate Selection

Misidentified background BDT The first BDT, which specialises on the rejection of misidentified backgrounds, is trained uniquely with MC samples. This is a crucial step in the selection as no PID requirements have been applied in any previous step, and thus an important contribution from misidentified backgrounds is expected to populate the data sample. The signal sample is characterised by a combination of B 0 → D ∗ (2007)0 K + π − and Bs0 → D ∗ (2007)0 K − π + MC samples, while the background sample consists of 0 → D ∗ (2007)0 misidentified background MC from B 0 → D ∗ (2007)0 K + K − , B(s) + − 0 ∗ 0 + − K K and B → D (2007) π π decays. The BDT includes a total of 16 variables as described in Table 6.6, covering kinematic, topological and PID properties. A detailed description of all the variables is given in Sect. 11.3. In order for the BDT to be used to select signal events in the data sample, reasonable agreement in all variables between the MC and data samples is required. However, it is well known that the PID variables in MC do not provide a good description of the response in LHCb data. To overcome this, the PID variables in all MC samples are resampled using the PIDCORR package [11, 12]. The PIDCORR package utilises the momentum ( p) and pseudo-rapidity (η) of each final state particle to determine the PID variables using a control data sample of D ∗+ → D 0 π + decays as a reference template, hence resampling the PID variables of the candidate while using the PID variables in the MC to retain the relevant correlations in the variables used in the training. The remaining variables are expected to show reasonable agreement between the MC and data samples, as has been checked in Appendix B, where the signal MC samples are compared with background subtracted distributions from data, obtained using the extended sWeights, as described in Chap. 9. The approach of using the PID variables in a multivariate analysis is chosen in favour of rectangular cuts, since the discrimination against misidentified backgrounds is enhanced by exploiting the correlations between PID and kinematic variables. The discriminating power of each variable is shown in Fig. 6.7. Checks for overtraining have been performed using Kolmogorov–Smirnov tests, as shown in Fig. 6.8. The control channel has different particles in the final state, and hence is expected to be affected by different misidentified backgrounds. Therefore a different BDT has been trained, using B 0 → D ∗ (2007)0 π + π − MC to characterise the signal channel and a combination of B 0 → D ∗ (2007)0 K + π − and Bs0 → D ∗ (2007)0 K − π + MC to account for the possible misidentified backgrounds. The variables used in the control channel BDT are the same as those used in the signal channel BDT. The output of the MisID background BDT is shown in Fig. 6.9 for the signal and background samples. When applied on the data, all events with a BDT response lower than 0.2 are rejected initially, to reduce significantly the misidentified background with a minimal loss of signal events. The optimal working point for this requirement on the BDT response is computed in parallel with that on the response of the combinatorial background BDT described in the following subsection.

6.5 Multivariate Selection

95

Table 6.6 Variables used for the training of the misidentified and combinatorial background BDTs. All variables are common for the BDTs trained for both D ∗ (2007)0 decay modes, unless specified otherwise. All variables used have been computed after a mass constraint on the D meson. Misidentified background BDT 0 B(s)

D0

D 0 daughters combination 0 (K π) pair from B(s)

Combinatorial background BDT 0 B(s)

D0

D 0 daughters combination 0 (K π) pair from B(s)

γ/π 0

Isolation variables

pT χ2FD from PV min χ2IP (χ2 /dof)vertex cos θdir χ2FD from PV D PID χ2Z FD χ2T FD max (χ2IP ) min (χ2IP ) max (χ2IP ) K PIDK K PIDpi π PIDK π PIDpi pT χ2FD from PV min χ2IP (χ2 /dof)vertex cos θdir χ2DTF χ2FD from PV D PID χ2Z FD χ2T FD max (χ2IP ) min (χ2IP ) max (χ2IP ) Average confidence level pT Cone 1 pT asymmetry Cone 2 pT asymmetry Cone 3 pT asymmetry

96

6 Candidate Selection

Fig. 6.7 Power of the variables used in the MisID background BDT training to discriminate against misidentified backgrounds, for (top) signal and (bottom) control channels

Combinatorial background BDT A second BDT is trained that targets specifically the combinatorial background. To train this BDT, a combination of B 0 → D ∗ (2007)0 K + π − and Bs0 → D ∗ (2007)0 K − π + MC is used as the signal sample, and data candidates reconstructed with 0 mass above 5450 MeV/c2 or below 5150 MeV/c2 are used to characterise the B(s) combinatorial background. The samples have been split according to the D ∗ (2007)0 decay channel, which are treated independently and used to train separate BDTs. This approach enables the inclusion of variables specific to each D ∗ (2007)0 decay mode, such as the γ or π 0 transverse momenta or the γ confidence level (CL). A total of 18 variables are included in the BDT, which are chosen due to their discriminating power. Care is also taken to avoid the use of variables which are

A.U

6.5 Multivariate Selection

97

1

Signal Train sample 0.8

Signal Test sample Bg Train sample

0.6

Bg Test sample 0.4

0.2

0

A.U

0

0.2

0.4

0.6

0.8

1

MisID BDT output

K-S statistic (Signal) = 0.0023 N+M (Signal) = 0.0035 N ⋅M K-S statistic (Background) = 0.0010 N+M (Background) = 0.0014 N ⋅M

1

Signal Train sample 0.8

Signal Test sample Bg Train sample

0.6

Bg Test sample 0.4

0.2

0

0

0.2

0.4

0.6

0.8

1

MisID BDT output (Control channel)

K-S statistic (Signal) = 0.0045 N+M (Signal) = 0.0049 N ⋅M K-S statistic (Background) = 0.0015 N+M (Background) = 0.0024 N ⋅M

Fig. 6.8 Cumulative distribution of the MisID background BDT output for (top) signal and (bottom) control channels. The Kolmogorov–Smirnov statistics between the training and testing samples are also included. These are expected to be comparable to NN+M ·M in the absence of overtraining, where N and M refer to the number of events in the training and testing samples, respectively

Fig. 6.9 MisID background BDT distributions for signal and background samples obtained for the (left) signal and (right) control channels. The samples are split into training and testing samples to check for over-training

98

6 Candidate Selection

0 correlated strongly with position in the Dalitz plot or the B(s) reconstructed mass, as these could bias the resulting distributions. The variables used include kinematic, topological and isolation variables, as listed in Table 6.6, with detailed descriptions for all variables given in Sect. 11.3. Similarly as for the previous BDT, agreement between MC and Data across all the variables is studied after the selection process, this study is presented in Appendix B. The discriminating power for each variable is shown in Fig. 6.10. The distribution of the combinatorial background BDT response, for signal and background samples, is shown in Fig. 6.11. Finally, a Kolmogorov– Smirnov test is used to verify the absence of overtraining, as shown in Fig. 6.12.

Fig. 6.10 Power of the different observables, used in the combinatorial background BDT training, to discriminate between signal and background for (top) D ∗ (2007)0 → D 0 γ and (bottom) D ∗ (2007)0 → D 0 π 0 modes

6.5 Multivariate Selection

99

Fig. 6.11 Combinatorial background BDT distributions for signal and background samples, obtained for the (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π 0 modes. The samples are split into training and testing samples to check for over-training

BDT Cut Optimisation Requirements on the outputs of both BDTs are optimised simultaneously in order to obtain the highest possible signal significance. This is done by maximising the Figure of Merit (FoM), S FoM = √ , (6.1) S+B where S and B are defined as the number of expected signal and background events in the signal regions [5229.65, 5329.65] MeV/c2 for B 0 decays and [5316.88, 5416.88] 0 masses [6]. In order MeV/c2 for Bs0 decays, defined as ranges around the known B(s) to obtain reference values of S and B, from which values at different cut values can be obtained to evaluate the FoM, preliminary fits are performed to the data. These are done with relatively tight BDT cuts (BDTMisID > 0.9 and BDTComb > 0.9) to obtain the reference estimates of S, and with no BDT cuts applied to obtain the reference estimates of B. The simple fit model used to obtain the reference estimate of S includes a Gaussian component to describe the signal yield plus an exponential distribution that accounts for all background components. For the reference estimation of B, exponential and linear fits are tested on the upper mass sideband of the data sample with no BDT requirements applied. The model that fits best in each channel is used to obtain the B estimate. The results of these fits are shown in Figs. 6.13 and 6.14. The fit models used here are far more basic than those described in Chap. 8, and are not expected to give a proper representation of the data, but simply to give a reasonable estimate of the signal and background yields so the FoM can be computed in a reliable manner. The S value for each set of BDT cuts is extrapolated using the efficiencies from signal MC. Similarly, the B value is obtained by scaling the reference value using the 0 mass above given cut retention rate in the sideband window with reconstructed B(s) 2 5450 MeV/c . Any mis-estimation of the signal and background yields from this fit will result in the obtained requirements not being as optimal as they could be, but will not bias the results of the analysis. Moreover, since the FoM distribution tends to have a

A.U

100

6 Candidate Selection 1

Signal Train sample 0.8

Signal Test sample Bg Train sample

0.6

Bg Test sample 0.4

0.2

0

A.U

0

0.2 0.4 0.6 0.8 1 *0 0 Combinatorial BDT output (D → D γ )

K-S statistic (Signal) = 0.0036 N+M (Signal) = 0.0038 N ⋅M K-S statistic (Background) = 0.0065 N+M (Background) = 0.0039 N ⋅M

1

Signal Train sample 0.8

Signal Test sample Bg Train sample

0.6

Bg Test sample 0.4

0.2

0

0

0.2 0.4 0.6 0.8 1 *0 0 Combinatorial BDT output (D → D π 0)

K-S statistic (Signal) = 0.0054 N+M (Signal) = 0.0104 N ⋅M K-S statistic (Background) = 0.0114 N+M (Background) = 0.0103 N ⋅M

Fig. 6.12 Cumulative distribution of the Combinatorial background BDT output for (top) D ∗ (2007)0 → D 0 γ and (bottom) D ∗ (2007)0 → D 0 π 0 modes. The Kolmogorov–Smirnov statistics between the training and testingsamples are also included. In the absence of overtraining

these are expected to be comparable to NN+M ·M , where N and M refer to the number of events in the training and testing samples, respectively

rather broad plateau around its peak, the loss of sensitivity due to not being perfectly optimal is expected to be minimal. In order to reduce systematic uncertainties on the final results, channels that share the same BDT are chosen to use the same cut. However, since the values of S and B are different for each of the decay modes, calculation of the figure of merit and determination of the optimal cuts values has been done independently for each channel. This allows verification that the optimal requirements for samples that use the same BDT are consistent within uncertainty, as seen in Fig. 6.15. The final requirements on the BDT responses are chosen to be: 0 → D ∗ (2007)0 K ± π ∓ channels, misID background BDT output > • for both B(s) 0.7, • for the B 0 → D ∗ (2007)0 π + π − channel, misID background BDT output > 0.25, • for all D ∗ (2007)0 → D 0 γ modes, combinatorial background BDT output > 0.7, • for all D ∗ (2007)0 → D 0 π 0 modes, combinatorial background BDT output > 0.6.

The signal efficiencies from MC and data retention rates for these cuts are shown in Table 6.7.

101

140 Data

120

Signal model

100

Background model

80

Full model

60

events/(10 MeV/c2)

events/(10 MeV/c2)

6.6 Yields

40

35

Data

30

Signal model

25

Background model

20

Full model

15 10

20

5

0

0

5200

5400

5600

5800

5200

Signal model

400

Background model

300

Full model

200

events/(10 MeV/c2)

events/(10 MeV/c2)

Data

500

100 5200

5400

5600

K −π +)

0

m(D (2007) events/(10 MeV/c2)

*

1400

120

5800

Data

100

Signal model

80

Background model

60

Full model

40

Background model

800

Full model

600

Data

250

Signal model

π +π −)

0

m(D (2007)

5800

[MeV/c2]

Full model

150

50 5600

5800

Background model

200

100

*

5600

300

200 5400

5400

m(D*(2007)0K −π +) [MeV/c2]

350

400

5200

5200

[MeV/c2]

Signal model

1000

0

5800

Data

1200

0

5600

20

events/(10 MeV/c2)

0

5400

m(D*(2007)0K +π −) [MeV/c2]

m(D*(2007)0K +π −) [MeV/c2]

0

5200

5400

5600

5800

m(D*(2007)0π +π −) [MeV/c2]

Fig. 6.13 Simple fits to obtain reference values for the signal (S) estimates used in the FoM optimisation. Distributions for (top) B 0 → D ∗ (2007)0 K + π − , (middle) Bs0 → D ∗ (2007)0 K − π + and (bottom) B 0 → D ∗ (2007)0 π + π − candidates satisfying BDTMisID > 0.9 and BDTComb > 0.9 are shown with the results of the fits superimposed, separately for (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π 0 modes

6.6 Yields Following all selection requirements, the majority of the background is removed from the data samples, leaving the signal peaks clearly visible as can be seen in Fig. 6.16. However, some backgrounds remain in the data sample. Thus to determine the signal yields reliably it is necessary to understand the sources of background properly, and to model them appropriately in fits, as will be described in the following sections. The

24000

events/(10 MeV/c2)

6 Candidate Selection

events/(10 MeV/c2)

102 4000

22000

Data

20000

Exponential model

18000

Linear model

16000

Data

3500

Exponential model

3000

Linear model

2500

14000

2000

12000 10000

1500

8000

1000

6000

5200

5400

5600

5800

5200

26000

4000

24000

Data

22000

Exponential model

20000

Linear model

18000

5600

5800

Data

3500

Exponential model

3000

Linear model

2500

16000

14000

2000

12000

1500

10000

5400

5600

5800

5200

m(D*(2007)0K −π +) [MeV/c2]

20000

3500

Data

18000

5400

16000

Exponential model

14000

Linear model

12000

5600

5800

m(D*(2007)0K −π +) [MeV/c2]

events/(10 MeV/c2)

5200

events/(10 MeV/c2)

8000

5400

m(D*(2007)0K +π −) [MeV/c2] events/(10 MeV/c2)

events/(10 MeV/c2)

m(D*(2007)0K +π −) [MeV/c2]

Data

3000

Exponential model

2500

Linear model

2000

10000

1500

8000

1000

6000 4000

500 5200

5400

5600

5800

m(D*(2007)0π +π −) [MeV/c2]

5200

5400

5600

5800

m(D*(2007)0π +π −) [MeV/c2]

Fig. 6.14 Linear and exponential fits to obtain reference values for the background (B) estimates used in the FoM optimisation. Distributions for (top) B 0 → D ∗ (2007)0 K + π − , (middle) Bs0 → D ∗ (2007)0 K − π + and (bottom) B 0 → D ∗ (2007)0 π + π − of the fits superimposed, separately for (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π 0 modes

0 total numbers of selected events in the B(s) candidate mass range shown in Fig. 6.16 0 2 and in signal windows of ±25 MeV/c around the appropriate known B(s) mass [6] are presented in Table 6.8, giving a reference point for the yields in each channel. Since the selection requirements do not exclude the possibility that a selected event contains a candidate in more than one final state, we explicitly check for such duplicate candidates. A large rate of duplicated candidates could lead to correlations between the yields that would need to be accounted for in the error propagation when the results are calculated. In Table 6.9 the rates of duplicate candidates between all

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

103 25 20 15 10

0.2

0.4

0.6

0.8

1

5

Comb. background BDT cut

Comb. background BDT cut

6.6 Yields 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

14 12 10 8 6 4 0.2

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

55 50 45 40 35 30 25 20 0.2

0.4

0.6

0.8

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

70 65 60 55 50 0.6

0.8

1

MisID background BDT cut

45

Comb. background BDT cut

Comb. background BDT cut

75

0.4

0.8

1

25 20 15 10 0.2

0.4

0.6

0.8

1

MisID background BDT cut 80

0.2

0.6

30

MisID background BDT cut 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.4

MisID background BDT cut Comb. background BDT cut

Comb. background BDT cut

MisID background BDT cut

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

44 42 40 38 36 34 32 30 28 26 0.2

0.4

0.6

0.8

1

MisID background BDT cut

Fig. 6.15 Optimisation of the FoM with respect to the requirements on the misidentified and combinatorial background BDTs. FoM distributions are shown for (top) B 0 → D ∗ (2007)0 K + π − , (middle) Bs0 → D ∗ (2007)0 K − π + and (bottom) B 0 → D ∗ (2007)0 π + π − channels, separately for (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π 0 modes

channels are shown. The largest rates of ∼5% occur for candidates for D ∗ (2007)0 → D 0 π 0 decays that also appear in the sample with the same final state except with D ∗ (2007)0 → D 0 γ. The duplication in the other direction is smaller, at around 1%. The duplication of candidates between other channels is around 1% or below. This level of duplicate candidates is considered acceptable, but since it provides a source of potential bias in the results it will be accounted for as a systematic uncertainty in Chap. 10.

104

6 Candidate Selection

Table 6.7 Signal efficiency and data retention of the optimal BDT cuts Signal efficiency Channel B 0 → D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 γ B 0 → D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 π 0 Bs0 → D ∗ (2007)0 K − π + , D ∗ (2007)0 → D 0 γ Bs0 → D ∗ (2007)0 K − π + , D ∗ (2007)0 → D 0 π 0 B 0 → D ∗ (2007)0 π + π − , D ∗ (2007)0 → D 0 γ B 0 → D ∗ (2007)0 π + π − , D ∗ (2007)0 → D 0 π 0 Data retention rate Channel B0 B0 Bs0 Bs0 B0 B0

→ → → → → →

D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 γ D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 π 0 D ∗ (2007)0 K − π + , D ∗ (2007)0 → D 0 γ D ∗ (2007)0 K − π + , D ∗ (2007)0 → D 0 π 0 D ∗ (2007)0 π + π − , D ∗ (2007)0 → D 0 γ D ∗ (2007)0 π + π − , D ∗ (2007)0 → D 0 π 0

MisID BDT (%) 68.23 73.83 68.30 73.87 93.38 96.02

Comb. BDT (%) 70.72 82.87 72.13 84.18 70.65 78.73

Combined (%) 49.63 59.45 50.62 63.04 66.83 75.96

MisID BDT (%) 2.74 2.79 5.30 5.72 61.65 67.25

Comb. BDT (%) 16.10 24.13 15.75 23.58 16.21 24.11

Combined (%) 0.47 0.70 1.06 1.62 11.63 17.95

Moreover, it is also possible for one event to contain multiple candidates reconstructed as the same final state, defined as multiple candidates. The rates for multiple candidates in the data are shown in Table 6.10. These rates are larger than typically seen in LHCb analyses involving only charged particles in the final state, due to the possibility of signal decays being both correctly reconstructed and misreconstructed (as discussed in Chap. 7). The rates of multiple candidates are seen to be rather similar in all D ∗ (2007)0 → D 0 γ final states (3–4% in the full mass range) and in all D ∗ (2007)0 → D 0 π 0 final states (around 8% in the full mass range), and therefore it is expected that any associated effects in the determinations of the yields will cancel out, to a good approximation, when the ratios are calculated. Nonetheless, this is a source of systematic uncertainty that is considered in Chap. 10.

105 events/(10 MeV/c2)

events/(10 MeV/c2)

6.6 Yields 1200

350

1000

300 250 200 150 100

5200

5400

5600

K +π −)

0

(D (2007)

5800

400

0

Mass [MeV/c ] events/(10 MeV/c2)

4500 4000 3500 3000 2500 2000 1500 1000 500

5400

5200

5600

5400

5600

5800

5800

5200

5400

5600

5800

5200

5400

5600

5800

90 80 70 60 50 40 30 20 10 0

(D*(2007)0π +π −) Mass [MeV/c2]

(D*(2007)0K +π −) Mass [MeV/c2]

events/(10 MeV/c2)

events/(10 MeV/c2)

5200

(D*(2007)0K −π +) Mass [MeV/c2]

2

events/(10 MeV/c2)

*

0

600

200

50 0

800

1200

250

1000

200 150 100 50 0

800 600 400 200

5200

5400

5600

K −π +)

0

(D (2007) *

5800

0 2

Mass [MeV/c ]

(D*(2007)0π +π −) Mass [MeV/c2]

0 mass after all selection requirements are Fig. 6.16 Distribution of the reconstructed B(s) 0 ∗ 0 + − applied, for (left) B → D (2007) K π , (middle) Bs0 → D ∗ (2007)0 K − π + and (right) B 0 → D ∗ (2007)0 π + π − decays, separately for (top) D ∗ (2007)0 → D 0 γ and (bottom) D ∗ (2007)0 → D 0 π 0 modes

106

6 Candidate Selection

Table 6.8 Number of candidates retained for each channel after all selection requirements are applied, together with the numbers inside signal windows defined as ±25 MeV/c2 relative to the 0 mass appropriate known B(s) Channel B0 B0 Bs0 Bs0 B0 B0

→ → → → → →

D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 γ D ∗ (2007)0 K + π − , D ∗ (2007)0 → D 0 π 0 D ∗ (2007)0 K − π + , D ∗ (2007)0 → D 0 γ D ∗ (2007)0 K − π + , D ∗ (2007)0 → D 0 π 0 D ∗ (2007)0 π + π − , D ∗ (2007)0 → D 0 γ D ∗ (2007)0 π + π − , D ∗ (2007)0 → D 0 π 0

Total candidates

In signal region

5326 1257 12628 3121 83490 19556

1353 390 4187 1122 18441 5014

Table 6.9 Percentage of candidates in the sample indicated by the column heading also present in the sample indicated by the row heading B 0 , γ (%) B 0 , π 0 (%) Bs0 , γ (%) Bs0 , π 0 (%) Control γ Control π 0 (%) (%) B0, γ B 0 , π0 Bs0 , γ Bs0 , π 0 Control γ Control π 0

1.09 4.61 0.12 0.00 0.08 0.00

0.30 0.00

0.00 0.06 0.00 0.07

5.64 0.14 0.01

0.00 0.15 1.39

1.27 0.00 0.90 0.00

0.00 0.05

0.00 1.19 0.02 0.32 1.24

5.33

Table 6.10 Percentage of selected events with more than one candidate in each data sample. These rates of multiple candidate are given both in the complete selected samples and in the signal regions Channel Multiple candidates in dataset Multiple candidates in signal (%) region (%) B0, γ B 0 , π0 Bs0 , γ Bs0 , π 0 Control γ Control π 0

3.17 8.59 2.98 8.84 4.03 8.98

1.76 7.09 1.58 7.09 2.08 7.22

References 0

1. LHCb collaboration, Aaij R et al (2014) Observation of overlapping spin-1 and spin-3 D K − resonances at mass 2.86 GeV/c2 . Phys Rev Lett 113:162001, arXiv:1407.7574 0 2. LHCb collaboration, Aaij R et al (2014) Dalitz plot analysis of Bs0 → D K − π + decays. Phys Rev D90:072003, arXiv:1407.7712

References

107 0

3. LHCb collaboration, Aaij R et al (2015) Amplitude analysis of B 0 → D K − π + decays. Phys Rev D92:012012, arXiv:1505.01505 4. LHCb collaboration, Aaij R et al (2016) Constraints on the unitarity triangle angle γ from Dalitz plot analysis of B 0 → D K + π − decays. Phys Rev D93:112018, Erratum https://doi. org/10.1103/PhysRevD.94.079902ibid. D94:079902, arXiv:1602.03455 5. Hulsbergen WD (2005) Decay chain fitting with a Kalman filter. Nucl Instrum Meth A552:566, arXiv:physics/0503191 6. Particle Data Group, Zyla PA et al (2020) http://pdg.lbl.gov/Review of particle physics. Prog Theor Exp Phys 2020:083C01 7. Williams M (2012) Generic D from B selections. http://cdsweb.cern.ch/search?p=LHCbINT-2012-002&f=reportnumber&action_search=Search&c=LHCb+Internal+NotesLHCbINT-2012-002 8. Feindt MT, Kerzel U (2006) The Neuro Bayes neural network package. Nucl Instrum Meth A 559:190 9. Pivk M, Le Diberder FR (2005) sPlot: a statistical tool to unfold data distributions. Nucl Instrum Meth A555:356, arXiv:physics/0402083 10. Tianqi Chen CG XGBoost: a scalable tree boosting system. arXiv:1603.02754 11. Anderlini L et al (2016) The PIDCalib package. http://cdsweb.cern.ch/search?p=LHCb-PUB2016-021&f=reportnumber&action_search=Search&c=LHCb+NotesLHCb-PUB-2016-021 12. Aaij R et al Selection and processing of calibration samples to measure the particle identification performance of the LHCb experiment in Run 2. EPJ Tech Instrum 6:1, arXiv:1803.00824

Chapter 7

Characterization of Backgrounds

The selection procedure succeeds in rejecting the majority of candidates that do not contain a signal decay. Nonetheless, several sources of significant background remain. As seen in Fig. 6.6, despite a clear signal peak, other components can be seen across the whole range of each of the mass spectra. In this section these different background sources, as well as the techniques used to characterise them, are described in detail. These components can be split in the following categories: • Combinatorial background: Candidates which do not originate primarily from the decay products of a single b-hadron decay, but instead combine tracks from multiple sources. Potentially this could include pions and kaons that originate from the primary vertex, although such backgrounds are expected to be strongly suppressed by the selection requirements. More commonly, this is expected to be composed of particles originating from different b- or c-hadron decays which have been combined with other particles from the rest of the event. • Partially reconstructed backgrounds: Candidates originating from a b-hadron decay to a final state that includes all the particles in the signal, plus an extra particle (usually a pion). The missing extra particle, which is not included in the 0 -candidate mass distribution reconstruction of the signal candidate, causes the B(s) to be shifted to lower values, and hence these backgrounds populate the leftmost part of the distribution. Since the missing particle can be charged or neutral, 0 , decays. partially reconstructed backgrounds can originate from B + , as well as B(s) 0 If the missing particle is more massive than a pion (e.g. a kaon) the shift in B(s) candidate mass will be large enough that the background distribution will fall outside of the range of invariant mass considered by the fit. • Misidentified backgrounds: Special emphasis in the selection has been made to reduce background from misidentified decays. However, it is still possible that 0 hadron decays to a final state misidentified backgrounds remain, in which the B(s) that has the same number of charged tracks as the signal, but in which one these

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_7

109

110

7 Characterization of Backgrounds

tracks is of a different species than in the signal.1 Misidentification of protons is also possible, so potential backgrounds from Λ0b decays must also be considered. 0 candidate mass distributions for misidentified backgrounds are shifted The B(s) and smeared due to the use of the wrong mass hypothesis in the reconstruction. This category also includes the so called cross-feed backgrounds, where a decay that is signal in one final-state is background in another. • Partially combinatorial backgrounds: It is possible to have backgrounds where the majority, but not all, of the particles comprising the signal candidate originate from a b-hadron decay, and the remaining particles come from some other (random) source. These will retain some b-hadron properties, the distributions of which will typically be smeared due to the random particle. In the case that the random additional particle is soft, the smearing will be correspondingly smaller. The most 0 → Dh + h − important source of partially combinatorial background is due to B(s) decays in which an extra γ or π0 is used to build a fake D ∗ (2007)0 candidate. Additionally, a contribution from B + → D ∗ (2007)0 h + decays, where an extra charged track is included to build a signal candidate, must also be considered. In addition to having candidates from background decays, in this section we also consider cases where signal decays are not correctly reconstructed. This is necessary due to the very challenging reconstruction of the D ∗ (2007)0 meson, in particular due its neutral decay products. Depending on the nature of this misreconstruction of the signal, we define two different categories: • Misreconstructed signal: Candidates in which the neutral decay product (γ or π0 ) is not the real neutral from the signal decay but instead is taken from the rest of the event. This category is present in both D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 channels, but is more important in the latter, due to the challenges associated with reconstructing low momenta π0 in a typical LHCb event. • Wrong D∗ (2007)0 decays: There is a non-negligible probability for a D ∗ (2007)0 → D 0 γ decay to be reconstructed as a D ∗ (2007)0 → D 0 π0 candidate, combining the signal γ with a random soft photon. Similarly, and more often, it is possible for one of the photons from a signal π0 decay to be used to build a D ∗ (2007)0 → D 0 γ candidate. These two categories are characterised using signal MC samples, in which the candidate final state particles are matched with the generated decay products, as described below. These categories could be considered as part of the signal rather than as backgrounds, but since they have much worse resolution than the correctly reconstructed signal decays they are treated as backgrounds in this analysis. We note that similar subcategories could also be defined for several of the background sources, such as the misidentified backgrounds. However, since these components are accounted as background sources in this analysis, this distinction is only applied to signal for simplicity. 1

In principle this could also occur with more than one misidentified track, but since the misidentification rates are low such backgrounds are expected to be negligible.

7.2 Partially Reconstructed Backgrounds

111

In the remainder of this section we discuss the properties of each of these back0 ground categories in turn, with particular attention on their distributions in B(s) candidate mass and in the Dalitz plot. These will be the crucial elements for the measurement of the relative branching fraction of the signal decays.

7.1 Combinatorial Background As is seen in many LHCb analyses, combinatorial background is expected to have 0 candidate mass distribution. Since there is a reasonably large release a smooth B(s) of momentum in the signal decays under study, and since there are typically more low momentum tracks produced in pp collisions, the distribution is expected to be 0 candidate mass increases. falling as B(s) Since the combinatorial background is expected to be dominated by combinations of particles from b and c decays, some structure is expected to be found in the twobody mass distributions (or equivalently in the projections onto Dalitz plot axes) due to candidates containing real resonances. These structures would be smeared by the 0 mass constraint in the calculation of the Dalitz plot variables, imposition of the B(s) but may still be visible.

7.2 Partially Reconstructed Backgrounds Partially reconstructed backgrounds come from B 0 , Bs0 and B + decays to final states identical to the signal except for an extra π0 , π+ or γ that is not included in the reconstruction of the candidate. Since the signal decay final state contains particles of the same species as the extra particle, the reconstruction of the signal candidate can be done in many different ways, depending on which particles are used and which are left out. Nonetheless, candidates in which some misreconstruction occurs on top of missing a particle—e.g. the extra π0 from a K 1 (1270)0 → K + π− π0 resonance is used to build a D ∗ (2007)0 candidate, instead of the correct one—are expected 0 mass distribution falling out of the range of this analysis, or to to present a B(s) be rejected in the selection procedure. Thus, only partially reconstructed decays in which the extra particle is not used to build a signal candidate are considered. Partially reconstructed backgrounds occur through intermediate resonances of excited states. As such there is a large number of channels to be considered. However, 0 -candidate mass distribution is localised in most of the cases, the reconstructed B(s) outside the mass fit region and therefore only an small selection of them needs to be included in the final fit model. Moreover, since various different potential back0 -candidate mass distribution, only a limited grounds result in similar shapes in the B(s) 0 number need to be included in the final fit. The B(s) -candidate mass distributions for all considered channels can be seen in Fig. 7.1. These distributions have been obtained

112

7 Characterization of Backgrounds A.U.

A.U.

24000

22000

B0→ D1(2420)0K +π −

22000

0 * 0 0 B → D (2007) K 1(1270)

18000

0 * 0 0 B → D (2007) K 1(1400)

16000

B+→ D1(2420)−K +π +

14000

B → D2 (2460) +

*

−

12000

0

B+→ D (2007) K 1(1400)+ *

10000

18000 16000

0

12000

D 1(2420) K −π + 0 D 2*(2460) K −π + 0 0 D *(2007) K 1(1270) 0 0 D *(2007) K 1(1400) Ds2*(2573)−K +π + 0 D*(2007) K 1(1270)+ 0 D*(2007) K 1(1400)+ Ds2*(2573)−π +π + 0

10000

8000

8000

6000

6000

4000

4000 2000

2000 0

0

14000

π

K+ +

B+→ D (2007) K 1(1270)+ *

Bs → 0 Bs → 0 Bs → 0 Bs → B +→ B +→ B +→ B +→

20000

B0→ D2*(2460)0K +π −

20000

4600

4800

5000

K +π −)

0

(D (2007)

0

5400

4600

2

Mass [MeV/c ]

A.U.

*

5200

4800

5000

5200

5400

(D*(2007)0K −π +) Mass [MeV/c2]

B → D1(2420)0K π +π − B0→ D2*(2460)0π +π − B0→ D*(2007)0a1(1260)0 0 * 0 0 B → D (2007) η ' (958) B+→ D1(2420)−π +π + * B+→ D2 (2460)−π +π + * B+→ Ds2 (2573)+π +π − * 0 B+→ D (2007) a1(1260)+ B+→ D1(2420)+π +π − B+→ D1(2420)+π +π − * + B → D2 (2460)+π +π − B+→ D2*(2460)+π +π − 0

25000 20000 15000 10000 5000 0

4600

4800

5000

5200

5400

(D*(2007)0π +π −) Mass [MeV/c2] 0 -candidate mass for all considered partially reconstructed backFig. 7.1 Distribution of the B(s) grounds with their most relevant reconstruction method, to the (top left) B 0 → D ∗ (2007)0 K + π− , (top right) Bs0 → D ∗ (2007)0 K − π+ , and (bottom) B 0 → D ∗ (2007)0 π+ π− channels. The dashed lines indicate the limit of the mass fit region

from RapidSim samples of one million decays each; although they are not expected to give a perfect description for the background shapes, they give a good first approximation to describe their distributions in the data sample for the channels with an extra charged or neutral pion. The complete list of the partially reconstructed backgrounds that are included in the fit model and the channels to which they contribute, can be found in Table 7.1. For backgrounds with an extra photon, such as B 0 → D ∗ (2007)0 η (958), η (958) → π+ π− γ, for which the reconstructed B mass distribution spans much closer to the signal region, a more accurate model is required. In this case, a one million decays sample have been generated using EvtGen in conjunction with RapidSim. This enables the generation of intermediate resonant states with the appropriate lineshapes. This is a crucial step in this case as the η (958) decay is expected to be dominated by the η (958) → ρ0 (770)γ decay, which has a m(π+ π− ) distribution significantly different from phase-space.2 The comparison of the B mass distribution obtained with only RapidSim and the model provided by EvtGen is shown in Fig. 7.2. The latter is used in the mass fit model. A detailed study of the η (958) → ρ0 (770)γ decay dynamics has been published by the BESIII collaboration [1]. The EvtGen model reproduces the distribution sufficiently well for our purposes.

2

7.3 Misidentified Backgrounds

113

Table 7.1 Partially reconstruction backgrounds considered. Which includes only decays with 0 -candidate mass distribution falling inside the mass fit region, as shown in Fig. 7.1 reconstructed B(s) Channel

0.03 0.025

A.U.

A.U.

Relevant for B 0 → D ∗ (2007)0 K + π− B 0 → D ∗ (2007)0 K 1 (1270)0 Relevant for Bs0 → D ∗ (2007)0 K − π+ Bs0 → D ∗ (2007)0 K 1 (1270)0 Relevant for B 0 → D ∗ (2007)0 π+ π− B 0 → D ∗ (2007)0 a1 (1260)0 B 0 → D ∗ (2007)0 η (958)

RapidSim sample (no resonant model)

0.07

RapidSim sample (no resonant model)

0.06

RapidSim+EvtGen sample

RapidSim+EvtGen sample

0.05 0.02 0.04 0.015

0.03

0.01

0.02

0.005 0

3500

0.01

4000

4500

π +π −)

0

(D (2007) *

5000

5500 2

Mass [MeV/c ]

0

5200

5400

5600

5800

(D*(2007)0π +π −) Mass [MeV/c2]

Fig. 7.2 Comparison of the B-candidate mass distribution of for B 0 → D ∗0 η (958) decays generated with RapidSim (red) and EvtGen in conjunction with RapidSim (green). The right plot 0 -candidate mass used in the mass fit shows the same distributions within the range used of B(s) (normalised within that range)

7.3 Misidentified Backgrounds We define misidentified backgrounds as decays with the exact same topology as our signal channel, in which one or more of the final state particles is of a different species than the signal channel. This particle is assigned the wrong mass hypothesis when 0 -candidate building the signal candidate, leading to a smearing and shifting of the B(s) mass. Depending on the true species of the wrongly identified particle, we can break down this background category in two sub-groups: • proton misidentification: This category comprises Λ0b decays that share the same final state as the signal decay channel except that they include a p which has been reconstructed as a K − or π− , i.e. Λ0b → D ∗ (2007)0 pπ+ (K + ). For completeness in this category we also have studied contributions from Λ0b → D 0 pπ+ (K + ), in which an extra γ/π0 from other sources is used to build a D ∗ (2007)0 candidate, as well as misidentifying the p. Misidentified backgrounds from Ξ 0b decays are also possible but are negligible due to the lower Ξ 0b production rate and branching fractions to the relevant final states [2].

114

7 Characterization of Backgrounds

The complete list of channels in this category can be found in Table 7.2. For each channel the selection efficiency is given with respect to the signal channel it is misidentified as, as obtained from MC. In addition to these suppression factors, the relative Λ0b production rate should also be taken into account: this has been measured [3] to be f Λ0b /( f u + f d ) = 0.259 ± 0.018 averaged over the LHCb acceptance, and falling with b-hadron pT (and hence reduced for L0HadronTOS triggers that favour high pT). Assuming f u = f d , this implies an additional factor of two suppression compared to the B 0 signal and normalisation modes.3 These background suppression factors are considered small enough that backgrounds with proton misidentification do not need to be included in the simultaneous fit 0 -candidate mass model described in the following section. Due to the broad B(s) distributions of these components, any residual contribution from this source will be absorbed by the combinatorial background model. The possibility for residual background from Λ0b decays to influence the results is, however, considered as a source of systematic uncertainty. • kaon or pion misidentification: This category includes backgrounds from fully reconstructed B meson decays with one misidentified particle: either a charged K 0 → D ∗ (2007)0 K ± π∓ signal channels or π. This includes cross-feed from the B(s) to the B 0 → D ∗ (2007)0 π+ π− normalisation channel and vice versa. The complete list of backgrounds considered as well as their relative efficiencies with respect to the corresponding signal channel can be seen in Table 7.3. Even though the relative efficiencies are comparable to those found in the case 0 -candidate mass distributions for these channels of misidentified protons, the B(s) exhibit narrower peaks and therefore they are included in the simultaneous fit model described in the following section. Since the particle misidentification rate is relatively small, contributions from events in which more than one particle has been misidentified are considered to be negligible. Also, decays in which one of the D 0 decay products is misidentified will be removed effectively by selection requirements, in particular that on the D 0 -candidate mass, and therefore do not need to be considered further. 0 -candidate mass distributions of these background In order to determine the B(s) components, we use a set of MC samples which are generated with a flat distribution across the square Dalitz plot (see Table 5.5 for details). The square Dalitz plot (SDP) is defined by the following variables (see, e.g. Ref. [5]): m =

3

m 12 − m 1 − m 2 1 arccos 2 −1 , π m B − m1 − m2 − m3 1 θ = θ12 , π

(7.1)

One should also take into account the relative branching fractions of the decays in question, but B(Λ0b → D 0 pπ+ ) = (6.3 ± 0.7) × 10−4 [2] is comparable to B(B 0 → D 0 π+ π− ) = (8.8 ± 0.5) × 10−4 [4].

7.3 Misidentified Backgrounds

115

Table 7.2 List of backgrounds with p → π− or p → K − misidentification, as well as their reconstruction and selection efficiencies with respect to the signal channels Channel Rel. in Dγ channel Rel. in Dπ0 channel Background to B 0 → D ∗ (2007)0 K + π− Λ0b → D ∗ (2007)0 pK + 1.2 · 10−2 Λ0b → D 0 pK + 5.4 · 10−3 0 ∗ 0 Background to Bs → D (2007) K − π+ Λ0b → D ∗ (2007)0 pπ+ 6.4 · 10−2 0 0 + Λb → D pπ 2.6 · 10−2 0 ∗ 0 Background to B → D (2007) π+ π− Λ0b → D ∗ (2007)0 pπ+ 3.9 · 10−2 Λ0b → D 0 pπ+ 1.6 · 10−2

1.3 · 10−2 2.1 · 10−3 5.4 · 10−2 1.5 · 10−2 4.6 · 10−2 9.9 · 10−3

Table 7.3 List of backgrounds with K /π misidentification, as well as their reconstruction and selection efficiency with respect to the signal channels Channel Rel. in Dγ channel Rel. in Dπ0 channel Background to B 0 → D ∗ (2007)0 K + π− B 0 → D ∗ (2007)0 K + K − 3.7 · 10−3 B 0 → D ∗ (2007)0 π+ π− 2.0 · 10−2 Bs0 → D ∗ (2007)0 K + K − 1.2 · 10−2 0 ∗ 0 Background to Bs → D (2007) K − π+ B 0 → D ∗ (2007)0 K + K − 4.5 · 10−3 0 ∗ 0 + − B → D (2007) π π 7.3 · 10−3 0 ∗ 0 + − Bs → D (2007) K K 1.5 · 10−2 Background to B 0 → D ∗ (2007)0 π+ π− B 0 → D ∗ (2007)0 K + π− 1.8 · 10−1 0 ∗ 0 − + Bs → D (2007) K π 8.7 · 10−2

4.2 · 10−3 4.2 · 10−2 2.5 · 10−3 6.9 · 10−3 7.7 · 10−3 8.0 · 10−3 3.0 · 10−2 1.1 · 10−1

where m i is the mass of the particle numbered i (and equivalently for m B ), m 12 is the invariant mass of the 12 pair, and θ12 is the helicity angle of the “12” system, i.e. the angle between the particle numbered 1 and that numbered 3 in the rest frame of the 12 pair. This formalism enables the description of the Dalitz plot phase space by variables bounded by m and θ ∃[0, 1]. The definition is, however dependent on the ordering of the particles—more than one SDP can be defined for a unique three body decay. The ordering of the particles that define the SDPs used in the analysis in order to generate MC samples is given in Table 7.4. 0 The underlying phase space distribution affects both the reconstructed B(s) candidate mass shape and the misidentification rate, so all misidentified background samples need to be reweighted to more realistic models (compared to the flat SDP distribution used in the generation). Knowledge of the resonant structure of these channels is limited, however, reasonable approximations are used to describe their

116

7 Characterization of Backgrounds

Table 7.4 Particle ordering convention used to define the square Dalitz plots within which the Monte Carlo samples for these decays are generated with a flat distribution Channel 1 2 3 B0 Bs0 B0 B0 Bs0

→ → → → →

D ∗ (2007)0 K + π− D ∗ (2007)0 K − π+ D ∗ (2007)0 K + K − D ∗ (2007)0 π+ π− D ∗ (2007)0 K + K −

Table 7.5 Model for B 0 → D ∗ (2007)0 π+ π− decays

D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0 D ∗ (2007)0

K+ K− K+ π− K+

Resonant channel D ∗ (2007)0 ρ(770)

→ B 0 → D ∗ (2007)0 f 2 (1270) B 0 → D1 (2420)− π+ B 0 → D2∗ (2460)− π+ B0

π− π+ K− π+ K−

Decay fraction 0.3419 0.0962 0.3373 0.2246

decay models, as described below. It should be stressed these models are used only to determine the shapes of the misidentified components in the mass fit, as well as their relative selection efficiencies. Since the correlation between the phase space 0 -candidate mass distribution and selection efficiency is distribution and both the B(s) relatively weak, it is not necessary to have highly accurate models. 0 → D ∗ (2007)0 h + h − channel for • B0 → D∗ (2007)0 π+ π− : This is the only B(s) which a full (albeit unpublished) amplitude analysis exists [6]. However, since the contribution from this and the other misidentified backgrounds is expected to be small, only a approximate description of its decay distribution is needed. Thus, only the most relevant resonances, shown in Table 7.5, are included. Interference effects are not included in these models, so the decay fractions quoted in Table 7.5 and following tables sum to 100% by definition. • Bs0 → D∗ (2007)0 K + K − : No previous results exists on the amplitude model for this three-body decay channel, although LHCb has published an observation of Bs0 → D ∗ (2007)0 φ decays [7]. A model similar to that for B 0 → D ∗ (2007)0 π+ π− decays is used, replacing X → π+ π− and D ∗ (2007)0 π− resonant states with corresponding X → K + K − and D ∗ (2007)0 K − structures. The model is given in Table 7.6. • B0 → D∗ (2007)0 K + K − : This channel has also not been observed yet and thus no results exist on its resonant structure. The assumed decay model is similar to that for the Bs0 → D ∗ (2007)0 K + K − channel, but with D ∗ (2007)0 K − resonances suppressed by a factor of 10, following the trend observed for D 0 K − resonances in B 0 → D 0 K + K − and Bs0 → D 0 K + K − decays [8]. Similarly, the allowed K + K − resonances differ between the two cases as the B 0 decay is not expected to have any s s¯ component in its final state. The complete model is given in Table 7.7.

7.3 Misidentified Backgrounds

117

Table 7.6 Model for Bs0 → D ∗ (2007)0 K + K − decays

Resonant channel

Table 7.7 Model for B 0 → D ∗ (2007)0 K + K − decays

Resonant channel

Table 7.8 Model for B 0 → D ∗ (2007)0 K + π− decays

Resonant channel

Bs0 Bs0 Bs0 Bs0

B0 B0 B0 B0

→ → → →

→ → → →

D ∗ (2007)0 φ(1020) D ∗ (2007)0 f 2 (1525) Ds1 (2536)− K + ∗ (2573)− K + Ds2

D ∗ (2007)0 a0 (980) D ∗ (2007)0 f 2 (1525) Ds1 (2536)− K + ∗ (2573)− K + Ds2

D ∗ (2007)0 K ∗ (892)0

→ B 0 → D ∗ (2007)0 K 0∗ (1430)0 B 0 → D1 (2420)− K + B 0 → D2∗ (2460)− K + B0

Decay fraction 0.3419 0.0962 0.3373 0.2246

Decay fraction 0.7366 0.2072 0.0337 0.0225

Decay fraction 0.2860 0.2140 0.3000 0.2000

• B0 → D∗ (2007)0 K + π− : Even though this channel has not been measured, some inspiration can be taken from a previous amplitude analysis of B 0 → D 0 K + π− decays [9]. The K π resonant structures are expected to be the same as for B 0 → D 0 K + π− decays, with only the most relevant states included in the model. The D ∗ (2007)0 π− resonances are chosen to be the same as in the B 0 → D ∗ (2007)0 π+ π− decay, maintaining the ratio between them. The complete model is given in Table 7.8. • Bs0 → D∗ (2007)0 K + π− : A similar approach is used to describe this channel, following the previous amplitude analysis of Bs0 → D 0 K − π+ decays [10]. The ratio of contributions from K π resonances is taken from this previous work, considering again only the most relevant states. Similarly, the D ∗ (2007)0 K − resonances in the model are chosen to be the same as those in the Bs0 → D 0 K − K + model. The complete model is shown in Table 7.9. The square Dalitz plots for these models, following the ordering conventions of the MC generated samples shown in Table 7.4, are shown in Fig. 7.3. The samples for these models have been produced using the EvtGen package with the BParticleGun generator, which simulates only the decay chain of interest and does not simulate the rest of the pp collision event, therefore greatly enhancing the generation speed. Since only the square Dalitz plot distributions of these samples are used in the analysis, this is an effective way to generate the large samples needed to

118

7 Characterization of Backgrounds

Table 7.9 Model for Bs0 → D ∗ (2007)0 K − π+ decays Resonant channel

Decay fraction

D ∗ (2007)0 K ∗ (892)0

→ → D ∗ (2007)0 K ∗0 (1430)0 → Ds1 (2536)− π+ ∗ (2573)− π+ → Ds2

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.2

0.4

0.6

0.2579 0.1802 0.3373 0.2246

0.8

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

θ'

θ'

Bs0 Bs0 Bs0 Bs0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.2

0.4

0.6

0.8

m'

0.2

0.4

0.6

0.8

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

θ'

θ'

m' 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.2

0.4

0.2

θ'

0.6

0.8

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

m'

m' 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0.4

0.6

0.8

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

m'

Fig. 7.3 Square Dalitz plot distribution for the models used to describe misidentified backgrounds: (top left) B 0 → D ∗ (2007)0 π+ π− , (top right) Bs0 → D ∗ (2007)0 K + K − , (middle left) B 0 → D ∗ (2007)0 K + K − , (middle right) B 0 → D ∗ (2007)0 K + π− and (bottom) Bs0 → D ∗ (2007)0 K − π+

7.3 Misidentified Backgrounds

119

perform the SDP weighting. The weights are obtained by normalising to the distributions obtained from another set of samples generated with the BParticleGun generator, which replicate the flat square Dalitz plot distributions in the full-simulation samples. The weights are determined from the BParticleGun samples, and are calculated as f New (m , θ ) , (7.2) W (m , θ ) = f Flat (m , θ )

A.U.

A.U.

where f New (m , θ ) and f Flat (m , θ ) are PDF values for the square Dalitz plot distributions in the newly developed and flat SDP models, respectively. The PDF values are obtained non-parametrically by binning the distributions with an adaptive binning scheme for each model, so all bins have the same number of events. In order to maintain enough events in each bin and minimise the statistical uncertainties in the weighting procedure, this strategy is also used for the flat model. To account for the different binning scheme in both samples, the PDF value for each distribution is computed as Fi (m , θ ) (7.3) f i (m , θ ) = i (m , θ )

22000 20000

Flat generated MC

18000

Weighted MC

Weighted MC

120

12000

100

10000

80

8000

60

6000

40

4000

20

2000

5400

5600

K +π −)

0

(D (2007)

5800

0

2

350

Flat generated MC

300 250

50 40

150

30

100

20

50

10

5600

5400

K +π −)

0

(D (2007)

5800

0

2

Mass [MeV/c ]

5600

5800

Flat generated MC

60

Weighted MC

*

5400

(D*(2007)0K +π −) Mass [MeV/c2]

70

200

5200

5200

Mass [MeV/c ] A.U.

5200 *

0

Flat generated MC

140

14000

A.U.

180 160

16000

0

200

Weighted MC

5200

5400

5600

5800

(D*(2007)0K +π −) Mass [MeV/c2]

Fig. 7.4 Effect of SDP reweighting on (top left) correctly reconstructed B 0 → D ∗ (2007)0 K + π− with D ∗ (2007)0 → D 0 γ decays, and on (top right) B 0 → D ∗ (2007)0 π+ π− , (bottom left) Bs0 → D ∗ (2007)0 K + K − , and (bottom right) B 0 → D ∗ (2007)0 K + K − decays misreconstructed as B 0 → D ∗ (2007)0 K + π− (all with D ∗ (2007)0 → D 0 γ)

120

7 Characterization of Backgrounds

where Fi (m , θ ) is the absolute value per bin and i (m , θ ) is the bin size. f i (m , θ ) is normalized to unity after the adaptive binning is defined. 0 -candidate mass distributions of the signal and misidentified background comB(s) ponents from the MC samples after this reweighting process is applied are illustrated in Fig. 7.4. The effect on correctly reconstructed decays is minimal, since the properties of the signal PDF do not depend strongly on SDP position. The effect on misidentified background components is larger, as expected. Moreover, channels with SDP distributions which greatly differ from the flat model, in particular 0 → D ∗ (2007)0 K + K − decays, are expected to exhibit a more divergent weighted B(s) distribution. Although other background components featuring three body decays have also been generated with flat square Dalitz plot distributions, this correction only expected to sizeable for misidentified backgrounds.

7.4 Partially Combinatorial Backgrounds In multibody signal channels, such as those studied here, it is possible to have a genuine “(n − 1)-body” B decay that produces the majority of the signal candidate, combined with a random particle from the rest of the pp collision event. 0 → D 0 h + h − decays In this analysis, a particularly important source of this is B(s) which are fully reconstructed and combined with a random γ/π0 from the rest of the event. These decays have almost identical topology as signal events and thus are not strongly suppressed by the selection: the only aspects of the selection that discriminate against this source of background are the γ/π0 quality requirements D ∗ (2007)0 D0 − m rec cut. Hence, the selection efficiency for these background and the m rec sources is comparable to that for signal, as shown in Table 7.10. Background from fully reconstructed B + → D ∗ (2007)0 h + decays combined with a random π− (K − ) to build a signal candidate are also considered; these are discriminated against more

Table 7.10 Relative efficiencies for all relevant partially combinatorial backgrounds with respect to the signal efficiency for each channel, as evaluated from MC Channel Rel. in Dγ channel Rel. in Dπ0 channel Background to B 0 → D ∗ (2007)0 K + π− B 0 → D 0 K + π− 0.420 B + → D ∗ (2007)0 K + 0.073 Background to Bs0 → D ∗ (2007)0 K − π+ Bs0 → D 0 K − π+ 0.383 B + → D ∗ (2007)0 π+ 3.7 · 10−3 Background to B 0 → D ∗ (2007)0 π+ π− B 0 → D 0 π+ π− 0.435 B + → D ∗ (2007)0 π+ 0.156

0.306 0.091 0.288 2.1 · 10−3 0.371 0.210

121

2000

A.U.

A.U.

7.4 Partially Combinatorial Backgrounds

0

1800

B0→ D K +π −

1600 1400

200

800

150

600 400

100

200

50

5200

5600

5400

K +π −)

0

(D (2007)

1600 1400

5800

0

2

B0→ D K −π + 0

B+→ D *(2007)0π + (x10)

400

B0→ D K −π +

350

B+→ D *(2007)0π + (x10)

5600

5800

5600

5800

0

300

1000

250

800

200

600

150

400

100 50

200

5600

5400

K −π +)

0

(D (2007)

5800

0

2

2500 0

B0→ D π +π − 2000

5200

Mass [MeV/c ] A.U.

5200 *

B → D (2007) +

*

0

π+

5400

(D*(2007)0K −π +) Mass [MeV/c2] 0

B0→ D π +π −

500

B → D*(2007)0π + +

400

1500

300

1000

200

500 0

5400

(D*(2007)0K +π −) Mass [MeV/c2]

450

1200

0

5200

Mass [MeV/c ] A.U.

A.U.

*

1800

B+→ D*(2007)0K +

250

1000

0

0

B0→ D K +π −

300

1200

A.U.

400 350

B+→ D*(2007)0K +

100

5200

5400

5600

π +π −)

0

(D (2007) *

5800

0

2

Mass [MeV/c ]

5200

5400

5600

5800

(D*(2007)0π +π −) Mass [MeV/c2]

0 -candidate mass distribution for partially combinatorial backgrounds Fig. 7.5 Reconstructed B(s) when reconstructed as (top) B 0 → D ∗ (2007)0 K + π− , (middle) Bs0 → D ∗ (2007)0 K − π+ and (bottom) B 0 → D ∗ (2007)0 π+ π− . The left (right) plots correspond to the D ∗ (2007)0 → D 0 γ (D 0 π0 ) channels. All distributions share the same normalisation to show the relative efficiencies, except that in the middle row the B + → D ∗ (2007)0 π+ yield has been multiplied by a factor of 10 to make it visible

effectively by the selection procedure due to the use of vertexing information. Nev0 → D 0 h + h − ertheless, their large branching fractions, compared to those of B(s) 0 decays, makes them worth studying. The reconstructed B(s) -candidate mass distributions for all partially combinatorial backgrounds, after the selection is applied, can be seen in Fig. 7.5. These are used in the simultaneous mass fit model described in the next section, with the exception of B + → D ∗ (2007)0 π+ , which is not included

122

7 Characterization of Backgrounds

in any of the channels, as it is mostly rejected in the selection procedure as well as 0 -candidate mass distribution at the limit of the analysis range. presenting a B(s) 0 For the partially combinatorial backgrounds to the B(s) → D ∗ (2007)0 K ± π∓ decays, modelling based on MC is sufficient as the yields of these backgrounds are found to be fairly low. However, the partially combinatorial background from B 0 → D 0 π+ π− decays in the control channel is quite large, as a consequence of the larger yields in this final state (due to both the branching fractions of the decays involved and the effect of the particle identification requirements). The fit is therefore sensitive to the modelling of this decay, and in particular to potential data/MC differences due to the momentum spectrum of the soft neutral particle that is combined with the B 0 → D 0 π+ π− decay to form the control channel candidate. Therefore, in order to properly model this background, a data-driven approach is used instead. Data events with a reconstructed (D 0 π+ π− ) mass in the region [5250, 5310] MeV/c2 are selected. These events are dominated by partially combinatorial background events, and contain a small contribution from fully com1200 1200

1000

1000

800

800

600

600 400

400

200

200

5200

5250

(D

0

5300

π +π −)

5350

0

2

5200

5400

5600

5800

(D*(2007)0π +π −) Mass [MeV/c2]

Mass [MeV/c ] 400

250

350 300

200

250 150

200

100

150 100

50 0 5200

50 5250

0

5300

5350

(D π +π −) Mass [MeV/c2]

0

5200

5400

5600

5800

(D*(2007)0π +π −) Mass [MeV/c2]

Fig. 7.6 Background subtraction technique to extract the data distribution of B 0 → D 0 π+ π− partially combinatorial background. (Left) Illustration of the linear fit used to estimate the yield of fully combinatorial background in the [5250, 5310] MeV/c2 window. (Right) m(D ∗ (2007)0 π+ π− ) distribution for all events in the window (blue), and with the fully combinatorial background subtracted (green). The fully combinatorial background distribution (red) is extracted from the sideband of the left plots. The top (bottom) plots correspond to D ∗ (2007)0 → D 0 γ (D ∗ (2007)0 → D 0 π0 ) decays

123 A.U.

A.U.

7.5 Misreconstructed Signal Data

0.1

Data, Bg subtracted

Data Data, Bg subtracted

0.14

Simulation

0.08

0.18 0.16

Simulation

0.12 0.1

0.06

0.08 0.04

0.06 0.04

0.02

0.02 0

5200

5400

5600

π +π −)

0

(D (2007) *

5800

0

2

Mass [MeV/c ]

5200

5400

5600

5800

(D*(2007)0π +π −) Mass [MeV/c2]

Fig. 7.7 Comparison of the B 0 -candidate mass distributions from B 0 → D 0 π+ π− partially combinatorial background to the control channel, obtained from MC (red) and from data before (blue) and after (green) background subtraction. The left (right) plots correspond to the D ∗ (2007)0 → D 0 γ (D 0 π0 ) channels

binatorial background events. Thus, the following simple background subtraction method is used to determine the B 0 -candidate mass distribution of the partially combinatorial background. First, the yield of fully combinatorial background in the [5250, 5310] MeV/c2 window is obtained from a linear fit of the m(D 0 π+ π− ) sidebands. Then, the m(D ∗ (2007)0 π+ π− ) distribution from B 0 → D 0 π+ π− partially combinatorial background is obtained from the distribution of the candidate in the designated window by subtracting the fully combinatorial component, the shape of which is extracted from the sideband. This procedure is illustrated in Fig. 7.6. A comparison between the distribution obtained from the data-driven approach to that from MC is shown in Fig. 7.7. The data-driven shape is considered more reliable than that from MC, as the momentum spectrum of additional soft particles may not be reliably modelled in simulation. However, it is only possible to use this approach for the control channel where a reasonably large sample of the partially combinatorial background can be isolated. As the shapes are found to be quite similar, this provides confidence that the use of MC for other partially combinatorial backgrounds is sufficient and will not introduce large systematic uncertainties due to mismodelling.

7.5 Misreconstructed Signal Reconstruction of neutral particles is a point of major importance in this analysis. While reconstruction of photons and neutral pions is always challenging in LHCb, it is especially so in this analysis due to their low average transverse momentum, which makes it harder to separate signal from background. Thus, it is relatively common to have candidates which come from a true signal decay, but in which the final state γ or π0 has not been reconstructed properly. This can occur because the neutral particle used to build the candidate (or, in the case of neutral pions, part of it) comes

124

7 Characterization of Backgrounds

A.U.

A.U.

from a background cluster in the calorimeter, i.e. originates from decay products of other particles produced in the pp collision or other interactions in the same bunch crossing. These misreconstructed signal candidates have much worse D ∗ (2007)0 -candidate mass resolution than the correctly reconstructed signal. This in turn has an impact 0 -candidate mass distribution, and on the resolution of Dalitz on the reconstructed B(s) plot variables. Thus, these candidates are considered to be a background source and are treated separately to the correctly reconstructed signal events. In order to charac-

22000

Signal

20000 18000

Misreconstructed signal

16000 14000

0

D (2007) →D π 0

*

12000

4000 3500

Signal

3000

Misreconstructed signal

2500

0

10000 8000

1500

6000

1000

4000

500

2000

5200

5400

5600

K +π −)

0

(D (2007) A.U.

*

5800

0

2

5200

Mass [MeV/c ] A.U.

0

0

D*(2007)0→D γ

2000

25000

Signal

5400

5600

5800

(D*(2007)0K +π −) Mass [MeV/c2]

3500

Signal

3000

20000

Misreconstructed signal

Misreconstructed signal 2500

15000

0

D (2007) →D π 0

*

0

D*(2007)0→D γ

0

2000 1500

10000

1000 5000 500

5400

5200

5600

K −π +)

0

(D (2007) A.U.

*

5800

0

2

5200

Mass [MeV/c ] A.U.

0

25000

Signal

20000

Misreconstructed signal

5400

5600

5800

(D*(2007)0K −π +) Mass [MeV/c2]

4000

Signal

3500

0

D (2007) →D π 0

*

15000

Misreconstructed signal

3000

0

D*(2007)0→D γ

2500

0

2000 10000

1500 1000

5000

500 0

5200

5600

5400

π +π −)

0

(D (2007) *

5800

0

2

Mass [MeV/c ]

5200

5400

5600

5800

(D*(2007)0π +π −) Mass [MeV/c2]

0 -candidate mass distribution for misreconstructed signal and wrong Fig. 7.8 Reconstructed B(s) ∗ 0 D (2007) decay backgrounds, compared to correctly reconstructed signal, for (top) B 0 → D ∗ (2007)0 K + π− , (middle) Bs0 → D ∗ (2007)0 K − π+ and (bottom) B 0 → D ∗ (2007)0 π+ π− candidates. The left (right) plots correspond to D ∗ (2007)0 → D 0 γ (D 0 π0 ) candidates

References

125

terise these candidates and to differentiate them from correctly reconstructed events, the truth matching selection described in Table 5.6 is modified to not impose any requirements on the γ or π0 originated in the D ∗ (2007)0 decay. Since the challenges of reconstructing the neutral final state particles are different for photons and neutral pions, the contribution from this background is expected to be different in the D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 channels, both in shape and in yield 0 mass distribution for this with respect to the fully reconstructed signal. The B(s) component compared to fully reconstructed events is shown in Fig. 7.8.

7.6 Wrong D∗ (2007)0 Decay Backgrounds We consider as a separate category decays reconstructed through the wrong D ∗ (2007)0 decay channel, either by adding one extra combinatorial (or fake) photon to build a D ∗ (2007)0 → D 0 π0 candidate from a D ∗ (2007)0 → D 0 γ decay, or to miss one soft photon to do the inverse operation. This category has similarities to the misreconstructed signal category (and indeed it could also be considered as a types of partially combinatorial or partially reconstructed backgrounds), but since the candidates end up in a different final state it is appropriate to treat them separately. Following a similar procedure as for misreconstructed signal, these candidates are characterised using the signal MC samples, which include both D ∗ (2007)0 decays, by selecting candidates that have the wrong D ∗ (2007)0 decay reconstructed and not imposing any truth matching requirements on the neutral particles. The reconstructed 0 -candidate mass distributions for these candidates as well as for misreconstructed B(s) signal events are shown in Fig. 7.8, where they are compared to fully reconstructed signal events. The relative yields of misreconstructed signal and wrong D ∗ (2007)0 decay components compared to the correctly reconstructed signal decays are not expected to be well modelled in MC. Thus, the ratio of yields between these components is set as a free parameter in the simultaneous fit model described in the next section.

References 1. BESIII collaboration, Ablikim M et al (2018) Precision study of η → γπ + π − decay dynamics. Phys Rev Lett 120:242003. arXiv:1712.01525 − 2. LHCb collaboration, Aaij R et al (2014) Study of beauty baryon decays to D 0 ph − and Λ+ c h final states. Phys Rev D89:032001. arXiv:1311.4823 3. LHCb collaboration, Aaij R et al (2019) Measurement of b-hadron fractions in 13 T eV pp collisions. Phys Rev D100:031102(R). arXiv:1902.06794 4. Particle Data Group, Zyla PA et al (2020) Review of particle physics. Prog Theor Exp Phys 2020:083C01 5. Back J et al (2018) Laura++ : a Dalitz plot fitter. Comput Phys Commun 231:198 arXiv:1711.09854 (∗)0 6. Belle collaboration, Study of B 0 → D π+ π− decays. arXiv:hep-ex/0412072

126

7 Characterization of Backgrounds ∗0

7. LHCb collaboration, Aaij R et al (2018) Observation of the decay Bs0 → D φ and search for 0

the mode B 0 → D φ. Phys Rev D98:071103(R). arXiv:1807.01892 0 8. LHCb collaboration, Aaij R et al (2018) Observation of the decay Bs0 → D K + K − . Phys Rev D98:072006. arXiv:1807.01891 0 9. LHCb collaboration, Aaij R et al (2015) Amplitude analysis of B 0 → D K + π − decays. Phys Rev D92:012012. arXiv:1505.01505 0 10. LHCb collaboration, Aaij R et al (2014) Dalitz plot analysis of Bs0 → D K − π + decays. Phys Rev D90:072003. arXiv:1407.7712

Chapter 8

Simultaneous Fit

The B0(s) -candidate mass distributions for all signal and background contributions are modelled using analytic probability density functions (PDFs). The parameters used to define the PDFs are, mostly, obtained from MC samples. Following the discussion in the previous chapter, the complete list of the components that are considered in the simultaneous fit model are: ¯ ∗ (2007)0 K + π − , D ¯ ∗ (2007)0 → D ¯ 0 γ/ D ¯ 0 π0 : • B0 → D – Correctly reconstructed signal – Signal with misreconstructed neutral final state particles – Signal with wrong D∗ (2007)0 decay – Partially reconstructed B0 → D∗ (2007)0 K 1 (1400)0 decays, as a proxy for all partially reconstructed backgrounds in this channel – Misidentified backgrounds from the following decays: ∗ B 0 → D∗ (2007)0 K + K − ∗ B0s → D∗ (2007)0 K + K − ∗ B 0 → D∗ (2007)0 π + π − Partially combinatorial B 0 → D0 K + π − decays Partially combinatorial B + → D∗ (2007)0 K + decays Fully combinatorial background • B0s → D∗ (2007)0 K − π + , D∗ (2007)0 → D0 γ/D0 π 0 : – Correctly reconstructed signal – Signal with misreconstructed neutral final state particles – Signal with wrong D∗ (2007)0 decay – Partially reconstructed B0s → D∗ (2007)0 K1 (1400)0 decays, as a proxy for all partially reconstructed backgrounds in this channel – Misidentified backgrounds from the following decays: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_8

127

128

8 Simultaneous Fit

∗ B 0 → D∗ (2007)0 K + K − ∗ B0s → D∗ (2007)0 K + K − ∗ B 0 → D∗ (2007)0 π + π − Partially combinatorial B0s → D0 K − π + decays Fully combinatorial background • B 0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ/D0 π 0 : – Correctly reconstructed signal – Signal with misreconstructed neutral final state particles – Signal with wrong D∗ (2007)0 decay – Partially reconstructed B0 → D∗ (2007)0 a1 (1260)0 decays, as a proxy for all partially reconstructed backgrounds with a missing pion in this channel – Partially reconstructed B0 → D∗ (2007)0 η (958) decays – Misidentified backgrounds from the following decays: ∗ B 0 → D∗ (2007)0 K + π − ∗ B0s → D∗ (2007)0 K − π + Partially combinatorial B 0 → D0 π + π − decays Partially combinatorial B + → D∗ (2007)0 π + decays Fully combinatorial background. Most backgrounds are modelled independently using MC simulation. The exceptions to this are fully combinatorial background events, and partially combinatorial B+ → D0 π + π − in the control channel, which are fitted directly from the data. Some backgrounds are present in more than one channel. In these cases their B0(s) -candidate mass distributions are modelled using the same MC sample but reconstructed in the corresponding final state. This also includes the different D∗ (2007)0 decays, which have been modelled independently although include the same signal and background components. Analytic PDFs are used to enable corrections due to data-simulation disagreement to be applied in a common way to all shapes. The analytic model chosen for each signal and background component is based on the nature of each decay. Thus, components that originate from similar sources tend to be modelled with the same PDF. The PDFs used in this analysis are described below.

8.1 Double Crystal Ball Function All of the signal components and most of the peaking backgrounds are modelled using double-sided Crystal Ball functions. The Crystal Ball function [1] has a Gaussian core, which is ideal to describe peaking contributions, while also including modified tails that help to account for stochastic reconstruction effects and final state radiation. Moreover, since these reconstruction effects may be different on the two sides of the

8.1 Double Crystal Ball Function

129

mass distribution, a double-sided Crystal Ball function is used, allowing the shape to accurately describe the data in both sides of the distribution. Explicitly, the double Crystal Ball function used in the analysis is a combination of two Crystal Ball functions that share the same mean and width parameters of the Gaussian core, while having different tail parameters. The two Crystal Ball functions are combined as f DCB (m; F, μ, σ, α1 , n 1 , α2 , n 2 ) = F f CB (m; μ, σ, α1 , n 1 ) + (1 − F) f CB (m, μ, σ, α2 , n 2 ) ,

(8.1) where F gives the fraction of the PDF contained in the first Crystal Ball function, and f CB is the Crystal Ball PDF defined as f CB (m, μ, σ, α, n) = N

⎧ 2 ⎨exp(− (m−μ) ), 2σ 2 ⎩

A · (B −

for

m−μ −n ) , σ

m−μ σ

for

> −α

m−μ σ

.

(8.2)

< −α

Here μ and σ are the mean and standard deviation of the Gaussian core respectively. The parameters α and n describe the tail, with α giving the point in which the Crystal Ball modified tail starts and n controlling its shape. In Eq. (8.2) the tail is to low (high) mass values in the case that α is positive (negative). The A and B parameters are defined as n n |α|2 , A= · exp − |α| 2 (8.3) n − |α| . B= |α| Finally, N is a normalisation factor given by 1 , with σ(C + D) n 1 |α|2 C= · · exp − , and |α| n − 1 2 |α| π , 1 + erf √ D= 2 2 N=

(8.4)

where erf(x) is the error function. Due to their peaking structure, all signal, misreconstructed signal and misidentified background components are described with this model. The parameters of the PDFs for each of these components, obtained from independent fits to the appropriate MC samples, can be found in Tables 8.1, 8.2, 8.3, 8.4, 8.5 and 8.6. Although the double Crystal Ball function is chosen to describe all of these components, in some cases a single Crystal Ball function is sufficient to describe the shape; in those cases F is equal to 1 or 0 and either the pair (α1 , n 1 ) or (α2 , n 2 ), respectively, is left

130

8 Simultaneous Fit

Table 8.1 Fitted parameters for all components that use a double Crystal Ball function in the B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 γ channel. Units of MeV/c2 for μ and σ are implied Component

μ

σ

α1

n1

α2

Correctly rec. signal

5283.8 ± 0.3

20.7 ± 0.3

1.9 ± 0.2

2.0 ± 0.5

−1.5 ± 0.2 6 ± 2

Misrec. signal

5299 ± 2

96.8 ± 0.6

Wrong D∗ (2007)0 decay

5299.7 ± 1.0

72.4 ± 0.3

4.1 ± 1.4

2.4 ± 1.9

B0 → D∗ (2007)0 K+ K−

5216 ± 14

31 ± 4

1±3

0.1 ± 0.2

0±4

B0 → D∗ (2007)0 π + π −

5329 ± 3

30 ± 4

0.5 ± 0.1

36 ± 4

−0.6 ± 0.5 5 ± 3

0.6 ± 0.2

B0s → D∗ (2007)0 K+ K−

5241 ± 16

48 ± 30

0.5 ± 1.2

26 ± 8

−1 ± 3

0.9 ± 0.7

−4 ± 3

n2 29 ± 3

F 0.60 ± 0.09 0 1

34 ± 2 27 ± 16

0.8 ± .08

Table 8.2 Fitted parameters for all components that use a double Crystal Ball function in the B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 π 0 channel. Units of MeV/c2 for μ and σ are implied Component

μ

σ

α1

n1

α2

Correctly rec. signal

5288.8 ± 0.5

20.9 ± 0.4

2.4 ± 1.1

0.1 ± 0.6

−2.3 ± 0.4 4 ± 3

0.1 ± 0.3

Misrec. signal

5298.1 ± 1.3

43.8 ± 1.4

1.4 ± 0.3

21.8 ± 1.6

−1.5 ± 0.2 32 ± 3

0.4 ± 0.1

Wrong D∗ (2007)0 decay

5294 ± 6

90.0 ± 1.5

3±3

0.3 ± 1.3

−3 ± 3

0.5 ± 0.1

B0 → D∗ (2007)0 K+ K−

5206 ± 8

83 ± 6

1.0 ± 1.6

0.1 ± 0.2

1

B0 → D∗ (2007)0 π + π −

5336.1 ± 1.4

49.7 ± 1.0

8.6 ± 1.4

12.5 ± 1.5

1

B0s → D∗ (2007)0 K+ K−

5227 ± 28

51 ± 66

2±4

0.1 ± 1.1

1

n2

11 ± 3

F

Table 8.3 Fitted parameters for all components that use a double Crystal Ball function in the B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 γ channel. Units of MeV/c2 for μ and σ are implied Component

μ

σ

α1

n1

α2

Correctly rec. signal

5370.2 ± 0.6

21.4 ± 0.6

2.4 ± 0.4

1.5 ± 0.9

−1.3 ± 0.8 4 ± 4

0.9 ± 0.1

Misrec. signal

5388.9 ± 0.7

91.0 ± 0.5

9.4 ± 0.7

32 ± 23

−1.4 ± 0.3 34 ± 27

0.34 ± 0.06

Wrong D∗ (2007)0 decay

5389.9 ± 1.5

68.4 ± 1.3

3.2 ± 1.5

1.6 ± 1.4

−3.5 ± 1.2 0.4 ± 0.2

0.86 ± 0.05

B0 → D∗ (2007)0 K+ K−

5222 ± 21

22 ± 6

0.6 ± 0.8

0.1 ± 0.2

−0.5 ± 1.1 37 ± 8

0.65 ± 0.02

B0 → D∗ (2007)0 π + π −

5328 ± 13

26.9 ± 1.7

0.5 ± 0.8

12.0 ± 0.8

−0.4 ± 1.0 3.2 ± 1.9

0.37 ± 0.07

B0s → D∗ (2007)0 K+ K−

5326.0 ± 0.1

76 ± 2

1.4 ± 0.2

0.7 ± 0.2

n2

F

1

Table 8.4 Fitted parameters for all components that use a double Crystal Ball function in the B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 π 0 channel. Units of MeV/c2 for μ and σ are implied Component

μ

σ

α1

n1

α2

Correctly rec. signal

5372.7 ± 0.9

16.9 ± 1.3

1.3 ± 0.7

2.4 ± 1.4

−1.1 ± 2.4 17 ± 21

Misrec. signal

5386 ± 39

46 ± 39

−4.0 ± 1.0 1 ± 3

1

Wrong D∗ (2007)0 decay

5395 ± 7

80.6 ± 1.2

4.2 ± 1.2

3.6 ± 1.3

−4 ± 3

0.48 ± 0.05

B0 → D∗ (2007)0 K+ K−

5347 ± 9

43 ± 10

2±2

12 ± 23

−0.3 ± 0.3 43 ± 17

B0 → D∗ (2007)0 π + π −

5183 ± 14

48 ± 4

B0s → D∗ (2007)0 K+ K−

5258 ± 20

51 ± 12

0.0 ± 0.2

31.7 ± 1.4

n2

12 ± 5

F 0.16 ± 0.02

0.71 ± 0.04

−1.4 ± 1.9 23 ± 5

0

−0.5 ± 0.3 2.1 ± 1.3

0.38 ± 0.05

8.2 ARGUS Function Convolved with a Crystal Ball Function

131

Table 8.5 Fitted parameters for all components that use a double Crystal Ball function in the B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ channel. Units of MeV/c2 for μ and σ are implied Component

μ

σ

Correctly rec. signal

5283.3 ± 0.3

21.7 ± 0.3

α1 1.9 ± 0.1

n1 1.5 ± 0.3

α2 −1.4 ± 0.3

n2 3.7 ± 0.3

0.80 ± 0.04

Misrec. signal Wrong D∗ (2007)0 decay B0 → D∗ (2007)0 K + π −

5298 ± 5

93 ± 5

1.4 ± 0.4

15 ± 3

−0.7 ± 1.9

10.1 ± 1.5

0.81 ± 0.08

5298.1 ± 0.3

73.4 ± 0.2

−3.12 ± 0.02

0.49 ± 0.04

0

5236 ± 2

25 ± 3

0.36 ± 0.09

50 ± 2

−0.49 ± 0.09

27 ± 2.

0.59 ± 0.09

B0s → D∗ (2007)0 K + π −

5292 ± 3

51 ± 5

1.1 ± 1.3

0.10 ± 0.07

−1.1 ± 1.1

12.8 ± 1.9

0.47 ± 0.03

F

Table 8.6 Fitted parameters for all components that use a double Crystal Ball function in the B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 π 0 channel. Units of MeV/c2 for μ and σ are implied Component

μ

σ

α1

n1

α2

n2

F

Correctly rec. signal

5287.9 ± 0.6

22.1 ± 0.6

2.2 ± 0.2

0.9 ± 0.4

−1 ± 2

12 ± 2

0.95 ± 0.02

Misrec. signal

5294.5 ± 1.5

46.6 ± 1.4

2.3 ± 1.3

2.1 ± 1.9

−1.1 ± 1.1 4 ± 2

0.85 ± 0.03

Wrong D∗ (2007)0 decay

5308 ± 17

90 ± 17

0.5 ± 1.0

49 ± 7

−1.5 ± 1.1 50 ± 7

0.28 ± 0.06

B0 → D∗ (2007)0 K+ π −

5238 ± 6

40 ± 31

0.6 ± 0.9

0.1 ± 1.2

−1.2 ± 1.5 10 ± 2

0.17 ± 0.07

B0s → D∗ (2007)0 K+ π −

5321 ± 7

36 ± 29

0.1 ± 0.5

3.3 ± 1.3

−0.8 ± 1.1 23 ± 4

0.44 ± 0.05

blank in the table. Figures 8.1, 8.2, 8.3, 8.4, 8.5, 8.6 and 8.7 show the results of these fits to the MC samples. The PDF offers a good description of the distribution in all components, with the exception of the misidentified B 0 → D∗ (2007)0 K + K − and B0s → D∗ (2007)0 K + K − backgrounds for which the MC samples are statistically limited and highly affected by the reweighting process presented in the previous section. The yields of these components are expected to be small so that this imperfection of the modelling will not have a major impact on the results. All the parameters given in Tables 8.1, 8.2, 8.3, 8.4, 8.5 and 8.6 are fixed in the global fit to data. A common shift of the mean and a common scale of the width of all shapes is included in the data fit to account for possible discrepancies between the MC samples and the data, which are expected to be shared across all components. The tail parameters are left unchanged by this procedure.

8.2 ARGUS Function Convolved with a Crystal Ball Function Partially reconstructed backgrounds originate from b-hadron decays with a missing particle. Their invariant mass thus has a kinematic limit at m X b − m miss , where m X b and m miss are the mass of the decaying b hadron and the missing particle, respectively. The distribution extends to lower invariant mass values according to the momentum carried by the missing particle—in most cases of interest this leads to a long tail on the low mass side of the distribution. This shape, with a kinematic limit and a long tail, is modelled by the ARGUS function [2].

8 Simultaneous Fit Entries/10 MeV/c2

Entries/10 MeV/c2

132

20000 15000 10000

4000 3500 3000 2500 2000 1500 1000

5000

500 0

0

5 0 −5

5 0 −5

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

5200

25000 20000 15000 10000

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5800 5600 (D*(2007)0K −π +) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

4000 3500 3000 2500 2000 1500 1000

5000

500

0

0

5 0 −5

5 0 −5

5200

5400

5800 5600 (D*(2007)0K −π +) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

30000 25000 20000 15000 10000

4500 4000 3500 3000 2500 2000 1500 1000

5000

500

0

0

5 0 −5

5 0 −5

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

Fig. 8.1 B0(s) -candidate invariant mass distributions for correctly reconstructed signal, obtained from MC samples, with results of fits to double Crystal Ball functions overlaid. (Top) B 0 → D∗ (2007)0 K + π − , (middle) B0s → D∗ (2007)0 K − π + and (bottom) B 0 → D∗ (2007)0 π + π − channels, with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 decays

Similarly to the signal and misidentified backgrounds, the distribution observed in data is smeared by the experimental resolution, so in particular the kinematic limit does not appear as a hard edge, but as a short tail instead. The resolution effect is expected to be similar to the signal, which as described above is modelled by a double-sided Crystal Ball function. To take into account these effects, partially reconstructed backgrounds are therefore modelled by the convolution of an ARGUS function with a Crystal Ball function. Since the resolution effect is only significant on the high mass tail of the distribu-

Entries/10 MeV/c2

Entries/10 MeV/c2

8.2 ARGUS Function Convolved with a Crystal Ball Function 900 800 700 600 500 400

1600 1400 1200 1000 800 400

200

200

100

0

0

5200

5400

5 0 −5

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

1800

600

300

5 0 −5

133

1000 800 600

5200

5400

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5800 5600 (D*(2007)0K −π +) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

2000

1500

1000

400 500

0

0

5 0 −5

5 0 −5

5200

5400

5800 5600 (D*(2007)0K −π +) Mass [MeV/c2]

1200

Entries/10 MeV/c2

Entries/10 MeV/c2

200

1000 800 600

2000 1500 1000

400

500

200

0

0

5 0 −5

2500

5200

5400

5 0 −5

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

Fig. 8.2 B0(s) -candidate invariant mass distributions for misreconstructed signal, obtained from MC samples, with results of fits to double Crystal Ball functions overlaid. (Top) B 0 → D∗ (2007)0 K + π − , (middle) B0s → D∗ (2007)0 K − π + and (bottom) B 0 → D∗ (2007)0 π + π − channels, with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 decays

tion, it is sufficient to use a single, rather than double, Crystal Ball function in the convolution. The ARGUS PDF is given by f ARGUS (m, m 0 , c, p) = N · m · 1 −

m m0

2 p

· exp c · 1 −

m m0

2 , (8.5)

Entries/10 MeV/c2

8 Simultaneous Fit

1800 1600 1400 1200 1000 800

180 160 140 120 100 80

600

60

400

40

200

20

0

0

5 0 −5

5 0 −5

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

Entries/10 MeV/c2

134

2000 1500 1000

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

450 400 350 300 250 200 150 100

500

0

5 0 −5

5 0 −5

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

50 0

2000 1500 1000

180 160 140 120 100 80 60 40

500

20 0

0

5 0 −5

5 0 −5

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

Fig. 8.3 B0(s) -candidate invariant mass distributions for signal with wrong D∗ (2007)0 decay, obtained from MC samples, with results of fits to double Crystal Ball functions overlaid. (Top) B 0 → D∗ (2007)0 K + π − , (middle) B0s → D∗ (2007)0 K − π + and (bottom) B 0 → D∗ (2007)0 π + π − channels, reconstructed in the (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 decays

where N is a normalisation factor, and m 0 indicates the threshold (the aforementioned kinematic limit) of the component. The parameters c and p define the position and the shape of the low mass tail of the distribution. Since the Crystal Ball function describes the hard edge smearing due to resolution effects, its mean is fixed to 0 (in the global fit, this mean is shifted by a common bias to account for MC-data disagreement as described previously). All other parameters of the shape are floated in the fits to MC samples, even though some of them can be predicted.

120

Entries/10 MeV/c2

Entries/10 MeV/c2

8.2 ARGUS Function Convolved with a Crystal Ball Function

100 80 60

50 40 30 20

40

10

20 0

0

5 0 −5

5 0 −5

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

135

100 80 60 40

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5800 5600 (D*(2007)0K −π +) Mass [MeV/c2]

60 50 40 30 20

20

10

0

0

5 0 −5

5 0 −5

5200

5400

5800 5600 (D*(2007)0K −π +) Mass [MeV/c2]

Fig. 8.4 B0(s) -candidate invariant mass distributions for misreconstructed B 0 → D∗ (2007)0 K + K − decays, obtained from MC samples, with results of fits to double Crystal Ball functions overlaid. These decays are reconstructed in the (top) B 0 → D∗ (2007)0 K + π − and (bottom) B0s → D∗ (2007)0 K − π + channels, with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 decays

The fitted parameters for these PDFs for all partially reconstructed backgrounds can be found in Table 8.7. As mentioned in the previous section, partially reconstructed background samples are generated using RapidSim, from which it is seen that there are no major differences in shape between the different D∗ (2007)0 decays. Thus, the same sample and therefore the same fitted parameters are used for both D∗ (2007)0 → D0 γ/π 0 channels. Fits of all components described with this model are shown in Fig. 8.8, which exhibit good agreement in all channels. Since the samples used to perform these fits have been generated with RapidSim rather than full simulation, it is expected that they do not give a perfect representation of the shape of the component in data.

8 Simultaneous Fit Entries/10 MeV/c2

Entries/10 MeV/c2

136 700 600 500 400 300

350 300 250 200

200

150

100

100 50

0

0

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

800 600 400

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

450 400 350 300 250 200 150

200

100 50

0

5 0 −5

5 0 −5

Entries/10 MeV/c2

5 0 −5

Entries/10 MeV/c2

450 400

0

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

5 0 −5

Fig. 8.5 B0(s) -candidate invariant mass distributions for misreconstructed B0s → D∗ (2007)0 K + K − decays, obtained from MC samples, with results of fits to double Crystal Ball functions overlaid. These decays are reconstructed in the (top) B 0 → D∗ (2007)0 K + π − and (bottom) B0s → D∗ (2007)0 K − π + channels, with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 decays

Nonetheless, since these distributions fall predominantly outside of the fit range, it is only necessary to have a good approximation of the shape within the fit range, which is achieved with these samples. Possible mismodelling of these (and all) shapes is of course a source of systematic uncertainty and it is discussed in Chap. 10.

8.3 Johnson Function and Double Crystal Ball Function Partially combinatorial backgrounds exhibit a wide distribution in the high mass region, with tail structures that present an important challenge for modelling. After testing several alternative approaches, it was found that a combination of a Johnson function [3] and two Crystal Ball functions gives a good description of all partially combinatorial backgrounds. The Johnson function provides a good base for these components as it features a Gaussian core with a wider tail in the high mass region. It is given by

Entries/10 MeV/c2

Entries/10 MeV/c2

8.3 Johnson Function and Double Crystal Ball Function

200 150 100

137

140 120 100 80 60 40

50 0

0

5 0 −5

5 0 −5

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

20

70 60 50 40 30

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

35 30 25 20 15 10

20 10

5

0

0

5 0 −5

5 0 −5

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

Fig. 8.6 B0(s) -candidate invariant mass distributions for misreconstructed B 0 → D∗ (2007)0 π + π − decays, obtained from MC samples, with results of fits to double Crystal Ball functions overlaid. These decays are reconstructed in the (top) B 0 → D∗ (2007)0 K + π − and (bottom) B0s → D∗ (2007)0 K − π + channels, with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 decays

1 m−μ 2 δ 1 −1 γ + δ sinh f J (m; γ, δ, μ, σ) = √ exp − , 2 σ σ 2π 1 + ( m−μ )2 σ

(8.6) where γ and δ describe the shape and the width of the high mass tail. The parameters μ and σ describe the mean and width of the Gaussian core. However, note that due to the tail shift given by γ, the mean of the distribution is not given by μ. The Johnson function is combined with a double Crystal Ball function to account for possible resolution effects in the distribution. The combination is done as f J+DCB = F2 f J + (1 − F2 ) f DCB ,

(8.7)

where f DCB is the function given in Eq. (8.1) except that the μ and σ parameters (as well as the tail parameters) of the two Crystal Ball functions are not required to be the same in this case. Thus this function has in total 14 parameters: four in f J , four for each f CB , and two fractions. The aforementioned data-MC disagreement is included by adjusting all μ and σ parameters (in both the Crystal Ball and Johnson functions)

8 Simultaneous Fit Entries/10 MeV/c2

Entries/10 MeV/c2

138 2500 2000 1500

600 500 400 300

1000 200 100

0

0

5 0 −5

5 0 −5

5200

5400

5800 5600 (D*(2007)0π +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

500

1200 1000 800 600

5200

5400

5800 5600 (D*(2007)0π +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

250 200 150 100

400 50

200 0

0

5 0 −5

5 0 −5

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

Fig. 8.7 B0(s) -candidate invariant mass distributions for misreconstructed decays that contribute to the spectrum for the control channel, obtained from MC samples, with results of fits to double Crystal Ball functions overlaid. (Top) B 0 → D∗ (2007)0 K + π − and (bottom) B0s → D∗ (2007)0 K − π + decays, with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 decays

except in the shape to describe B0 → D0 π + π − , as this component is obtained through data-driven methods. The fitted parameters for the partially combinatorial components that use this model are shown in Tables 8.8 and 8.9. The model gives a good description of all partially combinatorial backgrounds, as shown in Figs. 8.9 and 8.10. It may be noted that the model used to characterize the relevant partially combinatorial backgrounds contains a unusually large number of parameters, while for some of these backgrounds, a simpler model with fewer parameters would potentially have sufficed. However, in order to maintain consistency between all partially combinatorial components, the same model was used. Moreover, since these model parameters are fixed in the fit to data, the large number of parameters does not cause any issues with fit stability.

22.5 ± 0.5

20.4 ± 0.9

25.8 ± 0.5

B0 → D∗ (2007)0 η (958)

B0 → D∗ (2007)0 a1 (1260)0

B0 → D∗ (2007)0 K 1 (1270)0

B0s → D∗ (2007)0 K1 (1270)0

σ

22.4 ± 0.5

Channel

−1.53 ± 0.06

−0.93 ± 0.05

−1.38 ± 0.05

−1.40 ± 0.05

α

8.7 ± 1.0

41 ± 10

8.8 ± 0.8

8.4 ± 0.7

n

5132.0 ± 0.4

5253.0 ± 0.5

5227.8 ± 0.3

5135.6 ± 0.7

m0

−2.25 ± 0.07

−3.80 ± 0.06

−3.80 ± 0.05

−3.86 ± 0.07

c

0.13 ± 0.01

0.23 ± 0.07

0.21 ± 0.01

0.21 ± 0.01

p

B0 → D∗ (2007)0 π + π −

B0 → D∗ (2007)0 π + π −

B0s → D∗ (2007)0 K− π +

B0 → D∗ (2007)0 K+ π −

Reconstruction channel

Table 8.7 Fitted parameters for the ARGUS function convolved with a Crystal Ball function used to model all partially reconstructed background components, as well as to the channel that they contribute to

8.3 Johnson Function and Double Crystal Ball Function 139

8 Simultaneous Fit Entries/10 MeV/c2

Entries/10 MeV/c2

140 12000

12000

10000

10000 8000 6000

8000 6000

4000

4000

2000

2000

0

Entries/10 MeV/c2

5 0 −5 4500

0

5000

5 0 −5 4500

5500 (D*(2007)0K +π −) Mass [MeV/c2]

5000

5500 (D*(2007)0K −π +) Mass [MeV/c2]

Test_PREC2 Entries/10 MeV/c2

10000

12000

8000

10000

6000

8000

6000

4000

4000

2000

2000

0

5 0 −5 4500

0

5000

5500 (D*(2007)0π +π −) Mass [MeV/c2]

5 0 −5 4600

4800

5000

5200 5400 5600 5800 * (D (2007)0π +π −) Mass [MeV/c2]

Fig. 8.8 Fits to the distributions of partially reconstructed background components with an ARGUS function convolved with a Crystal Ball function. (Top left) B0 → D∗ (2007)0 K 1 (1270)0 , (top right) B0s → D∗ (2007)0 K1 (1270)0 , (bottom left) B0 → D∗ (2007)0 η (958) and (bottom right) B0 → D∗ (2007)0 a1 (1260)0 . These are reconstructed in the channels indicated in Table 8.7

8.4 Exponential Function Finally, due to the falling distribution caused by the higher number of low momentum tracks within LHCb, an exponential distribution with a negative slope is used to model the fully combinatorial background, as commonly done in LHCb. The PDF is given by (8.8) f exp (m) = N · exp(c · m) where N is a normalisation factor and c represents the slope of the distribution. Contrary to the previous cases, no simulation samples can be used to describe this background, and instead the parameters of the combinatorial components are determined from data directly. Thus, the slope parameter for this component in each of the final states in the simultaneous fit, is a floating variable. The fitted values are reported in Sect. 8.6.

B+ → D∗ (2007)0 π +

B+ → D∗ (2007)0 π +

B+ → D∗ (2007)0 K +

B+ → D∗ (2007)0 K +

B0 → D0 π + π −

B0 → D0 π + π −

B0s → D0 K − π +

B0s → D0 K − π +

B0 → D0 K + π −

Channel B0 → D0 K + π −

41 ± 6

10 ± 3

5435 ± 4

31 ± 42

5421 ± 11

36.0 ± 0.4

5349 ± 33

65 ± 8

5460.1 ± 0.8

62 ± 6

5488 ± 15

13 ± 9

5499 ± 12

44 ± 5

5535 ± 9

24 ± 4

5444 ± 4

30 ± 7

5549 ± 9

35 ± 3

5482 ± 7

35 ± 2

5611 ± 6

52 ± 5

5412 ± 5

30 ± 12

5601 ± 17

66 ± 106

5637 ± 416

5529 ± 31

71.7 ± 0.5

5605.5 ± 0.7

26 ± 10

5586 ± 23

24.2 ± 0.9

5399 ± 2

39 ± 3

69.6 ± 1.0

5509 ± 3

5494 ± 4

σ

μ

−0.1 ± 0.3

1.6 ± 0.2

−0.00 ± 0.09

1.44 ± 0.04

−0.34 ± 0.07

2.8 ± 0.2

−3 ± 3

0.9 ± 0.1

−4 ± 3

3±2

−4 ± 2

3.7 ± 0.7

−3.5 ± 0.3

0.9 ± 0.9

−4.0 ± 0.9

1.24 ± 0.02

−4 ± 2

1.2 ± 0.2

−2.8 ± 1.1

1.46 ± 0.09

α

0.4 ± 0.6

1.2 ± 0.4

20 ± 16

2.22 ± 0.08

20 ± 10

0.0 ± 0.1

7 ± 15

20 ± 17

17 ± 2

17.3 ± 1.1

1.91 ± 0.08

1.8 ± 0.2

0.0 ± 0.5

3±2

19 ± 10

10.8 ± 1.1

19 ± 17

4.4 ± 1.6

3±3

6.8 ± 1.5

n

0.42 ± 0.09

0.39 ± 0.01

0.94 ± 0.06

0.20 ± 0.02

0.7 ± 0.1

0.75 ± 0.04

0.3 ± 0.1

0.4 ± 1.2

0.43 ± 0.08

0.66 ± 0.03

F

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 π 0

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ

B0 → D∗ (2007)0 K + π − , D∗ (2007)0 → D0 π 0

B0 → D∗ (2007)0 K + π − , D∗ (2007)0 → D0 γ

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 π 0

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ

B0s → D∗ (2007)0 K − π + , D∗ (2007)0 → D0 π 0

B0s → D∗ (2007)0 K − π + , D∗ (2007)0 → D0 γ

B0 → D∗ (2007)0 K + π − , D∗ (2007)0 → D0 π 0

Reconstruction Channel B0 → D∗ (2007)0 K + π − , D∗ (2007)0 → D0 γ

Table 8.8 Fitted parameters for the two Crystal Ball functions that form part of the model used to describe partially combinatorial backgrounds. (Parameters of the Johnson function that complete the model are given in Table 8.9.)

8.4 Exponential Function 141

μ

5370 ± 5

5363 ± 10

5319.5 ± 0.4

5380 ± 32

5330 ± 13

5324 ± 4

5371 ± 5

5434 ± 20

5469.5 ± 0.9

5388 ± 12

Channel

B0 → D0 K+ π −

B0 → D0 K+ π −

B0s → D0 K− π +

B0s → D0 K− π +

B0 → D0 π + π −

B0 → D0 π + π −

B+ → D∗ (2007)0 K+

B+ → D∗ (2007)0 K+

B+ → D∗ (2007)0 π +

B+ → D∗ (2007)0 π +

5.8 ± 1.2

11.0 ± 0.2

15 ± 39

5.7 ± 0.6

25 ± 5

58 ± 13

28 ± 37

38.90 ± 0.08

48 ± 12

43 ± 3

σ

−7.4 ± 0.4

−3.77 ± 0.01

−1.8 ± 1.7

−6.8 ± 0.2

−8 ± 7

−4.6 ± 0.3

−11 ± 6

−9.70 ± 0.01

−4.3 ± 0.9

−3.7 ± 0.1

γ

1.82 ± 0.08

1.00 ± 0.01

1±4

1.57 ± 0.04

4±3

4.9 ± 0.2

4.4 ± 0.9

4±8

2.9 ± 0.2

2.48 ± 0.06

δ

0.42 ± 0.09

0.39 ± 0.01

0.94 ± 0.06

0.20 ± 0.02

0.7 ± 0.2

0.46 ± 0.04

0.3 ± 0.1

0.4 ± 1.8

0.43 ± 0.08

0.66 ± 0.03

F2

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 π 0

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ

B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 π 0

B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 γ

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 π 0

B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ

B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 π 0

B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 γ

B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 π 0

B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 γ

Reconstruction Channel

Table 8.9 Fitted parameters for the Johnson functions that form part of the model used to describe partially combinatorial backgrounds. (Parameters of the two Crystal Ball functions that complete the model are given in Table 8.8.)

142 8 Simultaneous Fit

143 Entries/10 MeV/c2

Entries/10 MeV/c2

8.4 Exponential Function 2000 1800 1600 1400 1200 1000 800

350 300 250 200

400

100

200

50 0

0

5 0 −5

5200

5400

5 0 −5

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

2000

Entries/10 MeV/c2

Entries/10 MeV/c2

400

150

600

1800 1600 1400 1200 1000

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

350 300 250 200 150

400

100

200

50 0

5400

5 0 −5

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

Entries/10 MeV/c2

5200

600 500 400 300 200 100 0

5 0 −5

5400

450

800

5 0 −5

5200

400

600

0

Entries/10 MeV/c2

450

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

220 200 180 160 140 120 100 80 60 40 20 0

5 0 −5

Fig. 8.9 Fits to partially combinatorial background components involving B0(s) → D0 h + h − decays. (Top) B0 → D0 K+ π − reconstructed as B0 → D∗ (2007)0 K+ π − , (middle) B0s → D0 K− π + reconstructed as B0s → D∗ (2007)0 K− π + , (bottom) B0 → D0 π + π − reconstructed as B0 → D∗ (2007)0 π + π − . The left (right) plots correspond to D∗ (2007)0 reconstructed as D0 γ (D0 π 0 ). Distributions for B0 → D∗ (2007)0 K+ π − and B0s → D∗ (2007)0 K− π + candidates are obtained from MC samples, while those for the control channels (bottom plots) are obtained from data as detailed in Sect. 7.4

8 Simultaneous Fit 250

Entries/10 MeV/c2

Entries/10 MeV/c2

144

200 150 100

70 60 50 40 30 20

50

10 0

0

5200

5400

5 0 −5

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

700

Entries/10 MeV/c2

Entries/10 MeV/c2

5 0 −5

600 500 400 300

5200

5400

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

180 160 140 120 100 80 60

200

40 100

20

0

5 0 −5

0

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

5 0 −5

Fig. 8.10 Fits to partially combinatorial background components involving B+ → D∗ (2007)0 h + decays. (Top) B+ → D∗ (2007)0 K+ reconstructed as B0 → D∗ (2007)0 K+ π − , (bottom) B+ → D∗ (2007)0 π + reconstructed as B0 → D∗ (2007)0 π + π − . The left (right) plots correspond to D∗ (2007)0 reconstructed as D0 γ (D0 π 0 )

8.5 Fit Strategy As seen in the previous sections, many backgrounds components contribute to the different data samples. This makes it difficult to establish a reliable fit model that is consistent across all channels. However, many background components are common in multiple channels or can be related to other similar backgrounds. Thus, a simultaneous unbinned extended maximum-likelihood fit to the B0(s) -candidate mass distributions of the 6 channels has been developed using RooFit [4] in order to separate signal from background robustly while taking advantage of the similarities between the channels. The mass range for the fit is chosen to be m D∗ hh = [5100, 5900] MeV/c2 , where m D∗ hh is the mass of the reconstructed B0(s) candidate computed while fixing the D∗ (2007)0 and D0 masses to their known values [5] using the DecayTreeFitter tool [6]. This is chosen in favour of the selected data sample range, comprising events with m D∗ hh = [5000, 6000] MeV/c2 , because of the following effects. First, fixing the D∗ (2007)0 and D0 masses using the DecayTreeFitter tool causes small shifts of the reconstructed B0(s) -candidate mass. Whilst this improves the resolution of the signal peak, it also generates some edge effects, which can removed

8.5 Fit Strategy

145

by truncating the range. Secondly, in the low mass region, partially reconstructed backgrounds dominate the distribution. As such including this region could increase the dependence of the results on the correct modelling of these components (see discussion in Sect. 8.2). Reducing the fit range in the low mass region even more, in order to completely remove partially reconstructed backgrounds, was considered. However, it is crucial to have a wide enough range in the mass fit in order to distinguish the combinatorial background from other components. Thus, a fit range of m D∗ hh = [5100, 5900] MeV/c2 is chosen: this provides a wide enough range to properly describe combinatorial background, but does mean that partially reconstructed backgrounds must be considered. The yields of many of the components used in the model can be constrained between different channels. Here, and in the following, the word “constrain” (and its variants) implies that certain quantities are forced to take consistent values between the different channels. In some cases quantities are constrained to a value that is external to the fit—here, although the value is floated in the fit, a Gaussian penalty term is included in the likelihood with its mean and width corresponding to the the external constraint value and uncertainty (the determination of the mean and width parameters of these Gaussian constraints is described below). Care is taken that the constraints applied do not lead to systematic biases on the signal yields and therefore on the results. The different constraints utilised in the simultaneous fit model are: • Misreconstructed signal and wrong D∗ (2007)0 decay ratios: The ratio between misreconstructed signal and fully reconstructed signal yields is expected to be the same in all channels and only depend on the reconstructed D∗ (2007)0 decay. The yields of misreconstructed signal components are thus given by the product of the γ π0 , depending on the correctly reconstructed signal yield and the fraction f mrs or f mrs D∗ (2007)0 decay. These fractions are set to be equal in all channels that share the same D∗ (2007)0 decay channel. The values for these fractions are floating parameters in the final model as they are not expected to be properly described in MC γ π0 samples. In a similar way we can define f wD∗ and f wD ∗ as the ratio between “wrong ∗ 0 D (2007) ” decays and correctly reconstructed signal yields. These parameters are also floated in the simultaneous fit model. The constraints between the fully reconstructed and misrecontructed signal components can then be written as γ

γ

γ

NwD∗ (X ) = f wD∗ Nsig (X ) , π π π NwD ∗ (X ) = f wD∗ N sig (X ) , 0

0

0

γ

γ γ Nmrs (X ) = f mrs Nsig (X ) ,

(8.9)

π π π Nmrs (X ) = f mrs Nsig (X ) , 0

0

0

where N yz (X ) is the yield of the y component in the X decay channel, where z can be γ or π 0 depending on the D∗ (2007)0 decay, and y can be sig (correctly

146

8 Simultaneous Fit

reconstructed signal), mrs (misreconstructed signal) or wD∗ (wrong reconstructed D∗ (2007)0 decay). • Misidentified background ratios: Misidentified backgrounds appear in more than one of the channels. This includes cross-feed backgrounds that are correctly reconstructed in one channel but misidentified in another. Therefore the relative yields in the different channels can be related by the relative efficiencies, both of particle identification (corrected using the PIDCORR package) and of the requirement that the candidate be within the B0(s) -candidate mass range used in the fit. The yields of B0s → D∗ (2007)0 K + K − and B 0 → D∗ (2007)0 K + K − in the B0s → D∗ (2007)0 K− π + channel are defined relative to the yield of these components in the B 0 → D∗ (2007)0 K + π − channel ( f s K K and f K K ). These ratios are expected to be the same for both D∗ (2007)0 decays and therefore the same ratio is used for both cases. The yields from misidentified B 0 → D∗ (2007)0 π + π − decays in both B 0 → D∗ (2007)0 K + π − and B0s → D∗ (2007)0 K − π + channels are defined relative to the correctly identified control channel yields ( f ππ and f sππ , respectively), taking into account that the background shapes, unlike the signal distributions, have not been split to account for the misreconstructed neutral particles or the wrong D∗ (2007)0 decay. The misidentified backgrounds in the control channel correspond to the signals in the other channels and are thus constrained in exactly the same way (defining f K π and f s K π ratios). In all cases, these fractions are floated in the global fit but have an additional Gaussian constraint with the mean and width obtained from the relative efficiency of each decay between reconstruction channels and its uncertainty, which is computed from the simulated samples. These constraints can be written as N (B0s → D∗0 K + K − |B0s → D∗0 K − π + ) = f s K K N (B0s → D∗0 K + K − |B 0 → D∗0 K + π − ) N (B 0 → D∗0 K + K − |B0s → D∗0 K − π + ) = f K K N (B 0 → D∗0 K + K − |B 0 → D∗0 K + π − ) N (B 0 → D∗0 π + π − |B 0 → D∗0 K + π − ) = f ππ N (B 0 → D∗0 π + π − |B 0 → D∗0 π + π − ) N (B 0 → D∗0 π + π − |B0s → D∗0 K − π + ) = f sππ N (B 0 → D∗0 π + π − |B 0 → D∗0 π + π − ) N (B 0 → D∗0 K + π − |B 0 → D∗0 π − π + ) = f K π N (B 0 → D∗0 K + π − |B 0 → D∗0 K + π − ) N (B0s → D∗0 K − π + |B 0 → D∗0 π − π + ) = f s K π N (B0s → D∗0 K − π + |B0s → D∗0 K − π + )

(8.10) where N (X |Y ) stands for the yield of the X component reconstructed as the Y decay channel. These constraints are applied separately, but with the same constraint parameters, for each D∗ (2007)0 decay. • B0 → D0 h+ h− partially combinatorial background ratios: The yields of partially combinatorial B0 → D0 h + h − backgrounds in the different channels depend exclusively on the probability to form a D∗ (2007)0 candidate by combining with a random γ or π 0 candidate. Thus, the ratio of yields between B0 → D0 h + h − background with an extra γ with respect to that with an extra π 0 is expected to be the same for each h + h − combination. For instance the yields of B0 → D0 π + π − when reconstructed as B0 → D∗ (2007)0 π + π − with both possible D∗ (2007)0 decays can be related by the same ratio as the yields of B0 → D0 K+ π − when reconstructed as B0 → D∗0 K+ π − . This ratio, that we define as f B2Dhh is determined from MC.

8.5 Fit Strategy

147

The complete list of constraints for B0 → D0 h + h − partially combinatorial backgrounds is: N (B0 → D0 K+ π − |π 0 ) = f B2Dhh N (B0 → D0 K+ π − |γ) N (B0s → D0 K− π + |π 0 ) = f B2Dhh N (B0s → D0 K− π + |γ) 0 + −

(8.11)

0 + −

N (B → D π π |π ) = f B2Dhh N (B → D π π |γ) 0

0

0

where N (B0 → D0 h + h − |x) stands for the yield of B0 → D0 h + h − when reconstructed as B0 → D∗ (2007)0 h + h − with the D∗ (2007)0 → D0 x decay. Similarly, these ratios are floated in the simultaneous fit, with Gaussian constraints obtained from the relative efficiencies from simulated samples. • B + → D∗ (2007)0 h+ partially combinatorial backgrounds: Following the same approach to the previous category, the yields of partially combinatorial B + → D∗ (2007)0 h + backgrounds depend on the probability to make a combination with a random pion. Since this probability should be the same independent of the D∗ (2007)0 decay mode, similar constraints to those in Eq. (8.11) can be applied: N (B+ → D∗0 K + |B0 → D∗0 K + π − |π 0 ) = f Bu Dsth N (B+ → D∗0 K + |B0 → D∗0 K + π − |γ) N (B+ → D∗0 π + |B0 → D∗0 π + π − |π 0 ) = f Bu Dsth N (B+ → D∗0 π + |B0 → D∗0 π + π − |γ)

(8.12) where N (B+ → D∗0 h + |x|y) corresponds to the yield of B+ → D∗0 h + when reconstructed in the x final state with D∗ (2007)0 → D0 y. The constraint f Bu Dsth is defined in the same way as f B2Dhh in the case of partially combinatorial B0 → D0 h + h − backgrounds, and is floated with a Gaussian constrain obtained from simulation. In addition, the relative yields of partially combinatorial backgrounds from B+ → D∗0 K+ and B+ → D∗0 π + decays can be constrained, since their relative branching fractions and reconstruction probabilities are known. The relative branching fraction is [7–9], PDG = f Bu

B(B+ → D∗ (2007)0 K+ ) B(B+ → D∗ (2007)0 π + )

= (8.34 ± 0.36)% .

(8.13)

Moreover, the probabilities for B+ → D∗0 K+ and B+ → D∗0 π + decays to form partially reconstructed backgrounds do not need to be identical, which is accomMC . The constraints modated in the constraint through by including the parameter f Bu can thus be written as PDG MC N (B+ → D∗0 K + |B0 → D∗0 K + π − |γ) = f Bu f Bu N (B+ → D∗0 π + |B0 → D∗0 π + π − |γ) PDG MC N (B+ → D∗0 K + |B0 → D∗0 K + π − |π 0 ) = f Bu f Bu N (B+ → D∗0 π + |B0 → D∗0 π + π − |π 0 )

(8.14)

148

8 Simultaneous Fit

MC where it may be noted that the same parameter f Bu is used for both D∗ (2007)0 decay modes since it was verified in MC to not depend on the decay mode. Similarly PDG MC and f Bu are floated in the fit with external to the previous cases, both f Bu Gaussian constraints, obtained from the Eq. 8.13 and from simulated samples, respectively. The yields of all partially combinatorial B+ backgrounds are defined relative to each other by Eqs. (8.12) and (8.14), leaving one yield in this category to float in the fit (this is chosen to be the yield of B+ → D∗ (2007)0 π + partially combinatorial background in the control channel with D∗ (2007)0 → D0 γ). However, it is possible to further constrain these components, and moreover it is desirable to do so since otherwise the fit may have difficulty to distinguish the shape of this background from other components in the fit. This is achieved by adding an extra constraint, relating the yield of the B+ → D∗0 π + background in the B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ control channel with the yield of the B0 → D0 π + π − partially combinatorial background in the same channel. Since the B0 → D0 π + π − partially combinatorial background has one of the most distinctive shapes, it is a good candidate to constrain yields from other backgrounds. Similarly as in Eq. (8.14), implementation of this constraint requires knowledge of both the relative branching fractions and reconstruction probabilities. The former is given by

f BPDG + /B0 =

B(B+ → D∗ (2007)0 π + )B(D∗ (2007)0 → D0 γ) B(B0 → D0 π + π − )

= 1.97 ± 0.14 ,

(8.15) where the most precise single results for B(B+ → D∗ (2007)0 π + ), B(D∗ (2007)0 → D0 γ) and B(B0 → D0 π + π − ) are from Refs. [7, 10, 11], respectively. The relative selection efficiency of these two backgrounds, denoted f BMC + /B0 , is obtained from MC. In conclusion, this final constraint on partially combinatorial background is given by MC 0 0 + − N (B+ → D∗0 π + |B0 → D∗0 π + π − |γ) = f BPDG + /B0 f B+ /B0 N (B → D π π |γ) (8.16) Note that in both Eqs. (8.14) and (8.16), the f PDG and f MC terms enter as a product, so the fit does not have sensitivity to them individually. Nonetheless, it does not cause any difficulty to consider them as separate constraints in the fit. • Partially reconstructed background ratios: Since all partially reconstructed backgrounds contain the same D∗ (2007)0 decay structure as the signal decay in its respective channel, their yields relative to the signal should be the same indeB0 B0 Control pendent of the D∗ (2007)0 decay. We define f PR , f PRs , f PR1 , f PControl as the ratios R2 between each partially reconstructed background yield and the signal yield in the B 0 , B0s and control channels, respectively. In the fit, each of these is constrained to be the same for both D∗ (2007)0 decay modes (in this case it is not constrained to any external input value). Since the control channel contains two different par-

8.6 Fit Results

149

tially reconstructed background with distinctive distributions, two different ratios are allowed. These constraints can then be expressed as B N (B0 → D∗ (2007)0 K+ π − |X ) N (B+ → D∗ (2007)0 K 1 (1270)+ |B0 |X ) = f PR 0

B0

N (B+ → D∗ (2007)0 K 1 (1270)+ |B0s |X ) = f PRs N (B0 → D∗ (2007)0 K− π + |X ) Control N (B0 → D∗ (2007)0 π + π − |X ) N (B0 → D∗ (2007)0 η (958)|C|X ) = f PR1 Control N (B0 → D∗ (2007)0 π + π − |X ) N (B+ → D∗ (2007)0 a1 (1260)+ |C|X ) = f PR2 (8.17) where X represents the D∗ (2007)0 decay and C is shorthand for the control channel.

8.6 Fit Results Accounting for all the aforementioned constraints, the global simultaneous fit has a total of 49 degrees of freedom, 12 of which are not completely floated but have Gaussian constraints applied. The 37 completely floating parameters are: the 6 signal yields of the signal and control channels, 4 ratios of misreconstructed signal and wrong D∗ (2007)0 decays relative to correctly reconstructed signal, 6 combinatorial background yields, 6 combinatorial background slope parameters, 4 yields of misidentified background from B0(s) → D∗ (2007)0 K+ K− decays to the D∗ (2007)0 K+ π − final states, 3 yields of partially combinatorial B0(s) → D0 h + h − backgrounds, 4 ratios related to the yields of partially reconstructed decays and 2 shift and 2 scale parameters that quantify differences in the signal shape between data and MC. The 12 parameters with external Gaussian constraints are composed of 6 that constrain the misidentified background yields, and 6 that constrain the partially combinatorial background yields. The complete set of results obtained from the simultaneous is shown in Table 8.10. The fit results are also displayed in Fig. 8.11 (linear scale) and Fig. 8.12 (log scale). The χ2 /Nbins values quantifying the agreement between the fit models and the six spectra included in the simultaneous fits are given in Table 8.11. In general, the fit model gives a good description of the data sample. In particular the estimates of the signal yields, which are the observables that will be used to calculate the relative branching fractions that are the primary objectives of this analysis, are robust. This is partially thanks to the decomposition of the signal component into fully reconstructed and misreconstructed candidates. Some discrepancies between the data and the fit model can be observed in certain regions. A modest excess in the B0s → D∗ (2007)0 K− π + channels around the 5300 MeV region can be observed. This is thought to be associated to the mismodelling of contributions from B0s → D∗ (2007)0 K+ K− decays which proved to be extremely challenging due to the lack of simulated events (conditioned by the extremely low selection retention rate) and the limited knowledge of this channel. On the other hand, the fit model appears to be extremely successful in the high mass region. This is partially thanks to the

150

8 Simultaneous Fit

Table 8.10 Results of the simultaneous fit. For quantities that have Gaussian constraints applied, the parameters of the constraint are also indicated. Units of MeV/c2 on the shift parameters μ, and of (MeV/c2 )−1 on the slope parameters p0 , are implied. Uncertainties are statistical only Parameter

Fitted value

N (B0 |γ)

946.37 ± 53.39

N (B0 |π 0 )

184.66 ± 17.04

N (B0s |γ)

3744.32 ± 76.85

N (B0s |π 0 )

632.72 ± 45.75

N (Control|γ)

15020.91 ± 217.84

N (Control|π 0 )

2591.49 ± 189.70

γ f ∗ wD 0 fπ ∗ wD γ f mrs π0 f mrs

0.3848 ± 0.0055 0.1453 ± 0.0069 0.3322 ± 0.0413 1.5218 ± 0.1630

N (Combinatorial Bg|B0 , γ)

1758.16 ± 211.21

N (Combinatorial Bg|B0 , π 0 )

416.26 ± 41.77

N (Combinatorial Bg|B0s , γ)

5944.49 ± 558.01

N (Combinatorial Bg|B0s , π 0 )

1258.17 ± 93.84

N (Combinatorial Bg|Control, γ)

23886.5 ± 1686.82

N (Combinatorial Bg|Control, π 0 )

6310.04 ± 349.01

Combinatorial slope p0 in B0 , γ

−0.003804 ± 0.000308

Combinatorial slope p0 in B0 , π 0

−0.005086 ± 0.000372

Combinatorial slope p0 in B0s , γ

−0.003915 ± 0.000227

Combinatorial slope p0 in B0s , π 0

−0.004901 ± 0.000240

Combinatorial slope p0 in Control, γ

−0.003231 ± 0.000156

Combinatorial slope p0 in Control, π 0

−0.004766 ± 0.000153

N (B0s → D∗ (2007)0 K + K − |B0 , γ)

374.33 ± 169.78

N (B0s → D∗ (2007)0 K + K − |B0 , π 0 )

Gaussian constraint (μ ± σ)

23.54 ± 17.23

N (B0 → D∗ (2007)0 K + K − |B0 , γ)

97.66 ± 50.81

N (B0 → D∗ (2007)0 K + K − |B0 , π 0 )

29.88 ± 11.19

fs K K

0.7433 ± 0.2685

0.4324 ± 0.2518

fK K

1.6276 ± 0.3220

1.3880 ± 0.3498

f ππ

0.00997 ± 0.00075

0.0097 ± 0.0008

f sππ

0.00338 ± 0.000299

0.0034 ± 0.0003

(continued)

8.6 Fit Results

151

Table 8.10 (continued) Parameter

Fitted value

Gaussian constraint (μ ± σ)

fKπ

0.1586 ± 0.00409

0.1583 ± 0.0041

fs K π

0.0686 ± 0.00339

0.0619 ± 0.0034

N (B0 → D0 K + π − |B0 , γ)

207.22 ± 43.81

N (B0s → D0 K − π + |B0s , γ)

264.45 ± 54.05

N (B0 → D0 π + π − |Control, γ)

2894.61 ± 169.59

f B2Dhh

0.1396 ± 0.0012

0.1390 ± 0.0012

MC f Bu PDG f Bu

0.3521 ± 0.0012

0.3521 ± 0.0012

0.0832 ± 0.0036

0.0834 ± 0.0036

f Bu2Dsth

0.2239 ± 0.00673

0.2197 ± 0.0068

f MC +

0.3737 ± 0.00320

0.3736 ± 0.0032

2.0054 ± 0.1368

1.97 ± 0.14

B /B0 f PDG B+ /B0 B0 f PR B0 f PRs Control f PR1 Control f PR2 μMC γ μMC π0 σγMC σ MC π0

0.1168 ± 0.0569 0.0544 ± 0.0884 0.4841 ± 0.0164 0.4201 ± 0.0389 −0.9402 ± 0.1775 −0.4096 ± 0.4460 1.0067 ± 0.0121 0.9947 ± 0.0356

utilization of data-driven techniques to model contributions from B0 → D0 π + π − background in the control channel, which play an important role to constrain similar backgrounds in the other channels. In general, the fit model describes the data sample up to a satisfactory level of agreement, while the large number of backgrounds and the limited knowledge of them makes it very challenging to improve the model further. Systematic uncertainties due to the imperfection of the fit model are discussed in Sect. 10. The yields of both B0 → D∗ (2007)0 K+ π − and B0s → D∗ (2007)0 K− π + decays seen in Fig. 8.11 and reported in Table 8.10 are highly significant, with the smallest being N (B0 |π 0 ) = 184.7 ± 17.0. Since the significance of both signals is clearly far in excess of 5 σ, quantification of the significance is unnecessary. Both B0 → D∗ (2007)0 K+ π − and B0s → D∗ (2007)0 K− π + decays are observed for the first time. Due to the nature of the simultaneous fit, with various constraints between different components, it is to be expected that there may be significant correlations between fit parameters. The correlation factors between the measured signal yields are given in Table 8.12. The correlations between the modes with D∗ (2007)0 → D0 γ decays are relatively low, the largest being O(20%). However, larger correlations, up to almost 90%, are seen between yields with D∗ (2007)0 → D0 π 0 decays. Investigations with pseudoexperiments in which different fit constraints are floated show that this is

8 Simultaneous Fit 350 Signal 0 D*(2007)0→D π 0 events Misreconstructed signal B0→ D*(2007)0π +π − B0→ D*(2007)0K +K − B0s→ D*0 (2007)0K +K − B0→ D K +π − B+→ D*(2007)0K + B0→ D*(2007)0K 1(1270)0 Combinatorial background

300 250 200 150 100

Candidates/10 MeV/ c2

Candidates/10 MeV/ c2

152

0

5400

5600 5800 m(D*(2007)0K +π −) [MeV/c2]

Signal 0 D*(2007)0→D π 0 events Misreconstructed signal 0 * B → D (2007)0π +π − B 0→ D*(2007)0K +K − B s0→ D0*(2007)0K +K − B s0→ D K −π + B s0→ D*(2007)0K 1(1270)0 Combinatorial background

800 600 400

5200

5400

5600 5800 m(D*(2007)0K +π −) [MeV/c2]

300 250

Signal 0 D*(2007)0→D γ events Misreconstructed signal 0 * B → D (2007)0π +π − B 0→ D*(2007)0K +K − B 0s → D0*(2007)0K +K − B s0→ D K −π + B s0→ D*(2007)0K 1(1270)0 Combinatorial background

200 150 100 50

200

0

0

5200

5400

5600 5800 m(D*(2007)0K −π +) [MeV/c2]

4500 4000

Signal 0 0 D*(2007) →D π 0 events Misreconstructed signal B0→ D*(2007)0K +π − B0s→ D*0 (2007)0K −π + B0s→ D π −π + * B+→ D (2007)0π + B0→ D*(2007)0η ' (958)0 0 B0→ D*(2007)0a1(1260) Combinatorial background

3500 3000 2500 2000 1500

5 0 −5

1000

Candidates/10 MeV/ c2

Candidates/10 MeV/ c2

40

5 0 −5

Candidates/10 MeV/ c/ 2

Candidates/10 MeV/ c2

5200

1000

5200

5400

5600 5800 m(D*(2007)0K −π +) [MeV/c2]

1200 1000

Signal 0 0 D*(2007) →D γ events Misreconstructed signal B0→ D*(2007)0K +π − B0s→ D*0 (2007)0K −π + B0s→ D π −π + * B+→ D (2007)0π + B0→ D*(2007)0η ' (958)0 0 B0→ D*(2007)0a1(1260) Combinatorial background

800 600 400 200

500 0

5 0 −5

60

0

1200

5 0 −5

Signal 0 D*(2007)0→D γ events Misreconstructed signal B0→ D*(2007)0π +π − B0→ D*(2007)0K +K − B0s→ D*0 (2007)0K +K − B0→ D K +π − B+→ D*(2007)0K + B0→ D*(2007)0K 1(1270)0 Combinatorial background

80

20

50

5 0 −5

100

0

5200

5400

5600 5800 m(D*(2007)0π +π −) [MeV/c2]

5 0 −5

5200

5400

5600 5800 m(D*(2007)0π +π −) [MeV/c2]

Fig. 8.11 Results of the simultaneous fits with linear y-axis scale. Distributions of (top) B 0 → D∗ (2007)0 K + π − , (middle) B0s → D∗ (2007)0 K − π + , and (bottom) B 0 → D∗ (2007)0 π + π − with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 candidates are shown. The total fit result is shown as a solid blue line. All channels also include components for correctly reconstructed signal (solid green line), misreconstructed signal (dashed dark green line) signal with the wrong D∗ (2007)0 decay reconstructed (dashed light green line). Misidentified B0 → D∗ (2007)0 π + π − and B0s → D∗ (2007)0 K+ K− decays are indicated by blue and purple lines, respectively. In the control channel, the lighter and darker blue lines indicate misidentified B0 → D∗ (2007)0 K+ π − and B0s → D∗ (2007)0 K− π + decays, respectively. Partially combinatorial backgrounds are shown in orange (B0(s) → D0 h + h − ) and brown (B+ → D∗ (2007)0 h + ) while partially reconstructed backgrounds are grey and black. The combinatorial background component is shown in red

153 Candidates/10 MeV/ c2

Candidates/10 MeV/ c2

8.6 Fit Results

2

10

10

102

10

1

1 10−1

5200

5400

5 0 −5

5600 5800 m(D*(2007)0K +π −) [MeV/c2]

Candidates/10 MeV/ c/ 2

Candidates/10 MeV/ c2

5 0 −5

103

102

10

5200

5400

5600 5800 m(D*(2007)0K +π −) [MeV/c2]

5200

5400

5600 5800 m(D*(2007)0K −π +) [MeV/c2]

5200

5400

5600 5800 m(D*(2007)0π +π −) [MeV/c2]

102

10

1 1

5200

5400

5 0 −5

5600 5800 m(D*(2007)0K −π +) [MeV/c2]

Candidates/10 MeV/ c2

Candidates/10 MeV/ c2

5 0 −5

103

2

10

103

102

10

10 1

5 0 −5

5200

5400

5600 5800 m(D*(2007)0π +π −) [MeV/c2]

5 0 −5

Fig. 8.12 Results of the simultaneous fits with log y-axis scale. Distributions of (top) B 0 → D∗ (2007)0 K + π − , (middle) B0s → D∗ (2007)0 K − π + , and (bottom) B 0 → D∗ (2007)0 π + π − with (left) D∗ (2007)0 → D0 γ and (right) D∗ (2007)0 → D0 π 0 candidates are shown. The total fit result is shown as a solid blue line. All channels also include components for correctly reconstructed signal (solid green line), misreconstructed signal (dashed dark green line) signal with the wrong D∗ (2007)0 decay reconstructed (dashed light green line). Misidentified B0 → D∗ (2007)0 π + π − and B0s → D∗ (2007)0 K+ K− decays are indicated by blue and purple lines, respectively. In the control channel, the lighter and darker blue lines indicate misidentified B0 → D∗ (2007)0 K+ π − and B0s → D∗ (2007)0 K− π + decays, respectively. Partially combinatorial backgrounds are shown in orange (B0(s) → D0 h + h − ) and brown (B+ → D∗ (2007)0 h + ) while partially reconstructed backgrounds are grey and black. The combinatorial background component is shown in red

154

8 Simultaneous Fit

Table 8.11 χ2 /Nbins values from each of the spectra fitted in the simultaneous fit Mode χ2 /Nbins B0 , γ B0 , π 0 B0s , γ B0s , π 0 Control, γ Control, π 0

1.03 0.69 1.98 1.22 2.04 2.28

Table 8.12 Correlation factors between the signal yields from the simultaneous fit N (B0 , γ)

N (B0 , π 0 )

N (B0s , γ)

N (B0s , π 0 )

N (Control, )

N (Control, ß0 )

–

0.018

0.223

0.018

0.094

0.056

0.018

–

0.064

0.646

−0.025

0.675

N (B0s , γ)

0.223

0.064

–

0.100

0.113

0.093

N (B0s , π 0 )

0.018

0.646

0.100

–

−0.035

0.866

N (Control, )

0.094

−0.025

0.113

−0.035

–

−0.001

0.675

0.093

0.866

−0.001

–

N (B0 , γ) N (B0 , π 0 )

N (Control, ß0 ) 0.056

π mainly due the shared parameter f mrs , which fixes the yield of misreconstructed relative to correctly reconstructed signal and takes a fairly large value (around 1.5). These correlations are taken into account when calculating the ratios of branching fractions, as described in Chap. 11. 0

References 1. Skwarnicki T (1986) A study of the radiative cascade transitions between the Upsilon-prime and Upsilon resonances, PhD thesis, Institute of Nuclear Physics, Krakow. DESY-F31-86-02 2. ARGUS collaboration, Albrecht H et al (1990) Search for hadronic b → u decays. Phys. Lett. B241:278 (1990). https://doi.org/10.1016/0370-2693(90)91293-K 3. Johnson NL (1949) Bivariate distributions based on simple translation systems. Biometrika 36:297. https://doi.org/10.1093/biomet/36.3-4.297 4. Verkerke W, Kirkby DP (2003) The RooFit toolkit for data modeling C0303241:MOLT007. arXiv:physics/0306116 5. Particle Data Group, Zyla PA et al (2020) Review of particle physics. Prog. Theor. Exp. Phys. 2020:083C01. http://pdg.lbl.gov/ https://doi.org/10.1093/ptep/ptaa104 6. Hulsbergen, W.D.: Decay chain fitting with a Kalman filter. Nucl. Instrum. Meth. A552:566 (2005). https://doi.org/10.1016/j.nima.2005.06.078, arXiv:physics/0503191 arXiv:physics/0503191 7. LHCb collaboration, Aaij R et al (2018) Measurement of CP observables in B ± → D (∗) K ± and B ± → D (∗) π ± . Phys. Lett. B777:16 (2018). https://doi.org/10.1016/j.physletb.2017.11. 070, arXiv:1708.06370

References

155

8. BaBar collaboration, Aubert B et al (2005) Measurement of the ratio B(B − → D ∗0 K − )/B(B − → D ∗0 π − ) and of the CP asymmetry of B − → DC∗0P+ K − decays. Phys Rev D71:031102 (2005). https://doi.org/10.1103/PhysRevD.71.031102, arXiv:hep-ex/0411091 9. LHCb collaboration, Aaij R et al (2021) Measurement of CP observables in B ± → D (∗) K ± and B ± → D (∗) π ± decays using two-body D final states. JHEP 04:081. https://doi.org/10. 1007/JHEP04(2021)081, arXiv:2012.09903 10. BESIII collaboration, Ablikim M et al (2015) Precision measurement of the D ∗0 decay branching fractions. Phys Rev D91:031101. https://doi.org/10.1103/PhysRevD.91.031101, arXiv:1412.4566 11. LHCb collaboration, Aaij R et al (2015) Dalitz plot analysis of B 0 → D¯ 0 π + π − decays. Phys Rev D92:032002. https://doi.org/10.1103/PhysRevD.92.032002, arXiv:1505.01710

Chapter 9

Signal Efficiencies

As this analysis presents the first observation of these decays, clearly no accurate data-tested models of the Dalitz plot distributions exist. Thus, the simulation samples used for the signal are generated flat across the square Dalitz plane. The choice of flat square Dalitz distributions at least allows for a sufficient population of events across the phase space to get efficiency estimates with reasonable uncertainty. However without any further corrections applied, the use of these samples to determine the signal efficiencies would cause large biases on the results, due to differences in the distribution of events over the phase space between signal and MC, since the efficiency varies across the phase space. To avoid this limiting systematic uncertainty, we reweight the simulated events 0 using a data-driven method in which we use the fit results on the invariant mass of B(s) candidates to project out the signal distribution in the Dalitz space. We then use this projection of the signal distribution in the Dalitz space to reweight the signal simulation samples. The approach is very similar to the commonly used sPlot method [1], which has been used for this purpose in numerous previous LHCb analyses [2–6]. Essentially, the ratio of branching fractions (illustrated here for decay modes A and B) can be expanded as w A (m B i )/ A (m i , θi ) N (A)/ (A) B(A) = = i , B(B) N (B)/ (B) j w B (m B j )/ B (m j , θ j )

(9.1)

where the indices i and j run over candidates in the A and B decay channel candidates, respectively. The functions w A (B) and A (B) are the signal weight and efficiency 0 -candidate function for A (B), respectively, and m B i ( j) , m i ( j) and θi ( j) are the B(s) mass and square Dalitz plot variables for candidate i ( j). Instead of the SDP, any phase-space variable could have been used instead to describe the efficiency—as 0 -candidate mass. However the SDP variables long as it is uncorrelated with the B(s) are chosen as the MC samples have been generated flat in this plane, allowing for a more convenient representation of the efficiency maps. In terms of these weights, the average efficiency can be written as. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_9

157

158

9 Signal Efficiencies

i w X (m B i ) . w (m X B i )/ X (m i , θi ) i

(X ) =

(9.2)

This definition is general for any efficiency correction that utilizes weights to project the signal component in the data. However, the method used in this analysis differs from previous analyses in the definition of the signal weight function. This novel approach is based on a new derivation of the classic sPlot method which has been developed within LHCb [7]. There are a few distinct advantages to this new sWeight method over the classic sPlot implementation. These arise because the new method only requires a description of the signal and background shapes in the discriminating variable (in this case 0 candidate) and their relative fractions. This means that the invariant mass of the B(s) 1. sWeights can be extracted in a different mass region to that which is used for fitting (useful in certain situations but not actually used in our case); 2. one can constrain yield parameters, yield fractions and indeed shape parameters and still extract meaningful sWeights (the latter is allowed in the classic sPlot method but the former two are not). The latter point is absolutely vital in this analysis because many of the peaking background shapes have their relative normalisations fixed or constrained in the simultaneous mass fit.

9.1 The Weight Function ws The complete derivation of this method can be found in [7]. However, for brevity, in this thesis we only briefly discuss its final results, as these are the necessary elements for the efficiency correction. In the presence of only signal and only a single background component, the weight function ws (m B ) for the signal can be written as αs gs (m B ) + αb gb (m B ) , (9.3) ws = gTotal (m B ) where gs (m B ) (gb (m B )) is the signal (background) probability density function (PDF) (gTotal is the complete model PDF), and the α parameters are obtained by solving the following matrix equation

Wss Wsb αs 1 · = , Wsb Wbb αb 0

where Wx y =

gx (m B )g y (m B ) dm B . gTotal (m B )

(9.4)

(9.5)

In cases in which more components are present, the matrix 2 × 2 equation from Eq. 9.4 can be expanded to a N × N , with the α factors expanded to N elements

2 1 0

3 2 1 0

−1

−1

−2

−2 5200

5400

5600

−3

5800

(D*(2007)0K +π −) Mass [MeV/c2]

3

Signal Weight

Signal Weight

Signal Weight

3

−3

2 1 0

−3

5800

(D*(2007)0K −π +) Mass [MeV/c2]

3 2 1 0

5600

5800

5800

5400

5600

5800

0

−2 5400

5600

1

−2

(D*(2007)0π +π −) Mass [MeV/c2]

5400

(D*(2007)0K −π +) Mass [MeV/c2]

2

−1

5200

5200

3

−1

−3

5800

0

−2 5600

5600

1

−2 5400

5400

(D*(2007)0K +π −) Mass [MeV/c2]

2

−1

5200

5200

3

−1

−3

Signal Weight

159

Signal Weight

Signal Weight

9.1 The Weight Function ws

−3

5200

(D*(2007)0π +π −) Mass [MeV/c2]

Fig. 9.1 Signal weight functions w S (m B ) for all components in all channels: (top) B 0 → D ∗0 K+ π− , (middle) Bs0 → D ∗0 K− π+ and (bottom) B 0 → D ∗0 π+ π− , with (left) D ∗ (2007)0 → 0 mass, although other D 0 γ and (right) D ∗ (2007)0 → D 0 π 0 . The weight function peaks at the B(s) peaks are present to compensate for the weight functions of other background components

equations, and the weight function being computed as linear combination of the αx factors in Eq. 9.4. The resulting weight function for the projection of the signal component ws , only depends on m B and it is shown in Fig. 9.1. These ws can be then used to project the signal square Dalitz plot distributions as can be seen in Fig. 9.2. All channels show different structures due to the resonances present in the three body decay. The horizontal bars correspond to h + h − resonances ( ) in all channels, these are K ∗ (892)0 → K± π∓ in the B 0 → D ∗ (2007)0 K+ π− and

9 Signal Efficiencies 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×103 180 160 140 120 100 80 60 40 20 0 −20 0.2

0.4

0.6

0.8

θ'

θ'

160

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×103 300 250 200 150 100 50 0 −50 −100 0.2

0.4

0.6

0.8

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×103 800 700 600 500 400 300 200 100 0 0.2

0.4

0.6

0.8

1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.2

0.4

0.6

0.8

×103 1200 1000 800 600 400 200 0 0.2

0.4

0.6

0.8

1

m'

1

×103 1200 1000 800 600 400 200 0 −200 −400 −600

m'

θ'

θ'

m' 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

m'

θ'

θ'

m'

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×103 2000 1500 1000 500 0 −500 −1000 0.2

0.4

0.6

0.8

1

m'

Fig. 9.2 Signal square Dalitz plot distributions obtained by applying ws weights to the data sample, for (top) B 0 → D ∗0 K+ π− , (middle) Bs0 → D ∗0 K− π+ and (bottom) B 0 → D ∗0 π+ π− , with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π 0 . All channels show the expected structures. These distributions are effectively used to correct the efficiencies from the flat square Dalitz plot distributions used in MC generation

Bs0 → D ∗ (2007)0 K− π+ channels, and ρ(770)0 → π+ π− in the control channel. The vertical bands correspond to narrower resonances due to D and Ds decays. These are D1 (2420)− → D ∗ (2007)0 π− decays in the B 0 → D ∗ (2007)0 K+ π− and B 0 → D ∗ (2007)0 π+ π− channels, and Ds1 (2536)− → D ∗ (2007)0 K− decays in the Bs0 → D ∗ (2007)0 K− π+ . Vertical bands in the B 0 → D ∗ (2007)0 K+ π− and B 0 → D ∗ (2007)0 π+ π− channels appear in different positions due to a different particle ordering used to define the SDP in each channel.

A.U.

9.1 The Weight Function ws

161

0.024 0.022

Signal weighted Data

0.02 0.018 0.016

Signal MC

0.014 0.012 0.01 0.008 0.006 0.004 0.002 0.7

0.8

0.9

1

Combinatorial BDT output Fig. 9.3 Comparisons, between background-subtracted data and MC, of the BDT output distribution for the B 0 → D ∗ (2007)0 π+ π− channels. Only the D ∗ (2007)0 → D 0 γ mode is shown due to its larger yield. Good agreement can be seen between both distributions, which validates the implementation of the sWeights method

The structures are clearest in the case of Bs0 → D ∗ (2007)0 K− π+ , D ∗ (2007)0 → D γ decays, corresponding to the fact that this mode has the least “oscillations” in its weights functions (Fig. 9.1), which in turn corresponds to the fact that this mode has the cleanest signal (Fig. 8.12). Moreover, this channel is expected to be dominated by the narrow resonance Ds1 (2536)− → D ∗ (2007)0 K− . As a sanity check, these distributions can be compared with Fig. 7.3 in Chap. 7. Where decay models for the signal channels were created to account for the most relevant expected resonances. This check does not necessarily validate the obtained distributions, as the models used in Chap. 7 are not based on measurements and serve only as first approximations for these distributions. The check nonetheless serves to spot any major problems with either the method or the models used in Chap. 7. A further cross-check can be made by studying the projection of different variables after applying the signal and background weights, as long as those variables are independent of the invariant mass. One good candidate for this check is the output of the combinatorial background BDT used in the selection (see Sect. 6.5). The variables used to train this BDT were chosen to be independent from the invariant mass m B and thus the resulting output is expected to be independent as well. The distribution of the BDT obtained after applying the signal weights in the B 0 → D ∗ (2007)0 π+ π− , D ∗ (2007)0 → D 0 γ control channel, in comparison with the MC sample distribution is shown in Fig. 9.3. Since the signal weights are the crucial output of the weighting procedure, this is interpreted as a validation of the method. Similar checks comparing the distributions of BDT input variables between sWeighted data and MC are given in Appendix B. 0

162

9 Signal Efficiencies

The combinatorial background shape is also similar to the expected distribution, although not falling as rapidly with BDT output as might be expected in the range 0.70–0.95—the combinatorial background distribution in Fig. 9.3 is actually rising slightly in this region rather than falling. This may be due to correlations with other components such as partially combinatorial background.

9.2 Correlation Studies As noted previously, independence between the control variable used to calculate the signal weights (m B ) and the projected variables (square Dalitz plot coordinates) must be satisfied by the signal and all background components. The square Dalitz plot variables m and θ are not expected to be correlated with m B for signal, since 0 -mass constraint applied, the square Dalitz plot coordinates are calculated with a B(s) but for some backgrounds correlations may be induced due to the reconstruction and selection. More specifically, for misidentified background components the shift in m B is related to the momentum of the particle that is misidentified, and the square Dalitz plot position is also correlated with particle momentum, so a correlation between m B and (m , θ ) can occur. Similarly, for partially combinatorial background both the m B value and the square Dalitz plot position depend on the momentum of the random particle that is included in the candidate, so correlations between m B and (m , θ ) are expected. Possible correlations between these variables are therefore checked for each component present in the mass fit model, using Kendall rank [8] correlations determined from MC samples. The Kendall rank coefficients, which quantify non-trivial correlations between variables, are computed as τ=

concordant pairs − discordant pairs , n(n − 1)/2

(9.6)

where we define a concordant (discordant) pair as a pair of events in which the differences for the invariant mass and the square Dalitz plot variable have the same (opposite) sign. Differences between pairs of events in the invariant mass m B and in each of the square Dalitz plot variables are plotted in Fig. 9.4 for correctly and misreconstructed signal components, in Fig. 9.5 for misidentified background components and in Fig. 9.6 for partially combinatorial background components. Only the components in the B 0 → D ∗ (2007)0 K+ π− , D ∗ (2007)0 → D 0 γ channel are shown for simplicity, as any correlation is expected to be related to the reconstruction of the different backgrounds, and thus similar correlations are expected to be observed in similar components in other channels. The Kendall rank coefficients for all samples are shown in Tables 9.1 and 9.2. As expected, the signal components, as well as the misreconstructed signal and wrong D ∗ (2007)0 components, do not exhibit any significant correlation. However, significant correlations are observed for some

9.2 Correlation Studies

163

Fig. 9.4 Differences in m B plotted against corresponding differences in (left; blue) m , (right; red) θ for each possible pair of events in MC samples corresponding to (top) correctly reconstructed signal, (middle) misreconstructed signal and (bottom) signal with wrong D ∗ (2007)0 decays, all for the B 0 → D ∗ (2007)0 K+ π− , D ∗ (2007)0 → D 0 γ channel

misidentified backgrounds and especially large coefficients are found for partially combinatorial B + → D ∗ (2007)0 h + decays. These correlations could bias the weights obtained with the sWeights procedure, and hence are a source of systematic uncertainty as discussed in Chap. 10. Although no significant correlation is found in the signal components, non-linear correlation effects could be overlooked by the Kendall rank coefficients. In Appendix C further 0 -mass distribution is studied in correlation studies are shown, where the signal B(s) different regions of the SDP. No major discrepancies in signal shape are observed between the different regions of the SDP.

164

9 Signal Efficiencies

Fig. 9.5 Differences in m B plotted against corresponding differences in (left; blue) m , (right; red) θ for each possible pair of events in MC samples corresponding to misidentified (top) B 0 → D ∗ (2007)0 π+ π− , (middle) B 0 → D ∗ (2007)0 K+ K− and (bottom) Bs0 → D ∗ (2007)0 K+ K− decays, all for the B 0 → D ∗ (2007)0 K+ π− , D ∗ (2007)0 → D 0 γ channel

9.2 Correlation Studies

165

Fig. 9.6 Differences in m B plotted against corresponding differences in (left; blue) m , (right; red) θ for each possible pair of events in MC samples corresponding to partially combinatorial (top) B 0 → D 0 K+ π− and (bottom) B + → d∗ (2007)0 K+ decays, all for the B 0 → D ∗ (2007)0 K+ π− , D ∗ (2007)0 → D 0 γ channel Table 9.1 Kendall rank coefficients between m B and m for each component in all the channels. These coefficients have been extracted from the MC samples used to describe each component Component

B0, γ

B 0 , π0

Bs0 , γ

Bs0 , π 0

Control, γ

Control, π 0

Signal

−0.010

−0.005

−0.007

0.015

0.001

−0.002

Misrec. signal

−0.002

0.005

−0.010

0.006

−0.006

−0.001

Wrong D ∗ (2007)0 decay

−0.008

0.006

0.002

−0.015

−0.009

0.025

B 0 → D ∗ (2007)0 K+ K−

0.225

0.201

0.239

0.204

N.A.

N.A.

Bs0 → D ∗ (2007)0 K+ K−

0.363

0.448

0.362

0.248

N.A.

N.A.

B 0 → D ∗ (2007)0 π+ π−

0.103

0.056

0.144

0.159

N.A.

N.A.

B 0 → D0 h+ h−

−0.033

−0.068

−0.047

−0.060

−0.027

−0.050

B + → D ∗ (2007)0 h +

0.718

0.671

0.246

0.171

0.480

0.381

B 0 → D ∗ (2007)0 K+ π−

N.A.

N.A.

N.A.

N.A.

0.162

0.140

Bs0 → D ∗ (2007)0 K− π+

N.A.

N.A.

N.A.

N.A.

−0.197

0.175

166

9 Signal Efficiencies

Table 9.2 Kendall rank coefficients between m B and θ for each component in all the channels. These coefficients have been extracted from the MC samples used to describe each component Component

B0, γ

B 0 , π0

Bs0 , γ

Bs0 , π 0

Control, γ

Control, π 0

Signal

0.012

0.014

0.014

0.014

0.013

0.012

Misrec. signal

0.045

0.017

0.057

0.031

0.066

0.025

Wrong D ∗ (2007)0 decay

0.056

0.041

0.060

0.030

0.058

0.049

B 0 → D ∗ (2007)0 K+ K−

−0.020

0.035

0.054

−0.030

N.A.

N.A.

Bs0 → D ∗ (2007)0 K+ K−

−0.008

0.023

0.115

0.013

N.A.

N.A.

B 0 → D ∗ (2007)0 π+ π−

0.130

0.086

0.131

0.049

N.A.

N.A.

B 0 → D0 h+ h−

0.137

0.172

0.138

0.183

0.112

0.156

B + → D ∗ (2007)0 h +

0.079

0.071

−0.289

−0.284

−0.175

−0.191

B 0 → D ∗ (2007)0 K+ π−

N.A.

N.A.

N.A.

N.A.

0.044

0.019

Bs0 → D ∗ (2007)0 K− π+

N.A.

N.A.

N.A.

N.A.

−0.057

−0.056

9.3 Efficiency Determination The determination of the efficiency function that enters Eq. 9.1 is done in a conventional way. The total efficiency is described as tot = acc · rec · sel ,

(9.7)

where each of the three terms describes the contribution to the efficiency of a distinct part of the selection: • acc : Acceptance efficiency, obtained from signal MC samples, and evaluated as the number of events in which all signal decay products are inside the LHCb detector acceptance divided by the number of generated events. • rec : Reconstruction and stripping efficiency, obtained from signal MC samples, and evaluated as the total number of candidates1 reconstructed by the stripping divided by the total number of events with all the decay products within the LHCb detector acceptance. • sel Selection efficiency, evaluated as the number of candidates that passed the selection in the signal MC sample divided by the number of candidates passing the stripping selection in the same sample. All these efficiencies are obtained from MC samples, that are generated with flat square Dalitz plot distributions. This model ensures that there are enough candidates in the MC samples after the selection has been applied that the efficiency can be computed reliably in all parts of the phase space. The efficiency maps for acc × rec are shown in Fig. 9.7. These two cannot be disentangled since the generated MC samples have already been filtered by the stripping selection. The efficiency maps 1

Since this includes multiple candidates that originate from the same event. This could lead to a bias in the efficiency determination and is studied as a source of systematic uncertainty in Chap. 10.

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

167 0.006

θ'

θ'

9.3 Efficiency Determination

0.005 0.004 0.003 0.002 0.001 0.2

0.4

0.6

0.8

1

0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×10−3

0.2

0.4

0.6

0.8

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.006 0.005 0.004 0.003 0.002 0.001 0.2

0.4

0.6

0.8

1

0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×10−3

0.2

0.4

0.6

0.8

0.005 0.004 0.003 0.002 0.001 0.2

0.4

0.6

0.8

1

m'

1

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

m'

0

θ'

θ'

m' 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

m'

θ'

θ'

m'

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×10−3 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.2

0.4

0.6

0.8

1

0

m'

Fig. 9.7 Acceptance and stripping efficiency (acc × rec ) as a function of square Dalitz plot position for all the channels with (top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ , and (bottom) B 0 → D ∗ (2007)0 π+ π− , with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0

for the selection (sel ) are shown in Fig. 9.8. Since the selection procedure has been chosen to be independent of the SDP variables as much as possible, sel presents a relatively flat distribution, while most of the SDP efficiency inhomogeneities originate at reconstruction and stripping level. Finally, the total efficiencies are shown as a function of square Dalitz plot position in Fig. 9.9. These are the efficiency functions that enter Eq. 9.1. It is notable that the efficiency functions have similar shapes in different channels, which is a consequence of the selection procedure being kept as similar as possible for the different final states.

9 Signal Efficiencies 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.2

0.4

0.6

0.8

1

0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

θ'

θ'

168 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.12 0.1 0.08 0.06 0.04 0.02 0.2

0.4

0.6

0.8

0.12

θ'

θ'

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.1 0.08 0.06 0.04 0.02 0.2

0.4

0.6

0.8

1

0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.1 0.08 0.06 0.04 0.02 0.2

0.4

0.6

0.8

0.1 0.08 0.06 0.04 0.02 0.4

0.6

0.8

1

0

θ'

θ'

m'

1

0

m'

0.12

0.2

0

0.12

m' 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

m'

m'

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.12 0.1 0.08 0.06 0.04 0.02 0.2

0.4

0.6

0.8

1

0

m'

Fig. 9.8 Selection efficiency (sel ) as a function of square Dalitz plot position for all the channels with (top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ , and (bottom) B 0 → D ∗ (2007)0 π+ π− , with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0

In order to assess the impact of the square Dalitz plot distributions on the efficiency, Table 9.3 shows the average weighted efficiency, calculated following Eq. 9.2, compared to the value that is obtained directly from the MC (i.e. assuming a flat square Dalitz plot distribution). The impact on modes with D ∗ (2007)0 → D 0 γ decay is relatively small, but a significant reduction in average efficiency is observed for modes with D ∗ (2007)0 → D 0 π0 decay. This reduction is, however, similar for all three 0 decays, indicating that the ratios of branching fractions for modes that share B(s) that same D ∗ (2007)0 decay are robust against minor imperfections in the efficiency determination procedure.

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

169 ×10−3

θ'

θ'

9.3 Efficiency Determination

0.25 0.2 0.15 0.1 0.05 0.2

0.4

0.6

0.8

1

0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×10−3 0.06 0.05 0.04 0.03 0.02 0.01 0.2

0.4

0.6

0.8

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×10−3 0.3 0.25 0.2 0.15 0.1 0.05 0.2

0.4

0.6

0.8

1

0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.06 0.05 0.04 0.03 0.02 0.01 0.2

0.4

0.6

0.8

0.3 0.25 0.2 0.15 0.1 0.05 0.4

0.6

0.8

1

0

θ'

θ'

1

0

m' ×10−3

0.2

0

×10−3 0.07

m' 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

m'

θ'

θ'

m'

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×10−3 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.2

0.4

0.6

0.8

m'

1

0

m'

Fig. 9.9 Total efficiency as a function of square Dalitz plot position for all the channels with (top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ , and (bottom) B 0 → D ∗ (2007)0 π+ π− , with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0 Table 9.3 Total average efficiencies for each channel for a flat square Dalitz plot distribution and after weighting to the data Channel

SDP

data

B 0 → D ∗ (2007)0 K+ π− , D ∗ (2007)0 → D 0 γ

1.86 · 10−4

2.00 · 10−4

B0 Bs0 Bs0 B0 B0

→

D ∗ (2007)0 K+ π− ,

D ∗ (2007)0

→

D ∗ (2007)0 K− π+ ,

D ∗ (2007)0

→

D ∗ (2007)0 K− π+ ,

D ∗ (2007)0

→

D ∗ (2007)0 π+ π− ,

D ∗ (2007)0

→

D ∗ (2007)0 π+ π− ,

D ∗ (2007)0

→

D 0 π0

3.10

· 10−5

2.08 · 10−5

→

D0 γ

1.99 · 10−4

2.18 · 10−4

→

D 0 π0

3.38 · 10−5

2.17 · 10−5

→

D0 γ

2.55 · 10−4

2.65 · 10−4

→

D 0 π0

· 10−5

2.48 · 10−5

4.10

170

9 Signal Efficiencies

References 1. Pivk M, Le Diberder FR (2005) sPlot: a statistical tool to unfold data distributions. Nucl Instrum Meth A555:356, arXiv:physics/0402083 2. LHCb collaboration, Aaij R et al (2012) Observation of B 0 → D¯ 0 K + K − and evidence for Bs0 → D¯ 0 K + K − . Phys Rev Lett 109:131801, arXiv:1207.5991 3. LHCb collaboration, Aaij R et al (2014) Dalitz plot analysis of Bs0 → D¯ 0 K − π − decays. Phys Rev D90:072003, arXiv:1407.7712 4. LHCb collaboration, Aaij R et al (2015) Dalitz plot analysis of B 0 → D¯ 0 π + π − decays. Phys Rev D92:032002, arXiv:1505.01710 5. LHCb collaboration, Aaij R et al (2019) Measurement of CP observables in the process B 0 → D K ∗ 0 with two- and four-body D decays. JHEP 08:041 arXiv:1906.08297 6. LHCb collaboration, Aaij R et al (2018) Measurement of the CKM angle γ using B ± → D K ± with D → K S0 π + π − , K S0 K + K − decays. JHEP 08:176, Erratum ibid. 10:107, arXiv:1806.01202 7. Dembinski H, Kenzie M, Langenbruch C, Schmelling M Custom orthogonal weight functions (COWs) for event classification. arXiv:2112.04574 8. Kendall MG (1938) A new measure of rank correlation. Biometrika 30:81

Chapter 10

Systematic Uncertainties

Although all the elements for the calculation of the relative branching fractions are available at this point, it is crucial to have a thorough study of potential systematic biases in the results of this thesis. In this section, the sources of such biases are examined in detail, with the objective of quantifying these systematic uncertainties. Moreover, understanding these biases and their correlations between the different channels is a crucial point for the combination, into a final result, of the measurements of the relative branching fractions using each D ∗ (2007)0 decay. The measurements of the relative branching fractions rely essentially on two main inputs: the yields, as obtained from the invariant mass fit described in Chap. 8, and the average efficiencies, obtained from simulation weighted according to the square Dalitz plot distributions observed in data. Thus, depending of the origin of the bias we can distinguish the sources of systematics in two different categories. Some parts of this analysis, such as the weighting procedure, entangles these two inputs to some extent, as seen in Eq. (9.1). However, it is still useful to consider the sources of systematic uncertainty as affecting one or the other of these. There are also a small number of sources of systematic uncertainty associated to external inputs used in the determination of the results, which will also be discussed in this chapter.

10.1 Efficiency Systematic Uncertainties 10.1.1 PIDCORR Resampling The PID variables are a key feature in this analysis to reject misidentified backgrounds. Several PID variables are combined using a BDT in the selection (see Sect. 6.5) that relies on MC samples. However, it is a well known feature of MC simulated samples within LHCb that PID variables do not describe the data to the required level of precision. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_10

171

172

10 Systematic Uncertainties

To overcome that, this analysis makes use of the PIDCORR package [1], which resamples the PID variables in the MC samples using bins of p, pt and η of the corrected particle and the number of reconstructed tracks in the event (nTracks) − + in templates from data samples of D ∗+ → D 0 π+ , Λ → pπ− and Λ+ c → pK π decays. Nevertheless, we must consider any possible biases associated to this resampling procedure. With that objective, the PIDCORR package provides alternate statistically independent templates for the study of systematic uncertainties. The comparison of the standard and the alternative template give a good estimation of the systematic uncertainty associated to the finite nature of the templates. Moreover, one needs to also consider possible biases originated due to the mismodelling of the variables used for the resampling. Reasonable agreement between MC and data in both p and pt is expected and is studied thoroughly in Sect. 10.1.5. The nTracks variable is however not usually well modelled in the MC generated samples. Comparisons of the nTracks distributions between the MC samples and the background-subtracted data are shown in Fig. 10.1. The average numbers of tracks in the simulated and background-subtracted data samples for each channel are also shown in Table 10.1. The average is also given considering only channels with D ∗ (2007)0 → D 0 γ decays, which are cleaner and hence provide more reliable estimates in background-subtracted data. The nTracks distribution in each of the MC samples is scaled using the ratio of these averages, and the PID variables are resampled using the baseline configuration of PIDCORR. To compute the systematic uncertainty on the signal efficiency for each of these two effects, it is sufficient to study the efficiency of the cuts applied on the misID BDT when using the three different approaches (standard template, statistically independent template and standard template with scaled nTracks), as it is the only selection step that utilises PID variables. In Table 10.2 the efficiencies for the three approaches are given. The biases associated to the branching fraction ratios and cross checks can be obtained by just combining the efficiencies that contribute to the different channels. These are shown in Tables 10.3 and 10.4. Since modifying the PID variables that take part in the BDT is equivalent to using a new selection, ideally one would have to propagate these changes, compute the mass fits and the corrected efficiency and obtain the modified branching fractions. However, since the changes in the efficiency are O(0.1%) or smaller we consider that studying the changes in the efficiency is sufficient to estimate this systematic uncertainty.

10.1.2 MC Statistics The signal efficiency maps needed for the measurement of the relative branching fractions are obtained from MC samples, and hence have an uncertainty associated to the finite sizes of those samples. More specifically, the efficiency maps make use of the binned SDP of each of the signal MC samples. In order to estimate the associated

0.05

173 A.U.

A.U.

10.1 Efficiency Systematic Uncertainties Bg subtracted data Signal MC

0.04

0.1

Bg subtracted data

0.08

Signal MC

0.06 0.04

0.03

0.02 0.02

0 −0.02

0.01

−0.04 0

0

−0.06

100

200

300

0

100

200

Bg subtracted data

0.03

0.06

Bg subtracted data

0.05

Signal MC

0.025

300

nTracks A.U.

A.U.

nTracks

Signal MC

0.04 0.02

0.03

0.015

0.02

0.01

0.01

0.005

0

0

−0.01

0

100

200

300

0

100

200

0.025 Bg subtracted data

0.02

Signal MC

0.04

Signal MC

0.02

0.01

0.01

0.005

0 −0.01

0

0

Bg subtracted data

0.03

0.015

300

nTracks A.U.

A.U.

nTracks

100

200

300

nTracks

0

100

200

300

nTracks

Fig. 10.1 Comparison of the number of tracks per event in the MC samples and the backgroundsubtracted data. The plots correspond to (top to bottom) B 0 → D ∗ (2007)0 K+ π− , Bs0 → D ∗ (2007)0 K− π+ and B 0 → D ∗ (2007)0 π+ π− , with (left to right) D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 decays

uncertainty, Poisson fluctuations are applied to the number of events in each bin of the SDP. The signal efficiency is then computed using the modified SDP histograms. This procedure is repeated 1000 times to obtain a distribution of the signal efficiency for each channel, as shown in Fig. 10.2. The systematic uncertainty is taken from the width of this distribution, while the mean of the distribution can be used to check for possible biases caused by the efficiency correction procedure. The widths and means of these distributions, as well as the reference value for the corrected efficiency, are given in Table 10.5, with the ratio between each distribution’s width and its mean value. In addition, the uncertainties on the branching fraction ratios are given in Table 10.6.

174

10 Systematic Uncertainties

Table 10.1 Mean numbers of tracks in MC samples and background-subtracted data as well as their ratio for each channel. The last column gives the average across channels with D ∗ (2007)0 → D 0 γ decays; this value is used as the scaling factor to assess the systematic uncertainty of the PIDCORR package due to mismodelling effects Template B 0 γ B 0 π0 Bs0 γ Bs0 π0 Control γ Control π0 Average (γ) MC 118.5 sample Bg150.3 subtracted data Ratio 1.27

111.7

113.4

106.0

118.2

111.1

116.7

130.3

147.2

140.4

148.5

139.9

148.7

1.17

1.30

1.32

1.26

1.26

1.27

Table 10.2 Efficiencies of the misID BDT cut on the data samples using the standard templates, the templates designed to study systematic uncertainties associated with the resampling procedure statistics and the scaling of the number of tracks in the MC samples Template B0γ B 0 π0 Bs0 γ Bs0 π0 Control γ Control π0 Standard template Systematics template nTracks scaling Ratio (syst. template) Ratio (nTracks scaling)

0.4827

0.6079

0.4944

0.6174

0.6400

0.7395

0.4822

0.6072

0.4937

0.6174

0.6393

0.7386

0.4851

0.6112

0.4976

0.6195

0.6424

0.7435

1.0010

1.0011

1.0014

1.0000

1.0011

1.0012

0.9951

0.9945

0.9935

0.9966

0.9963

0.9947

Table 10.3 Relative systematic uncertainties on the relative branching fractions associated to the size of the templates used in the resampling of the PID variables Branching fraction D∗ → D0 γ D ∗ → D 0 π0 B(B 0 →D ∗ (2007)0 K + π− ) B(B 0 →D ∗ (2007)0 π+ π− ) B(Bs0 →D ∗ (2007)0 K − π+ ) B(B 0 →D ∗ (2007)0 π+ π− ) B(B 0 →D ∗ (2007)0 K + π− ) B(Bs0 →D ∗ (2007)0 K − π+ )

5.0 · 10−5

1.8 · 10−4

3.6 · 10−4

1.25 · 10−3

4.1 · 10−4

1.06 · 10−3

10.1 Efficiency Systematic Uncertainties

175

Table 10.4 Relative systematic uncertainties on the relative branching fractions associated to MC mismodelling effects in the resampling of the PID variables Branching fraction D∗ → D0 γ D ∗ → D 0 π0 1.3 · 10−3

1.2 · 10−4

2.7 · 10−3

1.9 · 10−3

1.5 · 10−3

2.1 · 10−3

100

A.U

A.U

B(B 0 →D ∗ (2007)0 K + π− ) B(B 0 →D ∗ (2007)0 π+ π− ) B(Bs0 →D ∗ (2007)0 K − π+ ) B(B 0 →D ∗ (2007)0 π+ π− ) B(B 0 →D ∗ (2007)0 K + π− ) B(Bs0 →D ∗ (2007)0 K − π+ )

80

160 140 120

60

100 80

40

60 40

20

20 0.195

×10−3 0.205

0.2

0

Corrected efficiency

A.U

A.U

0

120

120

80

100

60

80

20

22

×10−6

Corrected efficiency

60

40

40

20

20

0

0.215

0.22

×10−3

0

Corrected efficiency A.U

A.U

18

140

100

120

18

20

22

24

×10−6

Corrected efficiency

140 120

100

100

80

80

60

60

40

40

20 0

16

20 0.26

0.265

Corrected efficiency

×10−3

0

22

24

26

×10−6

Corrected efficiency

Fig. 10.2 Distribution of corrected efficiency for each channel, obtained from 1000 MC samples that include Poisson fluctuations in each bin of the SDP. The plots correspond to (top to bottom) B 0 → D ∗ (2007)0 K+ π− , Bs0 → D ∗ (2007)0 K− π+ and B 0 → D ∗ (2007)0 π+ π− , with (left to right) D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 decays

176

10 Systematic Uncertainties

Table 10.5 Mean and standard deviation from the corrected efficiency distributions obtained from 1000 toy MC samples generated with Poisson fluctuations from the original MC samples. The baseline value of the corrected efficiency for each channel is also included for reference, as well as the relative uncertainty, δˆ/ˆ = μ/σ Template

B0γ

B 0 π0

Bs0 γ

Bs0 π0

Control γ

Control π0

μ

1.995 · 10−4

1.981 · 10−5

2.177 · 10−4

2.093 · 10−5

2.639 · 10−4

2.403 · 10−5

σ

1.95 · 10−6

1.03 · 10−6

2.07 · 10−6

1.03 · 10−6

1.60 · 10−6

5.95 · 10−7

Baseline ˆ 2.007 · 10−4

2.075 · 10−5

2.188 · 10−4

2.165 · 10−5

2.651 · 10−4

2.476 · 10−5

δˆ ˆ

0.052

0.010

0.049

0.006

0.025

0.010

Table 10.6 Systematic uncertainties on the relative branching fractions associated with the finite size of the MC samples used to determine the efficiencies Branching fraction D∗ → D0 γ D ∗ → D 0 π0 B(B 0 →D ∗ (2007)0 K + π− ) B(B 0 →D ∗ (2007)0 π+ π− ) B(Bs0 →D ∗ (2007)0 K − π+ ) B(B 0 →D ∗ (2007)0 π+ π− ) B(B 0 →D ∗ (2007)0 K + π− ) B(Bs0 →D ∗ (2007)0 K − π+ )

1.2 · 10−2

5.7 · 10−2

1.2 · 10−2

5.5 · 10−2

1.4 · 10−2

7.1 · 10−2

10.1.3 SDP Binning for Efficiency Estimation The SDP binning scheme used for the signal MC samples involves an implicit assumption that the variation within each bin is negligible. However, the binning cannot be too fine since the number of events in each MC sample is limited. Thus, a binning scheme of 20 by 20 bins is used, which appears a reasonable compromise between these two factors. Possible biases caused by this binning scheme choice are studied by computing the corrected signal efficiency in each channel using different binning schemes. A range of schemes are investigated, using a square grid from 2 by 2 bins up to 80 by 80 bins. The dependence of the corrected efficiency on the number of bins for each channel is shown in Fig. 10.3. It can be seen that the corrected efficiency is reasonably stable for numbers of bins around the baseline value of 20, confirming that this is a reasonable choice. Significant biases in the corrected efficiency can be observed at very small or very large numbers of bins. These however are not a reasonable estimate of the systematic uncertainty associated to the binning. Instead, the local variation of the corrected efficiency around the baseline choice should be considered, as they reflect the change in the efficiency when modifying the bin boundaries. Since “oscillations” of the corrected efficiency can be observed as the number of bins is changed, the amplitude of these oscillations is taken as the estimate of the systematic uncertainty. The origin of these oscillations is related to the alignment of bin boundaries—changing from 20 to 21 bins changes all bin boundaries, but changing from 20 to, say, 30 bins leaves some of the boundaries unaffected—A statistical correlation between the average efficiencies obtained when some of bin boundaries

177

×10−3

Corrected efficiency

Corrected efficiency

10.1 Efficiency Systematic Uncertainties

0.2

0.195 0.19

0.185

0

20

40

60

22 20 18

14 0

80

bins

×10−3

Corrected efficiency

Corrected efficiency

24

16

0.18

0.225 0.22

0.215 0.21

0.205

20

40

60

80

bins

×10−6 24 23 22 21 20 19 18 17

0.2

16

0

20

40

60

20

0

bins

×10−3

0.268 0.266 0.264 0.262 0.26 0.258 0.256 0.254 0.252 0.25 0

80

40

60

80

bins

−6

Corrected efficiency

Corrected efficiency

×10−6

27 ×10 26 25 24 23 22 21 20 19

20

40

60

80

bins

0

20

40

60

80

bins

Fig. 10.3 Corrected efficiency as a√function of the number of bins used in the SDP of the signal MC samples. The x-axis, labelled bins, corresponds to n where the binning scheme is n × n. The plots correspond to (top to bottom) B 0 → D ∗ (2007)0 K+ π− , Bs0 → D ∗ (2007)0 K− π+ and B 0 → D ∗ (2007)0 π+ π− , with (left to right) D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 decays

align may be expected, so that smaller deviations of the efficiencies are expected at regular intervals, leading to the appearance of oscillations. The amplitude is evaluated from the half-difference between two opposite points in the local oscillations, estimated “by eye”. The systematic uncertainty in the efficiency estimation obtained with this method is shown in Table 10.7, with the effects on the branching fraction ratios given in Table 10.7 (Table 10.8).

178

10 Systematic Uncertainties

Table 10.7 Estimation of the systematic uncertainty due to the SDP binning in the signal MC efficiency maps B0γ B 0 π0 Bs0 γ Bs0 π0 Control γ Control π0 Local amplitude Baseline ˆ δˆ ˆ

2.5 · 10−6

1.1 · 10−6

4.0 · 10−6

0.9 · 10−6

1.9 · 10−6

1.4 · 10−6

2.007 · 10−4 0.012

2.075 · 10−5 0.052

2.188 · 10−4 0.018

2.165 · 10−5 0.042

2.651 · 10−4 0.007

2.476 · 10−5 0.055

Table 10.8 Systematic uncertainties on the relative branching fractions associated to SDP binning of the signal efficiency maps Branching fraction D∗ → D0 γ D ∗ → D 0 π0 B(B 0 →D ∗ (2007)0 K + π− ) B(B 0 →D ∗ (2007)0 π+ π− ) B(Bs0 →D ∗ (2007)0 K − π+ ) B(B 0 →D ∗ (2007)0 π+ π− ) B(B 0 →D ∗ (2007)0 K + π− ) B(Bs0 →D ∗ (2007)0 K − π+ )

1.4 · 10−2

7.6 · 10−2

1.9 · 10−2

6.9 · 10−2

2.1 · 10−2

6.7 · 10−2

10.1.4 L0Hadron Trigger Systematics It is a well known feature of MC simulated samples within LHCb that the hardware trigger performance does not fully agree with the performance from the data. This could affect the results of this thesis through the use of the L0Hadron trigger. This effect may potentially introduce sizeable biases in the efficiency estimation in each one of the different channels. In most cases however, these differences are expected to cancel out when computing ratios of branching fraction measurements. Nonetheless, it is important to assess any possible systematic uncertainty related to this disagreement. To do so, the L0HadronTables package offers tables for the L0Hadron efficiency for pions kaons and protons in bins of E T , which have been extracted from data templates [2]. The final state charged particles in the MC can be reweighted using these tables to obtain the corrected efficiency. This can be compared from the MC L0Hadron_TOS efficiency to compute the systematic uncertainty associated to trigger efficiency biases. Candidates in MC, however, are required to pass the stripping selection before the weights are applied, which includes HLT2 trigger requirements. Thus, the L0Hadron_TOS efficiency is instead computed with respect to L0Global_TIS. That is L0Hadron_TOS ≈

NL0Global_TIS&L0Hadron_TOS , NL0Global_TIS

with N being the selected candidates under each specific trigger.

(10.1)

10.1 Efficiency Systematic Uncertainties

179

Table 10.9 Efficiency of the L0Hadron trigger for the MC samples and its corrected version using the data templates provided by the L0HadronTables package. The relative difference between both values has also been included and is taken as the systematic uncertainty for the efficiency calculation in each channel Channel MC (%) Corrected (%) δ/ (%) B0, γ B 0 , π0 Bs0 , γ Bs0 , π0 Control γ Control π0

52.65 75.96 53.16 76.51 59.17 80.12

47.45 68.55 47.59 68.50 52.74 72.60

9.88 9.76 10.48 10.47 10.87 9.39

Table 10.10 Systematic uncertainties on the relative branching fractions associated to mismodelling of the L0Hadron efficiency Branching fraction D∗ → D0 γ D ∗ → D 0 π0 B(B 0 →D ∗ (2007)0 K + π− ) B(B 0 →D ∗ (2007)0 π+ π− ) B(Bs0 →D ∗ (2007)0 K − π+ ) B(B 0 →D ∗ (2007)0 π+ π− ) B(B 0 →D ∗ (2007)0 K + π− ) B(Bs0 →D ∗ (2007)0 K − π+ )

1.1 · 10−2

4.1 · 10−3

4.4 · 10−3

1.2 · 10−2

6.7 · 10−3

7.9 · 10−3

Moreover, if the corrected efficiency differs with respect to the MC efficiency at the trigger step, it is still possible for the selection to attenuate or accentuate these effects—i.e. more events should have been triggered, but these would have discarded by the selection. In Table 10.9, the efficiency ratio described in Eq. (10.1) are shown for the MC samples and compared to the corrected efficiency. The systematic uncertainty in the efficiency is taken as the relative difference between both values. Although notable differences can be seen between the MC and corrected efficiency, these biases cancel out for the most part when computing the ratio of efficiencies between different channels for the relative branching fraction measurements, as expected. The full effects of these biases on the results of this analysis are shown in Table 10.10.

10.1.5 Data/MC Disagreement The selection procedure is designed with the minimisation of systematic uncertainties in mind. Input variables to the MVA, and the MVA output, are checked to have low 0 -candidate mass and the SDP variables, and moreover correlations with both the B(s) the same combinatorial MVA requirement is imposed on signal and control channels.

180

10 Systematic Uncertainties

A.U.

This ensures that the efficiency of the requirement is close to identical for both signal and control channel and hence almost exactly cancels in the determination of the branching fraction ratio. Nonetheless, the efficiency of the MVA requirement is not exactly identical in all channels, and hence imperfect agreement between data and MC of the variables used in the MVA is a source of systematic uncertainty. Comparisons between the distributions of these variables between data and MC are shown in Appendix B. The data distributions are obtained by subtracting background using the sWeights method 0 -candidate discussed in Chap. 9 (since this relies on the output of the fit to the B(s) mass, this comparison can only be done post hoc). Excellent agreement between background-subtracted data and MC is found for most variables in all samples. Nevertheless, modest mismodelling effects can be found in some variables. In particular the pt asymmetry variables show a small discrepancy. (Note that any disagreements between background-subtracted data and MC in the PID variables will be covered by the systematic uncertainty discussed in Sect. 10.1.1, and therefore do not need to be considered here.) In order to estimate possible biases on the results caused by this mismodelling, the MC distribution of the pt asymmetry in a cone of 1.5 radians around the B direction is reweighted to match the background-subtracted data distribution. This is done using the B 0 → D ∗ (2007)0 π+ π− channel, since this has the largest statistics. This variable is chosen among the complete list of variables used in the BDTs since it exhibits the largest disagreement between MC and background-subtracted data. The comparison of the reweighted MC sample with the background subtracted distribution is shown in Fig. 10.4. 0.04 0.035 0.03

Signal weighted Data Signal MC Corrected signal MC

0.025 0.02 0.015 0.01 0.005 0 −1

−0.5

0

0.5

1

1.5

B PT Asymmetry (Cone1)

Fig. 10.4 pt asymmetry in a cone with opening angle 1.5 radians around the B direction the B 0 → D ∗ (2007)0 π+ π− channel. The blue distribution corresponds to the unweighted signal MC, while the red points correspond to the background-subtracted data distribution. In the green distribution, the MC generated sample has been reweighted to reproduce the data distribution

0.04 0.03

181 A.U.

A.U.

10.1 Efficiency Systematic Uncertainties

Signal weighted Data

0.04 Signal weighted Data 0.03

Signal MC

Signal MC

0.02 0.02 0.01 0.01 0 0 −0.01 −0.01

0.8

0.9

1

0.03 0.025

0.7

Combinatorial BDT output

A.U.

A.U.

0.7

Signal weighted Data

0.035 0.03

Signal MC 0.02

0.8

0.9

1

0.9

1

0.9

1

Corrected Combinatorial BDT output Signal weighted Data Signal MC

0.025 0.02

0.015

0.015

0.01

0.01 0.005

0.005

0

0

0.8

0.9

Combinatorial BDT output

0.024 0.022 0.02 0.018 0.016

0.7

1

A.U.

A.U.

0.7

Signal weighted Data

0.025 0.02

Signal MC

0.8

Corrected Combinatorial BDT output Signal weighted Data Signal MC

0.015

0.014 0.012

0.01

0.01 0.008

0.005

0.006 0.004 0.002

0.7

0

0.8

0.9

1

Combinatorial BDT output

0.7

0.8

Corrected Combinatorial BDT output

Fig. 10.5 Combinatorial BDT output in the (top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ and (bottom) B 0 → D ∗ (2007)0 π+ π− channels, in all cases with D ∗ (2007)0 → D 0 γ. The left (right) plots show the comparison between background-subtracted data and signal MC with the baseline (retrained) BDT

The BDT is retrained with the weighted MC and the requirement on the BDT output (numerically unchanged from the baseline procedure) is applied to both signal MC and data samples. Comparisons, between background-subtracted data and signal MC, of the BDT output are shown in Figs. 10.5 and 10.6 for both the baseline BDT and the retrained corrected BDT. In all cases, better agreement can be seen with the BDT retrained with the weighted MC, and hence this procedure is considered to give a reliable estimate of the systematic uncertainty associated to disagreement between data and MC. Differences in the efficiency of the BDT output cut between

10 Systematic Uncertainties 0.1 0.08 0.06

A.U.

A.U.

182

Signal weighted Data Signal MC

Signal weighted Data

0.06

Signal MC

0.04

0.04

0.02

0.02 0

0

−0.02

−0.02

−0.04

−0.04

−0.06

0.6

0.06 0.05 0.04

−0.06

0.7

0.8

0.9

1

0.6

Combinatorial BDT output

A.U.

−0.08

A.U.

0.1 0.08

Signal weighted Data

0.03

0.8

0.9

1

0.9

1

0.9

1

0.08 Signal weighted Data 0.06

Signal MC

0.7

Corrected Combinatorial BDT output

Signal MC

0.04

0.02 0.01

0.02

0 0

−0.01 −0.02

0.04 0.03

0.7

0.8

0.9

0.6

1

Combinatorial BDT output

A.U.

A.U.

0.6

Signal weighted Data Signal MC

0.7

0.8

Corrected Combinatorial BDT output

0.05

Signal weighted Data

0.04

Signal MC

0.02

0.03

0.01

0.02 0.01

0

0

−0.01

−0.01 −0.02

0.6

0.7

0.8

0.9

1

Combinatorial BDT output

0.6

0.7

0.8

Corrected Combinatorial BDT output

Fig. 10.6 Combinatorial BDT output in the (top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ and (bottom) B 0 → D ∗ (2007)0 π+ π− channels, in all cases with D ∗ (2007)0 → D 0 π0 . The left (right) plots show the comparison between background-subtracted data and signal MC with the baseline (retrained) BDT

the baseline and weighted BDT are shown in Table 10.11. These differences are used to evaluate the associated systematic uncertainty on the branching fraction ratios, shown in Table 10.12. The changes in efficiencies are at the level of a few percent for modes with D ∗ (2007)0 → D 0 γ decays, but are somewhat larger for modes with D ∗ (2007)0 → D 0 π0 decays. However, since all changes are in the same direction (reduction of efficiency), the systematic uncertainties on the ratios of branching fractions are typically smaller.

10.1 Efficiency Systematic Uncertainties

183

Table 10.11 Signal efficiency of the cut on the combinatorial BDT with the baseline and corrected MC, the relative change in the efficiency is also included Channel Baseline (%) Corrected (%) δ/ B0, γ B 0 , π0 Bs0 , γ Bs0 , π0 Control γ Control π0

70.65 82.88 72.14 84.18 70.66 78.74

69.35 75.58 71.10 77.51 69.23 74.31

1.9 · 10−2 8.8 · 10−2 1.4 · 10−2 7.9 · 10−2 2.0 · 10−2 5.6 · 10−2

Table 10.12 Systematic uncertainty on the relative branching fractions associated to the disagreement between data and MC generated samples Relative BF δ B/B B(B 0 →D ∗ (2007)0 K + π− ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(B 0 →D ∗ (2007)0 K + π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 )

7.0 · 10−4 3.4 · 10−2 5.9 · 10−3 2.4 · 10−2 5.2 · 10−3 9.5 · 10−3

10.1.6 Data/MC Disagreement in Bs0 Lifetime The Bs0 lifetime used to generate the simulated sample is 1.512 ps. However, Bs0 → D ∗ (2007)0 K− π+ is a flavour-specific decay, and as such the current best knowledge of the appropriate lifetime is τfs = 1.527 ± 0.011 ps [3]. (See Ref. [4] for a discussion of Bs0 effective lifetimes) Since the efficiency depends on decay-time, this is a source of systematic uncertainty. The average efficiency can be written as an integral over decay time (here for the originally generated lifetime) ¯gen =

∞

(t)Fgen (t)dt ,

(10.2)

0

where F(t) is the underlying decay time distribution, which is a pure exponential, i.e. F(t) = τ1 exp(−t/τ ). The decay-time dependent efficiency function (t) is taken to be independent of the lifetime used in the MC from which it is obtained. Since the generated distribution Fgen (t) is known, it is easily obtained—up to a multiplicative constant that will cancel in the correction discussed below and hence is neglected—

184

10 Systematic Uncertainties

from the reconstructed decay time distribution of the signal MC after all selection criteria are imposed. We can reweight the MC from the generated lifetime τgen to the world average lifetime τWA by applying weighting factors, τgen τgen τgen − τWA FWA (t) 1 1 ≈ , = w(t) = exp −t − −t 2 Fgen (t) τWA τWA τgen τWA τWA (10.3) 1 1 where the approximation is valid for t τWA − τgen 1, as is the case here. The corrected efficiency is then ¯corr =

∞

(t)w(t)Fgen (t)dt ,

(10.4)

0

and since the reconstructed MC decay time distribution is given by (t)F(t) this can simply be multiplied by the weight factors and integrated to obtain the multiplicative systematic uncertainty. In Fig. 10.7, the distribution for (t)Fgen (t) in the MC samples is shown in comparison with respect to (t)w(t)Fgen (t). Since the baseline efficiency from the MC samples can be computed as ¯corr =

∞

(t)Fgen (t)dt ,

(10.5)

0

3000

Signal MC Corrected signal MC

2500

A.U.

A.U.

it is enough to compare the integral of the two distributions to compute the relative systematic uncertainty in the efficiency due to the mismodelling of the lifetime distribution of Bs0 → D ∗ (2007)0 K− π+ decays. These values are shown in Table 10.13 in conjunction with the associated effects to the measurements of the relative branching fractions. 600

2000

400

1500

300

1000

200

500

100

0

5 0 −5 0

Signal MC Corrected signal MC

500

0

0.005

0.01 B0s lifetime [ns]

5 0 −5 0

0.005

0.01 B0s lifetime [ns]

Fig. 10.7 Variation of reconstructed decay-time distribution when correcting for mismodelling of the Bs0 lifetime in the Bs0 → D ∗ (2007)0 K− π+ MC samples with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0 decays

10.1 Efficiency Systematic Uncertainties

185

Table 10.13 Relative systematic uncertainties on the relative branching fractions associated to mismodelling of the Bs0 lifetime B0γ δ/ N.A.

B 0 π0 N.A.

Bs0 γ 6.0 · 10−3

Branching fraction B(B 0 →D ∗ (2007)0 K + π− ) B(B 0 →D ∗ (2007)0 π+ π− ) B(Bs0 →D ∗ (2007)0 K − π+ ) B(B 0 →D ∗ (2007)0 π+ π− ) B(B 0 →D ∗ (2007)0 K + π− ) B(Bs0 →D ∗ (2007)0 K − π+ )

Bs0 π0 4.9 · 10−3

Control γ N.A.

Control π0 N.A.

D∗ → D0 γ

D ∗ → D 0 π0

N.A.

N.A.

· 10−3

4.9 · 10−3

6.0 · 10−3

4.9 · 10−3

6.0

10.1.7 Biases in the sWeights Procedure Due to Correlations The effectiveness of the sWeights procedure, described in Chap. 9, needed to obtain the signal efficiency in each channel relies on the absence of correlation between 0 -candidate mass. Correlations between these varithe SDP variables and the B(s) ables, for all the components included in the mass fit model, have been studied from MC samples in Sect. 9.2. For most components the correlations were found to be not important. However, in some cases—particularly for misidentified and partially combinatorial backgrounds—sizeable correlation effects were seen. These correlations may potentially introduce some biases in the event-by-event efficiency correction procedure used to obtain the results of the analysis. In order to quantify the impact of such biases on the measurements, two distinct sets of pseudo-experiments have been generated. In the first, the correlations between 0 mass are maintained by picking random events from the SDP variables and the B(s) the MC samples. In the latter, all correlations are removed by picking events inde0 -candidate mass and the SDP variables.1 Rather than trying pendently for the B(s) to replicate the full simultaneous fit, this is done for each channel separately, and only the D ∗ (2007)0 → D 0 γ mode is included due to limited MC statistics in the D ∗ (2007)0 → D 0 π0 channel. The pseudo-experiments include most components from the baseline mass fit model, specifically correctly reconstructed signal, misreconstructed signal, signal 0 → D ∗ (2007)0 K+ K− decays, partially with the wrong D ∗0 decay, misidentified B(s) combinatorial background from B 0 → D 0 h + h − and B + → D ∗ (2007)0 h + decays and fully combinatorial background. Partially reconstructed backgrounds are not included, as there is no model available for the SDP distributions of these components. Misidentified B 0 → D ∗ (2007)0 π+ π− decays are also not included as their observed yield is small and they do not exhibit important correlations that could contribute to biasing the results.

1

Both SDP variables are picked from the same event in order to maintain the appropriate SDP distribution.

10 Systematic Uncertainties Events / 10 MeV/c2

Events / 10 MeV/c2

186 Data sample

350

Toy sample (Correlated)

300

Toy sample (Uncorrelated)

250 200

Data sample

1200

Toy sample (Correlated) 1000

Toy sample (Uncorrelated)

800 600

150 400

100

200

50 0

0

5200

5400

5600 5800 m(D*(2007)0K +π −) [MeV/c2]

Events / 10 MeV/c2

5 0 −5

4500

5 0 −5

5200

5400

5600 5800 m(D*(2007)0K +π −) [MeV/c2]

Data sample

4000

Toy sample (Correlated)

3500

Toy sample (Uncorrelated)

3000 2500 2000 1500 1000 500 0

5 0 −5

5200

5400

5600 5800 m(D*(2007)0K +π −) [MeV/c2]

0 -candidate mass distribution of the two pseudo-experiments generated to evaluFig. 10.8 B(s) ate biases due to correlations with the SDP variables for (top left) B 0 → D ∗ (2007)0 K+ π− , (top right) Bs0 → D ∗ (2007)0 K− π+ and (bottom) B 0 → D ∗ (2007)0 π+ π− candidates, in all cases with D ∗ (2007)0 → D 0 γ decays. The distributions from data are included for comparison

All components are taken from full MC samples, except for fully combinatorial 0 -candidate mass distribution is the model used in the background for which the B(s) fit to data and the SDP distribution is that obtained from the sWeights procedure. Thus, uniquely, the fully combinatorial component is introduced without correlations in both pseudo-experiments, which is acceptable since the correlations in this component are not expected to be important. The correctly reconstructed signal, misreconstructed signal and misidentified background components take into account the reweighting procedure described in Sect. 7.3 so the corrected SDP distribution is generated in the pseudo-experiments. The pseudo-experiments have been generated containing 10 times the number of events present in the data samples, maintaining the ratio between the yields of all the included components as measured in the simultaneous fit model. The use of large yield pseudo-experiments is preferred to an ensemble of data-sized samples since it is technically simpler and the result for the bias would be mathematically 0 -candidate mass distributions of both pseudoequivalent. A comparison of the B(s) experiments (correlated and uncorrelated) with the data distribution is shown in Fig. 10.8, in which the pseudo-experiments have been normalized to the data yield to facilitate the comparison. The distributions are seen to be consistent except for

10.2 Yields Systematic Uncertainties

187

Table 10.14 Corrected efficiencies obtained from pseudo-experiments generated with and without correlations, as well as the value obtained using the results of the baseline fit to data. The relative efficiency difference between the results with and without correlations, which is taken as the systematic uncertainty, is also shown. The corresponding uncertainty on each of the relative branching fractions is also given, where this value is computed conservatively assuming that the biases on each of the efficiencies are fully correlated B0γ

Bs0 γ

Control γ

corr

1.98763 · 10−4

2.12246 · 10−4

2.59798 · 10−4

uncorr

1.96944 · 10−4

2.13761 · 10−4

2.59008 · 10−4

data

2.00701 · 10−4

2.18842 · 10−4

2.65064 · 10−4

δ/

0.0092

−0.0071

0.0031

Branching fraction B(B 0 →D ∗ (2007)0 K+ π− ) B(B 0 →D ∗ (2007)0 π+ π− ) B(Bs0 →D ∗ (2007)0 K− π+ ) B(B 0 →D ∗ (2007)0 π+ π− ) B(B 0 →D ∗ (2007)0 K+ π− ) B(Bs0 →D ∗ (2007)0 K− π+ )

δB/B (%) 0.61 1.02 1.62

0 the absence of partially reconstructed background at low B(s) -candidate mass in ∗ 0 + − ∗ 0 + − the D (2007) K π and D (2007) π π final states, which however should not impact this study. The pseudo-experiments are fitted independently, with each component modelled in the same way as in the baseline mass fit (see Chap. 8). Then, the sWeights procedure described in Chap. 9 is applied to obtain the efficiency-corrected yield and the corrected efficiency. Finally, the corrected efficiency has been computed for both samples. The corrected efficiencies as well as their relative difference are shown in Table 10.14. The differences between the corrected efficiencies in the pseudo-experiments generated with and without correlations are taken as the estimate of the systematic uncertainty due to correlation effects. These effects are found to be below the percent level, and hence small compared to the leading uncertainties. The corresponding uncertainty on each of the relative branching fractions is also given, where this value is computed conservatively assuming that the biases on each of the efficiencies are fully correlated. Note that the corrected efficiencies for these samples are not expected to match exactly the values obtained from the fit to data, as given in Table 9.3, due to the approximations in the generation of pseudo-experiments. Nonetheless, reasonable agreement is found, which can be considered a further validation of the SDP models used to reweight the MC samples, as described in Sect. 7.3.

10.2 Yields Systematic Uncertainties 10.2.1 Fit Stability The fit stability is studied to quantify any possible biases introduced by the fit model. An ensemble of 2500 toy experiments is generated according to the results of the fit to data. The numbers of events generated in each of the toy samples follows a Poisson distribution with the mean being the number of events in the data sample.

188

10 Systematic Uncertainties

Fig. 10.9 Pull distributions of the signal yields obtained from an ensemble of 2500 toy experiments. The distributions are given for (top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ and (bottom) B 0 → D ∗ (2007)0 π+ π− , with left (right) plots corresponding to channels with D ∗ (2007)0 → D 0 γ (D ∗ (2007)0 → D 0 π0 )

The toy samples are fitted using the same configuration as the data and the distributions of results are compared to the values obtained in the fit to data. The pull distributions are shown in Fig. 10.9. The means of the signal yields from the ensemble, as well as the means and standard deviations of the pull distributions, are given in Table 10.15. The pull distributions are seen to all be close to the unit Gaussian distributions expected in the absence of any bias. Nonetheless, the means of the pulls have values

10.2 Yields Systematic Uncertainties

189

Table 10.15 Mean signal yields from the ensemble of 2500 toy experiments, compared to the baseline values, together with the mean (μ) and standard deviation (σ) of each pull distribution B0γ B 0 π0 Bs0 γ Bs0 π0 Control γ Control π0 Baseline yield Toys mean yield Pull μ Pull σ

946.37

184.66

3744.32

632.72

15020.91

2591.49

940.62

184.12

3755.12

624.92

15016.05

2563.12

−0.154 ± 0.024 1.137 ± 0.017

−0.094 ± 0.023 1.132 ± 0.017

0.127 ± 0.025 1.082 ± 0.018

−0.207 ± 0.029 1.192 ± 0.020

−0.038 ± 0.030 1.260 ± 0.021

−0.137 ± 0.035 1.323 ± 0.025

Table 10.16 Results of the different branching fractions when using the signal yields obtained from the toy samples with respect to the baseline model Branching fraction Baseline Modified δB δ B/B model model B(B 0 →D ∗ (2007)0 K + π− ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(B 0 →D ∗ (2007)0 K + π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 )

0.0832

0.0827

−0.0005

−0.0058

0.0850

0.0857

0.0007

0.0081

1.189

1.193

0.0038

0.0032

1.099

1.098

−0.0015

−0.0014

0.0700

0.0693

−0.0006

−0.0089

0.0773

0.0757

0.0007

0.0095

as large as 10 or 20%, corresponding to small biases that are a source of systematic uncertainty. These uncertainties are quantified by comparing the results of the branching fraction ratios when utilising the mean signal yields from the toy experiments instead of the yields obtained from data, as shown in Table 10.16. It can also be noted that the widths of the pull distributions are slightly, but consistently larger than unity. This corresponds to an underestimation of the statistical uncertainty by a small amount, varying from 10 to 30%. To account for this, the statistical uncertainty of each of the corrected yields has been scaled using the corresponding factor. This is understood as a consequence of the simultaneous fit and its constraints.

10.2.2 Contributions from Λ0b Backgrounds Although many different sources of background are included in the baseline mass fit model, some components are excluded. This is typically done because the extra components are expected to have an almost negligible yield, and their inclusion would complicate the baseline fit.

190

10 Systematic Uncertainties

The most relevant of such backgrounds consists of events that contain a Λ0b → D (2007)0 ph + decay in which the anti-proton has been misidentified as a K − or a π− as well as Λ0b → D 0 ph + decays in which, additionally, an extra random γ/π0 is used to build a D ∗ (2007)0 candidate. The justification to not include this contribution in the baseline was given in Sect. 7.3, but nonetheless it is important to quantify any possible biases introduced by this decision. To do so, an alternative fit model has been developed featuring these contributions. The signal yields are obtained using the alternate model which features components from Λ0b backgrounds modelled using a Johnson plus a double Crystal Ball function. The signal sWeights are computed using the alternate model to compute the branching fraction ratios. These will be compared with the baseline results to quantify the systematic uncertainty. In addition, when introducing these components to the alternate mass fit model, the following constraints are included. These constraints serve to keep the levels of these backgrounds consistent between the different channels. ∗

• Both the channels Bs0 → D ∗ (2007)0 K− π+ and B 0 → D ∗ (2007)0 π+ π− may contain events from the same backgrounds Λ0b → D ∗ (2007)0 pπ+ and Λ0b → D 0 pπ+ . The yields of the backgrounds in the two channels can be written as N (Λ0b → D 0 pπ+ |B 0 → D ∗ (2007)0 π+ π− |γ) = f ΛD0 N (Λ0b → D 0 pπ+ |Bs0 → D ∗ (2007)0 π+ K− |γ) , 0

b

N (Λ0b → D ∗0 pπ+ |B 0 → D ∗ (2007)0 π+ π− |γ) =

(10.6)

∗0

f ΛD0 N (Λ0b → D ∗0 pπ+ |Bs0 → D ∗ (2007)0 π+ K− |γ) . b

0

∗0

Here f ΛD0 and f ΛD0 correspond to the relative rate for each Λ0b background to be b b reconstructed in the two final states. These constraints are determined from MC. • The ratio between the yields of Λ0b → D ∗ (2007)0 ph + in channels with different D ∗ (2007)0 decays depends only on the reconstruction efficiency of the D ∗ (2007)0 meson and thus must be the same in all the channels. The background yields in channels with different D ∗ (2007)0 decays can therefore be constrained: ∗0

D 0 ∗0 ph + |X |π0 ) , N (Λ0b → D ∗0 ph + |X |γ) = f γ/π 0 N (Λb → D D 0 0 + 0 N (Λ0b → D 0 ph + |X |γ) = f γ/π 0 N (Λb → D ph |X |π ) . 0

∗0

(10.7)

0

D D 0 Here f γ/π 0 and f γ/π0 correspond to the relative rates for each Λb background to be reconstructed in the two D ∗ (2007)0 channels. These constraints are determined from MC. Also, X is one of the three decay channels B 0 → D ∗ (2007)0 K+ π− , Bs0 → D ∗ (2007)0 K− π+ and B 0 → D ∗ (2007)0 π+ π− .

The constraints allow the Λ0b background components to be included in the fit model while only increasing the degrees of freedom of the fit model by 8, 4 of which (corresponding to the aforementioned constraints) are not completely free in the fit.

10.2 Yields Systematic Uncertainties

191

The results of the mass fit when including these components are shown in Figs. 10.10 and 10.11. Note that the fit model sets the Λ0b → D 0 ph + background yields to 0, consistent with the decision to not include this component in the baseline model. The Λ0b → D ∗ (2007)0 ph + background yields are, however, non-zero so the signal yields are modified from their baseline values. The results of the fit with the modified model are used to compute the corrected signal efficiency using sWeights as described in Chap. 9. The signal yields and the corrected efficiencies in each channel are given in Table 10.17, compared with the values from the baseline model. These are used to compute the relative branching fractions, which are then compared with the baseline results to obtain the systematic uncertainty associated to not including these background components. These uncertainties are presented in Table 10.18.

10.2.3 Multiple and Duplicated Candidates As discussed in Sect. 6.6, the data sample after selection contains non-negligible contributions from candidates that originate from the same event, also referred to as multiple candidates. Similarly, about 1% of the candidates in the D ∗ (2007)0 → D 0 γ channels occur in events that also have at least one candidate in the D ∗ (2007)0 → D 0 π0 channel, referred to as duplicated candidates. (This fraction is around 5% when evaluated the other way around.) Both multiple and duplicate candidates arise mainly as a consequence of the challenging reconstruction of the final state neutral particles. Since these effects may differ between data and MC, they may induce biases on the yield estimation in the mass fit model as well as to the corrected efficiencies obtained from signal MC samples. Thus, these effects must be studied as a source of systematic uncertainty. In order to do so, additional selection requirements are introduced in order to reduce the rates of both multiple and duplicate candidates, due to the neutral particle reconstruction, to zero. For each event with more than one candidate, in either the same channel or the channel with the other D ∗ (2007)0 decay, only the candidate with lowest DecayTreeFitter χ2 is kept, while the remaining candidates are discarded. This is referred to subsequently as the multiple candidate veto, and removes all multiple candidates as well as duplicated candidates from the other D ∗ (2007)0 decay. This requirement does not significantly sculpt the B candidate mass distribution, since the DecayTreeFitter configuration uses constraints on only the D 0 0 and D ∗ (2007)0 masses, not on the B(s) mass. Note that this veto does not remove potential duplicated candidates due to misidentification of charged tracks—e.g. candidates selected as both B 0 → D ∗ (2007)0 K+ π− and B 0 → D ∗ (2007)0 π+ π− . However, such effects are at or below the 1% level and are considered negligible. The effectiveness of the multiple candidate veto to remove background and select the correctly reconstructed candidate has been checked using truth matched MC. The results are shown in Table 10.19, which also includes the probability to select

10 Systematic Uncertainties Fully reconstructed signal 0 0 D* (2007) →D π 0 events Misreconstructed signal 0 B 0→ D* (2007) π +π − 0 + − 0 * B → D (2007) K K 0 B s0→ D*0 (2007) K +K − B 0→ D K +π − 0 B +→ D* (2007) K + 0 0 B 0→ D*0(2007) K 1(1270) 0 Λ b→ D p K + 0 0 Λ b→ D* (2007) p K + Combinatorial background

350 300 250 200 150

Entries/10 MeV/c2

Entries/10 MeV/c2

192

0

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2] Fully reconstructed signal 0 D*(2007)0 →D π 0 events Misreconstructed signal B0 → D*(2007)0 π +π − B0 → D*(2007)0 K +K − Bs0 → D*0 (2007)0 K +K − Bs0 → D K −π + B0s → D*0(2007)0 K 1(1270)0 0 Λb → D p π + 0 Λb → D*(2007)0 p π + Combinatorial background

1200 1000 800 600

Entries/10 MeV/c2

Entries/10 MeV/c2

0

5 0 −5

5200

5400

300

200 150 100

200

50

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2] Fully reconstructed signal 0 D*(2007)0 →D γ events Misreconstructed signal B0 → D*(2007)0 π +π − B0 → D*(2007)0 K +K − Bs0 → D*0 (2007)0 K +K − Bs0 → D K −π + B0s → D*(2007)0 K 1(1270)0 Combinatorial background 0 0 Λb → D p π + 0 Λb→ D*(2007)0 p π +

250

400

0

0

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

5000 Fully reconstructed signal 0 0 D* (2007) →D π 0 events Misreconstructed signal 0 B 0→ D* (2007) K +π − 0 B 0s→ D*0 (2007) K −π + B s0→ D π −π + 0 B +→ D* (2007) π + 0 0 B 0→ D* (2007) η ' (958) 0 0 B 0→ D*0(2007) a 1(1260) 0 Λ b→ D p π + 0 0 * Λ b→ D (2007) p π + Combinatorial background

4000 3000 2000

5 0 −5

Entries/10 MeV/c2

Entries/10 MeV/c2

60

20

50

5 0 −5

Fully reconstructed signal 0 0 D* (2007) →D γ events Misreconstructed signal 0 B 0→ D* (2007) π +π − 0 + − 0 * B → D (2007) K K 0 B s0→ D*0 (2007) K +K − B 0→ D K +π − 0 B +→ D* (2007) K + 0 0 B 0→ D*0(2007) K 1(1270) 0 Λ b→ D p K + 0 0 Λ b→ D* (2007) p K + Combinatorial background

80

40

100

5 0 −5

100

5200

5400

1200

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2] Fully reconstructed signal 0 0 D* (2007) →D γ events Misreconstructed signal 0 B 0→ D* (2007) K +π − 0 B 0s→ D*0 (2007) K −π + B s0→ D π −π + 0 B +→ D* (2007) π + 0 0 B 0→ D* (2007) η ' (958) 0 0 B 0→ D*0(2007) a 1(1260) 0 Λ b→ D p π + 0 0 * Λ b→ D (2007) p π + Combinatorial background

1000 800 600 400

1000

200

0

5 0 −5

0

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

5 0 −5

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

Fig. 10.10 Results of the simultaneous fits with linear y-axis scale including partially combinatorial backgrounds from Λ0b decays (in pink). Distributions of (top) B 0 → D ∗ (2007)0 K + π − , (middle) Bs0 → D ∗ (2007)0 K − π + , and (bottom) B 0 → D ∗ (2007)0 π + π − with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0 candidates are shown. The total fit result is shown as a solid blue line. All channels also include components for correctly reconstructed signal (solid green line), misreconstructed signal (dashed dark green line) signal with the wrong D ∗ (2007)0 decay reconstructed (dashed light green line). Misidentified B 0 → D ∗ (2007)0 π+ π− and Bs0 → D ∗ (2007)0 K+ K− decays are indicated by blue and purple lines, respectively. In the control channel, the lighter and darker blue lines indicate misidentified B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ 0 → D0 h+ h− ) decays, respectively. Partially combinatorial backgrounds are shown in orange (B(s) + ∗ 0 + and brown (B → D (2007) h ) while partially reconstructed backgrounds are grey and black. The combinatorial background component is shown in red

193 Entries/10 MeV/c2

Entries/10 MeV/c2

10.2 Yields Systematic Uncertainties

2

10

10

102

10

1 1 10−1

5200

5400

5 0 −5

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

5 0 −5

103

102

5200

5400

5600 5800 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

102

10

10 1 1

5200

5400

5 0 −5

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

5 0 −5

3

10

102

103

102

10 10 1

5 0 −5

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

5 0 −5

Fig. 10.11 Results of the simultaneous fits with log y-axis scale including partially combinatorial backgrounds from Λ0b decays (in pink). Distributions of (top) B 0 → D ∗ (2007)0 K + π − , (middle) Bs0 → D ∗ (2007)0 K − π + , and (bottom) B 0 → D ∗ (2007)0 π + π − with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0 candidates are shown. The total fit result is shown as a solid blue line. All channels also include components for correctly reconstructed signal (solid green line), misreconstructed signal (dashed dark green line) signal with the wrong D ∗ (2007)0 decay reconstructed (dashed light green line). Misidentified B 0 → D ∗ (2007)0 π+ π− and Bs0 → D ∗ (2007)0 K+ K− decays are indicated by blue and purple lines, respectively. In the control channel, the lighter and darker blue lines indicate misidentified B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ 0 → D0 h+ h− ) decays, respectively. Partially combinatorial backgrounds are shown in orange (B(s) + ∗ 0 + and brown (B → D (2007) h ) while partially reconstructed backgrounds are grey and black. The combinatorial background component is shown in red

194

10 Systematic Uncertainties

Table 10.17 Signal yields and corrected signal efficiencies for the baseline model and the modified fit model that includes Λ0b background components Baseline model Channel B0, γ B 0 , π0 Bs0 , γ Bs0 , π0 Control, γ Control, π0

Yield 946.37 184.66 3744.32 632.72 15020.91 2591.49

Modified model corrected 2.01 · 10−4 2.08 · 10−5 2.18 · 10−4 2.16 · 10−5 2.65 · 10−4 2.48 · 10−5

corrected 2.03 · 10−4 1.97 · 10−5 2.18 · 10−4 2.03 · 10−5 2.65 · 10−4 2.49 · 10−5

Yield 913.09 172.43 3473.56 599.6 14361.6 2525.13

Table 10.18 Branching fraction ratios for the baseline model and the modified fit model that includes Λ0b background components. The difference between these two values is taken as the associated systematic uncertainty Branching fraction Baseline Modified δB δ B/B model model B(B 0 →D ∗ (2007)0 K + π− ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(B 0 →D ∗ (2007)0 K + π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 )

0.0832

0.0830

−0.00302

0.003

0.0850

0.08633

0.0013

0.0154

1.189

1.159

−0.030

−0.025

1.099

1.147

0.048

0.044

0.070

0.072

0.0016

0.023

0.077

0.075

−0.0021

0.027

the correctly reconstructed candidate using other variables that were considered. The DecayTreeFitter χ2 is used as it exhibits the highest discriminating power. It is important to note that the ratios given in Table 10.19 correspond to only the events with more than one candidate, which constitute a small fraction of the total events in the data sample. The overall signal efficiency of the veto is above 99% for channels with D ∗ (2007)0 → D 0 γ and above 96% for D ∗ (2007)0 → D 0 π0 channels. Moreover, the veto is expected to reduce misreconstructed signal candidates by 14% in the channels with D ∗ (2007)0 → D 0 γ and 6% in the D ∗ (2007)0 → D 0 π0 channels. The impact of the multiple candidate veto on the results of the analysis is used to quantify the systematic uncertainty associated to the presence of multiple candidates. Since this veto modifies the selection strategy, this requires checking its impact on the MC samples used to define the shapes used in the mass fit. Moreover, most of the constraints used in the mass fit rely on relative efficiencies determined from MC, and hence these must also be changed to account for the modified selection. In Table 10.20, the Gaussian constraints used in the baseline model are compared to

10.2 Yields Systematic Uncertainties

195

Table 10.19 Percentage of correctly reconstructed candidates maintained from events with multiple and duplicated candidates when different veto variables. Discriminant variable (%) B 0 , γ (%) B 0 , π0 Bs0 , γ (%) Bs0 , π0 Control γ Control (%) π0 (%) (%) (%) χ2DTF (No constraints) χ2DTF (D ∗ (2007)0 and D 0 mass) γ CL δ(m D ∗ (2007)0 − m D 0 )

62 82

57 66

63 79

58 67

64 82

56 68

57 62

71 54

57 59

70 58

57 54

71 57

BDTcomb

75

65

76

63

76

64

Table 10.20 Gaussian constraints used in the baseline model and its modified version using the veto to remove duplicate and multiple candidates Parameter Baseline constraint (μ ± σ) Modified constraint (μ ± σ) fs K K fK K f ππ f sππ fKπ fs K π f B2Dhh MC f Bu f Bu2Dsth f BMC + /B 0

0.4324 ± 0.2518 1.3880 ± 0.3498 0.0097 ± 0.0008 0.0034 ± 0.0003 0.1583 ± 0.0041 0.0619 ± 0.0034 0.1390 ± 0.0012 0.3521 ± 0.0049 0.2197 ± 0.0068 0.3736 ± 0.0032

0.4810 ± 0.2633 1.4072 ± 0.3754 0.0094 ± 0.0008 0.0035 ± 0.0004 0.1583 ± 0.0042 0.0619 ± 0.0034 0.1307 ± 0.0012 0.3543 ± 0.0051 0.2059 ± 0.0035 0.3686 ± 0.0032

the values obtained with the modified selection that include this veto. The results of the mass fit with the multiple candidate veto applied, including updated shapes and constraints, are shown in Figs. 10.12 and 10.13. The results of this fit are used to obtain branching fraction ratios, with candidateby-candidate efficiency corrections applied as described in Chap. 9. The signal yields and the corrected efficiencies obtained from this procedure are compared to those from the baseline model in Table 10.22. The results for the branching fraction ratios with the multiple candidate veto applied are compared to the baselines results, from which the corresponding systematic uncertainty is evaluated, as shown in Table 10.23 (Table 10.21).

10 Systematic Uncertainties Fully reconstructed signal 0 D*(2007)0→D π 0 events Misreconstructed signal B0→ D*(2007)0π +π − B0→ D*(2007)0K +K − Bs0→ D*0 (2007)0K +K − B0→ D K +π − B+→ D*(2007)0K + B0→ D*(2007)0K 1(1270)0 Combinatorial background

350 300 250 200 150

Entries/10 MeV/c2

Entries/10 MeV/c2

196

0

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2] Fully reconstructed signal 0 D*(2007)0→D π 0 events Misreconstructed signal B 0→ D*(2007)0π +π − B 0→ D*(2007)0K +K − B s0→ D*0(2007)0K +K − B 0s → D K −π + B 0s → D*(2007)0K 1(1270)0 Combinatorial background

1200 1000 800 600

Entries/10 MeV/c2

Entries/10 MeV/c2

40

5 0 −5

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2] Fully reconstructed signal 0 D*(2007)0→D γ events Misreconstructed signal B 0→ D*(2007)0π +π − B 0→ D*(2007)0K +K − B s0→ D*0(2007)0K +K − B s0→ D K −π + B 0s → D*(2007)0K 1(1270)0 Combinatorial background

250 200 150 100

400

50

200

0

0

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

5000 Fully reconstructed signal 0 D*(2007)0→D π 0 events Misreconstructed signal B0→ D*(2007)0K +π − B0s→ D*0 (2007)0K −π + Bs0→ D π −π + B+→ D*(2007)0π + 0 0 B → D*(2007) η ' (958)0 B0→ D*(2007)0a1(1260)0 Combinatorial background

4000 3000 2000

5 0 −5

Entries/10 MeV/c2

Entries/10 MeV/c2

50

10

0

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2] Fully reconstructed signal 0 D*(2007)0→D γ events Misreconstructed signal B0→ D*(2007)0K +π − B0s→ D*0 (2007)0K −π + Bs0→ D π −π + B+→ D*(2007)0π + 0 0 B → D*(2007) η ' (958)0 B0→ D*(2007)0a1(1260)0 Combinatorial background

1000 800 600 400

1000

200

0

5 0 −5

70 60

20

50

5 0 −5

Fully reconstructed signal 0 D*(2007)0→D γ events Misreconstructed signal B0→ D*(2007)0π +π − B0→ D*(2007)0K +K − Bs0→ D*0 (2007)0K +K − B0→ D K +π − B+→ D*(2007)0K + B0→ D*(2007)0K 1(1270)0 Combinatorial background

30

100

5 0 −5

90 80

0

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

5 0 −5

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

Fig. 10.12 Results of the simultaneous fits with linear y-axis scale after the veto to remove duplicate and multiple candidates is applied. Distributions of (top) B 0 → D ∗ (2007)0 K + π − , (middle) Bs0 → D ∗ (2007)0 K − π + , and (bottom) B 0 → D ∗ (2007)0 π + π − with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0 candidates are shown. The total fit result is shown as a solid blue line. All channels also include components for correctly reconstructed signal (solid green line), misreconstructed signal (dashed dark green line) signal with the wrong D ∗ (2007)0 decay reconstructed (dashed light green line). Misidentified B 0 → D ∗ (2007)0 π+ π− and Bs0 → D ∗ (2007)0 K+ K− decays are indicated by blue and purple lines, respectively. In the control channel, the lighter and darker blue lines indicate misidentified B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ 0 → D0 h+ h− ) decays, respectively. Partially combinatorial backgrounds are shown in orange (B(s) + ∗ 0 + and brown (B → D (2007) h ) while partially reconstructed backgrounds are grey and black. The combinatorial background component is shown in red

197 Entries/10 MeV/c2

Entries/10 MeV/c2

10.2 Yields Systematic Uncertainties

102

10

10

1 1 10−1

5200

5400

5 0 −5

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

5 0 −5

103

102

5200

5400

5800 5600 (D*(2007)0K +π −) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

102

10

10 1 1

5200

5400

5 0 −5

5600 5800 (D*(2007)0K −π +) Mass [MeV/c2]

Entries/10 MeV/c2

Entries/10 MeV/c2

5 0 −5

103

102

103

102

10 10 1

5 0 −5

5200

5400

5600 5800 (D*(2007)0π +π −) Mass [MeV/c2]

5 0 −5

Fig. 10.13 Results of the simultaneous fits with log y-axis scale after the veto to remove duplicate and multiple candidates is applied. Distributions of (top) B 0 → D ∗ (2007)0 K + π − , (middle) Bs0 → D ∗ (2007)0 K − π + , and (bottom) B 0 → D ∗ (2007)0 π + π − with (left) D ∗ (2007)0 → D 0 γ and (right) D ∗ (2007)0 → D 0 π0 candidates are shown. The total fit result is shown as a solid blue line. All channels also include components for correctly reconstructed signal (solid green line), misreconstructed signal (dashed dark green line) signal with the wrong D ∗ (2007)0 decay reconstructed (dashed light green line). Misidentified B 0 → D ∗ (2007)0 π+ π− and Bs0 → D ∗ (2007)0 K+ K− decays are indicated by blue and purple lines, respectively. In the control channel, the lighter and darker blue lines indicate misidentified B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ 0 → D0 h+ h− ) decays, respectively. Partially combinatorial backgrounds are shown in orange (B(s) + ∗ 0 + and brown (B → D (2007) h ) while partially reconstructed backgrounds are grey and black. The combinatorial background component is shown in red

198

10 Systematic Uncertainties

Table 10.21 χ2 /Nbins values from each of the spectra fitted in the simultaneous fit Mode χ2 /Nbins B0, γ B 0 , π0 Bs0 , γ Bs0 , π0 Control, γ Control, π0

1.11 0.61 2.14 1.21 2.28 2.45

Table 10.22 Signal yields and corrected efficiencies for both the baseline results and the modified selection that includes the multiple candidate veto Baseline model Modified model Channel B0, γ B 0 , π0 Bs0 , γ Bs0 , π0 Control, γ Control, π0

Yield 946.37 184.66 3744.32 632.72 15020.91 2591.49

corrected 2.01 · 10−4 2.08 · 10−5 2.18 · 10−4 2.16 · 10−5 2.65 · 10−4 2.48 · 10−5

corrected 1.90 · 10−4 1.92 · 10−5 2.15 · 10−4 2.05 · 10−5 2.57 · 10−4 2.31 · 10−5

Yield 853.38 159.38 3249.84 574.61 13583.12 2311.21

Table 10.23 Branching fraction ratios for both the baseline results and the modified selection that includes the multiple candidate veto. The difference between these two values is taken as a systematic uncertainty Branching fraction Baseline Modified δB δ B/B model model B(B 0 →D ∗ (2007)0 K + π− ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(B 0 →D ∗ (2007)0 K + π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 )

0.0832

0.0852

0.0020

0.024

0.0850

0.0829

0.0021

0.0251

1.189

1.130

1.099

1.101

0.0018

0.002

0.070

0.075

0.0054

0.078

0.077

0.075

0.0021

0.027

−0.060

0.050

10.3 Systematic Uncertainties Summary

199

10.2.4 Alternative Background Models The description of several backgrounds relies on MC samples generated flat across the squared Dalitz plot. Great effort is made to improve the characterisation of such backgrounds by reweighting the samples using custom models as described in Sect. 7.3. However, the reliability of this method depends on the accuracy of the custom physical models of these decays and the binning used for the reweighting procedure. While the choices made for the former are supported by the background distributions obtained with the sWeights procedure, the latter is a potential source of systematic uncertainty due to the limited number of events in the background MC samples after the selection is applied. In the baseline analysis, an adaptive binning scheme of 20 × 20 bins with the same number of events per bin is used in the reweighting process. This choice is potentially a source of systematic uncertainty. Note that the reweighting process affects both the background shape description and the overall background efficiency. Changes in the efficiency due to the binning scheme are studied by determining the corrected yields in the MC samples with different numbers of bins. The range of binnings considered is from 10 × 10 up to 100 × 100 bins, where in all cases an adaptive binning scheme is used to ensure the same number of events per bin. This not only allows to study the fluctuations of the background efficiency as a function of the number of bins, but also serves as a cross check to validate the chosen binning used in the baseline model. The corrected yields as a function of the number of bins for all signal components are shown in Fig. 10.14. To evaluate the systematic uncertainty due to this effect, the invariant mass fit is repeated with background shapes obtained from reweighted MC using a 80 × 80 binning scheme. Similarly as for other changes to the fit model discussed previously, a modified fit model also affects the sWeights calculation and therefore the corrected efficiency. A comparison of the new fit model signal yields and the corrected efficiency with respect to the baseline model is shown in Table 10.24. These alternative results are used to calculate the branching fraction ratios, as shown in Table 10.25, and the difference between the modified and baseline results is taken as the systematic uncertainty associated to the background modelling.

10.3 Systematic Uncertainties Summary The complete list of systematic uncertainties is shown in Table 10.26, where the statistical uncertainties and uncertainties due to external constraints are also given for comparison. The uncertainties are quoted as both absolute and relative uncertainties to allow a more clear comparison between the different relative branching fractions. The statistical uncertainties have been scaled as discussed in Sect. 10.2.1, and therefore differ from the values presented in Chap. 11. Systematic uncertain-

10 Systematic Uncertainties Corrected MC yield

Corrected MC yield

200 76000 75500 75000 74500 74000 73500

14500 14400 14300 14200 14100 14000 13900 13800 13700 13600

73000

50

100

50

Corrected MC yield

Corrected MC yield

88000

87000

86000

85000

14800

14600

14400

14200

14000

84000

50

100

50

bins

×103

100.6

100.4 100.2 100 99.8 99.6 99.4 99.2

100

bins Corrected MC yield

Corrected MC yield

100

bins

bins

16300

16200

16100

16000

15900

99 50

100

bins

50

100

bins

Fig. 10.14 Corrected yield of the signal MC samples as a function of the number of bins of the reweighting templates used to correct for the flat distribution across the SDP used in the generated samples

ties are indicated as being considered to be either completely uncorrelated (†) or completely correlated (∗) between the ratios obtained with the D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 decays—this is important for the combination of results between the two channels, discussed in Chap. 11. Despite the challenges of this analysis, including the difficult reconstruction of the D ∗ (2007)0 mesons and the large number of background components that must be included in the mass fit, the individual results (each column in Table 10.26) generally have systematic uncertainties at the same level as or below the statistical uncertain-

10.3 Systematic Uncertainties Summary

201

Table 10.24 Signal yields and corrected signal efficiencies for both the baseline model and the modified version that includes the alternative model for the background components Baseline model Modified model Channel B0, γ B 0 , π0 Bs0 , γ Bs0 , π0 Control, γ Control, π0

Yield 946.37 184.66 3744.32 632.72 15020.91 2591.49

corrected 2.01 · 10−4 2.08 · 10−5 2.18 · 10−4 2.16 · 10−5 2.65 · 10−4 2.48 · 10−5

corrected 1.97 · 10−4 2.09 · 10−5 2.17 · 10−4 2.20 · 10−5 2.60 · 10−4 2.50 · 10−5

Yield 913.60 188.74 3627.66 615.93 13731.70 2552.51

Table 10.25 Branching fraction ratios for the baseline model and the modified fit with alternative background models. The difference between these two values is taken as the associated systematic uncertainty Branching fraction Baseline Modified δB δ B/B model model B(B 0 →D ∗ (2007)0 K + π− ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 π+ π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 ) B(B 0 →D ∗ (2007)0 π+ π− ,π0 ) B(B 0 →D ∗ (2007)0 K + π− ,γ) B(Bs0 →D ∗ (2007)0 K − π+ ,γ) B(B 0 →D ∗ (2007)0 K + π− ,π0 ) B(Bs0 →D ∗ (2007)0 K − π+ ,π0 )

0.0832

0.0879

0.0047

0.056

0.0850

0.0882

0.0032

0.038

1.189

1.1248

0.059

0.049

1.099

1.076

−0.0023

−0.021

0.070

0.070

0.0004

0.006

0.077

0.082

0.0046

0.060

ties. An exception is for the Bs0 , γ mode, normalised to the control channel, where the high signal yields results in a statistical uncertainty of only 2.7%. The ratio of B 0 to Bs0 branching fractions is also affected by some significant systematic uncertainties. In most cases the largest sources of systematic uncertainty are the disambiguation between multiple candidates and the background modelling. The limited size of the MC samples is also an important source of systematic uncertainty in the D ∗ (2007)0 → D 0 π0 channel as a consequence of the low reconstruction efficiency and correspondingly small samples of selected MC decays.

∗ σdata/MC

∗ σPID

† σMC stats

∗ σmult cand

σ∗ 0 Λb

∗ σmodel

∗ σfit bias

5.1 · 10−3

σstat

1.9 · 10−5 2.3 · 10−3 (3.4%)

(0.1%)

(1.2%) (0.0%)

4.8 · 10−3 (5.7%)

9.9 · 10−4

(0.10%) 5.8 · 10−5

(2.5%)

(2.4%)

1.1 · 10−4

2.1 · 10−3

(1.5%)

(0.3%)

2.0 · 10−3

1.3 · 10−3

(5.6%)

2.5 · 10−4

3.2 · 10−3 (3.8%)

4.7 · 10−3

6.9 · 10−4 (0.8%)

(0.6%)

(8.1%)

B(B 0 →D ∗00 K + π− ) π B(B 0 →D ∗00 π+ π− ) π 0.0850 6.9 · 10−3

4.8 · 10−4

(6.1%)

0.083

Central value

+ − B(B 0 →D ∗0 γ K π ) + π− ) B(B 0 →D ∗0 π γ

(0.6%)

7.0 · 10−3

(0.3%)

3.2 · 10−3

(1.2%)

1.4 · 10−2

(5.0%)

5.9 · 10−2

(2.5%)

2.9 · 10−2

(4.9%)

5.9 · 10−2

(0.3%)

3.8 · 10−3

(2.7%)

3.2 · 10−2

1.189

− + B(Bs0 →D ∗0 γ K π ) + π− ) B(B 0 →D ∗0 π γ

(2.4%)

2.7 · 10−2

(0.2%)

2.5 · 10−3

(5.5%)

6.0 · 10−2

(0.2%)

1.8 · 10−3

(4.4%)

4.8 · 10−2

(2.1%)

2.3 · 10−2

(0.1%)

1.5 · 10−3

(4.5%)

5.0 · 10−2

1.099

B(Bs0 →D ∗00 K − π+ ) π B(B 0 →D ∗00 π+ π− ) π

(0.5%)

3.6 · 10−4

(0.2%)

1.1 · 10−4

(1.4%)

9.8 · 10−4

(7.8%)

5.4 · 10−3

(2.3%)

1.6 · 10−3

(0.6%)

4.4 · 10−4

(0.9%)

6.3 · 10−4

(5.9%)

4.1 · 10−3

0.070

+ − B(B 0 →D ∗0 γ K π ) − π+ ) B(Bs0 →D ∗0 K γ

(1.0%)

7.3 · 10−4

(0.2%)

1.8 · 10−4

(7.1%)

5.4 · 10−3

(2.7%)

2.1 · 10−3

(2.7%)

2.1 · 10−3

(6.0%)

4.6 · 10−3

(1.0%)

7.3 · 10−4

(7.4%)

5.7 · 10−3

0.077

(continued)

B(B 0 →D ∗00 K + π− ) π B(Bs0 →D ∗00 K − π+ ) π

Table 10.26 Systematic uncertainties on the branching fraction ratios, with the soft neutral particle emitted in the D ∗ (2007)0 decay indicated as a subscript for brevity of notation. The sources of uncertainty are presented in the same order as described in the text, with central values and statistical uncertainties included to facilitate comparison. Values in parentheses are relative to the central values given in the first row. Systematic uncertainties are indicated as being considered to be either completely uncorrelated (†) or completely correlated (∗) between the ratios obtained with the D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 decays. The total correlated and uncorrelated systematic uncertainties are obtained by summing the relevant sources in quadrature. The uncertainty due to the fragmentation fractions is quoted separately.

202 10 Systematic Uncertainties

σ ∗f / f s d

∗ total σsyst

† total σsyst

∗ σweights

στ∗ Bs0

† σSDP bins

∗ σtrigger

— —

—

—

5.1 · 10−3

(9.5%)

(1.8%) (6.1%)

8.1 · 10−3

1.5 · 10−3

(6.2%)

(0.6%)

(0.6%)

5.2 · 10−3

5.2 · 10−4

5.1 · 10−4

—

(7.6%)

(1.4%)

—

6.4 · 10−3

1.2 · 10−3 —

(0.4%)

(1.1%)

—

B(B 0 →D ∗00 K + π− ) π B(B 0 →D ∗00 π+ π− ) π 3.5 · 10−4

+ − B(B 0 →D ∗0 γ K π ) + π− ) B(B 0 →D ∗0 π γ 9.2 · 10−4

Table 10.26 (continued)

3.4 · 10−2 (3.1%)

(3.1%)

(5.3%)

6.3 · 10−2

(8.8%)

9.7 · 10−2

(1.0%)

3.7 · 10−2

(7.7%)

9.0 · 10−2

(2.3%)

2.7 · 10−2

(1.0%)

1.1 · 10−2

(0.5%)

(0.6%) 1.2 · 10−2

5.4 · 10−3

(6.9%)

7.6 · 10−2

(1.2%)

B(Bs0 →D ∗00 K − π+ ) π B(B 0 →D ∗00 π+ π− ) π 1.3 · 10−2

7.2 · 10−3

(1.9%)

2.3 · 10−2

(0.4%)

− + B(Bs0 →D ∗0 γ K π ) + π− ) B(B 0 →D ∗0 π γ 5.3 · 10−3

(3.1%)

2.2 · 10−3

(8.3%)

5.9 · 10−3

(2.5%)

1.8 · 10−3

(1.6%)

1.1 · 10−3

(0.6%)

4.2 · 10−4

(2.1%)

1.5 · 10−3

(0.7%)

+ − B(B 0 →D ∗0 γ K π ) − π+ ) B(Bs0 →D ∗0 K γ 4.7 · 10−4

(3.1%)

2.4 · 10−3

(8.1%)

5.8 · 10−3

(9.8%)

7.5 · 10−3

(1.6%)

1.2 · 10−3

(0.5%)

3.8 · 10−4

(6.7%)

5.2 · 10−3

(0.8%)

B(B 0 →D ∗00 K + π− ) π B(Bs0 →D ∗00 K − π+ ) π 6.1 · 10−4

10.3 Systematic Uncertainties Summary 203

204

10 Systematic Uncertainties

References 1. Aaij R et al Selection and processing of calibration samples to measure the particle identification performance of the LHCb experiment in Run 2. EPJ Tech Instrum 6:1, arXiv:1803.00824 2. Sanchez AM, Robbe P, Schune M-H (2012) Performances of the LHCb L0 calorimeter trigger. LHCb-PUB-2011-026 3. Heavy Flavor Averaging Group, Amhis Y et al (2021) Averages of b-hadron, c-hadron, and τ lepton properties as of 2018. Eur Phys J C81:226, arXiv:1909.12524, updated results and plots available at https://hflav.web.cern.ch 4. Fleischer R, Knegjens R (1789) Effective lifetimes of Bs Decays and their constraints on the Bs0 - B¯ s0 mixing parameters. Eur Phys J C71:1789, arXiv:1109.5115

Chapter 11

Results

11.1 Relative Branching Fraction Measurements The relative branching fraction of two decay modes is determined using Eq. (9.1), which takes as input the signal weights obtained from the mass fits (that sum to give the signal yield) and the efficiencies as functions of the square Dalitz plot coordinates. A summary of the yields and efficiencies for each channel is shown in Table 11.1. An additional factor in Eq. (9.1) is needed in case the branching fractions in which the numerator and denominator are for decays of different particles, as then their different production rates must be taken into account. In particular, for the ratio B(Bs0 → D ∗ (2007)0 K− π+ )/B(B 0 → D ∗ (2007)0 π+ π− ) the right-hand side is modified to include a factor of ( f s / f d )−1 , where f s / f d is the ratio of fragmentation fractions for Bs0 and B 0 mesons [1]. This procedure provides for various consistency checks that can be performed due to the splitting of the data samples, which help build confidence in the robustness of the analysis procedures. Studying both D ∗ (2007)0 decays independently not only means that two measurements are made of each of the relative branching fractions under study, but also enables determination of the relative D ∗ (2007)0 branching fraction, B(D ∗ (2007)0 → D 0 γ ) . (11.1) B(D ∗ (2007)0 → D 0 π0 ) This measurement can be done with each of the signal B 0 → D ∗ (2007)0 K+ π− , Bs0 → D ∗ (2007)0 K− π+ and control B 0 → D ∗ (2007)0 π+ π− channels, leading to the following results

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0_11

205

206

11 Results

Table 11.1 Signal yields and corrected signal efficiencies for all different channels Channel Yield Corrected B0, γ B 0 , π0 Bs0 , γ Bs0 , π0 Control, γ Control, π0

2.01 · 10−4 2.08 · 10−5 2.18 · 10−4 2.16 · 10−5 2.65 · 10−4 2.48 · 10−5

946.37 184.66 3744.32 632.72 15020.91 2591.49

B 0 → D ∗ (2007)0 K+ π− : Bs0 → D ∗ (2007)0 K− π+ : B 0 → D ∗ (2007)0 π+ π− :

B(D ∗ (2007)0 → D 0 γ ) B(D ∗ (2007)0 → D 0 π0 ) B(D ∗ (2007)0 → D 0 γ ) B(D ∗ (2007)0 → D 0 π0 ) B(D ∗ (2007)0 → D 0 γ ) B(D ∗ (2007)0 → D 0 π0 )

= 0.53 ± 0.06 , = 0.59 ± 0.04 , = 0.54 ± 0.04 ,

in which only the statistical uncertainty is included. All three measurements are compatible within one standard deviation with the latest measurements of this quantity from BESIII [2] and LHCb [3, 4]. BESIII : LHCb :

B(D ∗ (2007)0 → D 0 γ ) B(D ∗ (2007)0 → D 0 π0 ) B(D ∗ (2007)0 → D 0 γ ) B(D ∗ (2007)0 → D 0 π0 )

= 0.55 ± 0.02 , = 0.53 ± 0.03 ,

where the uncertainties include both statistical and systematic contributions. The determinations of this quantity presented above are not intended to be competitive with the world-leading results quoted here, but their consistency provides an important cross check. The relative branching fraction between B 0 → D ∗ (2007)0 K+ π− decays and the control channel, and that between Bs0 → D ∗ (2007)0 K− π+ decays and the control channel, are obtained using samples that share the same D ∗ (2007)0 decay. Having two independent measurements of each relative branching fraction provides a cross check of the results, and allows a combined measurement with smaller uncertainty than from each channel separately. These relative branching fractions are determined using Eq. (9.1) to be

11.1 Relative Branching Fraction Measurements

D ∗ (2007)0 → D 0 γ : D ∗ (2007)0 → D 0 π0 :

B(B 0 → D ∗ (2007)0 K+ π− ) B(B 0 → D ∗ (2007)0 π+ π− ) B(B 0 → D ∗ (2007)0 K+ π− ) B(B 0 → D ∗ (2007)0 π+ π− )

207

= 0.083 ± 0.005 , = 0.085 ± 0.006 ,

and D ∗ (2007)0 → D 0 γ : D ∗ (2007)0 → D 0 π0 :

B(Bs0 → D ∗ (2007)0 K− π+ ) B(B 0 → D ∗ (2007)0 π+ π− ) B(Bs0 → D ∗ (2007)0 K− π+ ) B(B 0 → D ∗ (2007)0 π+ π− )

= 1.19 ± 0.03 , = 1.10 ± 0.04 ,

where only statistical uncertainty is included, taking into account the statistical correlations given in Table 8.12. This is done by calculating the uncertainty on the ratio using σ N2 B B(A) σ N2 A σNA σNB B(A) 2 = + 2 −2 ρ AB σ (11.2) B(B) B(B) N A2 NA NB NB where A and B correspond to the two different channels and ρ AB is the correlation between their fitted yields. Due to the high positive correlations between the yields in the D ∗ (2007)0 → 0 0 D π channels, the inclusion of the correlation term results in a notable decrease of the statistical uncertainties compared to the case of zero correlation. This is an expected behaviour, as the fitted yields in these channels are strongly affected by the π0 = 1.5218 ± ratio of misreconstructed relative to correctly reconstructed signal f mrs 0.1630, which is shared among all channels. Therefore, the total statistical uncertainty on the ratio is driven by the sum of the yields of the correctly reconstructed and misreconstructed signal yields, which is taken into account through the correlation factor. Additional checks of the effects of the shared parameters in the fit on the statistical uncertainties of the signal yields are presented in Appendix D. In addition, the ratio between the branching fraction of B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ decays can also be obtained in the same manner, leading to D ∗ (2007)0 → D 0 γ : D ∗ (2007)0 → D 0 π0 :

B(B 0 → D ∗ (2007)0 K+ π− ) B(Bs0 → D ∗ (2007)0 K− π+ ) B(B 0 → D ∗ (2007)0 K+ π− ) B(B 0 → D ∗ (2007)0 K− π+ )

= 0.070 ± 0.004 , = 0.077 ± 0.006 ,

For the measurements of the relative branchings that include the Bs0 channel, the fragmentation fraction ratio f s / f d = 0.2539 [1] has been used in the calculation. The results are seen to be consistent between the D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 channels in both cases.

208

11 Results

With the consideration of the systematic uncertainties, the separate branching fraction ratios can be combined to provide a single measurement. The basis of the combination is the familiar equation to minimise the χ 2 , BComb. =

Bγ δBγ2

+

Bπ0 δBπ2 0

1 δBγ2

+

1 δBπ2 0

with δBComb. =

1 1 δBγ2

+

1 δBπ2 0

.

(11.3)

For what follows, it is helpful to note that the equation for BComb. is fundamentally a linear combination of Bγ and Bπ0 , and can be written BComb. = F Bγ + (1 − F) Bπ0 ,

(11.4)

where F depends only on the uncertainties δBγ and δBπ0 . By separating the sources of systematic uncertainty into those that are completely uncorrelated (i.e. 0% correlation) and those that are completely correlated (i.e. ±100% correlation), as done in Sect. 10.3, the combination to obtain the central value can be carried out including only the former sources. Specifically, †total 2 2 + σsys , where the corresponding values are given in Table 10.26. δBγ ,π0 = σstat This fixes the value of F in Eq. (11.4), which does not change subsequently. Note that the separate contributions of the statistical and uncorrelated systematic uncertainties can be obtained using Eq. (11.4) with simple error propagation, e.g.

(δBComb. )stat =

2 F 2 δBγ stat + (1 − F)2 (δBπ0 )2stat .

(11.5)

Finally, we can include the correlated systematic uncertainties. This is trivial for sources such as f s / f d , where the relative uncertainty is identical for both γ and π0 modes, as the relative uncertainty is also the same on the combined result. For the more general case, again simple error propagation can be used, but as the uncertainty is correlated the addition is linear rather than in quadrature, (δBComb. )corr = F δBγ corr + (1 − F) (δBπ0 )corr .

(11.6)

The results of this combination procedure are presented in Table 11.2. The final results of the analysis are B(B 0 → D ∗ (2007)0 K+ π− ) B(B 0 → D ∗ (2007)0 π+ π− ) B(Bs0 → D ∗ (2007)0 K− π+ ) B(B 0 → D ∗ (2007)0 π+ π− ) B(B 0 → D ∗ (2007)0 K+ π− ) B(Bs0 → D ∗ (2007)0 K− π+ )

= 0.0836 ± 0.0043 ± 0.0056 ,

(11.7)

= 1.178 ± 0.029 ± 0.091 ± 0.037 ,

(11.8)

= 0.0712 ± 0.0035 ± 0.0062 ± 0.0022 , (11.9)

11.2 Dalitz Plot Distributions

209

Table 11.2 Combined results for the branching fraction ratios. The contributions from statistical and systematic uncertainties, and from f s / f d are given separately, where the systematic uncertainty is the sum in quadrature of the correlated and uncorrelated sources, obtained as described in the text. The total uncertainty, σTOTAL , is the sum in quadrature of all sources Comb. result σstat σsys σ fs / fd σTOTAL

B(B 0 ) B(Control)

B(Bs0 ) B(Control)

B(B 0 ) B(Bs0 )

0.0836 0.0043 (5.1%) 0.0056 (6.7%) — — 0.0070 (8.4%)

1.178 0.029 (2.4%) 0.091 (7.7%) 0.037 (3.1%) 0.102 (8.6%)

0.0712 0.0035 (4.9%) 0.0062 (8.7%) 0.0022 (3.1%) 0.0074 (10.4%)

where the uncertainties are statistical, systematic and (where given) due to f s / f d . These correspond to the first observations of the B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ decays. Finally, measurements of the absolute branch0 → D ∗ (2007)0 K± π∓ decays are provided by making use of ing fraction for B(s) the previous results on the branching fraction of the control channel, B(B 0 → D ∗ (2007)0 π+ π− ) = (6.2 ± 1.2 ± 1.8) × 10−4 [5]. The absolute branching fractions are B(B 0 → D ∗ (2007)0 K+ π− ) = (5.18 ± 0.27 ± 0.34 ± 1.84) × 10−5 , B(Bs0 → D ∗ (2007)0 K− π+ ) = (7.30 ± 0.18 ± 0.56 ± 2.59 ± 0.23) × 10−4 , where the uncertainties are statistical, systematic, due to B(B 0 → D ∗ (2007)0 π+ π− ) and (where given) to f s / f d . The large uncertainty on the measurement of the control channel branching fraction dominates this result, which could motivate a future study of B 0 → D ∗ (2007)0 π+ π− decays. Such an analysis would strongly benefit from most of the techniques and studies developed in this thesis.

11.2 Dalitz Plot Distributions In addition to the branching fractions, additional results can be obtained as a byproduct of the sWeights procedure, which provides background-subtracted SDP distributions for all the studied decay channels. These are shown in Fig. 9.2. However, in order to interpret these in terms of the physical distributions, i.e. in terms of resonant contributions to the decays, it is necessary to also apply efficiency corrections to these distributions. This is simple to do with the weighting procedure described in

210

11 Results

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×103 250 200 150 100 50 0

LHCb 5.4 fb-1 0.2

0.4

0.6

0.8

m2(D*π−) [GeV2/ c4]

θ'

Sect. 9, and also allows the distributions obtained from the D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 modes to be combined. The outcome is shown in Fig. 11.1 for both the SDP and conventional Dalitz plot representations of the two-dimensional phase space. The conventional Dalitz plots are made with m 2 (h + h − ) on the x-axis and m 2 (D ∗0 h − ) on the y-axis, so that the expected resonance structures appear as horizontal and vertical bands in all three plots. The projections onto the two-body invariant masses are shown in Fig. 11.2, which provide more clarity on the nature of such structures. A clear

1

24 22 20 18 16 14 12 10 8 6 4 0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

×103 1600 1400 1200 1000 800 600 400 200 0 −200 −400

LHCb 5.4 fb-1 0.2

0.4

0.6

0.8

m2(D*K−) [GeV2/ c4]

θ'

m'

1

28 26 24 22 20 18 16 14 12 10 8 6 0

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

LHCb 5.4 fb-1 0.2

0.4

0.6

0.8

1

m'

×103 3500 3000 2500 2000 1500 1000 500 0 −500 −1000

m2(D*π−) [GeV2/ c4]

θ'

m'

LHCb 5.4 fb-1

×103 600 500 400 300 200 100 0

5

10

m2(K+π−) [GeV2/ c4] LHCb 5.4 fb-1

×103 3500 3000 2500 2000 1500 1000 500 0

5

10

m2(K−π+) [GeV2/ c4] LHCb 5.4 fb

25

-1

×106 12 10

20

8

15

6 4

10

2

5 0

0 5

10

2

m2(π+π−) [GeV / c4]

Fig. 11.1 (Left) SDP and (right) conventional Dalitz plot distributions of backgroundsubtracted and efficiency-corrected data, from D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 combined. (Top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ and (bottom) B 0 → D ∗ (2007)0 π+ π−

800

3 ×10

A.U

3 ×10

211 A.U

A.U

11.2 Dalitz Plot Distributions 1400 1200 1000

600

1000

800 400

600

500

400

200

200

0

0

0 −500

−200

−200

5000

A.U

3000

2000

(K +π −) Mass [MeV/c2]

3 ×10

3000

2500

12

2500

2000

10

2000

1500

8

1500

6

1000

4

500

2

0

0

−500

3000

4000

1000

A.U

25 20

2000

3000

2000

(K −π +) Mass [MeV/c2]

4000

5000

(D*(2007)0π +) Mass [MeV/c2] 3

×10

6000 5000

7000

×10

6000 5000

4000

15

3000

0

3

7000

5000

−500

5000

6

4000

500

(D*(2007)0K −) Mass [MeV/c2]

×10

3000

(D*(2007)0π −) Mass [MeV/c2]

3 ×10

1000

A.U

A.U

14

2000

1000

(D*(2007)0K +) Mass [MeV/c2]

6 ×10

A.U

4000

3000

A.U

3 ×10

1500

4000

3000 10

3000

2000

2000

1000

5

1000

0

2000

0

−1000

0

3000

4000

π −)

0

(D (2007) *

5000 2

1000

2000

(π +π −)

Mass [MeV/c ]

3000

2

Mass [MeV/c ]

2000

3000

4000

5000

(D*(2007)0π +) Mass [MeV/c2]

Fig. 11.2 The three two-body invariant mass combinations, as indicated by the axis labels, for background-subtracted and efficiency-corrected data from D ∗ (2007)0 → D 0 γ and D ∗ (2007)0 → D 0 π0 combined. (Top) B 0 → D ∗ (2007)0 K+ π− , (middle) Bs0 → D ∗ (2007)0 K− π+ and (bottom) B 0 → D ∗ (2007)0 π+ π− . Note that reflections from structures in one invariant mass com0 decay bination also show up in the other distributions for the same B(s)

K ± π ∓ structure, corresponding to the K ∗ (892)0 resonance, can be seen in both the B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ channels. Similarly, π+ π− structure corresponding to the ρ(770)0 resonance can be seen in the control channel. Narrow D ∗0 h − structures can also been seen in all channels, corresponding to the D1 (2420)− resonance in B 0 → D ∗ (2007)0 K+ π− and B 0 → D ∗ (2007)0 π+ π− decays, and corresponding to the Ds1 (2536)− resonance in Bs0 → D ∗ (2007)0 K− π+ decays. Broader structures also appear to be present in all h + h − and D ∗0 h − combinations, but there are no evident additional narrow resonances. The structures observed here are reasonably consistent with those anticipated when developing models for these channels, as necessary to describe their shapes when appearing as misidentified backgrounds as discussed in Sect. 7. This provides confidence that the shapes used do not introduce significant biases into the results of the fit to the invariant mass distributions. While a full amplitude analysis of these distributions is beyond the scope of this thesis, this first look at the Dalitz plot distributions of these decays provides cru0 → D X analyses within LHCb. Lack of knowlcial input for several ongoing B(s) edge of these distributions is a significant source of systematic uncertainty for various analyses aiming to measure the angle γ of the CKM unitarity triangle. The ( )

212

11 Results

information presented here can be used to help reduce these uncertainties in future measurements by using the efficiency corrected phase space distribution as a datadriven template to characterize these decays. Moreover, the contributions observed in the B 0 → D ∗ (2007)0 K+ π− decay, with a prominent K ∗ (892)0 resonant band overlapping with a D1 (2420)− component, indicate that this channel could itself be used in future to determine γ using similar approaches to those proposed to study B 0 → DK+ π− decays [6–9]. Although this appears extremely challenging with the current available data, it opens a new window for γ measurements in the next stage of the LHCb experiment.

11.3 Final Conclusions To conclude, this thesis presents the first observation of both the decay channels B 0 → D ∗ (2007)0 K+ π− and Bs0 → D ∗ (2007)0 K− π+ , and their branching fractions are measured with respect to the control channel B 0 → D ∗ (2007)0 π+ π− with an total relative uncertainty of 8%. The uncertainty is dominated by systematic sources due to the amount of backgrounds that need to be considered for this study. The measurement is based on the data sample recorded by the LHCb experiment during the years 2016–2018, which corresponds to 5.4 fb−1 . An absolute measurement of these branching fractions is also given, which relies on pre-existing knowledge of the control channel B 0 → D ∗ (2007)0 π+ π− branching fraction. This dominates the uncertainty of the measurement due to its large uncertainty of around 35%. This could motivate performing a future study on the branching fraction of the control channel with respect to, for example the B + → D ∗ (2007)0 π+ decay channel. The impact of this study however transcends the first observation of these channels. It also provides their phase space distributions. These will provide crucial inputs into several analyses such as the γ measurement using B 0 → DK+ π− decays, in which the studied decay is a source of partially reconstructed background. The usage of these phase space distributions for the characterization of these decays could decrease dramatically the systematic uncertainty of such studies. On top of that, the results given in this thesis also could motivate a further amplitude analysis, which could potentially lead to the new observation of new resonant states in the charm and charm-strange sectors. Moreover, a future amplitude analysis of these decays could be used as an independent measurement of the γ angle. This however would require the inclusion of additional final states in the analysis (i.e. the CP eigenstates D → K+ K− and D → π+ π− ), which would prove extremely challenging without the addition of a larger data sample. Thus, the results of this thesis must also be considered under the perspective of the upcoming Run 3 period of data taking of the LHCb detector, serving as a motivation of a future analysis for an additional measurement of the γ angle in the CKM matrix that could contribute to the LHCb objective to lower its precision towards the sub-degree level. Additionally, this analysis also includes several improvements in different techniques that can be easily extrapolated to many other analyses. On one hand, this

References

213

study is the first analysis within LHCb featuring fully reconstructed D ∗ (2007)0 decays. This opens a new window of C P measurements using B to D ∗ (2007)0 hadron decays, by giving crucial insight in the different challenges that reconstructing soft neutral particles present in LHCb, and that could have a direct impact on the precision of the γ angle, and hence of C P parameters of the CKM matrix. Finally, this analysis pioneers the usage of the extended sWeights method, offering a alternative to the classical sWeight method that allows the inclusion of yield constraints in the invariant mass fit model. The usage of this method proved to be extremely useful and can be extrapolated to a large number of analysis, within and outside the LHCb collaboration.

References 1. LHCb collaboration, Aaij R et al (2021) Precise measurement of the f s / f d ratio of fragmentation fractions and of Bs0 decay branching fractions. Phys Rev D104:032005, arXiv:2103.06810 2. BESIII collaboration, Ablikim M et al (2015) Precision measurement of the D ∗0 decay branching fractions. Phys Rev D91:031101, arXiv:1412.4566 3. LHCb collaboration, Aaij R et al (2018) Measurement of CP observables in B ± → D (∗) K ± and B ± → D (∗) π ± decays. Phys Lett B777:16, arXiv:1708.06370 4. LHCb collaboration, Aaij R et al (2021) Measurement of CP observables in B ± → D (∗) K ± and B ± → D (∗) π ± decays using two-body D final states. JHEP 04:081, arXiv:2012.09903 (∗)0 + 5. Belle collaboration, Study of B 0 → D π π decays. arXiv:hep-ex/0412072 6. Gershon T (2009) On the measurement of the Unitarity Triangle angle γ from B 0 → D K ∗0 decays, Phys Rev D79:051301, arXiv:0810.2706 7. Gershon T, Williams M (2009) Prospects for the measurement of the Unitarity Triangle angle γ from B 0 → D K + π − decays. Phys Rev D80:092002, arXiv:0909.1495 8. Gershon T, Poluektov A (2010) Double Dalitz plot analysis of the decay B 0 → D K + i− , D → K S0 π + π − , Phys Rev D81:014025, arXiv:0910.5437 9. Craik D, Gershon T, Poluektov A (2018) Optimising sensitivity to γ with B 0 → D K + π − , D → K S0 π + π − double Dalitz plot analysis. Phys Rev D97:056002, arXiv:1712.07853

Appendix A

Selection Variables Description

The stripping and selection steps make use of geometric and kinematic properties of the different reconstructed particles to differentiate between signal and background candidates. Some of these variables are used in most HEP experiments. However, several of them are specific from the LHCb collaboration, due to the detector unique geometry for a collider experiment. In this section, the complete list of variables used in this analysis are described • p and pT : momentum and transverse momentum respect the beam direction of a reconstructed particle, in MeV/c. • τreconstructed : lifetime of a reconstructed particle, in ps, measured as the distance between the origin and decay vertex of the particle, divided by pc/m. • mreconstructed : reconstructed mass of a particle, in MeV/c2 . • Minimum impact parameter χ 2 (min χ I2P ): minimum distance between the reconstructed momenta of a particle and the associated vertex, divided by its uncertainty. Only the vertex with the minimum distance is considered the associated vertex. • cosθdir : cosine of the angle difference between the particle momenta computed from the sum of its decay products and the line between the origin and decay vertices. • Vertex χ 2 /dof: sum of minimum distances between a reconstructed particle decay vertex and the reconstructed tracks of its decay products, divided by their uncertainties and the number of decay products. • Distance from PV χ 2 : distance between the decay vertex of a particle and the primary vertex associated to the candidate, divided by its uncertainty. • Track χ 2 /dof: Sum of the square of the distances between the all recorded hits and the reconstructed track, divided by their squared uncertainties, and the number of hits.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0

215

216

Appendix A: Selection Variables Description

• Ghost probability: Probability that track is reconstructed from random hits instead of a real particle. For a long track to be considered to be reconstructed from a real particle, at least 3 hits in the VELO detector, 1 hit in each layer of the TT detector and two hits in each of the 3 T Stations must be used for the reconstruction (see Sect. 3.2.6). • Distance of closest approach (DOCA): minimum distance between a pair of tracks, in mm. • Photon confidence level (CL): probability of a reconstructed calorimeter cluster to correspond to a photon. Computed from the shape of a cluster in the electromagnetic calorimeter [1] (see Sect. 3.2.6). • B2C BDT output: Boosted decision tree output trained to identify B hadron decays to charmed hadrons within LHCb [2]. • Flying distance χ 2 : square of the distance between the reconstructed decay and 2 2 (χTDF ) refers origin vertices, divided by the vertices squared uncertainties. χZDF to the distance projected along the beam (transverse to the beam) direction. • VELO (χ 2 /doftrack ): sum of the square of the distances between the VELO recorded hits and the reconstructed track, divided by their squared uncertainties and the number of hits. • TT (χ 2 /doftrack ): sum of the square of the distances between the TT recorded hits and the reconstructed track, divided by their squared. uncertainties, and the number of hits. • D PID: Output of the Neural Network trained to reject charmless background and candidates with charm mesons that originated from the primary vertex (see Section 6.3). • A PIDB: Probability of the particle A being properly identified with a particle type B. This refers to the variable A_ProbNNB, which combines information from all particle identification subdetectors within the LHCb detector (see Sect. 3.2.6). • max/min (χ I2P ) from particle combination: Maximum/Minimum impact parameter from the min χ I2P in a group of particles. This is used from a collection of final state particles in favour of min χ I2P to remove correlations with the B reconstructed mass. • Cone 1,2 and 3 pT asymmetry: pT asymmetry computed as B0

asym pT

=

pT (s) − pTcone B0

(A.1)

pT (s) + pTcone

where pTcone is the transverse component of the sum of all particle momenta within a cone with radius R = 1.5, 1.7 and 1 radians around the B0(s) -candidate. Where R = δη2 + δφ 2 . δη and δφ correspond to the difference in pseudorapidity and azimuthal angle around the beam direction, with respect to the B0(s) -candidate.

Appendix A: Selection Variables Description

217

Reference 1. LHCb collaboration, Calvo Gomez et al (2015) A tool for γ /π 0 separation at high energies. http://cdsweb.cern.ch/search?p=LHCb-PUB-2015-016&f= reportnumber&action_search=Search&c=LHCb+Notes LHCb-PUB-2015-016 2. Gligorov VV, Williams M (2013) Efficient, reliable and fast high-level triggering using a bonsai boosted decision tree. JINST 8:P02013. https://doi.org/10.1088/ 1748-0221/8/02/P02013. arXiv:1210.6861

Appendix B

Data/MC Agreement of BDT Input Variables

Distributions of the variables used as inputs for both BDTs in the selection process are studied to check for possible disagreements between data and MC. Similarly, the BDT output is also compared between data and MC. The data distributions are obtained using the sWeight method discussed in Chap. 9 (since this relies on the output of the fit to the B0(s) -candidate mass, this comparison can only be done post hoc). This approach ought to be reliable, since only variables that do not exhibit strong correlations with the B0(s) -candidate mass are used as BDT inputs. However, it relies on having a sufficient signal yield and therefore is done only for the D∗ (2007)0 → D0 γ modes, except for the variables which are specific to the D∗ (2007)0 → D0 π 0 decay. Comparisons, between background-subtracted data and MC, of the BDT output distributions for the signal and control channels are shown in Fig. B.1. Generally good agreement is observed, albeit with a small discrepancy in the combinatorial background BDT output. Further comparisons for each of the BDT input variables are shown in Figs. B.2, B.3, B.4 for B0 → D∗ (2007)0 K+ π − decays, in Figs. B.5, B.6, B.7 for B0s → D∗ (2007)0 K− π + decays and in Figs. B.8, B.9, B.10 for B0 → D∗ (2007)0 π + π − decays. The agreement is excellent for most variables, but some discrepancy between background-subtracted data and MC can be seen in the pT asymmetry variables. This is considered as a source of systematic uncertainty in Sect. 10.1.5.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0

219

A.U.

Appendix B: Data/MC Agreement of BDT Input Variables

A.U.

220

Signal weighted Data

0.045

Signal weighted Data

0.04 10−1

0.035

Signal MC

Signal MC

0.03 0.025 0.02

10−2

0.015 0.01

0.8

1

Signal weighted Data

−1

10

0.7

0.8

0.9

1

0.9

1

0.9

1

Combinatorial BDT output

0.03 0.025

Signal weighted Data

0.02

Signal MC

10−2

10

0.9

misID BDT output

A.U.

A.U.

0.7

Signal MC 0.015 0.01

−3

0.005 10−4

0

0.9

0.8

1

Signal weighted Data

10

0.7

misID BDT output

A.U.

A.U.

0.7

−1

0.8

Combinatorial BDT output

0.024 0.022

Signal weighted Data

0.02 0.018 0.016

Signal MC

10−2

Signal MC

0.014 10−3

0.012 0.01

10−4

0.006

0.008 0.004 0.002

10−5

0.2

0.4

0.6

0.8

1

misID BDT output

0.7

0.8

Combinatorial BDT output

Fig. B.1 Comparisons, between background-subtracted data and MC, of the BDT output distributions for the (top) B0 → D∗ (2007)0 K+ π − , (middle) B0s → D∗ (2007)0 K− π + , and (bottom) B0 → D∗ (2007)0 π + π − channels. In all cases only the D∗ (2007)0 → D0 γ mode is shown. The left (right) plots show the misID (combinatorial background) BDT outputs

0.1

A.U.

A.U.

Appendix B: Data/MC Agreement of BDT Input Variables

0.09 Signal weighted Data

0.08

Signal MC

0.05

0.05

0.04

0.04

0.03

0.03

0.02

0.02 0.01

0.01

10000

20000

30000

0

0

40000

SUM( PT )

A.U.

0

Signal weighted Data

0.08

0.06

Signal MC

0.06

A.U.

0.09

0.07

0.07

0

221

0.12 Signal weighted Data 0.1

10000

5000

15000

20000

B Flight distance χ2

0.08 0.07

Signal weighted Data

0.06

0.08

0.05

Signal MC

Signal MC

0.04

0.06

0.03 0.04

0.02

0.02

A.U.

0

0.01

5000

10000

15000

D Flight distance

0

χ2

0.3 Signal weighted Data

0.25

0

20000

A.U.

0

5

10

15

20

25

B End vertex χ2

0.4 0.35

Signal weighted Data

0.3 0.2

Signal MC

Signal MC

0.25 0.2

0.15

0.15

0.1

0.1 0.05 0

0.999998

0.05

0.9999985

0.999999

0.9999995

1

cosθ Bdir

0

0

200

400

600

800

1000

D z Flight distance χ2

Fig. B.2 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 γ decay

Appendix B: Data/MC Agreement of BDT Input Variables A.U.

A.U.

222 0.3

Signal weighted Data

0.25

0.22 0.2

Signal weighted Data

0.18 0.16

0.2

Signal MC

Signal MC

0.14

0.15

0.12

0.1

0.08

0.05

0.04

0.1 0.06 0.02

A.U.

0

100

200

300

400

0

−0.5

500

D transversal flight distance χ2

A.U.

0

0.2 0.18

Signal weighted Data

0.16

0.22 0.2 Signal weighted Data

0.14

Signal MC

Signal MC

0.12

0.1

0.1

0.08

0.08

0.06

0.06

0.04

0.04

0.02

0.02

5000

10000

15000

0

0

20000

Max(K-π) min IP χ2

A.U.

A.U.

1

D PID

0.16

0.12

0

0.5

0.18

0.14

0

0

0.22 0.2

Signal weighted Data

10000

0.16

20000

0.14 Signal weighted Data

0.1

Signal MC

15000

Max(DK-Dπ) min IP χ2

0.12

0.18 0.14

5000

0.12

0.08

0.1

0.06

Signal MC

0.08 0.04

0.06 0.04

0.02

0.02

A.U.

0

2000

1000

3000

Signal weighted Data 10−1

2

4

3

B Min IP χ2

1 Signal weighted Data

Signal MC

10−2

3

0.7

1

0

10−1

Signal MC

10−2

10−

0

4000

Min(DK-Dπ) min IP χ2

A.U.

0

−3

10

0.8

0.9

1

K PIDK

0

0.1

0.2

0.3

0.4

0.5

K PIDpi

Fig. B.3 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 γ decay

223

A.U.

A.U.

Appendix B: Data/MC Agreement of BDT Input Variables 1 Signal weighted Data

Signal weighted Data

10−1

10−1 Signal MC

Signal MC

10−2

10−

10−2

3

10

−3

10−4

0.02

0.04

0.06

0.08

0.1

0.18 0.16

0.7

π PIDK

A.U.

A.U.

0

Signal weighted Data

0.14

0.8

0.9

1

π PIDpi

0.1 Signal weighted Data 0.08

0.12

Signal MC

Signal MC 0.06

0.1 0.08

0.04

0.06 0.04

0.02

0.02 0

0.4

0.6

0.8

0.08 0.07

Signal weighted Data

0.06

Signal MC

0

Signal weighted Data

Signal MC

0.04

0.03

0.03

0.02

0.02

0.01

0.01

1

0

−1

2

B PT Asymmetry (Cone1)

A.U.

0.12 Signal weighted Data

0.1

0

1

2

B PT Asymmetry (Cone2)

0.2 0.18 Signal weighted Data

0.16 Signal MC

gamma PT

0.07

0.04

0

2000

1500

0.06 0.05

−1

1000

500

0.08

0.05

0

A.U.

0

1

γ CL

A.U.

A.U.

0.2

0.14

0.08

Signal MC

0.12 0.06

0.1 0.08

0.04

0.06 0.04

0.02

0.02 0

−1

0

1

2

B PT Asymmetry (Cone3)

0

0

500

1000

1500

2000

2500

π0 PT

Fig. B.4 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 γ decay, except for the bottom right plot, which corresponds to B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 π 0

Appendix B: Data/MC Agreement of BDT Input Variables A.U.

A.U.

224 0.05

Signal weighted Data 0.04

Signal MC 0.03

0.02

0.02

0.01

0.01

0

20000

30000

40000

0

SUM(PT)

0.07

A.U.

A.U.

0

10000

0.06

Signal weighted Data

15000

20000

Bs Flight distance χ2

0.04 0.035

Signal weighted Data

Signal MC

0.02

0.03

0.015

0.02

0.01

0.01

0.005 0

0

5000

10000

15000

0

20000

D Flight distance χ2

A.U.

A.U.

10000

0.025

Signal MC

0.04

10−1

5000

0.03

0.05

0

Signal weighted Data

Signal MC

0.03

0

0.05 0.04

Signal weighted Data

5

10

15

20

25

Bs End vertex χ2

0.2 Signal weighted Data 0.15

Signal MC

Signal MC

10−2 0.1 10

−3

0.05 10

−4

0.999998

0

0.9999985

0.999999

0.9999995

B

1

cosθdirs

0

100

200

300

D z Flight distance χ2

Fig. B.5 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 γ decay

A.U.

A.U.

Appendix B: Data/MC Agreement of BDT Input Variables 0.18 0.16

Signal weighted Data

225

0.1

Signal weighted Data

0.14 0.08

0.12 Signal MC

0.1

Signal MC 0.06

0.08 0.04

0.06 0.04

0.02

0.02 0

0

100

200

−0.5

300

D transversal flight distance χ2

A.U.

A.U.

0

0.1 Signal weighted Data

0

0.5

1

D PID

0.12 Signal weighted Data

0.1

0.08 0.08 Signal MC

0.06

Signal MC 0.06

0.04

0.04

0.02

0.02

0

0

5000

10000

15000

20000

0

2

Max(K-π) min IP χ

A.U.

A.U.

0

0.12 Signal weighted Data

15000

20000

0.07 Signal weighted Data

0.05

0.08

Signal MC

Signal MC

0.04

0.06

0.03

0.04

0.02

0.02

0.01 0

0

1000

2000

3000

4000

Signal weighted Data 10−1

1

0

Min(DK-Dπ) min IP χ2

2

4

3

Bs Min IP χ2

1

A.U.

A.U.

10000

Max(DK-Dπ) min IP χ2

0.06

0.1

0

5000

Signal weighted Data

10−1 Signal MC

Signal MC

10−2

10−2

−3

10

10−3

10−4 10−4

0.7

0.8

0.9

1

K PIDK

0

0.1

0.2

0.3

0.4

0.5

K PIDπ

Fig. B.6 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 γ decay

226

Appendix B: Data/MC Agreement of BDT Input Variables

10

A.U.

A.U.

1 Signal weighted Data

−1

Signal weighted Data Signal MC

10−2 Signal MC

10−2

10

−3

−3

10

10−4

10−4

10−5

0.02

0.04

0.06

0.08

0.1

0.08 Signal weighted Data

0.07

0.7

π PIDK

A.U.

0

A.U.

10−1

0.8

0.9

1

π PIDπ

0.05 Signal weighted Data 0.04

0.06 Signal MC

0.05

Signal MC

0.03

0.04 0.02

0.03 0.02

0.01

0.01 0

0

0.4

0.6

0.8

γ CL

0.05 0.04

0

1

A.U.

A.U.

0.2

Signal weighted Data

1500

2000

γ PT

Signal weighted Data

0.03

0.02

0.02

0.01

0.01

0

0

−1

0

1

−1

2

Bs PT Asymmetry (Cone1)

A.U.

0.07 Signal weighted Data

0.06 0.05

1000

Signal MC

Signal MC

0.03

A.U.

0.04

500

0

1

2

Bs PT Asymmetry (Cone2)

0.1

Signal weighted Data

0.08

Signal MC

0.04

Signal MC

0.06

0.03

0.04

0.02 0.02 0.01 0

0

−1

0

1

2

Bs PT Asymmetry (Cone3)

0

500

1000

1500

2000

2500

π0 PT

Fig. B.7 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 γ decay, except for the bottom right plot, which corresponds to B0s → D∗ (2007)0 K− π + , D∗ (2007)0 → D0 π 0

A.U.

A.U.

Appendix B: Data/MC Agreement of BDT Input Variables 0.04 0.035

Signal weighted Data

0.03

227

0.045 Signal weighted Data

0.04 0.035 0.03

Signal MC

0.025

Signal MC

0.025

0.02

0.02

0.015

0.015 0.01

0.01

0.005

A.U.

0

10000

20000

0.05

30000

0

40000

0

SUM(PT)

A.U.

0

0.005

Signal weighted Data

5000

10000

15000

20000

B Flight distnace χ2

0.03

Signal weighted Data

0.025 0.04

Signal MC

Signal MC

0.02

0.03 0.015 0.02

0.01

0.01

A.U.

0

5000

10000

15000

0

0

20000

D Flight distnace χ2

A.U.

0

0.005

Signal weighted Data 10−1

5

10

15

20

25

B End vertex χ2

0.14 Signal weighted Data 0.12 0.1

Signal MC

Signal MC

0.08 10−2

0.06 0.04 0.02

10−3

0.999998

0.9999985

0.999999

0.9999995

1

cosθBdir

0

0

50

100

150

200

D z Flight distance χ2

Fig. B.8 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ decay

Appendix B: Data/MC Agreement of BDT Input Variables A.U.

A.U.

228 0.14

Signal weighted Data

0.12

0.1

Signal weighted Data

0.08

0.1

Signal MC

Signal MC

0.08

0.06

0.06

0.04

0.04 0.02

0.02

A.U.

0

50

100

150

D transversal flight distance

0

−0.5

200

χ2

A.U.

0

0.1 Signal weighted Data 0.08

0

0.5

0.1

Signal weighted Data

0.08 Signal MC

Signal MC 0.06

0.06

0.04

0.04

0.02 0

0.02

5000

10000

15000

0

20000

0

Max (K-π) min IP χ2

0.12

A.U.

A.U.

0

1

D PID

Signal weighted Data

0.1

5000

10000

15000

20000

Max (DK-Dπ) min IP χ2

0.07 Signal weighted Data 0.06

0.08

0.05

Signal MC

Signal MC

0.04

0.06

0.03

0.04

0.02 0.02

A.U.

0

0.01

1000

2000

3000

0

4000

0

Min (DK-Dπ) min IP χ2

A.U.

0

1 Signal weighted Data

10−1

1

2

3

4

B Min IP χ2

Signal weighted Data

10−1 Signal MC

Signal MC 10−2

10−2

−3

10

−3

10

0

0.1

0.2

0.3

0.4

0.5

π+ PIDK

0.7

0.8

0.9

1

π+ PIDπ

Fig. B.9 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ decay

A.U.

A.U.

Appendix B: Data/MC Agreement of BDT Input Variables 1 Signal weighted Data

229

Signal weighted Data

10−1

10−1 Signal MC

Signal MC 10

−2

−2

10

10−3 10

10−1

0.1

0.2

0.3

0.4

0.7

0.5

π- PIDK

A.U.

A.U.

0

Signal weighted Data

−3

0.8

0.9

1

π- PIDπ

0.04 Signal weighted Data

0.035 0.03

Signal MC

Signal MC

0.025

10−2

0.02 0.015 0.01

−3

A.U.

0.2

0.005

0.4

0.6

0.8

0.03

Signal weighted Data

Signal MC

0.025

0

0.02 0.015

0.01

0.01

0.005

0.005

1

Signal weighted Data

0.04 0.035

2000

γ PT

Signal weighted Data Signal MC

0

−1

2

B PT Asymmetry (Cone1)

A.U.

0.045

0

1500

0.025

0.02

−1

1000

500

0.03

0.015

0

A.U.

0

1

γ CL

A.U.

10

0

1

2

B PT Asymmetry (Cone2)

0.05

Signal weighted Data

0.04

Signal MC

Signal MC

0.03 0.025

0.03

0.02 0.02

0.015 0.01

0.01

0.005 0

−1

0

1

2

B PT Asymmetry (Cone3)

0

0

500

1000

1500

2000

2500

π0 PT

Fig. B.10 Comparisons, between background-subtracted data and MC, of the input variables used in the BDTs. All distributions correspond to the B0 → D∗ (2007)0 π + π − , D∗ (2007)0 → D0 γ decay, except for the bottom right plot, which corresponds to B0 → D∗ (2007)0 K+ π − , D∗ (2007)0 → D0 π 0

Appendix C

Signal MC Correlation Studies

Since the absence of correlation between the reconstructed B0(s) mass and the SDP variables is a necessary requirement for the implementation of the efficiency corrections discussed in Chap. 9, further studies have been performed to search for non-trivial correlations. The signal B0(s) -mass distribution is plotted in several independent regions of the SDP, as shown in Fig. C.1. These distributions are fitted with a simple Gaussian function, as shown in Fig. C.2. The fitted parameters in the different regions are compared in Table C.1. No evident trend is observed in the means of the distributions. A small trend can be observed in the widths, with the narrowest values found in the SDP centre (m = [0.33, 0.66], θ = [0.33, 0.66]) being O(∞%) lower than the values at the edges of the SDP. This extent of signal shape variation is considered small enough that it will not significantly bias the weights obtained from the mass fit, and hence has negligible impact on the results of the analysis. This check does not guarantee that no correlation exists between the B0(s) -candidate mass and the SDP variables. Nonetheless, this check in conjunction with the Kendall coefficient study shown in Sect. 9 gives confidence in the usage of the extended sWeights technique and the validity of the results obtained with it.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0

231

Appendix C: Signal MC Correlation Studies 0.07

A.U.

A.U.

232 M'=[0,0.33] Θ '=[0,0.33] M'=[0,0.33] Θ '=[0.33,0.66]

0.06 0.05

M'=[0,0.33] Θ '=[0.33,0.66] M'=[0,0.33] Θ '=[0.66,1] M'=[0.33,0.66] Θ '=[0,0.33]

0.06

M'=[0.33,0.66] Θ '=[0.33,0.66] M'=[0.33,0.66] Θ '=[0.66,1]

M'=[0.33,0.66] Θ '=[0.33,0.66] M'=[0.33,0.66] Θ '=[0.66,1]

0.05

M'=[0.66,1] Θ '=[0,0.33]

0.04

M'=[0.66,1] Θ '=[0.33,0.66]

M'=[0.66,1] Θ '=[0,0.33] M'=[0.66,1] Θ '=[0.33,0.66]

0.04

M'=[0.66,1] Θ '=[0.66,1]

0.03

M'=[0.66,1] Θ '=[0.66,1]

0.03

0.02

0.02

0.01

0.01

5300

K +π −)

0

(D (2007)

5350

0

2

5200

Mass [MeV/c ] A.U.

5250

5200 *

A.U.

M'=[0,0.33] Θ '=[0,0.33]

M'=[0,0.33] Θ '=[0.66,1] M'=[0.33,0.66] Θ '=[0,0.33]

0

0.08 0.07

M'=[0,0.33] Θ '=[0,0.33]

0.07

M'=[0,0.33] Θ '=[0.33,0.66] M'=[0,0.33] Θ '=[0.66,1]

0.06

5250

5300

0.07

M'=[0,0.33] Θ '=[0,0.33] M'=[0,0.33] Θ '=[0.33,0.66] M'=[0,0.33] Θ '=[0.66,1]

0.06

M'=[0.33,0.66] Θ '=[0,0.33]

M'=[0.33,0.66] Θ '=[0,0.33] M'=[0.33,0.66] Θ '=[0.33,0.66]

M'=[0.33,0.66] Θ '=[0.33,0.66]

0.05

M'=[0.33,0.66] Θ '=[0.66,1]

0.04

M'=[0.66,1] Θ '=[0.33,0.66]

0.05

M'=[0.33,0.66] Θ '=[0.66,1] M'=[0.66,1] Θ '=[0,0.33]

M'=[0.66,1] Θ '=[0,0.33]

0.04

M'=[0.66,1] Θ '=[0.33,0.66] M'=[0.66,1] Θ '=[0.66,1]

M'=[0.66,1] Θ '=[0.66,1]

0.03

0.03

0.02

0.02

0.01

0.01

5300

5350

(D (2007) A.U.

5400

K −π +)

0

*

0

5450

5300

Mass [MeV/c ]

0.06

M'=[0,0.33] Θ '=[0,0.33] M'=[0,0.33] Θ '=[0.33,0.66]

5350

5400

M'=[0,0.33] Θ '=[0,0.33]

0.06

M'=[0,0.33] Θ '=[0.33,0.66] M'=[0,0.33] Θ '=[0.66,1]

M'=[0,0.33] Θ '=[0.66,1]

0.05

M'=[0.33,0.66] Θ '=[0,0.33]

M'=[0.33,0.66] Θ '=[0,0.33]

0.05

M'=[0.33,0.66] Θ '=[0.33,0.66]

0.04

M'=[0.33,0.66] Θ '=[0.33,0.66] M'=[0.33,0.66] Θ '=[0.66,1]

M'=[0.33,0.66] Θ '=[0.66,1]

0.04

M'=[0.66,1] Θ '=[0,0.33]

M'=[0.66,1] Θ '=[0,0.33] M'=[0.66,1] Θ '=[0.33,0.66]

M'=[0.66,1] Θ '=[0.33,0.66]

0.03

M'=[0.66,1] Θ '=[0.66,1]

0.02

0.01

0.01

5200

5250

5300

π +π −)

0

(D (2007) *

5350

M'=[0.66,1] Θ '=[0.66,1]

0.03

0.02

0

5450

(D*(2007)0K −π +) Mass [MeV/c2]

2

A.U.

0

5350

(D*(2007)0K +π −) Mass [MeV/c2]

0

2

Mass [MeV/c ]

5200

5250

5300

5350

(D*(2007)0π +π −) Mass [MeV/c2]

Fig. C.1 B0(s) -mass distributions in signal MC samples in different regions of the SDP, as indicated in the legend

0.07

233 A.U.

A.U.

Appendix C: Signal MC Correlation Studies M'=[0,0.33] Θ '=[0,0.33] M'=[0,0.33] Θ '=[0.33,0.66]

0.06

M'=[0,0.33] Θ '=[0.66,1]

M'=[0.33,0.66] Θ '=[0,0.33]

M'=[0.33,0.66] Θ '=[0,0.33]

0.06

M'=[0.33,0.66] Θ '=[0.33,0.66] M'=[0.33,0.66] Θ '=[0.66,1]

M'=[0.33,0.66] Θ '=[0.33,0.66] M'=[0.33,0.66] Θ '=[0.66,1]

0.05

M'=[0.66,1] Θ '=[0,0.33]

0.04

M'=[0.66,1] Θ '=[0.33,0.66]

M'=[0.66,1] Θ '=[0,0.33] M'=[0.66,1] Θ '=[0.33,0.66]

0.04

M'=[0.66,1] Θ '=[0.66,1]

0.03

M'=[0.66,1] Θ '=[0.66,1]

0.03

0.02

0.02

0.01

0.01

5300

K +π −)

0

(D (2007)

5350

0

2

5200

Mass [MeV/c ] A.U.

5250

5200 *

A.U.

M'=[0,0.33] Θ '=[0,0.33] M'=[0,0.33] Θ '=[0.33,0.66]

M'=[0,0.33] Θ '=[0.66,1]

0.05

0

0.08 0.07

M'=[0,0.33] Θ '=[0,0.33]

0.07

M'=[0,0.33] Θ '=[0.33,0.66] M'=[0,0.33] Θ '=[0.66,1]

0.06

5250

5300

0.07

M'=[0,0.33] Θ '=[0,0.33] M'=[0,0.33] Θ '=[0.33,0.66] M'=[0,0.33] Θ '=[0.66,1]

0.06

M'=[0.33,0.66] Θ '=[0,0.33]

M'=[0.33,0.66] Θ '=[0,0.33] M'=[0.33,0.66] Θ '=[0.33,0.66]

M'=[0.33,0.66] Θ '=[0.33,0.66]

0.05

M'=[0.33,0.66] Θ '=[0.66,1]

0.04

M'=[0.66,1] Θ '=[0.33,0.66]

0.05

M'=[0.33,0.66] Θ '=[0.66,1] M'=[0.66,1] Θ '=[0,0.33]

M'=[0.66,1] Θ '=[0,0.33]

0.04

M'=[0.66,1] Θ '=[0.33,0.66] M'=[0.66,1] Θ '=[0.66,1]

M'=[0.66,1] Θ '=[0.66,1]

0.03

0.03

0.02

0.02

0.01

0.01

5300

5350

(D (2007) A.U.

5400

K −π +)

0

*

0

5450

5300

Mass [MeV/c ]

0.06

M'=[0,0.33] Θ '=[0,0.33] M'=[0,0.33] Θ '=[0.33,0.66]

5350

5400

M'=[0,0.33] Θ '=[0,0.33]

0.06

M'=[0,0.33] Θ '=[0.33,0.66] M'=[0,0.33] Θ '=[0.66,1]

M'=[0,0.33] Θ '=[0.66,1]

0.05

M'=[0.33,0.66] Θ '=[0,0.33]

M'=[0.33,0.66] Θ '=[0,0.33]

0.05

M'=[0.33,0.66] Θ '=[0.33,0.66]

0.04

M'=[0.33,0.66] Θ '=[0.33,0.66] M'=[0.33,0.66] Θ '=[0.66,1]

M'=[0.33,0.66] Θ '=[0.66,1]

0.04

M'=[0.66,1] Θ '=[0,0.33]

M'=[0.66,1] Θ '=[0,0.33] M'=[0.66,1] Θ '=[0.33,0.66]

M'=[0.66,1] Θ '=[0.33,0.66]

0.03

M'=[0.66,1] Θ '=[0.66,1]

0.02

0.01

0.01

5200

5250

5300

π +π −)

0

(D (2007) *

5350

M'=[0.66,1] Θ '=[0.66,1]

0.03

0.02

0

5450

(D*(2007)0K −π +) Mass [MeV/c2]

2

A.U.

0

5350

(D*(2007)0K +π −) Mass [MeV/c2]

0

2

Mass [MeV/c ]

5200

5250

5300

5350

(D*(2007)0π +π −) Mass [MeV/c2]

Fig. C.2 Gaussian fits of the B0(s) -mass distributions in signal MC samples in different regions of the SDP, as indicated in the legend

B0 γ

μ = 5280.74 σ = 13.29 μ = 5280.66 σ = 13.83 μ = 5280.86 σ = 13.35 μ = 5280.30 σ = 12.37 μ = 5280.65 σ = 12.29 μ = 5280.90 σ = 12.66 μ = 5280.18 σ = 12.73 μ = 5280.37 σ = 12.92 μ = 5280.35 σ = 13.49

SDP region

m = [0, 0.33] θ = [0, 0.33] m = [0, 0.33] θ = [0.33, 0.66] m = [0, 0.33] θ = [0.66, 1] m = [0.33, 0.66] θ = [0, 0.33] m = [0.33, 0.66] θ = [0.33, 0.66] m = [0.33, 0.66] θ = [0.66, 1] m = [0.66, 1] θ = [0, 0.33] m = [0.66, 1] θ = [0.33, 0.66] m = [0.66, 1] θ = [0.66, 1] μ = 5280.41 σ = 14.82 μ = 5281.50 σ = 14.37 μ = 5281.39 σ = 14.99 μ = 5280.34 σ = 12.28 μ = 5281.22 σ = 12.93 μ = 5281.69 σ = 13.72 μ = 5281.00 σ = 13.19 μ = 5280.90 σ = 14.19 μ = 5281.65 σ = 13.81

B0 π 0 μ = 5367.86 σ = 13.73 μ = 5367.79 σ = 13.54 μ = 5367.94 σ = 13.73 μ = 5367.26 σ = 12.67 μ = 5367.41 σ = 12.63 μ = 5368.19 σ = 12.76 μ = 5367.43 σ = 12.77 μ = 5367.77 σ = 13.15 μ = 5367.88 σ = 13.19

B0s γ μ = 5368.45 σ = 14.57 μ = 5369.07 σ = 14.38 μ = 5369.15 σ = 14.62 μ = 5368.29 σ = 12.40 μ = 5368.77 σ = 13.81 μ = 5368.89 σ = 14.50 μ = 5368.25 σ = 13.81 μ = 5367.57 σ = 14.68 μ = 5368.89 σ = 14.96

B0s π 0 μ = 5280.59 σ = 14.96 μ = 5280.59 σ = 14.89 μ = 5280.34 σ = 14.57 μ = 5279.85 σ = 13.82 μ = 5280.53 σ = 13.66 μ = 5280.85 σ = 13.53 μ = 5280.35 σ = 14.27 μ = 5280.43 σ = 15.01 μ = 5280.54 σ = 14.91

Control γ

μ = 5281.64 σ = 15.46 μ = 5281.41 σ = 16.35 μ = 5282.14 σ = 15.79 μ = 5281.11 σ = 14.26 μ = 5280.71 σ = 14.71 μ = 5281.45 σ = 15.69 μ = 5282.01 σ = 14.81 μ = 5281.81 σ = 15.65 μ = 5281.81 σ = 16.79

Control π 0

Table C.1 Results of Gaussian fits to the B0(s) -mass distributions for signal MC samples in different regions of the SDP. Units of the mean (μ) and width (σ ) parameters are in MeV/c2 s

234 Appendix C: Signal MC Correlation Studies

Appendix D

Effects of Fit Constraints on Yield Statistical Uncertainties

High correlation coefficients between the fitted yields in the channels with D∗ (2007)0 → D0 π 0 decay result in a notable decrease of the statistical uncertainty of the branching fractions involving these channels. This is a direct consequence of the large contribution from misreconstructed signal events on these channels. In the fit, this contribution is set to be equal in all channels, by introducing the shared parameter π0 , which describes the ratio between correctly reconstructed and misreconstructed f mrs π0 signal events. This parameter is found to take the value f mrs = 1.5218 ± 0.1630 (Table 8.10). Indicating that there is a larger yield of misreconstructed than correctly reconstructed signal. Since this parameter is shared, it introduces large correlations between the signal yields. Moreover, each signal yield with D∗ (2007)0 → D0 π 0 π0 decay itself has a large correlation with f mrs as seen in Table 8.12. When assessing the statistical uncertainty on the ratios of the fitted signal yields, one has to take into account this correlation—effectively accounting for the fact that the uncertainty depends on the yield of both correctly reconstructed and misreconstructed signal. As a cross-check, the simultaneous fit is refitted by fixing different parameters (to their values as obtained from the baseline fit) and studying the effects on the uncertainties. This study is shown in Table D.1. The uncertainties on the modes with D∗ (2007)0 → D0 γ decays are in general reasonably stable, and not affected too much by whether nuisance parameters are floated or not.1 Larger changes in uncertainties are seen for the D∗ (2007)0 → D0 π 0 modes. Fixing ratios relating misidentified, partially combinatorial and partially reconstructed backgrounds has a minimal effect on the statistical uncertainties of the signal yields. This is expected as the yields of most of these backgrounds have relatively small yields and thus are not expected to have a big role on the mass fit model. However, when fixing ratios related to misreconstructed signal, the uncertainties of An exception is for the B0 , γ case, where an unexpected increase in uncertainty is seen when the misreconstructed signal parameters are fixed. This behaviour is not understood but does not affect the conclusions of this study. 1

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 A. B. Gonzalo, First Observation of Fully Reconstructed B 0 and Bs0 Decays into Final States Involving an Excited Neutral Charm Meson in LHCb, Springer Theses, https://doi.org/10.1007/978-3-031-22753-0

235

236

Appendix D: Effects of Fit Constraints on Yield Statistical Uncertainties

Table D.1 Yields and statistical uncertainties in different versions of the fit in which different constraints have been fixed. σstat corresponds to the statistical uncertainty from the baseline fit, mrs fixed refers to a version of the fit in which the ratios of misreconstructed signal have been fixed. σstat Bg fixed In σstat , all ratios that relate misidentified, partially combinatorial and partially reconstructed All fixed all the aforementioned ratios have been fixed. In all cases, backgrounds have been fixed. In σstat the ratios have been fixed to the fitted values from the baseline configuration √ Bg fixed mrs fixed All fixed Channel Fitted yields σstat σstat σstat σstat N B0, γ

946.37

B0, π 0

184.66

Bs0 , γ

3744.32

Bs0 , π 0

632.72

Control, γ

15020.90

Control,π 0

2591.49

53.40 (5.64%) 17.04 (9.23%) 76.85 (2.05%) 45.78 (7.23%) 217.84 (1.45%) 189.70 (7.32%)

72.05 (7.61%) 11.24 (6.09%) 69.68 (1.86%) 19.08 (3.02%) 206.18 (1.37%) 51.33 (1.98%)

45.20 (4.78%) 11.03 (5.97%) 66.37 (1.77%) 17.98 (2.84%) 189.24 (1.26%) 49.85 (1.92%)

48.21 (3.25%) 16.21 (8.78%) 72.12 (1.93%) 37.92 (5.99%) 205.90 (1.37%) 172.77 (6.67%)

30.76 (5.09%) 13.59 (7.36%) 61.19 (1.63%) 25.15 (3.98%) 122.56 (0.82%) 50.91 (1.96%)

the signal yields in these channels diminish notably, as the statistical uncertainty is then related to the sum of the fully reconstructed and misreconstructed yields. These properties justify the decision to include the correlations between the yilds when evaluating the statistical uncertainties on the ratios.